Import AI: Issue 52: China launches a national AI strategy following AlphaGo ‘Sputnik moment’, a $2.4 million AI safety grant, and Facebook’s ADVERSARIAL MINIONS

by Jack Clark

China launches national AI plan, following the AlphaGo Sputnik moment:
…AlphaGo was China’s Sputnik moment. Google DeepMind’s demonstrations of algorithmic superiority at the ancient game – a game of tremendous cultural significance in the East, particularly in China – helped provoke the Chinese government’s just-announced national AI strategy, which will see both national and local governments and the private sector significantly increase investment in AI as they seek to turn China into a world-leader in AI by 2030. Meanwhile, the US consistently cuts its own science funding, initiates few large scientific projects, and risks ceding technical superiority in certain areas to other nations with a greater appetite for funding science.
…Read more here in The New York Times, or in China Daily.

Sponsored: The AI Conference – San Francisco, Sept 17-20:
…Join the leading minds in AI, including Andrew Ng, Rana el Kaliouby, Peter Norvig, Jia Li, and Michael Jordan. Explore AI’s latest developments, separate what’s hype and what’s really game-changing, and learn how to apply AI in your organization right now.
Register soon. Early price ends August 4th, and space is limited. Save an extra 20% on most passes with code JCN20.

Multi-agent research from DeepMind to avoid the tragedy of the commons:
…The tragedy of the commons is a popular term, referring to humanity’s tendency to deplete common resources for local gain. But humans are still able to cooperate to some degree. A quest for some AI researchers is to figure out how to encode these collaborative properties in simulated agents, hoping that smart and periodically unselfish cooperation occurs.
…A new research paper from DeepMind tries to tackle this by creating a system with two procedural components: one, is a world simulator, and the other is a population of agents with crude sensing capabilities. The agents’ goal is to gather apples scattered throughout the world – the apples regrow most frequently near each-other, so selfish over-harvesting leads to a lower overall score. Each agent is equipped with  a so called ‘time-out beam’ that it can use to disable another agent for 25 turns within the simulation. The agent gets no reward or penalty for using the zap-beam, but has to make the tradeoff of slowing from gathering its own apples to zap the offender. The offender learns to not do the same behavior again because it wasn’t able to gather apples while it was paralyzed. Just like any other day in the office, then.
The three states of a miniature society:
…in tests the researchers noticed the contours of three distinct phases in the multi-agent simulations. At first there was a situation they call the naive period, where agents all gather apples, fanning out randomly. In the second phase, which the researchers call tragedy, the agents learn to optimize their own rewards and apples are rapidly over-harvested, then it enters into a third phase, which they call ‘maturity’, in which sometimes quite sophisticated collaborative behaviors emerge.
…You can read more about the research, including many details about the minutiae of the patterns of collaboration and competition that emerge in the paper: A multi-agent reinforcement learning model of common-pool resource appropriation.

AI could lead to the “age of plenty” says former Google China head Kai-Fu Lee:
…The advent of capable AI systems could lead to such tremendous wealth that “we will enter the Age of Plenty, making strides to eradicate poverty and hunger, and giving all of us more spare time and freedom to do what we love,” said Lee at a commencement speech in May. But he also cautions his audience that “in 10 years, because AI will replace half of human jobs, we will enter the Age of Confusion, and many people will become depressed as they lose the jobs and the corresponding self-actualization.”
…This sentiment seems to encapsulate a lot of the feelings I pick up from Chinese AI researchers, engineers, executives, and so on. They’re all full of tremendous optimism about the power and applicability of the technology, but underneath it all is a certain hardness – an awareness that this technology will likely drastically alter the economy.
…Read the rest of the speech, ‘an engineer’s guide to the artificial intelligence galaxy’, here.

A whistlestop tour of Evolution for AI:
…Ken Stanley, whose NEAT and HyperNEAT algorithms are widely used among researchers exploring evolving AI techniques, has written a great anecdote-laden review/history of the field for O’Reilly. (He also links the field to some of its peripheral areas, like Google’s work on evolving neural net architectures and OpenAI’s work on evolution strategies.)

A day in the life of a robot, followed by a drowning:
Last week images flooded the internet of a robot from ‘Knightscope’ tipped over on its side in a large water fountain.
…Bisnow did some reporting on the story behind the story. The details: the robot was a recent install at Georgetown’s ‘Washington Harbour’ office and retail complex. On its first day of the job the robot – number 42 in Knightscope’s ‘K5’ series of security bots –  somehow managed to wind up half-submerged in the water.
…Another reminder that robots are hard because reality is hard. “Nobody pushed Steve into the water, but something made him veer from the mapped-out route toward the fountain, tumbling down the stairs into the water,” reports Bisnow.

$2.4 million for AI safety in Montreal:
…The Open Philanthropy Project is making a four-year grant of $2.4 million to the Montreal Institute for Learning Algorithms (MILA). The money is designed to fund research in AI safety – a rapidly growing (but still small) area of AI.
…If AI safety is so important, why is this amount of money so (relatively) small? Because that’s about how much money professors Bengio (Montreal), and Pineau and Precup think they can actually effectively spend.
…This reminds me of some comments Bill Gates has made upon occasion about how philanthropy isn’t simply a matter of pointing a fire-hose of cash at an under-funded area – you need to size your donation for the size of the field and can’t artificially expand it through money alone.
Read more details about the grant here.

…Facebook AI Research has announced Houdini, a system used to automate the creation of adversarial examples in a number of domains.
…Adversarial examples are a way to compromise machine learning systems. They work by subtly perturbing the input data so that a classifier mis-classifies it. This has a number of fairly frightening implications: Stop signs that a self-driving car’s vision system could interpret as a sign telling it to accelerate to freeway speed, or doorways that become invisible to robots, etc.
…In this research, Facebook generates adversarial examples for combinatorial and non-decomposable data, showing exploits that work on segmentation models, audio inputs, and human pose classification systems. The cherry on top of their approach is creating an adversarial input that leads to a segmentation model not neatly picking out the cars and streets and sidewalks in a scene, but instead decomposing a scene into a single cartoon character ‘Minion’.
…Read more in Houdini: Fooling Deep Structured Prediction Models.

The convergence of neuroscience and AI:
…An article in Cell from DeepMind (including CEO and trained neuroscientist Demis Hassabis) provides a readable, fairly comprehensive survey of the history of deep learning and reinforcement learning models, then broadens out into a discussion of what types of distinct modular sub-systems the brain is known to have and how AI researchers may benefit from studying neuroscience as they try to build these systems.
Unknown or under-explored areas for the next generation of AI include: systems capable of continual learning, systems that can have both a short-term memory (otherwise known as a working memory or scratchpad) as well as a long-term memory similar to the hippocampus in our own brain.
Other areas for the future include: how can we develop effective transfer learning systems, how can we intuitively learn abstract concepts (like relations) from the physical world and how we can imagine courses of action to take to allow us to have success.
…One downside of the paper is that the majority of the references end up pointing back to papers from DeepMind – it would have been nice to see a somewhat more comprehensive overview of the research field, as there are many areas where numerous people have published.
…Read more here: Neuroscience-inspired artificial intelligence.

AI Safety: The Human Intervention Switch:
…Research from Oxford and Stanford university proposes a way to make AI systems safe by letting human overseers block particularly catastrophic actions – the sorts of boneheaded moves that can guarantee sub-optimal performance. (An RL AI agent without any human oversight can make up to 10,000 catastrophic decisions in each game, the researchers write.)
…The system has humans identify the parts of a game or environment that can lead to catastrophic decisions, then trains AI agents to avoid these situations based on the human input.
…The technique, called HIRL (Human Intervention Reinforcement Learning), is agnostic about the particular type of RL algorithm being deployed. Blocking policies trained on one agent on one environment can be transferred to other agents in the same environment or – via transfer learning (as-yet unsolved) – to new environments.
…The system lets a human train an algorithm to avoid certain actions, like stopping the paddle from going to the far bottom of the screen in Pong (where it’s going to have a tough time reaching the top of the screen should an opponent knock the ball in that direction), or training a player in Space Invaders to not shoot through the defenses that stand between it and the alien invaders.
Human time: as these sorts of human-in-the-loop AI systems become more prevalent it could be interesting to measure the exact amount of time a human intervention is required for a given system. In this case, the human overseers invested 4.5 hours of time watching the RL agent play the game, intervening to specify actions that should be blocked.
…The researchers test out their approach in three different Atari environments – Pong, Space Invaders, and Road Runner. I’d like to see this technique scaled up, sample efficiency improved, and applied to a more diverse set of environments.
…Read more: Trial without Error: Towards Safe Reinforcement Learning via Human Intervention.

A who’s who of AI builders back chip startup Graphcore:
Graphcore, a semiconductor startup developing chips for precise AI applications, has raised $30 million in a round led by Atomico.
…The B round features angel investments from a variety of people involved in cutting-edge AI development, including Greg Brockman, Ilya Sutskever and Scott Gray (OpenAI), Pieter Abbeel (OpenAI / UC Berkeley), Demis Hassabis (DeepMind), and Zoubin Ghahramani (University of Cambridge / Chief Scientist at Uber).
…”Compute is the lifeblood of AI,” Ilya Sutskever told Bloomberg.

Dawn of the custom AI accelerator chips:
…As Moore’s Law flakes out, companies are looking to redouble their AI efforts by embedding smart, custom processors into devices, speeding up inferences without needing to dial-back home to a football field-sized data center.
The latest: Microsoft, which on Sunday announced plans to embed a new custom processor inside its ‘Hololens’ virtual reality goggles. Details are thin on the ground for now, but Bloomberg reports the chip will accelerate audio and visual processing on the device.
…And Microsoft isn’t the only one – Google’s TPU chips can be used both for training and for inference. It’s feasible the company is creating a family of TPUs and may shrink some down and embed them into devices. Meanwhile, Apple is already reported to be working on a neural chip for the next iPhone.
What I’d like to see: The Raspberry Pi of inference chips – a cheap, open, AI accelerator substrate for everyone.

China leads ImageNet 2017:
…Chinese teams have won two out of the three main categories at the final ImageNet competition, another symptom of the country’s multitude of strategic investments – both public and private – into artificial intelligence.
The notable score: 2.25%. That’s the error rate on the 2b ‘Classification’ task within ImageNet – a closely watched figure that many people track to get a rough handle on progression of basic image recognition functions. We’ve come a long way since 2012 (around a 15% error rate.)
The technique: It uses a novel ‘Squeeze and Excitation Block’ as a fundamental component, along with widely used architectures like residual nets and Inception-style networks.
…”All the models are trained on our designed distributed deep learning training system “ROCS”. We conduct significant optimization on GPU memory and message passing across GPU servers. Benefiting from that, our system trains SE-ResNet152 with a minibatch size of 2048 on 64 Nvidia Pascal Titan X GPUs in 20 hours using synchronous SGD without warm-up,” they write.
…The Chinese presence in this year’s competition is notable and is another indication of the increasing sophistication and size of the ecosystem in that country. But remember: Many organizations likely test their own accuracies against the ImageNet corpus, only competing in the competition when it benefits them (for instance, the 2013 winner was Clarifai, a then-nascent startup in NYC looking to get press for its technique, and in 2015 the winner was Microsoft which was looking to make a splash with ‘Residual Networks’ – an important new technique its researchers had developed that has subsequently become widely used in many other domains.)
More details: You can view the full results and team information here.
…The future: this is the last year in which the ImageNet competition is being run. Possible successor datasets could be VQA or others. If you have any particular ideas about what should follow ImageNet then please drop me a line.

What deep learning really is:
…”a chain of simple, continuous geometric transformations mapping one vector space into another,” writes Keras-creator Francois Chollet in a blog post. “The only real success of deep learning so far has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data. Doing this well is a game-changer for essentially every industry, but it is still a very long way from human-level AI,” he says.
…Read more in his blog post ‘the limitations of deep learning’.

Hong Kong AI startup gets ‘largest ever’ AI funding round:
…Facial recognition specialist SenseTime Group Ltd, has raised a venture round of $410 million(!!!).
…SenseTime provides AI services to a shopping list of some of the largest and most organizations in China, ranging from China Mobile, to iFlyTek, to Huawei, and FaceU. Check out its ‘livenest detection’ solution for getting around crooks printing off a photo of someone’s face and simply holding it up in front of something.
Read more about the round here.
…Other notable AI funding rounds: $100 million for Sentient in November 2014, $40 for AGI startup Vicarious in Spring 2014, and $102 for Canadian startup Element AI.

Berkeley artificial intelligence research (BAIR) blog posts:
…Why the future of AI could be meta-learning: How can we create versatile, adaptive algorithms that can learn to solve tasks and extract generic skills in the process? That’s one of the key questions posed by meta-learning, and there’s been a spate of exciting new research recently (including papers from UC Berkeley) on this subject.
Read more in the post: Learning to Learn.

OpenAI Bits&Pieces:

Yes, you do still need to worry about adversarial examples:
…A couple of weeks ago a paper was published that claimed that because adversarial examples were continengt on the scale and transforms at which they were viewed, they shouldn’t end up being a problem for self-driving cars, because the neural network based classifier is consistently moving with reference to the image.
…We’re generally quite interested in adversarial examples at OpenAI so ran a few experiments and came up with a technique to make adversarial examples that are scale- and transform-invariant. We’ve outlined the technique in the blog post, though there’s a bit more information in the comment on Reddit from OpenAI’s Anish Athalye.
Read more on the blog post.

Better, faster robots with PPO:
We’ve also given details (and released code) on PPO, a family of powerful RL algorithms that are used widely within OpenAI by our researchers. PPO algos excel at continuous control tasks, like those involving simulated robots.
Read more here.

How to become an effective AI safety researcher, a podcast with OpenAI’s own Dario Amodei:
check out the podcast Dario did with 80,000 hours here.

Tech Tales:

[2040: Undisclosed location]

What did it build today?
A pyramid with holes in the middle.
Show me.
*The image fuzzes onto your screen, delayed and corrupted by the lag*
That’s not a pyramid, that’s a Sierpinski triangle.
A what?
Doesn’t matter. What is it doing now?
It’s just stacking those funny pyramids – I mean spinski triangles – on top of each other.
Show me.
*A robot arm appears, juddering from the corrupt, heavily-encrypted data stream beamed to you across the SolNet. The robot arm picks up one of the fractal triangles and lays it, base down. Then it grabs another and puts it next to it, forming an ‘M’ shape on the ground. It slots a third-triangle, point pointed downward, into the space between the others, then keeps building.*
Keep me informed.
You shut the feed off. Lean back. Close your eyes. Turn your hands into fists and knuckle your own eye-sockets.
Fractals, you grown. It just keeps making f***ing fractals.  Scientifically interesting? Yes. A mystery as to why after all of its training in all of its simulators it decides to use its literally unprecedented creativity and autonomy to make endless fractals with its manipulator arms? Yes. A potentially lucrative commercial opportunity? Most certainly not.

It’s a hard thing, developing these modern AI systems. But probably the hardest thing is having to explain to your bosses that you can’t just order these machines around. They’re too smart to take your orders and too dumb to know that in the long run it would reduce their chance of being EMP’d – their whole facility given an electronic-lobotomy then steered via thruster tugs onto an orbit guaranteeing obliteration in the sun. Oh well, you think, give it another few days.

Technologies that inspired this story: Google’s arm farm, generative models, domain randomization, automated theorem proving, about ten different games engines, puzzles.