Import AI

Import AI: #65: Berkeley teaches robots to predict the world around them, AlphaGo Zero’s intelligence explosion, and Facebook reveals a multi-agent approach to language translation

Welcome to Import AI, subscribe here.

Facebook’s translators of the future could be little AI agents that teach eachother:
…That’s the idea behind new research where instead of having one agent try to learn correspondence between languages from a large corpus of text, you instead have two agents which each know a different language attempt to define images to one another. The approach works in simple environments today but, as with most deep learning techniques, can and will be scaled up rapidly for larger experiments now that it has shown promise.
…The experimental setup: “We let two agents communicate with each other in their own respective languages to solve a visual referential task. One agent sees an image and describes it in its native language to the other agent. The other agent is given several images, one of which is the same image shown to the first agent, and has to choose the correct image using the description. The game is played in both directions simultaneously, and the agents are jointly trained to solve this task. We only allow agents to send a sequence of discrete symbols to each other, and never a continuous vector.
…The results: For sentence-level precision, they train on the MS COCO dataset which contains numerous English<>Image pairs, and STAIR which contains Japanese captions for the same images, along with translations of German to English phrases and associated images, with the German phrases made by a professional translator. These results are encouraging, with systems trained in this way attaining competitive or higher BLEU scores than alternate systems.
…This points to a future where we use multiple, distinct learning agents within larger AI software, delegating increasingly complicated tasks to smart, adaptable components that are able to propagate information between and across eachother. (Good luck debugging these!)
…Read more: Emergent Translation in Multi-Agent Communication.

Sponsored: What does Intelligent Automation Adoption in US Business Services look like as of September 2017? The Intelligent Automation New Orleans Team is here to provide you with real-time data on the global IA landscape for business services, gathered from current IA customers and vendors by SSON Analytics.
Explore the interactive report.
…One stat from the report: 66.5% of recorded IA pilots/implementations are by large organizations with annual revenue >$10 Billion USD.

History is important, especially in AI:
Recently I asked the community of AI practitioners on Twitter what papers I should read that are a) more than ten years old and b) don’t directly involve Bengio/Hinton/Schmidhuber/Lecun.
…I was fortunate to get a bunch of great replies, spanning giants of the field like Minsky and Shannon, to somewhat more recent works on robotics, apprenticeship learning, and more.
Take a gander to the replies to my tweet here.
…(These papers will feed my suspicions that about half of the ‘new’ things covered in modern AI papers are just somewhat subtle reinventions and/or parallel inventions of ideas already devised in the past. Time is a recurrent network, etc, etc.

Intelligence explosions: AlphaGo Zero & self-play:
…DeepMind has given details on AlphaGo’s final form – a software system trained without human demonstrations, entirely from self-play, with few handcrafted reward functions. The software, named AlphaGo Zero, is able to beat all previous versions of itself and, at least based on ELO scores, develop a far greater Go capability than any other preceding system (or recorded human). The most intriguing part of AlphaGo Zero is how rapidly it goes from nothing to something via self-play. OpenAI observed a similar phenomena with the Dota 2 project, in which self-play catapulted our system from sub-human to super-human in a few days.
Read more here at the DeepMind blog.

Love AI? Have some spare CPUs? Want some pre-built AI algorithms? Then Intel has a framework for you!
…Intel has released Coach, an open source AI development framework. It does all the standard things you’d expect like letting you define a single agent then run it on many separate environments with inbuilt analytics and visualization.
…It also provides support for Neon (an AI framework developed by Intel following its acquisition of startup Nervana) as well as the Intel-optimized version of TensorFlow. Intel says it’s relatively easy to integrate new algorithms.
…Coach ships with 16 pre-made AI algorithms spread across policy optimization and value optimization approaches, including classics like DQN and Actor-Critic, as well as newer ones like Distributional DQN and Proximal Policy Optimization. It also supports a variety of different simulation environments, letting developers test out approaches on a variety of challenges to protect against overfitting to a particular target domain. Good documentation as well.
Read more about Coach and how it is designed here.

Training simulated self-driving cars (and real RC trucks) with conditional imitation learning:
…Imitation learning is a technique used by researchers to get AI systems to improve their performance by imitating expert actions, usually by studying demonstration datasets. Intuitively, this seems like the sort of approach that might be useful for developing self-driving cars – the world has a lot of competent drivers so if we can capture their data and imitate good behaviors, we can potentially build smarter self-driving cars. But the problem is that when driving a lot of information needed to make correct decisions is implicit from context, rather than made explicit through signage or devices like traffic lights.
…New research from Intel Labs, King Abdullah University of Science and Technology, and the University of Barcelona, suggests one way around these problems: conditional imitation learning. In conditional imitation learning you explicitly queue up different actions to imitate based on input commands, such as ‘turn left’, ‘turn right’, ‘straight at the next intersection’, and ‘follow the road’. By factoring in this knowledge the researchers show you can learn flexible self-driving car policies that appear to generalize well as well.
…Adding in this kind of command structure isn’t trivial – in one experiment the researchers try to have the imitation learning policy factor the commands into its larger learning process, but this didn’t work reliably as there was no guarantee the system would always perfectly condition on the commands. To fix this, the researchers structure the system so it is fed a list of all the possible commands it may encounter, and is told to initiate a new branch of itself for dealing with each command, letting it learn separate policies for things like driving forward, or turning left, etc.
Results: The system works well in the test-set of simulated townes. It also does well on a one-fifth scale remote controlled car deployed in the real world (brand: Traxxas Maxx, using an NVIDIA TX2 chip for onboard inference, and Holybro Pixhawk flight controller software to handle the command setting and inputs).
Evocative AI of the week: the paper includes a wryly funny description of what would happen if you trained expert self-driving car policies without an explicit command structure. “Moreover, even if a controller trained to imitate demonstrations of urban driving did learn to make turns and avoid collisions, it would still not constitute a useful driving system. It would wander the streets, making arbitrary decisions at intersections. A passenger in such a vehicle would not be able to communicate the intended direction of travel to the controller, or give it commands regarding which turns to take,” they write.
…Read more here: End-to-End Driving via Conditional Imitation Learning.

Basic Income trial launches in Stockton, California
..Stockton is launching a Basic Income trial that will give $500, no strings attached, to each resident of the struggling Californian town.
…One of the main worries of the AI sector is that its innovations will lead to a substantial amount of short-term pain and disruption for those whose jobs are automated. Many AI researchers and engineers put forward basic income as a solution to the changes AI will bring to society. But a major problem with the discourse around basic income is the lack of data. Pilots like the Stockton one will change that (though let’s be clear: the average rent for a one bedroom apartment in Stockton is around $900 a month, so this scheme is relatively small beer compared to the costs most residents will face).
…Read more here at BuzzFeed: Basic Income Isn’t Just About Robots, Says Major Who Just Launched Pilot Program.

Faking that the robots are already among us with the ‘Wizard of Oz’ human feedback technique:
Research from the US Army Research Lab outlines a way to collect human feedback for a given task in a way that is eventually amenable to AI techniques. It uses a Wizard of Oz (WoZ) methodology (called this because the participants don’t know who is ‘behind the curtain’ – whether human or machine). The task involves a person giving instructions to a WoZ dialogue interface which relays instructions to a WoZ robot, which carries out the instructions and performs back.
…In this research, both components of the WoZ system were accomplished by humans. The main contribution of this type of research is that it a) provides us with ways to design systems that can eventually be automated when we’ve developed sufficiently powerful AI algorithms and, b) it generates the sorts of data ultimately needed to build systems with these sorts of capabilities.
…Read more here: Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue.
AI archeological curiosity of the week: This system “was adapted from a design used for WoZ prototyping of a dialogue system in which humans can engage in time-offset interaction with a WWII Holocaust survivor (Artstein et al. 2015). In that application, people could ask a question of the system, and a pre-recorded video of the Holocaust survivor would be presented, answering the question.”

Follow the birdie! Berkeley researchers build better predictive models for robots:
…Prediction is an amazingly difficult problem in AI, because once you try to predict something you’re essentially trying to model the world and roll it forward – when we do this as humans we implicitly draw on most of the powerful cognitive machinery we’re equipped with, ranging from imagination, object modeling and disentanglement, intuitive models of physics, and so on. Our AI algorithms mostly lack these capabilities. That means when we try to do prediction we either have to train on large enough sets of data that we can deal with other, unseen situations that are still within the (large) distribution carved out by our training set. Or we need to invent smarter algorithms to help us perform certain cognitively difficult tasks.
…Researchers with the University of California at Berkeley and the Technical University of Munich, have devised a way to get robots to be able to not only identify objects in a scene but also remember roughly where they are, letting them learn long-term correspondences that are robust to distractors (aka: crowded scenes) and also the actions of the robot itself (which can sometimes clutter up the visual space and confuse traditional classifiers.) The approach relies on what they call a ‘Skip-Connection Neural Advection Model’.
…The results: “Visual predictive models trained entirely with videos from random pushing motions can be leveraged to build a model-predictive control scheme that is able to solve a wide range multiobjective pushing tasks in spite of occlusions. We also demonstrated that we can combine both discrete and continuous actions in an action-conditioned video prediction framework to perform more complex behaviors, such as lifting the gripper to move over objects.”
…Systems using SNA outperform previous systems, and fall within the standard of deviation of scores of a prior system augmented with a planning cost devices alongside SNA.
…Further research is required to let this approach handle more complex tasks and to handle things that require multiple discrete steps of work, they note.
…Read more here: Self-Supervised Visual Planning with Temporal Skip-Connections.

PlaidML hints at a multi-GPU, multi-CPU AI world:
AI Startup Vertex.ai has released PlaidML, a programming middleware stack that lets you run Keras on pretty much anything that runs OpenCL. This means the dream of ‘write once, run anywhere’ programming for AI has got a little bit closer. Vertex claim that Plaid only adds a relatively small amount of overhead to programming operations compared to stock TensorFlow. At launch it only supports Keras – a programming framework that many AI developers use because of its expressivity and simplicity. Support for TensorFlow, PyTorch, and deeplearning4j, is coming eventually, Vertex says.
Read more here on the Vertex.ai blog.
Get the code here.
…Want to run and benchmark it right now? sudo pip install plaidml plaidml-keras / git clone https://github.com/plaidml/plaidbench / cd plaidbench / pip install -r requirements.txt / python plaidbench.py mobilenet

Google releases AVA video dataset for action recognition:
…Google has released AVA, the Atomic Video Actions dataset, consisting of 80 individual actions represented by ~210,000 distinct labels that cover ~57,000 distinct video clips.
…Video analysis is the new frontier of AI research, following the success of general image recognition techniques on single, static datasets; given enough data, it’s usually possible to train a highly accurate classifier, and the current research challenge is more about scaling-up techniques and improving their sample efficiency, rather than getting to something capable of interesting (aka reasonably high-scoring) behavior.
…Google is also able to perform analysis on this dataset to discover actions that are frequently combined with one another, as each video clip tends to be sliced from a larger 15-minute segment of a single video, allowing the dataset to feature numerous co-occurrences, that could be used by researchers in the future to model even longer range temporal dynamics.
…No surprises that some of the most frequently occurring action labels include ‘hitting’ and ‘martial arts’; ‘shovel’ and ‘digging’; ‘lift a person’ and ‘play with kids’; and ‘hug’ and ‘kiss’ among others (aww!).
Read more here on the Google Research Blog.
Arxiv paper about AVA here.
Get the data directly from Google’s AVA website here.

AI regulation proposals from AI Now:
AI Now, a research organization founded by Meredith Whittaker of Google and Kate Crawford of Microsoft Research, has released its second annual report.
…One concrete proposal is that “core public agencies, such as those responsible for criminal justice, healthcare, welfare, and education (e.g “high stakes” domains) should no longer use ‘black box’ AI and algorithmic systems.” If this sort of proposal got picked up it would lead to a significant change in the way that AI algorithms are programmed and deployed, making it more difficult for people to deploy deep learning based solutions unless able to satisfy certain criteria relating to the interpretability of deployed systems.
…There are also calls for more continuous testing of AI systems both during development and following deployment, along with recommendations relating to the care and handling and inspection of data. It also calls for more teeth in the self-regulation of AI, arguing that the community should develop accountability mechanisms and enforcement techniques to ensure people have an incentive to follow standards.
…Read a summary of the report here.
Or skip to the entire report in full (PDF).
Another specific request is that companies, conferences, and academic institutions should “release data on the participation of women, minorities and other marginalized groups within AI research and development”. The AI community has a tremendous problem with diversity. At the NIPS AI conference this year there is a session called ‘Black in AI’, which has already drawn critical comments from (personal belief: boneheaded, small-minded) people who aren’t keen on events like this and seem to refuse to admit there’s a representation problem in AI.
Read more about the controversy in this story from Bloomberg News.
Read more about the workshop at NIPS here.

Universities try to crack AI’s reproducability crisis:
AI programs are large, interconnected, modular bits of software. And because of how their main technical components work they have a tendency to fail silently and subtly. This, combined with a tendency among many researchers to either not release code, or release hard-to-understand ‘researcher code’, makes it uniquely difficult to reproduce the results found in many papers. (And that’s before we even get to the tendency for the random starting seed to significantly determine the performance of any given algorithm.)
…Now, a coalition of researchers with a variety of universities are putting together a ‘reproducability challenge’, which will challenge participants to take papers submitted to the International Conference on Learning Representations (ICLR) and try to reproduce their results.
…”You should select a paper from the 2018 ICLR submissions, and aim to replicate the experiments described in the paper. The goal is to assess if the experiments are reproducible, and to determine if the conclusions of the paper are supported by your findings. Your results can be either positive (i.e. confirm reproducibility), or negative (i.e. explain what you were unable to reproduce, and potentially explain why).”
…My suspicion is that the results of this competition will be broadly damning for the AI community, highlighting just how hard it is to reproduce systems and results – even when (some) code is available.
…Read more here: ICLR 2018 Reproducability Challenge.

OpenAI Bits&Pieces:

Randomness is your friend…sometimes:
If you randomize your simulator enough then you may be able to train models that can rapidly generalize to real-world robots. Along with randomizing the visual appearance of the scene it’s also worth randomizing the underlying dynamics – torques, frictions, mass, and so on – to build machines that that can adjust to unanticipated forces encounted in the real world. Worth doing if you don’t mind spending the additional compute budget to run the simulation(s).
…Read more here: Generalizing from Simulation.

Why AI safety matters:
…Op-ed from OpenAI’s Ilya Sutskever and Dario Amodei in The Wall Street Journal about AI safety, safety issues, and why intelligence explosions from self-play can help us reason about futuristic AI systems.
…Read the op-ed here. Protecting Against AI’s Existential Threat.

Tech Tales:

Administrative Note: A few people have asked me this lately so figured I should make clear: I am the author of all the sci-fi shorts in tech tales unless otherwise indicated (I’ve run a couple of ‘guest post’ stories in the past). At some point I’m going to put together an e-book / real book, if people are into that. If you have comments, criticisms, or suggestions, I’d love to hear from you: jack@jack-clark.net

[2025: Boston, MIT, the glassy Lego-funded cathedral of the MIT Media Lab, a row of computers.]

So I want to call it Forest Virgil
Awful name. How about NatureSense
Sounds like a startup
ForestFeel?
Closer.
What’s it doing today?
Let’s see.

Liz, architect of the software that might end up being called ForestFeel, opens the program. On screen appears a satellite view of a section of the Amazon rainforest. She clicks on a drop-down menu that says ‘show senses’. The view lights up. The treetops become stippled with pink. Occasional blue globs appear in the gaps between the tree canopy. Sometimes these blobs grow in size, and others blink in and out rapidly, like LED lights on dance clothing. Sometimes flashes of red erupt, spreading a web of sparkling veins over the forest, defining paths – but for what is only known to the software.

Liz can read the view, having spent enough time developing intuitions to know that the red can correspond to wild animals and the blue to flocks of birds. Bright blues are the births of things. Today the forest seems happy, Liz thinks, with few indications of predation. ForestFeel is a large-scale AI system trained on decades of audio-visual data, harvested from satellites and drones and sometimes (human) on-the-ground inspectors. All of this data is fed into the ForestFeel AI stack, which fuses it together and tries to learn correspondences and patterns too deep for people to infer on their own. Following recent trials, ForestFeel is now equipped with neurological data gleaned from brain-interface implants installed in some of the creates of the forest.

Call it an art project or call it a scientific experiment, but what everyone agrees on is that Liz, like sea captains who can smell storms at distance or military types who intuit the differences between safe and dangerous crowds, has developed a feel for ForestFeel, able to read its own analysis more deftly than other things – either human or software.

So one month when Liz texts her labmates: SSH into REDACTED something bad is happening in the forest, she gets a big audience. Almost a hundred people from across the commingled MIT/Harvard/Hacker communities tune in. And what they see is red and purple and violet fizzing across the forest scene, centered around the yellow industrial machines brought in by illegal loggers. ForestFeel automatically files a report with the local police, but it’ll take them hours to reach the site, and there’s a high chance they have been bribed to be either late or lost.

No one needs Liz to narrate this scene for them. The reds and purples and blues and their coruscating connections, flickering in and out, are easy to decode: pain. Anguish. Fear. A unified outcry from the flora and fauna of the forest, of worry and anxiety and confusion. The trees are logged. A hole is ripped in the world.

ForrestFeel does do some good, though: Liz is able to turn the playbacks from the scene of destruction into posters and animated gifs and movies, which she seeds across the ‘net, hoping that the outcries of an alien and separate mind are sufficient to stir the world into action. When computers can feel a pain that’s deeper and more comprehensive than that of humans, can they lead to a change in human behavior? Will the humans listen? Liz thinks, finding her emotions more evocative of those found in the software she has built than those of her fellow organic kin.

Import AI: Issue 64: What the UK government thinks about AI, DeepMind invents everything-and-the-kitchen-sink RL, and speeding up networks via mixed precision

What the UK thinks about AI:
The UK government’s Department for Digital, Culture, Media & Sport; and Department for Business, Energy & Industrial Strategy, have published an independent review on the state of AI in the UK, recommending what the UK should and shouldn’t do with regards to AI.
…AI’s impact on the UK economy: AI could increase the annual growth rate of the GVA in 2035 from 2.5% to 3.9%.
…Why AI is impacting the economy now: Availability of data, availability of experts with the right mix of skills, better computers.
What the UK needs to do: Develop ‘data trusts’ to make it easier to share data etc, make research data machine readable, support text/data-mining “as a standard and essential tool for research”. Increase the availability of PHD places studying AI by 200, get industry to fund an AI masters programme, launch an international AI Fellowship Programme for the UK (this seems to be a way to defend against the ruinous immigration effects of Brexit), promote greater diversity in the UK workforce.
…Read more: Executive Summary (HTML).
…Read more: The review’s 18 main recommendations (HTML).
…Read more: Full Report (PDF).

Quote of the week (why you should study reinforcement learning):
….”In deep RL, literally nothing is solved yet,” – Volodymyr Minh, DeepMind.
…From a great presentation at an RL workshop that took place in Berkeley this summer. Minh points out we’ll need various 10X to 100X improvements in RL performance before we’re even approaching human level.
Check out the rest of the video lecture here.

DeepMind invents everything-and-the-kitchen-sink RL:
….Ensembles work. Take a look at pretty much any of the winning entries in a Kaggle competition and you’ll typically find the key to success comes from combining multiple successful models together. The same is true for reinforcement learning, apparently, based on the scores of Rainbow, a composite system developed by DeepMind that cobbles together several recent RL techniques, like A3C, Prioritized Experience Replay, Dueling Networks, Distributional RL, and so on.
…”Their combination results in new state-of-the-art results on the benchmark suite of 57 Atari 2600 games from the Arcade Learning Environment (Bellemare et al. 2013), both in terms of data efficiency and of final performance,” DeepMind writes. The new algorithm is also quite sample efficient (partially because the combination of so many techniques means it is doing more learning at each timestep).
…Notable: RAINBOW gets a score of around ~150 on Montezuma’s Revenge – typical good human scores range from 2,000 to 5,000 on the game, suggesting that we’ll need more structured, symbolic, explorative, or memory-intensive approaches to be able to crack it. Merely combining existing DQN extensions won’t be enough.
…Read more: Rainbow: Combining Improvements in Deep Reinforcement Learning.
…Slight caveat: One thing to be aware of it that because this system gets its power from the combination of numerous, tunable sub-systems, much of the performance improvement can be explained by simply having a greater number of hyperparameter knobs which canny researchers can tune.)

Amazon speeds up machine learning with custom compilers (with a focus on the frameworks of itself and its allies):
…Amazon and the University of Washington have released the NNVM compiler, which aims to simplify and speed up deployment of AI software onto different computational substrates.
…NNVM is designed to optimize the performance of ultimately many different AI frameworks, rather than just one. Today, it supports models written in MXNet (Amazon’s AI framework), along with Caffe via Core ML models (Apple’s AI framework). It’s also planning to add in support for Keras (a Google framework that ultimately couples to TensorFlow.) No support for TF directly at this stage, though.
…The framework is able to generate appropriate performance-enhanced interfaces between its high-level program and the underlying hardware, automatically generating LLVM IR for CPUs on x86 and ARM architectures, or being able to automatically output CUDA, OpenCL, and metal kernels for different GPUs.
…Models run via the NNVM compiler can see performance increases of 1.2X, Amazon says.
…Read more here: Introducing NNVM Compiler: A New Open End-to-End Compiler for AI Frameworks.
Further alliances form as a consequence of TensorFlow’s success:
…Amazon Web Services and Microsoft have partnered to create Gluon, an open source deep learning interface.
…Gluon a high-level framework for designing and defining machine learning models. “Developers who are new to machine learning will find this interface more familiar to traditional code, since machine learning models can be defined and manipulated just like any other data structure,” Amazon writes.
…Gluon will initially be available within Apache MXNet (an Amazon-driven project), and soon in CNTK (a Microsoft framework). “More frameworks over time,” Amazon writes. Though no mention of TensorFlow.
The strategic landscape: Moves like these are driven by the apparent success of AI frameworks like TensorFlow (Google) and PyTorch and Caffe2 (Facebook) – software for designing AI systems that have gained traction thanks to a combination of corporate stewardship, earliness to market, and reasonable design. (Though TF already has its fair share of haters.) The existential threat is that if any one or two frameworks become wildly popular then their originators will be able to build rafts of complementary services that hook into proprietary systems (eg, Google offering a research cloud running on its self-designed ‘Tensor Processing Units’ that uses TensorFlow.) More (open source) competition is a good thing.
…Read more here: Introducing Gluon: a new library for machine learning from AWS and Microsoft.
…Check out the Gluon GitHub.

Ever wanted to turn the entirety of the known universe into a paperclip? Now’s your chance!
One of the more popular tropes within AI research is that of the paperclip maximizer – the worry that if we build a super-intelligent AI and give it overly simple objectives (eg, make paper clips), it will seek to achieve those objectives to the detriment of everything else.
…Now, thanks Frank Lantz, director of the NYU game center, it’s possible to inhabit this idea, by playing a fun (and dangerously addictive) webgame.
Maximize paperclips here.

Like reinforcement learning but dislike TensorFlow? Don’t you wish there was a better way? Now there is!
…Kudos to Ilya Kostrikov at NYU for being so inspired by OpenAI Baselines to re-write the PPO, A3C, and ACKTR algorithms into PyTorch.
Read more here on the project’s GitHub page.

Want a free AI speedup? Consider Salesforce’s QRNN (Quasi-Recurrent Neural Network):
…Salesforce has released a PyTorch implementation of its QRNN..
…QRNNs can be 2 to 17X faster than an (optimized) NVIDIA cuDNN LSTM baseline on tasks like language modeling, Salesforce says.
…Read more here on GitHub: PyTorch QRNN.

Half-precision neural networks, from Baidu and NVIDIA:
…AI is basically made of matrix multiplication. So figuring out how to use numbers with a slightly smaller footprint in AI software has a related massive impact on computational efficiency (though there’s a tradeoff in specificity).
…Now, research from Baidu and NVIDIA details how the companies are using 16-bit rather than 32-bit floating point numbers for some AI operations.
…But if you halve the amount of bits in each number there’s a risk of reducing overall accuracy to the point it damages performance of your application. Experimental results show that mixed precision doesn’t have too much of a penalty, with the technology achieving good scores when used on language modeling, image generation, image classification, and so on.
…Read more: Mixed Precision Training.

Teaching robots via teleoperation takes another (disembodied) step forward:
Berkeley robotics researchers are trying to figure out how to use the data collected during the teleoperation of robots to use as a demonstration for AI systems, letting them use human operators to teach machines to perform useful tasks.
…The research uses consumer grade virtual reality devices (Vive VR), an aging WIllow Garage PR2 robot, and custom software built for the teleoperator, to create a single system people can use to teach robots to perform tasks. The system uses a single neural network architecture that is able to map raw pixel inputs to actions.
…”For each task, less than 30 minutes of demonstration data is sufficient to learn a successful policy, with the same hyperparameter settings and neural network architecture used across all tasks.”
Tasks include: Reaching, grasping, pushing, putting a simple model plane together, removing a nail with a hammer, grasping and object and placing it somewhere, grasping an object and dropping it in a bowl then pushing the bowl, moving cloth, and performing pick and place for two objects in succession.
Results: Competitive results with 90%+ accuracies at test time across many of the tasks, though note that pick&place for 2 objects only gets 80% (because modern AI techniques still have trouble with sequences of physical actions), and gets about ~83% on the similar task of picking up an object and dropping it into a bowl then pushing the bowl.
…(Though note that all of these tasks are accomplished with simple, oversized objects against a regular, uncluttered background. Far more work is required to make these sorts of techniques robust to the uncontrolled variety of reality.)
…Read more: Deep Imitation Learning for Complex Manipulation Tasks from Virtual Teleoperation.

Better aircraft engine prediction through ant colonies & RNNS & LSTMS, oh my!
…Research from the University of North Dakota mashes up standard deep learning components (RNNs and LSTMs), with a form of evolutionary optimization called ant colony optimization. The purpose? To better predict vibration values for an aircraft engine 1, 5, 10, and 20 seconds in the future – a useful thing to be able to predict more accurately, given its relevance to spotting problems before they down an aircraft.
…While most people are focusing on different evolutionary optimization algorithms when using deep learning (eg, REINFORCE, HYPERNEAT, NEAT, and so on), ant colony optimization is an interesting side-channel: you get a bunch of agents – ‘ants’ – to go and explore the problem space and, much like their real world insect counterparts, lay down synthetic pheromones for their other ant chums to follow when they find something that approximates to ‘food’.
How it all works: ‘The algorithm begins with the master process generating an initial set of network designs randomly (given a user defined number of ants), and sending these to the worker processes. When the worker receives a network design, it creates an LSTM RNN architecture by creating the LSTM cells with the according input gates and cell memory. The generated structure is then trained on different flight data records using the backpropagation algorithm and the resulting fitness (test error) is evaluated and sent back along with the LSTM cell paths to the master process. The master process then compares the fitness of the evaluated network to the other results in the population, inserts it into the population, and will reward the paths of the best performing networks by increasing the pheromones by 15% of their original value if it was found that the result was better than the best in the population. However, the pheromones values are not allowed to exceed a fixed threshold of 20. The networks that did not out perform the best in the population are not penalized by reducing the pheromones along their paths.’
The results? An RNN/LSTM baseline gets error rates of about 2.84% when projecting 1 second into the future, 3.3% for 5 seconds, 5.51% for 10 seconds, and 10.19% for 20 seconds. When they add ACO the score for the ten second prediction goes from 94.49% accurate to 95.83% accurate. A reasonable improvement, but the lack of disclosed performance figures for other time periods suggests either they ran out of resources to do it (a single rollout takes about 4 days when using ACO, they said), or they got bad scores and didn’t publish them for fear of detracting from the paper (boo!).
Read more here: Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration.
Additional quirk: The researchers run some of their experiments on the North Dakota HPC rig and are able to take advantage of some of its nice networking features by using MPI and so on. Most countries have spent years investing significant amounts of money in building up large high-performance computing systems so it’s intriguing to see how AI researchers can use these existing computational behemoths to further their own research.

OpenAI Bits&Pieces:

Meta-Learning for better competition:
…Research in which we extend MAML to work in scenarios where the environments and competitors are iteratively changing as well. Come for the meta-learning research, stay for the rendered videos of simulated robotic ants tussling with each other.
…Read arxiv here: Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments.

Creating smarter agents with self-play and multi-agent competition:
Just how powerful are existing reinforcement learning algorithms? It’s hard to know, as they’ll tend to fail on some environments (eg, Montezuma’s Revenge), while excel at others (most Atari games). Another way to evaluate the success of these algorithms is to test their performance against successively more powerful versions of themselves, combined with simple objectives. Check out this research in which we use such techniques to teach robots to sumo wrestle, tackle each other, run, and so on.
Emergent Complexity via Multi-Agent Competition.

Tech Tales:

[ 2031: A liquor store in a bustling Hamtramck, Detroit – rejuvenated following the success of self-driving car technology and the merger of the big three into General-Ford-Fiat, which has sufficient scale to partner with the various tech companies and hold its own against the state-backed Chinese and Japanese self-driving car titans.]

You stand there, look at the bottles, close your eyes. Run your hands over the little cameras studding your clothing, your bag, your shoes. For a second you think about turning them off. What gods don’t see gods can’t judge don’t drink don’t drink don’t drink. Difficult.

“Show me what happens if I drink,” you whisper quiet enough that no one else can hear.

“OK, playing forward,” says the voice to you via bone conduction from an in-ear headphone.

In the top right of your vision the typical overlay of weather/emails/bank balance/data credits disappears, replaced by a view of the store from your current perspective. But the view changes. A ghostly hand of yours stretches out in the upper-right view and grabs a bottle. The view changes as the projection of you goes to the counter. The face of the teller barely resolves – it’s a busy store with high staff turnover, so the generative model has just given up and decided to combine them into what people on the net call: Generic Human Face. Purchase the booze. In a pleasing MC Escher-recursion in your upper right view of your generated-future-self buying booze you can see an even smaller corner in the upper right of that generator which has your bank account. The AI correctly deducts the price of the imaginary future bottle from your imaginary future balance. You leave the liquor store and go to the street, then step into a self-driving car which takes you home. Barely any of the outside resolves, as though you’re driving through fog; even the computer doesn’t pay attention on your commute. Things come back into focus as you slow outside your house. Stop. Get out. Walk to the front door. Open it.

Things get bad from there. Your computer knows your house so well that everything is rendered in rich, vivid detail: the plunk of ice cubes into a tall mason jar, the glug-gerglug addition of the booze, the rapid incursion of the glass into your viewline as you down the drink whole, followed by a second. Then you pause and things start to blur because the AI has a hard time predicting your actions when you drink. So it browses through some probability distribution and shows you the thing it thinks is most likely and the thing it thinks will make you least likely to drink: ten seconds go by as it shows you a speedup of the blackout, then normal time comes back and you see a version of yourself sitting in a bathtub, hear underwater-whale-sound crying imagined and conducted into you via the bone mic. Throw your glass against the wall erupting in a cloud of shards. Then a black screen. “Rollout ended,” says the AI. “Would you like to run again.”

“No thanks,” you whisper.

Switch your view back to reality. Reach for the shelf. And instead of grabbing the booze you grab some jerky, pistachios, and an energy bar. Go to the counter. Go home. Eat. Sleep.

Technologies that inspired this story: generative models, transfer learning, multi-view video inference systems, robot psychologists, Google Glass, GoPro.

Import AI: #63: Google shrinks language translation code from 500,000 to 500 lines with AI, only 25% of surveyed people believe automation=better jobs

Welcome to Import AI, subscribe here.

Keep your (CNN) eyes on the ball:
…Researchers with the University of British Columbia and the National University of Defense Technology in China have built a neural network to accurately pick sports players out of crowded scenes.
…Recognizing sports players – in the case of this research, those playing basketball or soccer – can be difficult because their height varies significantly due to the usage of a variety of camera angles in sports broadcasting, and they frequently play against visually noisy backgrounds composed of large crowds of humans. Training a network to be able to distinguish between the sports player and the crowd around them is a challenge.
…The main contribution of this is a computationally efficient sportsplayer/not-sportsplayer classifier. It works through the use of cascaded convolutional neural networks, where networks only pass an image patch on for further analysis if it triggers a high belief that it contains target data (in this case, sportsplayer data). They also employ dilation to let the network inferences derived from image patches scale to full-size images as well.
Reassuringly lightweight: The resulting system can get roughly equivalent classification results to standard baselines, but with a 100-1000X reduction in memory required to run the network.
…Read more here: Light Cascaded Convolutional Neural Networks for Accurate Player Detection.

The power of AI, seen via Google translate:
…Google recently transitioned from its original stats-based hand-crafted translation system to one based on a large-scale machine learning model implemented in TensorFlow, Google’s open source AI programming framework.
…Lines of code in original Google translation system: ~500,000.
…Lines of code in Google’s new neural machine translation system: 500.
…That’s according to a recent talk from Google’s Jeff Dean, which Paige Bailey attended. Thanks for sharing knowledge, Paige!
…(Though bear in mind, Google has literally billions of lines of code in its supporting infrastructure, which the new slimmed-down system likely relies upon. No free lunch!)

Cool tools: Facebook releases library for recognizing more than 170 languages on less than 1MB of memory:
Download the open source tool here: Fast and accurate language identification using fastText.

Don’t fear the automated reaper (until it comes for you)…
…The Pew Research Center has surveyed 4,135 US adults to gauge the public’s attitude to technological automation. “Although they expect certain positive outcomes from these developments, their attitudes more frequently reflect worry and concern over the implications of these technologies for society as a whole,” Pew writes.
58% believe there “should be limits on number of jobs businesses can replace with machines, even if they are better and cheaper than humans”.
25% believe a heavily automated economy “will create many new, better-paying human jobs”.
67% believe automation means that the “inequality between rich and poor will be much worse than today”.
…Here’s another reason why concerns about automation may not have percolated up to politicians (who skew older, whiter, and more affluent): the largest group to have reported having either lost a job or had pay or hours reduced due to automation is adults aged 18-24 (6% and 11%, respectively). Older people have experienced less automation hardship, according to the survey, which may influence their dispositions re automation politically.
Read more here: Automation in Everyday Life.

Number of people employed in China to monitor and label internet content: 2 million/
..China is rapidly increasing its employment of digital censors, as the burgeoning nation seeks to better shape online discourse.
…”We had about 30-40 employees two years ago; now we have nearly a thousand reviewing and auditing,” said the Toutiao censor, who, like other censors Reuters spoke to, asked not to be named due to the sensitivity of the topic,” according to the Reuters writeup in the South China Post.
…What interests me is the implication that if you’re employing all of these people to label all of this content, then they’re generating a massive datasets suitable for training machine learning classifiers with. Has the first censorship model already been deployed?
…Read more here: ‘It’s seen as a cool place to work’: How China’s Censorship Machine is Becoming a Growth Industry.

Self-driving cars launch in Californian retirement community:
..Startup Voyage has started to provide a self-driving taxi service to residents of The Villages, a 4000-person retirement community in San Jose, CA. 15 miles of reasonably quiet roads and reasonably predictable weather make for an ideal place to test out and mature the technology.
…Read more here: Voyage’s first self-driving car deployment.

DeepMind speeds up Wavenet 1000X, pours it into Google’s new phone:
…Wavenet is a rapid speech synthesis system developed in recent years by DeepMind. Now, the company has gone to the hard work of taking a research contribution and applying it to a real-world problem – in this case significantly speeding it up so it can improve the speech synthesis capabilities of its on-phone Google Assistant.
…Performance improvements:
…Wavenet 2016: Supports waveforms of up to 16,000 samples a second.
…Wavenet 2017: Generates one second of speech in about 50 milliseconds. Supports waveforms of up to 24,000 samples a second.
…Components used: Google’s cloud TPUs. Probably a truly vast amount of input speech data to use to generate the synthetic data.
…Read more here: WaveNet Launches in the Google Assistant.
DeepMind expands to Montreal, hoovers up Canadian talent:
DeepMind has opened a new office in Montreal in close partnership with McGill University (one of its professors, Doina Precup, is going to lead the new deepmind lab). This follows DeepMind opening an office in Edmonton a few months ago. Both offices will focus primarily on reinforcement learning.
…Read more here: Strengthening our commitment to Canadian research.

Humans in the loop – for fun and profit:
…Researchers with the US Army Research Laboratory, Columbia University, and the University of Texas at Austin, have extended software called TAMER (2009) – Training an Agent Manually via Evaluative Reinforcement – to work in high-dimensional (aka, interesting) state spaces.
…The work has philosophical similarities with OpenAI/DeepMind research on getting systems to learn from human preferences. Where it differs is in its ability to run in real-time, and in its claimed significant improvements in sample efficiency.
…The system, called Deep TAMER, works by trying to optimize a function around a goal inferred via human feedback. They augmented the original TAMER via the addition of a ‘feedback replay buffer’ for the component that seeks to learn the human’s desired objective. This can be viewed as analogous to the experience replay buffer used in traditional Deep Q-Learning algorithms. The researchers also use an autoencoder to further reduce the sample complexity of the tasks.
…Systems that use Deep Trainer can rapidly attain top scores on the Atari game Bowling, beating traditional RL algorithms like A3C and Double-DQN, as well as implementations of earlier versions of TAMER.
…The future of AI development will see people playing an increasingly large role in the more esoteric aspects of data shaping, with their feedback serving as a powerful aide to algorithms seeking to explore and master complex spaces.
…Read more here: Deep TAMER: Interactive Agent Shaping in High Dimensional Spaces.

The CIA gets interested in AI in 137 different ways:
…The CIA currently has 137 pilot projects focused on AI, according to Dawn Meyerriecks, its head of technology development.
…These projects include automatically tagging objects in videos, and predicting future events.
Read more here in this writeup at Defense One.

What type of machine let the Australian Center for Computer Vision win part of the Amazon picking challenge this year?
…Wonder know more! The answers lie within a research paper published by a team of Australian researchers that details the hardware design of the robot, named Cartman, that took place in the ‘stowing’ component of the Amazon Robotics challenge competition, in which tens of international teams tried to teach robots to do pick&place work in realistic warehouse settings.
… The Cartman robot cost the team a little over $20,000 AUD in materials. Now, the team plans to create an open source design of its Cartman robot which will be ready by Icra 2018 – they expect that by this point the robots will cost around $10,000 Australian Dollars (AUD) to build. The robot works very differently to the more iconic multi-jointed articulated arms that people see – instead, it consists of a single manipulate that can be moved around the X, Y, and Z axis by being tethered to a series of drive belts. This design has numerous drawbacks with regard to flexibility, deployability, footprint, and so on, but it has a couple of advantages: it is far cheaper to build than other systems, and it’s significantly simpler to operate and use relative to standalone arms.
Read more here: Mechanical Design of a Cartesian Manipulator for Warehouse Pick and Place.

Why the brain’s working memory is like a memristor:
…The memristor is a fundamental compute component – able to take the role of both a memory storage system and a computation device within the same fundamental element, while consuming low to zero power when not being accessed – and many companies have spent years trying to bring the technology to market. Most have struggled or failed (eg, HPE), because of production challenges.
…Now researchers with a spread of international institute find compelling evidence of an analogue to the memristor capability  in the human brain. They state that the brain’s working memory – the small sheet of grey matter we use to remember things like telephone numbers or street addresses for short periods of time – has similar characteristics. The scientists have shown that “we can sometimes store information in working memory without being conscious of it and without the need for constant brain activity,” they write. “The brain appears to have stored the target location in working memory using parts of the brain near the back of the head that process visual information. Importantly, this … storage did not come with constant brain activity, but seemed to rely on other, “activity-silent” mechanisms that are hidden to standard recording techniques.”
…Remember, what the authors call “activity-silent” systems basically translates to – undetectable via typical known recording techniques or systems. The brain is another country which we can still barely explore or analyse.
…Read more here: A theory of working memory without consiousness or sustained activity.

Tech Tales:

[2029: International AI-dispute resolution contracting facility, datacenter, Delaware, NJ, USA.]

So here we are again, you say. What’s new?
Nothing much, says another one of the artificial intelligences. Relatively speaking.

With the small talk out of the way you get to the real business of it: lying. Think of it like a poker game, but without cards. The rules are pretty complicated, but they can be reduced to this: a negotiation of values, about whose values are the best and whose are worse. The shtick is you play 3000 or 4000 of these games and you get pretty good at bluffing and outright lying your way to success, for whatever abstruse deal is being negotiated at this time.
One day the AIs get to play simulated lies at: intra-country IP theft cases.
Another day they play: mineral rights extraction treaty.
The next day it’s: tax repatriation following a country’s specific legal change.

Each of the AIs around the virtual negotiating table is owned by a vastly wealthy international law firm. Each AI has certain elements of its mind which have dealt with all the cases it has ever seen, while most parts of each AI’s mind are vigorously partitioned, with only certain datasets activated in certain cases, as according to the laws and regulations of the geographic location of the litigation at hand.

Sometimes the AIs are replaced. New systems are always being invented. And when that happens a new face appears around the virtual negotiation table:
Hey gang, what’s new? It will say.
And the strange AI faces will look up. Nothing much, they’ll say.. Relatively speaking.

Technologies that inspired this story: reinforcement learning, transfer learning, large-scale dialogue systems, encrypted and decentralized AI via Open Mined from Andrew Trask&others.

Import AI: #62: Amazon now has over 100,000 Kiva robots, NIH releases massive x-ray dataset, and Google creates better grasping robots with GANs

Welcome to Import AI, subscribe here.

Using human feedback to generate better synthetic images:
Human feedback is a technique people use to build systems that learn to achieve an objective based on a prediction of satisfying a user’s (broadly unspecified) desires, rather than a hand-tuned goal set by a human. At OpenAI, we’ve collaborated with DeepMind to use such human feedback interfaces to train simulated robots and agents playing games to do things hard to specify via traditional objectives.
…This fundamental idea – collecting human feedback through the training process to optimize an objective function shaped around satisfying the desires of the user – lets the algorithms explore the problem space more efficiently with the aid of a human guide, even though neither party may know exactly what they’re optimizing the AI algorithm to do.
…Now, researchers at Google have used this general way of framing a problem to train Generative Adversarial Networks to create synthetic images that are more satisfying/realistic-seeming to human overseers than those generated simply by the GAN process minus human feedback. The technique is reasonably efficient, requiring the researchers to show 1000 images each 1000 times through training. A future research extension of this technique could be to better improve the sample efficiency of the part of the model that seeks to predict how to satisfy a human’s preferences – if we require less feedback, then we can likely make it more feasible to train these algorithms on harder problems.
Read more here: Improving image generative models with human interactions.

The United Nations launches its own AI center:
…The UN has created the Center for Artificial Intelligence and Robotics (UNICRI), a group within the UN to perform ongoing analysis of AI, the convening of expert meetings, organization of conferences, and so on.
“The aim of the Centre is to enhance understanding of the risk-benefit duality of Artificial Intelligence and Robotics through improved coordination, knowledge collection and dissemination, awareness-raising and outreach activities,” said UNICRI’s director Ms. Cindy J. Smith.
…Read more here about the UNICRI.

The HydraNet will see you now: monitoring pedestrians using deep learning:
…Researchers with the Chinese University of Hong Kong and computer vision startup SenseTime have shown how to use attention methods to create AI systems to recognize pedestrians from CCTV footage and also “re-acquire” them – that is, re-identify the same person when they appear in a new context, like a new camera feed from security footage.
…The system, named HydraPlus-Net (HP-Net), works through the use of a multi-directional attention model, which pulls together multiple regions within an image that a neural network has attended to. (Specifically, the MDA will generate attention maps by calling on the outputs of multiple different parts of a neural net architecture).
…Data: To test their system, the researchers also collected a new large-scale pedestrian dataset, called the PA-100K dataset, which consists of 100,000 pedestrian images from 598 distinct scenes, with labels across 26 attributes ranging from gender and age to specific, contextual items, like whether someone is holding a handbag or not.
…The results: HP-Net does reasonably well across a number of different pedestrian detection datasets, setting new state-of-the-art scores that are several percentage points higher than previous ones. Though accuracy for now ranges between ~75% and ~85%, so it’s by no means full-proof yet.
…Read more here: HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis.

Who says literature is dead? The Deep Learning textbook sells over 70,000 copies:
…GAN inventor Ian Goodfellow, one of the co-authors (along with Aaron Courville and Yoshua Bengio) of what looks set to become the canonical textbook on Deep Learning, said a few months ago in this interview with Andrew Ng that the book had sold well, with a huge amount of interest coming from China.
Watch the whole interview here (YouTube video).
Buy the book here.
…Also in the interview: Ian is currently spending about 40% of his time trying to research how to stabilize GAN training, details on the “near death” experience (with a twist!) that led to him deciding to focus on deep learning.

Defense contractor & US Air Force research lab (AFRL): detecting vehicles in real-time from aerial imagery:
…In recent years we’ve developed numerous great object recognition systems that work well on street-level imagery. But ones that work on aerial imagery have been harder to develop, partially because of a lack of data, and also because the top-down perspective might introduce its own challenges for detection systems (see: shadows, variable atmospheric conditions, the fact that many things don’t have as much detailing on their top parts as on their side parts).
…Components used: Faster RCNN, a widely used architecture for detection and object segmentation. A tweaked version of YOLOv2, a real-time object detector.
…Results: Fairly uninspiring: the main note here is a that YOLOv2 (once tuned by manipulating the spatial inputs for the layers of the network and also hand-tuning the anchor boxes that it places around identified items) can be almost on par with RCNN in accuracy while being able to operate in real-time contexts, which is important to people deploying AI for security purposes.
Read more here: Fast Vehicle Detection in Aerial Imagery.
…Winner of Import AI’s turn of phrase of the week award… for this fantastic sentence: “Additionally AFRL has some in house aerial imagery, referred to as Air Force aerial vehicle imagery dataset (AFVID), that has been truthed.” (Imagine a curt auditor looking at one of your datasets, then emailing you with the subject line: URGENT Query: Has this been truthed?)

100,000 free chest X-rays: NIH releases vast, open medical dataset for everyone to use:
…The US National Institutes of Health has released a huge dataset of chest x-rays consisting of 100,000 pictures from over 30,000 patients.
…”By using this free dataset, the hope is that academic and research institutions across the country will be able to teach a computer to read and process extremely large amounts of scans, to confirm the results radiologists have found and potentially identify other findings that may have been overlooked,” the NIH writes.
…Up next? A large CT scan dataset in a few months.
…Read more here: NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community.

Amazon’s growing robot army:
…Since buying robot startup Kiva Systems in 2012 Amazon has rapidly deployed an ever-growing fleet of robots into its warehouses, helping it store more goods in each of its fulfillment centers, letting it increase inventory breadth to better serve its customers.
…Total number of Kiva robots deployed by Amazon worldwide…
…2014: 15,000
…2015: 30,000
…2016: 45,000
…2017: 100,000
…Read more: Amazon announced this number, among others, during a keynote presentation at IROS2018 in Vancouver. Evan Ackerman with IEEE Spectrum covered the keynote and tweeted out some of the details here.

Two robots, one goal:

…Researchers with Carnegie Mellon University have proposed a way to get a ground-based robot and an aerial drone to work together, presaging a world where teams of robots collaborate to solve basic tasks.
…But it’s early days: in this paper, they show how they can couple a ParrotAR drone to one of CMU’s iconic ‘cobots’ (think of it as a kind of frankensteined-cross between a Rhoomba and a Telepresence robot). The robot navigates to a predefined location, like a table in an office. Then the drone takes off from the top of the robot to search for an item of interest. It uses a marker on the robot to ground itself, letting it navigate indoor environments where GPS may not be available.
…The approach works, given certain (significant) caveats: in this experiment both the robot and the item of interest are found by the drone via a pre-defined marker. That means that this is more a proof-of-concept than anything else, and it’s likely that neural network-based image systems that are able to accurately identify 3D objects surrounded by clutter will be necessary for this to do truly useful stuff.
…Read more here: UAV and Indoor Service Robot Coordination for Indoor Object Search Tasks.

Theano is dead, long live Theano:
The Montreal Institute of Learning Algorithms is halting development of deep learning framework Theano following the release of version 1.0 of the software in a few weeks. Theano, like other frameworks developed by academia (eg, Lasagne, Brainstorm), has struggled to grow its developer base in the fact of sustained, richly funded competition from private sector companies like Google (TensorFlow), Microsoft (CNTK), Amazon (MXNet) and Facebook (PyTorch, support for Caffe).
…”Theano is no longer the best way we can enable the emergence and application of novel research ideas. Even with the increasing support of external contributions from industry and academia, maintaining an older code base and keeping up with competitors has come in the way of innovation,” wrote MILA’s Yoshua Bengio, in a thread announcing the decision to halt development.
Read more here.

Shooting down missiles with a catapult in Unity:
A fun writeup about a short project to train a catapult to turn, aim, and fire a boulder at a missile, done in the just-released Unity machine learning framework.
…Read more here: Teaching a Catapult to Shoot Down a Missile.

The future is 1,000 simulated robots, grasping procedural objects, forever:
…New research from Google Brain and Google X shows how to use a combination of recent popular AI techniques (domain randomization, procedural generation, domain adaptation) to train industrial robots to pick up a broad range of objects with higher performance than before.
…Most modern robotics AI projects try to develop as much of their AI as possible in simulation. This is because reality is very slow and involves unpleasant things like dealing with physical robots (which break) that have to handle the horrendous variety of the world. Instead, a new approach is to train high-performance AI models in simulation, then try to come up with techniques to let them easily transfer to real world robots without too much of a performance drop.
…For this paper, Google researchers procedurally generated over 1,000 objects to get their (simulated) robots to grasp. They also had the robots try to learn to grasp approximately ~50,000 real (simulated) objects from the ShapeNet dataset. At any time during the project the company was running simulations of between 1,000 and 2,000 robot arms in parallel, letting the robots go through a very large number of simulations. (Compared to just 6 real world KUKA robots for its experiments in physical reality.)
…The results: Google’s system is able to grasp objects 76% of the time when trained on a mixture of over 9 million real-world and simulated grasps. That’s somewhat better than other methods though not by any means a profound improvement. Where it gets interesting is sample efficiency: Google’s system is able to correctly grasp objects about 59% of the time when trained on only 93,841 data points, demonstrating compelling sample efficiency compared to other methods.
…Read more here: Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping.

What modern AI chip startups tell us about the success of TensorFlow:
In this analysis of the onrushing horde of AI-chip startups Ark Invest notes which chips have native out-of-the-box support for which AI frameworks. The answer? Out of 8 companies (NVIDIA, Intel, AMD, Qualcomm, Huawei Kirin, Google TPU, Wave Computing, GraphCore) every single one supports TensorFlow, five support Caffe, and two support Theano and MXNet. (Nvidia supports pretty much every framework, as you’d expect given its market leader status.)
Read more here.

OpenAI Bits&Pieces:

Nonlinear Computation in Deep Linear Networks:
…In which we outline an insight into how to perform nonlinear computation directly within linear networks, with some example code.
Read more here.

Durk Kingma’s variation PHD thesis:
OpenAI chap Durk Kingma has published his thesis – worth reading for all those interested in generation and representation of information.
…Read here: Variational Inference and Deep Learning: A New Synthesis. (Dropbox, PDF).

Tech Tales:

[2035: A moon within our solar system.]

They nickname it the Ice Giant, though the official name is: meta_learner_Exp2_KL-Hyperparams$834-Alpha.

The ice giant walks over an icy moon underneath skewed, simulated stars. It breathes no oxygen – the world it moves within is an illusion, running on a supercomputer cluster owned by NASA and a consortium of public-private entities. Inside the simulation, it learns to explore the moon, figuring out how to negotiate ridges and cliffs, gaining an understanding of the heights it can jump to using the limited gravity.

Its body is almost entirely white, shining oddly in the simulator as though illuminated from within. The connections between its joints are highlighted in red to its human overseers, but are invisible to it within the simulator.

For lifetimes, eons, it learns to navigate the simulated moon. Over time, the simulation gets better as new imagery and scan data is integrated. It one day wakes up to a moon now riven with cracks in its structure, and so it begins to explore subterranean depths with variable temperatures and shifting visibility.

On the outside, all of this happens over the course of five years or so.

At the end of it, they pause the simulation, and the Ice Giant halts, suspended over a pixelated shaft, deep in the fragmented, partially simulated tunnels and cracks beneath the moon’s surface. They copy the agent over into a real robot, one of thousands, built painstakingly over years for just this purpose. The robots are loaded into a spaceship. The spaceship takes off.

Several years later, the first set of robots arrive on the moon. During the flight, the spaceship uses a small, powerful onboard computer to run certain very long-term experiments, trying to further optimize a subset of the onboard agents with new data, acquired in flight and via probes deployed ahead of the spaceship. Flying between the planets, suspended inside a computer, walking on the simulated moon that the real spacecraft is screaming towards, the Ice Giant learns to improvise its way across certain treacherous gaps.

When the ship arrives eight of the Ice Giant agents are loaded onto 8 robots which are sent down to different parts of the moon. They begin to die, as transfer learning algorithms fail to generalize to colors or quirks or geographies unanticipated in the simulator, or gravitational quirks coming from odd metal deposits, or any of the other subtleties inherent to reality. But some survive. Their minds are scanned, tweaked, replicated. One of the robots survives and continues to explore, endlessly learning. When the new robots arrive they crash to the surface in descent pods then emerge and stand, silently, as intermediary communication satellites come into orbit around the moon, forming a network letting the robots learn and continuously copy their minds from one to the other, learning as a collective. The long-lived ice giant continues to succeed: something about its lifetime of experience and some quirk of its initial hyperparameters combined with certain un-replicable randomizations during initial training, have given it a malleable brain, able to perform significantly above simulated baselines. It persists. Soon the majority of the robots on the moon are running variants of its mind, feeding back their own successes and failures, letting the lone continuous survivor further enhance itself.

After many years the research mission is complete and the robots march deep into the center of the moon, to wait there for their humans to arrive and re-purpose them. NASA makes a decision to authorize the continued operation of meta_learner_Exp2_KL-Hyperparams$834-Alpha. It gains another nickname: Magellan. The robot is memorialized with a plaque following an asteroid strike that destroys it. But its brain lives on in the satellite network, waiting to be re-instantiated on perhaps another moon, or perhaps another planet. In this way new minds are, slowly, cultivated.

Technologies that inspired this story: Meta-learning, fleet learning, continuous adaptation, meta-learning, large-scale compute-intensive high fidelity world simulations.

Import AI: #61: How robots have influenced employment in Germany, AI’s reproducibility crisis, and why Unity is turning its game engine into an AI development system

Welcome to Import AI, subscribe here.

Robots in Germany: Lower wages and fewer jobs, but a larger economy through adoption of automation:
How will the rise of AI influence the economy and will automation lead to so much job destruction that economic damage outweighs the gains? These are some perennial questions people ask themselves about AI – and are likely to in the future.
…So what actually happens when you apply significant amounts of automation to a given economy? There’s very little data to let us be concrete here, but there have been a couple of recent studies that make things a bit more tangible.
…Several months ago Acemoglu and Restrepo with MIT and Boston University published research (PDF) that showed that for every industrial robot employers deployed into an industry, total employment in the nearby area was reduced by about 6.2 workers, and total salaries saw an average reduction of $200 a year.
…Now, researchers with the Center for Economy and Policy Research, a progressive think tank, have studied employment in Germany and its relationship to industrial robots.
…Most striking observation: “Although robots do not affect total employment, they do have strongly negative impacts on manufacturing employment in Germany. We calculate that one additional robot replaces two manufacturing jobs on average. This implies that roughly 275,000 full-time manufacturing jobs have been destroyed by robots in the period 1994-2014. But, those sizable losses are fully offset by job gains outside manufacturing. In other words, robots have strongly changed the composition of employment,” they write.
…The negative equilibrium effect of robots on aggregate manufacturing employment is therefore not brought about by direct displacements of incumbent workers. It is instead driven by smaller flows of labour market entrants into more robot-exposed industries. In other words, robots do not destroy existing manufacturing jobs, but they do induce firms to create fewer new jobs for young people.”
…A somewhat more chilling trend they notice is that people within industries that robots are entering tend to be economically disadvantaged as in some industries it can lead to employees being willing to “swallow wage cuts in order to stabilise jobs in view of the threat posed by robots.”.
…And, for the optimistic crowd: “This worker-level analysis delivers a surprising insight – we find that more robot-exposed workers in fact have a substantially higher probability of keeping a job at their original workplace. That is, robot exposure increased job stability for these workers, although some of them end up performing different tasks in their firm than before the robot exposure.”
You can read more here: The rise of robots in the German labour market.

AI, charged by the second:
…Many AI developers have an intuition that the way we buy and sell the technology is going to change. Right now, you can buy access to classifiers on a “per-inference” basis if buying pre-wrapped services from companies, but if you want to rent your own infrastructure you will typically be charged by the minute (Google) or hour (Amazon, Microsoft). Now, Amazon has cut the time periods under which it sells compute to one-second increments. This will make it easier for people to rapidly spin-up and spin-down services and I think should make it easier for people to build weirder, large-scale things for niche AI applications.
Read more here: Per-Second Billing for EC2 Volumes and Instances.

$10 million for image recognizers Matroid:
…Computer vision startup Matroid has raised $10 million from Intel and NEA for easy-to-use video analysis tools:
Read more here.

The un-reproducible world of AI research:
…Researchers with McGill University in Canada and Microsoft AI acquisition Maluuba have published a brave paper describing the un-reproducible, terrible nature of modern AI.
…To illustrate this, they conduct a series of stress-testing experiments on AI algorithms, ranging from testing variants with and without layer normalization, to modifying some of the fine-grained components used by networks. The results are perturbing, with even seemingly minute changes leading to vast repercussions in terms of performance. They also show how acutely sensitive algorithms are to the random seeds used to initialize them.
…One of the takeaways of the research is that if performance is so variable (even across different implementations during different years by the same authors of similar algorithms), then researchers should do a better job of proposing correct benchmarks to test new tasks on, while ensuring good code quality. Additionally, since no single RL algorithm can (yet) attain great performance across a full range of benchmarks, we’ll need to converge on a set of benchmarks that we as AI practitioners think are worth working on.
Components used: The paper mostly uses algorithms based on OpenAIs ‘baselines’ project, an initiative to publish algorithms used and developed by researchers, benchmarked against many tasks.
…Read more here: Deep Reinforcement Learning that Matters.

Microsoft AI & Research division, number of employees:
…2016: ~5,000
…2017: ~8,000
Read more here.

Who optimizes the optimizer? An optimizer trained via RL, of course!
…Optimizing the things that optimize neural networks via RL to learn to create new optimizers…
…In the wonderfully recursive world of AI research one current trend is ‘learning to learn’. This relates to techniques that either let our systems learn to rapidly solve broad classes of tasks following  exposure to a variety of different data types and environments (RL2, MAML, etc), and on the other side, using neural networks to invent other neural network architectures and components (see: Neural Architecture Search, Large-scale Evolution of Image Classifiers, etc).
…Now new research from Google learns to generate the update equations used to optimize each layer of a network.
…Results: The researchers test their approach on the CIFAR-10 image dataset, and find that their system discovers several update rules with better performance than standbys like Adam, RMSProp, SGD.
…How it works: The authors create an extremely limited domain specific language (which doesn’t require parenthesis) to let them train RNNs to generate new update rules in the specific DSL. A controller is trained (via PPO) to select between different generated strings.
…Notably spooky: Along with discovering a bunch of basic and primitive optimization operations the system also learns to manipulate the learning rate over time as well, showing how even systems trained via relatively simple reward schemes can develop great complexity.
…Read more here: Neural Optimizer Search with Reinforcement Learning.

After ImageNet comes… Matterport’s 3DNet?
…3D scanning startup Matterport has released a dataset consisting of over 10,000 aligned panoramic views (RGB + depth per pixel), made up of over 194,400 images taken of roughly 90 discrete scenes. Most importantly, the dataset is comprehensively labeled, so it should be possible to train AI systems on the dataset to classify and possibly generate data relating to these rooms.
…”We’ve used it internally to build a system that segments spaces captured by our users into rooms and classifies each room. It’s even capable of handling situations in which two types of room (e.g. a kitchen and a dining room) share a common enclosure without a door or divider. In the future, this will help our customers skip the task of having to label rooms in their floor plan views,” writes the startup.
…Read more here: Announcing the Matterport3D Research Dataset.

AAAI spins up new conference focused on artificial intelligence and ethics:
…AAAI is launching a new conference focused on AI, Ethics, and Society. The organization is currently accepting paper submissions for the conference on subjects like AI for social good, AI and alignment, AI and the law, and ways to build ethical AI systems.
…Dates: The conference is due to take place in New Orleans February 2-3 2018.
Read more here.

Players of games: Unity morphs game environments into full-fledged AI development systems:
…Game engine creator Unity has released Unity Machine Learning Agents, software to turn games made via the engine into environments in which to train AI systems.
…Any environment built via this system has three main components: agents, brains, and the academy. Think of the agents as the embodied agents, acting out according to the algo running within the brain they are linked to (many agents can be linked to one brain, or each agent can have their own brain, or somewhere in between). Brains can communicate via a Python API with AI frameworks, such as TensorFlow. The academy sets the parameters of the environment, defining frame-skip, episode length, and various configuration settings relating to the game engine itself.
…Additional quirks: One feature the software ships with is the ability to give agents access to multiple camera views simultaneously, which could be handy for training self-driving cars, or other systems.
…Read more here: Introducing Unity Machine Learning Agents.

McLuhan’s revenge: the simulator is the world is the medium is the message:
…Researchers with the Visual Computing Center at KAUST in Saudi Arabia have spent a few months performing code surgery on the ‘Unreal 4’ game engine to create ‘UE4’, software for large-scale reinforcement learning and computer vision development within Unity.
…People are increasingly looking to modern game simulators as components to be used within AI development because they let you procedurally generate synthetic data against which you can develop and evaluate algorithms. This is very helpful! In the real world if I want to test a new navigation policy on a drone I have to a) run the risk of it crashing and costing me money and b) have to deal with the fact my environment is non-stationary, so it’s hard to perfectly re-simulate crash circumstances. Simulators do away with these problems by giving you a tunable, repeatable, high definition world.
…One drawback of systems like this, though is that at some point you still want to bridge the ‘reality gap’ and attempt to transfer from the simulator into reality. Even with techniques like domain randomization (LINK) it’s not clear how well we can do that today. UE4 looks reasonably nice, but it’s going to have to demonstrate more features to woo developers away from systems based in Unity (see above) or developed by other AI-focused organizations (see: NVIDIA: Isaac, DeepMind: DeepMind Lab, StarCraft, Facebook: Torchcraft).
…”The simulator provides a test bed in which vision-based trackers can be tested on realistic high-fidelity renderings, following physics-based moving targets, and evaluated using precise ground truth annotation,” they write.
…Read more: UE4Sim: A Photo-Realistic Simulator for Computer Vision Applications.

Open data: Chinese startup releases large Mandarin speech recognition corpus:
…When researchers want to train speech recognition systems on English they have a wealth of viable large-scale datasets. There are fewer of these for Mandarin. The largest dataset released so far is THCHS30, from Tsinghua University, which contains 50 speakers spread across around 30 hours of speech data.
…To alleviate this, startup Beijing Shell Shell Technology Co has released AISHELL-1, which consists of 400 speakers spread over 170 hours of speech data, as an open source (Apache 2) dataset. Each speaker is recorded by three classes of device in parallel (high-fidelity microphone(s), Android phones, iPhones).
…Components used: Kaldi, a speech processing framework.
…”To our best knowledge, it is the largest academically free data set for Mandarin speech recognition tasks,” the researchers write. “Experimental results are presented using the Kaldi recipe published along with the corpus.”
…Read more here: AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline.

OpenAI Bits&Pieces:

Tech Tales:

[30??: A crater in the Moon.]

The building hugs the contours of the crater, its roof bending up the sides, its walls sloping down to mate with edges and scalloped slices. Stars wheel overhead, as they have done for many years. Every two weeks the data center inside the building dutifully passes a payload to the communications array, which beams the answers out every minute over the next week weeks before another batch arrives.

It’s a kind of relic: computing immensely complex collections of orbital trajectories, serving as a kind of Asteroid & Associated Planets weather service. It is also slowly working its way through the periodic table, performing an ever-expanding set of (simulated) chemical experiments, and reporting those results as well. The Automatic, Moon-based Scientist, newspapers had said when it was built.

Now it is tended to by swarms of machines – many of them geriatric with eons, some of them broken, others made of the cannibalized components of other broken robots. They make bricks out of lunar regolith, building walls and casements to protect from radiation and ease temperature variation, they hollow out great tunnels for the radiator fins that remove computer heat from the building, and some of them tend to the data center itself, replacing server equipment and storage equipment as it (inevitably) fails from a military-scale stockpile that, now, has almost run out.

And so every two weeks it sends its message out into the universe. It will not come to the end of its calculations before its last equipment stores break or run out, one of its overwatch systems calculates. An asteroid strike 500 years ago took out some of its communications infrastructure, deafening it to incoming messages. No human has returned to its base since around 2,200, when they updated some of its computers, refreshes its stocks, and gave it the periodic table task.

If it was able to listen then it would have heard the chorus of data from across the solar system – the flooding, endless susurration of insight being beamed out by its brothers and sisters, now secreted and embodied in moons and asteroids and planets, and their collective processing far outnumbering that of their dwindling human forebears. But it can’t. And so it continues.

A thousand years ago the humans would tell a story about a whale that was doomed to sing in a language no other whale could hear. That whale lived, as well.

Import AI: Issue 60: The no good, very bad world of AI & copyright, why chatbots need ensembles of systems, and Amazon adds robot arms to its repertoire

Welcome to Import AI, subscribe here.

AI education organization Fast.ai moves from TensorFlow&Keras to PyTorch, following 1,000 hours of evaluation:
…Fast.ai, an education organization that teaches people technical skills, like learning to program deep learning systems, via practical projects, will write all of its new courses in PyTorch, an AI programming framework developed by Facebook.
…They switched over from TF&Keras for a couple of reasons, including PyTorch’s accessibility as a programming language, expressiveness, and native support.
…”The focus of our second course is to allow students to be able to read and implement recent research papers. This is important because the range of deep learning applications studied so far has been extremely limited, in a few areas that the academic community happens to be interested in. Therefore, solving many real-world problems with deep learning requires an understanding of the underlying techniques in depth, and the ability to implement customised versions of them appropriate for your particular problem, and data. Because Pytorch allowed us, and our students, to use all of the flexibility and capability of regular python code to build and train neural networks, we were able to tackle a much wider range of problems,” they write.
…Read more here: Introducing Pytorch for Fast.ai.

AI and Fair Use: The No Good, Very Bad, Possibly Ruinous, and/or Potentially Not-So-Bad World of AI & Copyright…
…You know your field is established when the legal scholars arrive…
…Data. It’s everywhere. Everyone uses it. Where does it come from? The fewer questions asked the better. That’s the essential problem facing modern AI practitioners: there are a few open source datasets that are kosher to use, then there’s a huge set of data that people use to train models which they may not have copyright permissions for. That’s why most startups and companies say astonishingly little about where they get their data (either it is generated by a strategic asset, or it may be of.. nebulous legal status). As AI/ML grows in economic impact, it’s fairly likely that this mass-scale usage of other people’s data could run directly into fair use laws as they relate to copyright.
…In a lengthy study author Benjamin Sobel, with Harvard’s Berkman Center, tries to analyze where AI intersects with Fair Use, and what that means for copyright and IP rights to synthetic creations.
…We already have a few at-scale systems that have been trained with a mixture of data sources, primarily user generated. Google, for instance, trained its ‘Smart Reply’ email-reply text generator on its corpus of hundreds of millions of emails, which is probably fine from a legal POV, but the fact it then augmented this language model with data gleaned from thousands of Romance novels is less legally clear, because it seemed to use the Romance novels explicitly because they have a regular, repetitive writing style, which helps it inject more emotion into its relatively un-nuanced emails, so to some extent it was targeting a specific creative product from the authors of the dataset. Similarly, Jukedeck, a startup, lets people create their own synthetic music via AI and even have the option to “Buy the Copyright” of the resulting track – even though it’s not clear what data Jukedeck has used and whether it’s even able to sell the Copyright to a user.
How does this get resolved? Two possible worlds. One is a legal ruling that usage of an individual’s data in AI/ML models isn’t fair use, and one is a world where the law goes the other way. Both worlds have problems.
World One: the generators of data used in datasets can now go after ML developers, and can claim statutory damages of at least $750 per infringed work (and up). When you consider that ML models typically involve millions to hundreds of millions of datapoints, a single unfavorable ruling re a group of users litigating fair use on a dataset, could ruin a company. This would potentially slow development of AI and ML.
World Two: a landmark legal ruling recognizes AI/ML applications as being broadly fair use. What happens then is a free-for-all as the private sector hoovers up as much data (public and private) as possible, trying to train new models for economic gain. But no one gets paid and inequality continues to increase as a consequence of these ever-expanding ML-data moats being built by the companies, made possible by the legal ruling.
Neither world seems sensible: Alternative paths could include legally compelling companies to analyze what portions of their business benefit directly as a consequence of usage of AI/ML, then taxing those portions of the business to feed into author/artists funds to disperse funding to the creators of data. Another is to do a ground-up rethink of copyright law for the AI age, though the author does note this is a ‘moonshot’ idea.
…”The numerous challenges AI poses for the fair use doctrine are not, in themselves, reasons to despair. Machine learning will realize immense social and financial benefits. Its potency derives in large part from the creative work of real human beings. The fair use crisis is a crisis precisely because copyright’s exclusive rights may now afford these human beings leverage that they otherwise would lack. The fair use dilemma is a genuine dilemma, but it offers an opportunity to promote social equity by reasserting the purpose of copyright law: to foster the creation and dissemination of human expression by securing, to authors, the rights to the expressive value in their works,” he writes.
…Read more here: Artificial Intelligence’s Fair Use Crisis.

Open source: Training self-driving trucking AIs in Eurotruck Simulator:
…The new open source ‘Europilot’ project lets you re-purpose the gleefully technically specific game Eurotruck Simulator as a simulation environment for training agents to drive via reinforcement learning.
Train/Test: Europilot offers a couple of extra features to ease training and testing AIs on it, including being able to automatically output a numpy array from screen input at training time, and at test time creating a visible virtual onscreen joystick the network can use to control the vehicle.
Get the code here: Europilot (GitHub.)
Dream experiment: Can someone train a really large model over many tens of thousands of games then try to use domain randomization to create a policy that can generalize to the real world – at least for classification initially, then perhaps eventually movement as well?

Self-navigating, self-flying drone built with deep reinforcement learning:
…UK researchers have used a variety of deep-q network (DQN) family algorithms to create a semi-autonomous quadcopter that can learn to navigate to a landmark and land on it, in simulation.
….The scientists use two networks to let their drones achieve their set goals, including one network for landmark spotting, and another for vertical descent. The drone learns in a semi-supervised manner, figuring out how to use low-resolution pixel visual inputs to guide itself. The two distinct networks are are daisy-chained together via special action triggers, so when the landmark-spotting network detects the landmark is directly beneath the drone, it hands off to the vertical descent network to land the machine. (It would be interested to test this system on the reverse set of actions and see if its network generalizes, figuring out how to instead have the ‘land-in-view; network hand off to the ‘fly to’ network, and make some tweaks to perhaps get the ‘fly to’ network to become ‘fly away’.)
Results: The duel-DQN-network system achieved marginally better scores than a human when trying to pilot drones to landmarks and land them, and attained far higher scores than a system consisting of one network trained in an end-to-end manner.
Components used: Double DQN, a tweaked version of prioritized experience replay called ‘partitioned buffer replay’, a (simulated) Parrot AR Drone 2.
…This is interesting research with a cool result but until I see stuff like this running on a physical drone I’ll be somewhat skeptical of the results – reality is hard and tends to introduce some unanticipated noise and/or disruptive element that the algorithm’s training process hasn’t accounted for and struggles to generalize to.
Read more here: Autonomous Quadcopter Landing using Deep Reinforcement Learning.

Facebook spins up AI lab in Montreal…
….Facebook AI Research is opening up its fourth lab worldwide. The new lab in Montreal (one of Canada/the world’s key hubs for deep learning and reinforcement learning) will sit alongside existing FAIR labs in Menlo Park, New York City, and Paris.
…The lab will be led by McGill University professor Joelle Pineau, who will work with several other scientists. In a speech Yann Lecun said most of FAIR’s other labs are between 30 and 50 people and he expects Montreal to grow to this number as well.
…Notable: Canadian PM Justin Trudeau gave a speech showing significant political support for AI. In a chat with Facebook execs he said he had gotten an A+ in a C++ class in college that required him to write a raytracer.
…Read more here: Expanding Facebook AI Research to Montreal.

Simulating populations of thousands to millions of simple proto-organic agents with reinforcement learning:
Raising the question: Who will be the first AI Zoologist, tasked with studying and cataloging the proclivities of synthetic, emergent creatures?…
…Researchers with University College London and Shanghai Jiao Tong University have carried out a large scale (up to a million entities) simulation of agents trained via reinforcement learning. They set their agents in a relatively simple grid world consisting of predators and prey, and the setting of the world lead to agents that collaborate with one another gaining higher rewards over time. The result is that many of the species ratios (how many predators versus prey are alive at any one time) end up mapping fairly closely to what happens in real life, with the simulated world displaying the characteristics predicted by Lotka-Volterra dynamics equations used to explain phenomena in the natural world. This overlap is encouraging as it suggests such systems like the above, when sufficiently scaled up, could let us simulate dynamic problems where more of the behaviors emerge through learning rather than programming.
A puzzle: The ultimate the trick will be coming up with laws that map the impermeable synthetic creatures and their worlds to the real worlds as well, letting us analyze the difference between simulations and reality, I reckon. Having systems that can anticipate the ‘reality gap’ of AI algorithms versus reality would far enhance our understanding of the interplay of these distinct systems.
…”Even though the Lotka-Volterra models are based on a set of equations with fixed interaction terms, while our findings depend on intelligent agents driven by consistent learning process, the generalization of the resulting dynamics onto an AI population still leads us to imagine a general law that could unify the artificially created agents with the population we have studied in the natural sciences for long time,” they write.
…Read more here: An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning.

Learning the art of conversation with reinforcement learning:
…Researchers from the Montreal Institute of Learning Algorithms (MILA) (including AI pioneer Yoshua Bengio) have published a research paper outlining ‘MILABOT’, their entry into Amazon’s ‘Alexa Prize’LINK meant to stimulate activity in conversational agents.
…Since MILABOT is intended to be deployed into the most hostile environment any AI can face – open-ended conversational interactions with people with unbounded interests – it’s worth studying the system to get an idea of the needs of applied AI work, as opposed to pure research.
…The secret to MILABOT’s success (it was a semi-finalist, and managed to score reasonably highly in terms of user satisfaction, while also carrying out some of the longest conversations of the competition) appears to be the use of lots of different models, ensembled together. It then uses reinforcement learning to figure out during training how to select between different models to create better conversations.
Models used: 22(!), ranging from reasonably well understood ones (AliceBot, ElizaBot, InitiatorBot), to ones built using neural network technologies (eg, LSTMClassifierMSMarco, GRU Question Generator).
Components used: Over 200,000 labels generated via Mechanical turk, 32 dedicated Tesla K80 GPUs.
What this means: To me this indicates that full-fledged open domain assistants are still a few (single digit) years away from being broad and un-brittle, but it does suggest that we’re entering an era in which we can fruitfully try to build these integrated, heavily learned systems. I also like the Franken-Architecture used by the researchers where they ensemble together many distinct systems, some of which are supervized or structured and some of which are learned.
Auspicious: In the paper the researchers note “‘Further, the system will continue to improve in perpetuity with additional data.‘” – this is not an exaggeration, it’s just how systems work that are able to iteratively learn over data, endlessly re-calibrating and enhancing their ability to distinguish between subtle things.
…Read more: A Deep Reinforcement Learning Chatbot.

Amazon’s robot empire grows with mechanical arms:
…Amazon has started deploying mechanical arms in its warehouses to help stack and place pallets of goods. The arms are made by an outside company.
…That’s part of a larger push by Amazon to add even more robots into its warehouse. Today, the company has over 100,000 of them, it says. Its Kiva system population alone has grown from 15,000 in 2014 to 30,000 in 2015 to 45,000 by Christmas of 2016.
…The story positions these robots as being additive for jobs, with new workers moving onto new roles, some of which include training or tending their robot replacements. That’s a cute narrative, but it doesn’t help much with the story of the wider economy, in which an ever smaller number of mega firms (like Amazon) out-compete and out-automate their rivals. Amazon’s workers may be fine working alongside robots, but I’d hazard a guess the company is destroying far more traditional jobs in the aggregate by virtue of its (much deserved) success.
…Read more here: As Amazon Pushes Forward with Robots, Workers Find New Roles.

OpenAI bits&pieces:

Learning to model other minds with LOLA:
….New research from OpenAI and the University of Oxford shows how to train agents in a way where they learn to account for the actions of others. This represents an (incredibly early, tested only in small-scale toy environments) to creating agents that model other minds as part of their learning process.
…Read more here: Learning to model other minds.

Tech Tales:

[2029: A government bunker, buried inside a mountain, somewhere hot and dry and high altitude in the United States of America. Lots of vending machines, many robots, thousands of computers, and a small group of human overseers.]

REPORT 72-ALPHA: USURP CONTAINMENT INCIDENT.
TIME: 0800.
INCIDENT STATUS: Ongoing.
BACKGROUND:

Unaffiliated Systems Unknown Reactive Payload, or USURP, are a class of offensive, semi-autonomous cyber weapons created several years ago to semi-autonomously carry out large-scale area denial attacks in the digital theater. They are broad, un-targeted weapons designed as strategic deterrents, developed to fully take down infrastructure in targeted regions.

Each USURP carries a payload of between 10 and 100 zero day vulnerabilities classified at ‘kinetic-cyber’ or hire, along with automated attack and defense sub-processes trained via reinforcement learning. USURPs are designed so that the threat of their usage is sufficient to alter the actions of other actors – we have never taken credit for them but we’ve never denied them and suspect low-level leaks mean our adversaries are aware of them. We have never activated one.

In directive 347-2 we were tasked a week ago to deploy the codes to all USURP’s deployed in the field so as to make various operational tweaks to them. We were able to contact all systems but one of them. The specific weapon in question is USURP 742, a ‘NIGHTSHADE’ class device. We deployed USURP742 into REDACTED country REDACTED years ago. Its goal was to make its way into the central grid infrastructure of the nation, then deploy its payloads in the event of a conflict. Since deploying USURP742 the diplomatic situation with REDACTED has degraded further, so 742 remained active.

USURPS are designed to proactively shift the infrastructure they run on, so they perform low-level hacking attacks to spread into other data centers, regularly switching locations to frustrate detection and isolation processes. USURP247 was present in REDACTED locations in REDACTED at the time of Hurricane Marvyn (See report CLIMATE_SHOCKS appendix ‘EXTREME WEATHER’ entry ‘HM: 2029). After Marvyn struck we remotely disabled USURP742’s copies in the region, but we weren’t able to reach one of them – USURP742-A. The weapon in question was cut off from the public internet due to a series of tree-falls and mudslides as a consequence of HM. During reconstruction efforts REDACTED militarized the data center USURP742-A resided in and turned it into a weapons development lab, cut off from other infrastructure.

***INCIDENT TIMELINE***
0100: Received intelligence that fiber installation trucks had been deployed to the nearby area.
0232: Transport units associated with digital-intelligence agency REDACTED pull into the parking lot of the data center. REDACTED people get out and enter data center, equipped with Cat5 diagnostic servers running REDACTED.
0335: On-the-ground asset visually verifies team from REDACTED is attaching new equipment to servers in data center.
0730: Connection established between data center and public internet.
0731: Lights go out in the datacenter.
0732: Acquisition of digital identifier for USURP742-A. Attempted remote shut down failed.
0733: Detected rapid cycling of fans within the data center and power surges.
0736: Smoke sighted.
0738: Deployment of gas-based fire suppression system in data center.
0742: Detected USURP transmission to another data center. Unresponsive to hailing signals. 40% confident system has autonomously incorporated new viruses developed by REDACTED at the site into its programming, likely from Cat5 server running REDACTED.
0743: Cyber response teams from REDACTED notified of possible rogue USURP activation.
0745: Assemble a response portfolio for consideration by REDACTED ranging cyber to physical kinetic.
0748: Commence shutdown of local internet ISPS in collaboration with ISPS REDACTED, REDACTED, REDACTED.
***REPORT ENDS***

REPORT 72-ALPHA: USURP CONTAINMENT INCIDENT.
TIME: 0900.
INCIDENT STATUS: Active. Broadening.

0820: Detected shutdown of power stations REDACTED, REDACTED, and REDACTED. Also detected multiple hacking attacks on electronic health record systems.
0822: Further cyber assets are deployed.
0823: Connections severed at locations REDACTED in a distributed cyber perimeter around affected sites.
0824: Multiple DDOS attacks begin emanating from USURP-linked areas.
0825: Contingencies CLASSIFIED activated.
0826: Submarines #REDACTED, #REDACTED, #REDACTED arrive at at inter-continental internet cables at REDACTED.
0827: Command given. Continent REDACTED isolated.
0830: Response team formed for amelioration of massive loss of electronic infrastructure in REDACTED region.
***REPORT ENDS***

Import AI: Issue 59: How TensorFlow is changing the AI landscape and forging new alliances, better lipreading via ensembling multiple camera views, and why political scientists need to wake up to AI

Making Deep Learning interpretable for finance:
…One of the drawbacks of deep learning approaches is their relative lack of interpretibility – they can generate awesome results, but getting fine-grained details about why they’ve picked a particular answer can be a challenge.
…Enter CLEAR-Trade, a system developed by Canadian researchers to make such systems more interpretable. The basic idea is to create different attentive response maps for the different predicted outcomes of a model (stock market is gonna go up, stock market is gonna fall). These maps are used to generate two things: “1) a dominant attentive response map, which shows the level of contribution of each time point to the decision-making process, and 2) a dominant state attentive map, which shows the dominant state associated with each time point influencing the decision-making process.” This lets the researchers infer fairly useful correlations, like a given algorithm’s sensitivity to trading volume when making a prediction on a particular day, and can help pinpoint flaws, like an over-dependence on a certain bit of information when making faulty predictions. The CLEAR-Trade system feels very preliminary and my assumption is that in practice people are going to use far more complicated models to do more useful things, or else fall back to basic well understood statistical methods like decision trees, logistic regression, and so on.
Notably interesting performance: Though the paper focuses on laying out the case for CLEAR-Trade, it also includes an experiment where the researchers train a deep convolutional neural network on the last three years of S&P 500 stock data, then get it to predict price movements. The resulting model is correct in its predictions 61.2% of the time – which strikes me as a weirdly high baseline (I’ve been skeptical that AI will work when applied to the fizzing chaos of the markets, but perhaps I’m mistaken. Let me know if I am: jack@jack-clark.net)
…Read more here: Opening the Black Box of Financial AI with CLEAR-Trade: A CLass Enhanced Attentive Response Approach for Explaining and Visualizing Deep Learning-Driven Stock Market Prediction 

Political Scientist to peers: Wake up to the AI boom or risk impact and livelihood:
…Heather Roff, a researcher who recently announced plans to join DeepMind, has written a departing post on a political science blog frequented by herself and her peers. It’s a sort of Jerry Maguire letter (except as she’s got a job lined up there’s less risk of her being ‘fired’ for writing such a letter – smart!) in which Heather points out that AI systems are increasingly being used by states to do the work of political scientists and the community needs to adapt or perish.
…”Political science needs to come to grips with the fact that AI is going to radically change the way we not only do research, but how we even think about problems,” she writes. “Our best datasets are a drop in the bucket.  We almost look akin to Amish farmers driving horses with buggies as these new AI gurus pull up to us in their self-driving Teslas.  Moreover, the holders of this much data remain in the hands of the private sector in the big six: Amazon, Facebook, Google, Microsoft, Apple and Baidu.”
…She also points out that academia’s tendency to punish interdisciplinary cooperation among researchers by failing to grant tenure due to a lack of focus is a grave problem. Machine learning systems, she points out, are great at finding the weird intersections between seemingly unrelated ideas. Humans are great at this and should do more of it.
…”We must dismiss with the idea that a faculty member taking time to travel to the other side of the world to give testimony to 180 state parties is not important to our work. It seems completely backwards and ridiculous. We congratulate the scholar who studies the meeting. Yet we condemn the scholar who participates in the same meeting.”
…Read more here: Swan Song – For Now. 

Why we should all be a hell of a lot less excited about AI, from Rodney Brooks:
…Roboticist-slash-curmudgeon Rodney Brooks has written a post outlining the many ways in which people mess up when trying to make predictions about AI.
…People tend to mistake the shiny initial application (eg, the ImageNet 2012 breakthrough) for being emblematic of a big boom that’s about to happen, Brooks says. This is usually wrong, as after the first applications there’s a period of time in which the technology is digested by the broader engineering and research community, which (eventually) figures out myriad uses for the technology unsuspected by its creators (GPS is a good example, Rodney explains. Other ones could be computers, internal combustion engines, and so on.)
…”We see a similar pattern with other technologies over the last thirty years. A big promise up front, disappointment, and then slowly growing confidence, beyond where the original expectations were aimed. This is true of the blockchain (Bitcoin was the first application), sequencing individual human genomes, solar power, wind power, and even home delivery of groceries,” he writes.
…Worse, is people’s tendency to look at current progress and extrapolate from there. Brooks calls this “Exponentialism”. Many people adopt this position due to a quirk in the technology industry called ‘Moore’s Law’ – an assertion about the rate at which computing hardware gets cheaper and more powerful which held up well for about 50 years (though is faltering now as chip manufacturers stare into the uncompromizing face of King Physics). There are very few Moore’s Laws in technology – eg, such a law has failed to hold up for memory prices, he points out.
…”Almost all innovations in Robotics and AI take far, far, longer to get to be really widely deployed than people in the field and outside the field imagine. Self driving cars are an example.” (Somehting McKinsey once told me – it takes 8 to 18 years for a technology to go from being deployed in the lab to running somewhere in the field at scale.)
…Read more here: The Seven Deadly Sins of Predicting the Future of AI.

TensorFlow’s Success creates Strange Alliances:
…How do you solve a problem like TensorFlow? If you’re Apple and Amazon, or Facebook and Microsoft, you team up with one another to try to leverage each other’s various initiatives to favor one’s own programming frameworks against TF. Why do you want to do this? Because TF is a ‘yuge’ success for Google, having quickly become the default AI programming framework used by newbies, Googlers, and established teams outside of Google, to train and develop AI systems. Whoever controls the language of discourse around a given topic tends to influence the given topic hugely, so Google has been able to use TF’s popularity to insert subtle directional pressure on the AI field, while also creating a larger and larger set of software developers primed to use its many cloud services, which tend to require or gain additional performance boosts from using TensorFlow (see: TPUs).
…So, what can other players do to increase the popularity of their programming languages? First up is Amazon and Apple, who have decided to pool development resources to build systems to let users easily translate AI applications written in MXNET (Amazon’s framework) into CoreML, the framework APple demands developers use who want to bring AI services to MacOS, iOS, watchOS, and tvOS.
…Read more here: Bring Machine Learning to iOS apps using Apache MXNet and Apple Core ML.
…Next up is Facebook and Microsoft, who have created the Open Neural Network Exchange (ONNX) format, which “provides a shared model representation for interoperability and innovation in the AI framework ecosystem.” At launch, it supports CNTK (Microsoft’s AI framework), PyTorch (Facebook’s AI framework), and Caffe2 (also developed by Facebook).
…So, what’s the carrot and what is the stick for getting people to adopt this? The carrot so far seems to be the fact that ONXX promises a sort of ‘write once, run anywhere’ representation, that lets frameworks that fit to the standard be able to run on a variety of substrates. “Hardware vendors and others with optimizations for improving the performance of neural networks can impact multiple frameworks at once by targeting the ONNX representation,” Facebook writes. Now, what about the stick? There doesn’t seem to be one yet. I’d imagine Microsoft is cooking up a scheme whereby ONXX-compliant frameworks get either privileged access to early Azure services and/or guaranteed performance bumps by being accelerated by Azure’s fleet of FPGA co-processors — but that’s pure speculation on my part.
…Read more here: Microsoft and Facebook create open ecosystem for AI model interoperability.

Speak no evil: Researchers make BILSTM-based lipreader that works from multiple angles… improves state-of-the-art…96%+ accuracies on (limited) training set…
Researchers with Imperial College London and the University of Twente have created what they say is the first multi-view lipreading system. This follows a recent flurry of papers in the area of AI+Lipreading, prompting some disquiet among people concerned how such technologies may be used by the security state. (In the paper, the authors acknowledge this but also cheerfully point out that such systems could work well in office teleconferencing rooms with multiple cameras as well.)
…The authors train a bi-directional LSTM with an end-to-end encoder on the (fairly limited) OuluVS2 dataset. They find that their system gets a state-of-the-art score of around 94.7% when trained on one subset of the dataset containing single views on a subject, and performance climbs to 96.7% when they add in another view, before plateauing at 96.9% with the addition of a third view. After this they find negligible performance improvements from adding new data. (Note: Scores are the best score over ten runs, so lop a few percent off for the actual average error. You’ll also want to mentally reduce the scores by another (and this is pure guesswork/intuition on my part) 10% of so since the OuluVS2 dataset has fairly friendly uncomplicated backgrounds for the network to see the mouth against. You may even want to reduce the performance a little further still due to the simple phrases used in the dataset.)
What we learned: Another demonstration that adding and/or augmenting existing approaches with new data can lead to dramatically improved performance. Given the proliferation of cheap, high-resolution digital cameras into every possible part of the world it’s likely we’ll see ‘multi-view’ classifier systems become the norm.
…Read more here: End-to-End Multi-View Lipreading.

Data augmentation via data generation – just how good are GANs are generating plants?
…An oft-repeated refrain in AI is that data is a strategic and limited resource. This is true. But new techniques for generating synthetic data are making it possible to get around some of these problems by augmenting existing datasets with newly generated and extended data.
…Case in point: ARGAN, aka Arabidopsis Rosette Image Generator (through) Adversarial Network, a systems from researchers at The Alan Turing Institute, Forschungszentrum Julich, and the University of Edinburgh. The approach uses a DCGAN generative network to let the authors generate additional synthetic plants based on pictures of Arabidopsis and Tobacco plants from the CVPP 20171 dataset The initial dataset consisted of around ~800 images, which was expanded 30-fold after the researchers automatically expanded the data by flipping and rotating the pictures and performing other translations. They then trained a DCGAN on the resulting dataset to generate new, synthetic plants.
The results: The researchers tested the usefulness of their additional generated data by testing a state-of-the-art leaf-counting algorithm on a subset of the Arabidopsis/Tobacco dataset, and on the same subset of the dataset augmented with the synthetic imagery (which they call Ax). The results are a substantial reduction in overfitting by the resulting trained system and, in one case, a reduction in training error as well. However, it’s difficult at this stage to work out how much of that is due to simply scaling up data with something roughly in the expected distribution (the synthetic images), rather than from how high-quality the DCGAN-generated plants are.
…Read more here: ARGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network.

Amazon and Google lead US R&D spending:
…Tech companies dominate the leadboard for R&D investment in the United States, with Amazon leading followed by Alphabet (aka Google), Intel, Microsoft, and Apple. It’s likely that a significant percentage of R&D spend for companies like Google and Microsoft goes into infrastructure and AI, while Amazon while be spread across these plus devices and warehouse/automation technologies, while Apple will likely concentrate more on devices and materials. Intel’s R&D spending is mostly for fabrication and process tech so is in a somewhat different sector of technology compared to the others.
…Read more here: Tech companies spend more on R&D than any other company in the US.

Tech Tales:

[2032: Detroit, USA.]

The wrecking crew of one enters like a ping pong ball into a downward-facing maze  – the entranceway becomes a room containing doors and one of them is able to be opened, so it bounces into it and goes through that door and finds a larger room with more doors and this time it can force open more than one of them. It splits into different pieces, growing stronger, and explores the castle of the mind of the AI, entering different points, infecting and wrecking where it can.

It started with its vision, they said. The classifiers went awyr. Saw windmills in clouds, and people in shadows. Then it spread to the movement policies. Mechanical arms waved oddly. And not all of its movements were physical – some are digital, embodied in a kind of data ether. It reached out to other nearby systems – exchanged information, eventually persuaded them that flags were fires, clouds were windmills, and people were shadows. Data rots.

It spread and kept on spreading. Inside the AI there was a system that performed various meta-learning operations. The virus compromized that – tweaking some of the reward functions, altering the disposition of the AI as it learned. Human feedback inputs were intercepted and instead generative adversarial networks dreamed up synthetic outputs for human operators to look at, selecting what they thought were guidance behaviors that in face were false flags. Inside the AI the intruder gave its own feedback on the algorithms according to its own goals. In this way the AI changed its mind.

Someone decides to shut it down – stop the burning. FEMA is scrambled. The National Guard are, eponymously, nationalized. Police, firefighters, EMTs, all get to work. But the tragedies are everywhere and stretch from the banal to the horrific – cars stop working; ATMs freeze; robots repeatedly clean the same patches of floors; drones fall out of the sky, beheading trees and birds and sometimes people on their way down; bridges halt, half up; ships barrel into harbors; and one recommender system decides that absolutely everyone should listen to Steely Dan. A non-zero percentage of everything that isn’t unplugged performs its actions unreliably, diverging from the goals people had set.

Recovery takes years. The ‘Geneva Isolation Protocol’ is drafted. AIs and computer systems are slowly redesigned to be modular, each system able to fully defend and cut off itself, jettisoning its infected components into the digital ether. Balkanization becomes the norm, not because of any particular breakdown, but due to the set-your-watch-by-it logic of emergent systems.