Import AI: #79: Diagnosing AI brains with PsychLab; training drones without drone-derived data; and a Davos AI report.

by Jack Clark

Making better video game coaches with deep learning:
…The era of the deep learning-augmented video game coach is nigh!…
What if video game coaches had access to the same sorts of telemetry as coaches for traditional sports like soccer or the NFL? That’s the question that DeepLeague, software for analysing the coordinate position of a League of Legends player at a point in time, tries to answer. The software is able to look at a LoL minimap at a given point in time and use that to create a live view of where each specific player is. That sounds minor but it’s currently not easy to do this in real-time via Riot Games’ API, so DL (aided by 100,000 annotated game minimap images) provides a somewhat generalizable workaround.
  What it means: The significant thing here is that deep learning is making it increasingly easy to point a set of neural machinery at a problem, like figuring out how to map coordinates and player avatars to less-detailed dots on a minimap. This is a new thing: the visual world has never before been this easy for computers to understand, and things that look like toys at first frequently wind up being significant. Take a read of the ‘part 2’ post to get an idea of the technical details for this sort of project. And remember: this was coded up by a student over 5 days during a hurricane.
  Jargon, jargon everywhere: DeepLeague will let players “analyze how the jungler paths, where he starts his route, when/where he ganks, when he backs, which lane he exerts the most pressure on, when/where mid roams”. So there!
  Read more: DeepLeague: leveraging computer vision and deep learning on the League of Legends minimap + giving away a dataset of over 100,000 labeled images to further esports analytics research (Medium).
  Read even more: DeepLeague (Part 2): The Technical Details (Medium).
  Get the data (GitHub).

Rise of the drones: learning to operate and navigate without the right data with DroNet:
…No drone data? No problem! Use car & bicycle data instead and grab enough of it to generalize…
One of the main ways deep learning differs to previous AI techniques lies in its generalizability: neural networks fed on one data type are frequently able to attain reasonable performance on slightly different and/or adjacent domains. We’re already pretty familiar with this idea within object recognition – a network trained to recognize flowers should be able to draw bounding boxes around flowers and plants not in the training set, etc – but now we’re starting to apply the same techniques to systems that take actions in the world, like cars and drones.
  Now, researchers with the University of Zurich and ETH Zurich and the Universidad Politécnica de Madrid in Spain have proposed Dronet: a way to train drones to drive on city streets using data derived entirely from self-driving cars and bicycles.
  How it works: The researchers use an 8-layer Residual Network to train a neural network policy to do two things: work out the correct steering angle to stay on the road, and learn to avoid collisions using a dataset gathered via bicycle. They train the model via mean-squared error (steering) and binary cross-entropy (collision). The result is a drone that is able to move around in urban settings and avoid collisions, though as the input data doesn’t include information on the drone’s vertical position, it operates in these experiments on a plane.
  Testing: They test it on a number of tasks in the real world which include traveling in a straight line, traveling along a curve and avoiding collisions in an urban area. They also evaluate its ability to transfer to new environments by testing it in a high altitude outdoor environment, a corridor, and a garage, where it roughly matches or beats other baselines. The overall performance of the system is pretty strong, which is surprising given its relative lack of sophistication compared to more innately powerful methods such as a control policy implemented within a 50-layer residual network. “We can observe that our design, even though 80 times smaller than the best architecture, maintains a considerable prediction performance while achieving real-time operation (20 frames per second),” they say.
  Datasets: The researchers get the driving dataset from Udacity’s self-driving car project; it consists of 70,000 images of cars driving distributed over six distinct experiments. They take data from the front cameras and also the steering telemetry. For the collision dataset they had to collect their own and it’s here that they get creative: they mount a GoPro on the handlebars of a bicycle and “drive along different areas of a city, trying to diversify the types of obstacles (vehicles, pedestrians, vegetation, under construction sites) and the appearance of the environment. This way, the drone is able to generalize under different scenarios. We start recording when we are far away from an obstacle and stop when we are very close to it. In total, we collect around 32,000 images distributed over 137 sequences for a diverse set of obstacles. We manually annotate the sequences, so that frames far away from collision are labeled as 0 (no collision), and frames very close to the obstacle are labeled as 1 (collision)”.
  Drone used: Parrot Bebop 2.0 drone which passes footage at 30Hz via wifi to a computer running the neural network..
– Read more: DroNet: Learning to Fly by Driving (ETH Zurich).
– Get the pre-trained DroNet weights here (ETH Zurich).
– Get the Collision dataset here (ETH Zurich).
– Access the project’s GitHub repository here (GitHub).

UPS workers’ union seeks to ban drones, driverless vehicles:
In the absence of alternatives to traditional economic models, people circle the wagons to protect themselves from AI…
People are terrified of AI because they worry for their livelihoods. That’s because most politicians around the world are unable to suggest different economic models for an increasingly automated future. Meanwhile, many people are assuming that even if there’s not gonna be mass unemployment as a consequence of AI, there’s definitely going to be a continued degradation in wage bargaining power and the ability for people to exercise independent judgement in increasingly automated workplaces. As a consequence, workers’ unions are seeking to protect themselves. Case in point: the Teamsters labor union wants UPS to ban using drones or driverless vehicles for package deliveries so as to better protect their own jobs: this is locally rational, but globally irrational. If only society were better positioned to take advantage of such technologies without harming its own citizens.
– Read more: Union heavyweight wants to ban UPS from using drones or driverless vehicles (CNBC).

Human-in-the-loop AI artists, with ‘Deep Interactive Evolution’ (DeepIE):
…Battle of the buzzwords as researchers combine generative adversarial networks (GANs) with interactive evolutionary computation (IEC)…
The future of AI will involve humans augmenting themselves with increasingly smart, adaptive, reactive systems. One of the best ways to do this is with ‘human-in-the-loop’ learning, where a human is able to directly influence the ongoing evolution of a given AI system. One of the first places this is likely to show up is in the art domain, as artists access increasingly creative systems to help enhance their own creative practices. So it’s worth reading through this paper from researchers with New York University, the IT University of Copenhagen, and the Beijing University of Posts and Telecommunications, about how to smartly evolve novel images using humans, art, and AI.
  Their Deep Interactive Evolution approach relies on a four-stage loop: latent variables are fed into a pre-trained image generator which spits out images in response to the variables, these images are then shown to a user which selects the ones they prefer, new latent variables are derived based on those choices, then those variables are mutated according to rules defined by the user. This provides a tight feedback loop between the AI system and the person, and the addition of evolution provides the directed randomization needed to generate novelty.
“The main differentiating factor between DeepIE and other interactive evolution techniques is the employed generator. The content generator is trained over a dataset to constrain and enhance what is being evolved. In the implementation in this paper, we trained a nonspecialized network over 2D images. In general, a number of goals can be optimized for during the training process. For example, for generating art, a network that specializes in creative output can be used,” write the researchers.
  Testing: Testing subjectively generated art images is  a notoriously difficult task so it’s worth thinking about how these researchers did it: tue approach they used involved setting users two distinct tasks: one was to be presented with a predetermined picture which in this case was a shoe, and the other was to reproduce a picture of their own choosing. Both of these tests provide a way to evaluate how intuitive humans find the image-evolution process and also provide an implicit measure of the ease with which they can intuitively create with the AI.
  Results: “Based on self-reported numbers, users felt that they got much closer to reproducing the shoes than they did to the face. This could be predicted from figure 4. On average users reported 2.2 out of 5 for their ability to reproduce faces and 3.8 out of 5 for their ability to reproduce shoes, both with a standard deviation of 1,” write the researchers. My belief is that it’s much easier to generate shoes because they’re less complex visual entities and as humans we’re also not highly-evolved to distinguish between different types of shoe, whereas we are with faces, so I think we’ll always be more attuned to the flaws in faces and/or human-oriented things.
  Up next: “In the future, it will be interesting to extend the approach to other domains such as video games that can benefit from high-quality and controllable content generation.”
  Implementation details: The authors use a Wasserstein GAN with Gradient Penalty (WGAN-GP) network along with the DCGAN architecture. For evolution they use mutation and crossover techniques but, without being able to receive a specific signal from the user about the relative quality of the newly generated images, the network tends towards increasingly nutty images over time.
  Read more: Deep Interactive Evolution (Arxiv).
  Magnificent Jargon of the Week Award… for this incredible phrase: “Other options for mutation and crossover could involve interpolating between vectors along the hypersphere”. (Captain, interpolate the vectors across the hypersphere, please!).

Facebook releases the Detectron, its object detection research platform:
…Another computational dividend from AI-augmented-capitalism: free object detection for all…
Facebook AI Research (FAIR) has released Detectron, an open source platform for conducting research into object detection and segmentation. The package ships with a number of object detection algorithms, including: Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, and R-FCN, which are each built on some standardized neural network architectures including ResNeXt, ResNet, Feature Pyramid Networks, and VGG16.
  License: Apache 2.0
  Read More: Detectron (GitHub).

Enter the AI agent PsychLab, courtesy of DeepMind:
…Not quite ‘there is a turtle lying on its back in the desert’, but a taste of things to come…
DeepMind thinks AI agents are now sophisticated enough that we should start running them through (very basic) psychological tests, so it’s built and released an open source testing suite to do that, based within its 3D Quake3-based ‘DeepMind Lab’ environment.
  PsychLab provides a platform to compare AI agents to humans on a bunch of tasks derived from cognitive psychology and visual psychophysics. The environment is a literal platform that the agent stands on in front of a large (simulated) computer monitor – so the agent is free to look around the world and even look away from the experiments. By testing their agents on some of these tasks DeepMind also ends up discovering a surprising flaw in its ‘UNREAL’ architecture which leads it to re-design part of the agents’ vision system based on knowledge of biological foveal vision, which improves performance. (Adding this improvement in also increases performance on the ‘laser tag’ set of tasks that were created by another team, providing further validation of the tweak.)
  Tasks: Some of the tasks the researchers test their agents on include being able to detect subtle changes in an environment, being able to identify the orientation of a specific ‘Landolt C’ stimulus, being able to figure out which of two patterns is a concentric glass pattern, visual search (aka, playing a low-res version of ‘Where’s Waldo’), working out the main direction of motion from a group of dots moving separately, and tracking multiple objects at once, among others.
  Results: UNREAL agents fail to beat humans on basically every single baseline, with humans displaying greater sample efficiency, adaptability, and generally higher baseline scores than the agents. One exception are some of the visual acuity tests, where tweaks by DeepMind to implement a foveal vision model lead to UNREAL agents that more closely match human performance. This foveal model also dramatically improves UNREAL performance on the non-psychological ‘laser tag’ test, leading to agents that more consistently beat humans or match their skills.
  The trouble with time: One problem the researchers deal with is that of time, namely that reinforcement learning agents learn through endless runs of en environment and gravitate to success via a reward function, whereas human subjects are typically tested over a ~one hour period after being given verbal instructions. This difference likely means RL agent performance is significantly higher on certain tasks due to overfitting during a subjectively far longer period of training (remember, computers run simulations far faster than we humans can experience reality). “Since nonhuman primate training procedures can take many months, it’s possible that slower learning mechanisms might influence results with non-human primates that could not operate within the much shorter time frame (typically one hour) of the equivalent human experiment,” write the researchers.
– Read more: PsychLab: A Psychology Laboratory for Deep Reinforcement Learning Agents (Arxiv).
Looming ethical paradox: Once we have agents that pass all of these tests with flying colors, will we need to start dealing with the ethical questions of whether it is acceptable to shutdown/restart/delete/tweak these agents? We’re likely years away from this, but I think way before we get agents that display general cognition we’ll have ones that seem lifelike enough that we’ll have to deal with these questions – I don’t think that today we execute monkeys after they’ve done a six-month lab testing period, so I’m wondering if we’ll have to change how we handle and store agents as well – perhaps the future life of an UNREAL agent is to be ‘paused’ and have its parameters saved, rather than being junked entirely.

Evolution Strategies for all:
Basic tutorial walks you through a Minimum Viable Experiment to learn ES…
Evolution Strategies is a technique for creating AI agents that can handle long-term planning at the cost of immense computation. It’s different to Deep Learning because in many sense it’s much more primitive, but it’s also potentially more powerful than Deep Learning in some domains thanks to its ability to have performance scale almost linearly with additional computation, letting you throw computers at problems too hard for existing more sophisticated algorithms.
  Now Florida AI chap Eder Santana has published a post walking us through how to experiment with ES on what he calls a ‘minimum viable experiment’ – in this case, implementing ES in the Keras programming framework and using the resulting system to train an agent to play catch. It’s a good, math-based walkthrough of how it works and comes with code.
– Read more: MVE Series: Playing Catch with Keras and an Evolution Strategy (Medium).
– Get the code: EvolutionMVE (GitHub).
– Read more: Evolution Strategies as a Scalable Alternative to Reinforcement Learning (OpenAI).
– Read more: A Visual Guide to Evolution Strategies (@hardmaru).
– Read even more: Uber’s recent research on ES (Uber Engineering Blog).

*** Davos 2018 AI Special Report ***
Entrepreneurs, World Leaders, chime in on AI and what it means.

Alibaba founder warns that civilization ill prepared for the AI revolution:
…Choose art and culture over repetitive tasks, says entrepreneur at Davos 2018…
“If we do not change the way we teach 30 years later we will be in trouble because the way we teach, the things we teach our kids, are the things from 200 years ago. And we cannot teach our kids to compete with machines – they are smarter,” Jack Ma said at Davos this year. “Everything we teach  should be different from the machine,” he said. “The computer will always be smarter than you are; they never forget, they never get angry. But computers can never be as wise a man. The AI and robots are going to kill a lot of jobs, because in the future it’ll be done by machines. Service industries offer hope – but they must be done uniquely.”
– Read more: Jack Ma on the IQ of love – and other top quotes from his Davos interview.

British Prime Minister positions UK as the place to lead AI development:
…Impact of speech dimmed by UK’s departure from influence of world stage due to Brexit…
“In a global digital age we need the norms and rules we establish to be shared by all. That includes establishing the rules and standards that can make the most of Artificial Intelligence in a responsible way, such as by ensuring that algorithms don’t perpetuate the human biases of their developers,” said the PM. “So we want our new world-leading Centre for Data Ethics and Innovation to work closely with international partners to build a common understanding of how to ensure the safe, ethical and innovative deployment of Artificial Intelligence.”
– Read more here: PM’s Speech at Davos 2018: 25 January (

Google CEO stresses AI’s fundamental importance:
…”AI is probably the most important thing humanity has ever worked on,” said Pichai…   The Google CEO also said companies should “agree to demilitarize AI” and that we need “global multilateral frameworks” to tackle some of the issues posed by AI.
– Read more: Google CEO: AI is ‘more profound than electricity or fire’ (CNN).

Tech Tales:

[Japan, 2034: A public park with a pond.]

The wind starts and so the pond ripples and waves form across its long, rectangular surface. People throng at its sides; the ends are reserved for, at one end, the bright red race ribbon, and at the other, three shipping containers stacked side by side, with their doors flush with the edge of the pond, ready to open and let their cargo slide out and onto the undulating surface of the water. There’s an LED sign on top of the crates that reads, in strobing red&orange letters: KAWASAKI BOAT-RACE SPONSORED BY ARCH-AI: ‘INVENT FURTHER’.

At the other end of the course a person in a bright red jacket fires a starter pistol in the air and, invisibly, a chip in the gun relays a signal to a transcier placed halfway down the pond which relays the signal into the shipping crates, whose doors open outward. From each crate extends a metal tongue, which individually slide into the pond, each thin and smooth enough to barely cause a ripple. The boats follow, pushed from within by small robot arms, down onto the slides and then into the water. A silent electrically-powered utility vehicle lifts the crates once the boats are clear and removes them, creating more of a space for wind to gather and inhabit before plunging into the sails of the AI-designed boats.

Each boat is a miracle: a just-barely euclidean mess of sails and banisters and gantries and intricate pulleys. Each boat has been 3D printed overnight inside each of the three shipping crates, with their designs ginned up by evolutionary optimization processes paired with sophisticated simulations. And each is different – that’s the nice thing about wind; it’s so inherently unpredictable that when you can build micron-scale ropes and poles you can get really creative with the designs, relying on a combination of emergence and fiendishly-clever AI-dreamed gearing to turn your construction into something seaworthy.

The crowds cheer as the boats go past and tens of airborne drones film the reactions and track the gazes of various human eyeballs, silently ranking and scoring each boat not only on its speed relative to others, but on how alluring it seems to the humans. Enough competitions have been run now that the boat-making AIs have had to evolve their process many times, swapping out earlier designs that maximized sail surface area for ones made of serieses of independently moving ones, to the current iteration of speedy, attention-grabbing vessels, where the sails are almost impossible to individually resolve from a distance as, aside from a few handkerchief-sized ones, the rest shrink according to strange, fractal rules, down into the sub-visual. In this way each vessel moves, powered by pressures diverted into sails that are so fine and so carefully placed that they filgree together into something that, if you squint, seems like an entire thing, but the boats’ sounds of infinitely-tattered-flapping tell you otherwise.

A winner is eventually declared following the ranking of the crowd’s reactions and the heavily optimized single-digit-millimetre lead jockied for by the boards in the competition. Reactions are fed back. The electric utility vehicle brings the shipping containers back to the edge of the pond and sets each down by its edge in the same position as before. Inside, strange machines begin to whirr as new designs are built.

Later that night they burn the ships from the day’s competition, and the drones film that as well, silently feeding back points for the aesthetically pleasing quality of the burn to the printers in the containers: a little game the AIs play amongst themselves, unbeknownst to their human minders, as each seeks to find additional variables to explore. Perhaps one day the ships will be invisible, for they will each be made so fine.

Technologies that inspired this story: Human-in-the-loop feedback, evolutionary design, variational auto-encoders, drones, psychological monitoring via automated video analysis, etc.