Import AI 131: IBM optimizes AI with AI, via ‘NeuNets’; Amazon reveals its Scout delivery robot; Google releases 300k Natural Questions dataset

by Jack Clark

Amazon gets into delivery robot business with ‘Scout’:
…New pastel blue robot to commence pilot in Washington neighborhood…
Amazon has revealed Scout, a six-wheeled knee-height robot designed to autonomously deliver products to Amazon customers. Amazon is trialing Scout with six robots that will deliver packages throughout the week in  Snohomish County, Washington. “The devices will autonomously follow their delivery route but will initially be accompanied by an Amazon employee,” Amazon writes. The robots will only make deliveries during daylight hours.
  Why it matters: For the past few years, companies have been piloting various types of delivery robot in the world, but there have been continued questions about the viability and likelihood of adoption of such technologies. Amazon is one of the first very large technology companies to begin publicly experimenting in this area, and where Amazon goes, some try to follow.
  Read more: Meet Scout (Amazon blog).

Want high-definition robots? Enter the Robotrix:
…New dataset gives researchers high-resolution data over 16 exquisitely detailed environments…
What’s better to use for a particular AI research experiment – a small number of simulated environments each accompanied by a large amount of very high-quality data, or a very large number of environments each accompanied by a small amount of low-to-medium quality data? That’s a question that AI researchers tend to deal with frequently, and it explains why when we look at available datasets they tend to range in size from the small to the large.
  Now, researchers with the University of Alicante, Spain have released Robotrix, a dataset that contains a huge amount of information about a small amount of environments (16 different layouts of simulated rooms, versus thousands to tens of thousands for other approaches like House3D).
  The dataset consists of 512 sequences of actions taking place across 16 simulated rooms, rendered at high-definition via the Unreal Engine.. These sequences are generated by a robot avatar which uses its hands to interact with the objects and items in question. The researchers say this is a rich dataset, with every item in the simulated rooms being accompanied by 2D and 3D bounding boxes as well as semantic masks, along with depth information. The simulation outputs the RGB and depth data at a resolution of 1920 X 1080. In the future, the researchers hope to increase the complexity of the simulated rooms even further by using the inbuilt physics of the Unreal Engine 4 system to implement “elastic bodies, fluids, or clothes for the robots to interact with”. It’s such a large dataset that they think most academics will find something to like within it: “the RobotriX is intended to adapt to individual needs (so that anyone can generate custom data and ground truth for their problems) and change over time by adding new sequences thanks to its modular design and its open-source approach,” they write.
  Why it matters: Datasets like RobotriX will make it easier for researchers to experiment with AI techniques that benefit from access to high-resolution data. Monitoring adoption (or lack of adoption) of this dataset will help give us a better sense of whether AI research needs more high-resolution data, or if large amounts of low-resolution data are sufficient.
  Read more: The RobotriX: An eXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions (Arxiv).
  Get the dataset here (Github).

DeepMind cross-breeds AI from human games to beat pros at StarCraft II:
…AlphaStar system blends together population-based training, imitation learning, and RL…DeepMind has revealed AlphaStar, a system developed by the company to beat human professionals at the real-time strategy game StarCraft II. The system “applies a transformer torso to the units, combined with a deep LSTM core, an auto-regressive policy head with a pointer network, and a centalised value baseline,” according to DeepMind.
  Results: DeepMind recently played and won five StarCraft II matches against a highly-ranked human professional, proving that its systems are able to out-compete humans at the game.
  It’s all in the curriculum: One of the more interesting aspects of AlphaStar is the use of population-based training in combination with imitation learning to bootstrap the system from human replays (dealing with one of the more challenging exploration aspects of a game like StarCraft) then inter-breeding increasingly successful agents with eachother as they compete against eachother in a DeepMind-designed league, forming a natural curriculum for the system. “To encourage diversity in the league, each agent has its own learning objective: for example, which competitors should this agent aim to beat, and any additional internal motivations that bias how the agent plays. One agent may have an objective to beat one specific competitor, while another agent may have to beat a whole distribution of competitors, but do so by building more of a particular game unit.”
  Why this matters: I’ll do a lengthier writeup of AlphaStar when DeepMind publishes more technical details about the system. The current results confirm that relatively simple AI techniques can be scaled up to solve partially observable strategic games such as StarCraft. The diversity shown in the evolved AI systems seems valuable as well, pointing to a future where companies are constantly growing populations of very powerful and increasingly general agents.
  APM controversy: Aleksi Pietikainen has written up some thoughts about how DeepMind chose to present the AlphaStar results and how the system’s ability to take bursts of rapid-fire actions within the game means that it may have out-competed humans not necessarily by being smart, but by being able to exercise superhuman precision and speed when selecting moves for its units. This highlights how difficult evaluating the performance of AI systems can be and invites the philosophical question of whether DeepMind can restrict or constrain the number and frequency of actions taken by AlphaStar enough for it to learn to outwit humans more strategically.
It’ll also be interesting to see if DeepMind push a variant of AlphaStar further which has a more restricted observation space – the system that accrued a 10-0 win record had access to all screen information not occluded by the fog of war, while a version which played a human champion and lost was restricted to a more human-like (restricted) observation space during the game.
  Read more: AlphaStar: Mastering the Real-Time Strategy Game StarCraft II (DeepMind blog).
  Read more: An Analysis On How Deepmind’s Starcraft 2 AI’s Superhuman Speed is Probabaly a Band-Aid Fix For The Limitations of Imitation Learning (Medium).

Using touch sensors, graph networks, and a Shadow hand to create more capable robots:
…Reach out and touch shapes!…
Spanish researchers have used a robot hand – specifically, a Shadow Dexterous hand – outfitted with BioTac SP tactile sensors to train an AI system to predict stable grasps it can apply to a variety of objects.
  How it works: The system receives inputs from the sensor data which it then converts into graph representations that the researchers call ‘tactile graphs’, then it feeds this data into a Graph Convolutional Network (GCN) which learns to map different combinations of sensor data to predict whether the current grasp is stable or unstable.
  Dataset: They use the BioTacSP dataset, a collection of grasp samples collected via manipulating 41 objects of different shapes and textures, including fruit, cuddly toys, jars, toothpaste in a box, and more. They also add 10 new objects to this dataset, including a monster from hit game minecraft, a mug, a shampoo bottle, and more. The researchers record the hand manipulating these objects with the palm oriented flat, at a 45 degree angle, and on its side.
  Results: The researchers train a set of baseline models with varying network depths and widths and identify a ‘sweet spot on the architecture with 5 layers and 32 features”, which they then use in other experiments. They train the best performing network on all data in the dataset (excluding the test set), then test performance here and report accuracy of around 75% across all palm orientations. “There is a significant drop in accuracy when dealing with completely unknown objects,” they write.
  Why this matters: It’s going to take a long time to collect enough data and/or run enough high-fidelity simulations to gather and generate the data needed to train computers to use a sense of touch. Papers like this give us an indication for how such techniques may be used. Perhaps one day – quite far off, based on this research – we’ll be able to go into a store to see robots hand-stitching cuddly toys, or step into a robot massage parlor?
  Read more: TactileGCN: A Graph Convolutional Network for Predicting Grasp Stability with Tactile Sensors (Arxiv).

Chinese researchers use hierarchical reinforcement learning to take on Dota clone:
…Spoiler alert – they only test against in-game AIs…
Researchers with Vivo AI Lab, a Chinese smartphone company, have shown how to use hierarchical reinforcement learning to train AI systems to excel at the 1v1 version of a multiplayer game called King of Glory (KoG). KoG is a popular multi-player game in Asia and is similar to games like Dota and League of Legends in how it plays – squads of up to five people battle for control of a single map while seeking to destroy eachother’s fortifications and, eventually, home bases.
  How it works: The researchers combine reinforcement learning and imitation learning to train their system, using imitation learning to train their AI to select between any of four major action categories at any point in time (eg, attack, move, purchase, learn skills). Using imitation learning lets the researchers “relieve the heavy burden of dealing with massive actions directly” the researchers write. The system then uses reinforcement learning to figure out what to do in each of these categories, eg, if it decides to attack it figures out where to attack if it decides to learn a skill, it uses RL to help it figure out which skill to learn. They base their main algorithm significantly on the design of the PPO algorithm used in the OpenAI Five Dota system.
  Results: The researchers test their system in two domains: a restricted 1v1 version of the game, and a 5v5 version. For both games, they test against inbuilt enemy AIs. In the 1v1 version of the game  they’re able to beat entry-level, easy-level, and medium-level AIs within the game. For 5v5, they can reliably beat the entry-level AI, but struggle with the easy-level and medium-level AIs. “Although our agents can successfully learn some cooperation strategies, we are going to explore more effective methods for multi-agent collaboration,” they write.
  (This use of imitation learning makes the AI achievement of training an HRL system in this domain a little less impressive – to my mind – since it uses human information to get over lots of the challenging exploration aspects of the problem. This is definitely more about my own personal taste/interest than the concrete achievement – I just find techniques that bootstrap from less data (eg, human games) more interesting).
  Why this matters: Papers like this show that one of the new ways in which AI researchers are going to test and calibrate the perform of RL systems will be against real-time strategy games, like Dota 2, King of Glory, StarCraft II, and so on. Though the technical achievement in this paper doesn’t seem very convincing (for one thing, we don’t know how such a system performs against human players), it’s interesting that it is coming out of a research group linked to a relatively young (<10 years) company. This highlights how growing Asian technology companies are aggressively staffing up AI research teams and doing work on computationally expensive, hard research problems like developing systems that can out-compete humans at complex games.
   Read more: Hierarchical Reinforcement Learning for Multi-agent MOBA Game (Arxiv).

IBM gets into the AI-designing-AI game with NeuNets:
…In other words: Neural architecture search is mainstream, now…
IBM researchers have published details on NeuNets, a software tool the company uses to perform automated neural architecture search for text and image domains. This is another manifestation of the broader industrialization of AI, as systems like this let companies automate and scale up part of the process of designing new AI systems.
  NeuNetS: How it works: NeuNetS has three main components: a service module which provides the API interfaces into the system; an engine which maintains the state of the project; and a synthesizer, which IBM says is “a pluggable register of algorithms which use the state information passed from the engine to produce new architecture configurations”.
  NeuNetS: How its optimization algorithms work: NeuNetS ships with three architecture search algorithms: NCEvolve, which is a neuro-evolutionary system that optimizes a variety of different architectural approaches and uses evolution to mutate and breed successful architectures; TAPAS, which is a CPU-based architecture search system; and Hyperband++, which “speeds up random search by using early stopping strategy to allocate resources adaptively” and has also been extended to reuse some of the architectures it has searched over, speeding up the rate at which it finds new potential high-performing architectures.
  Results: IBM assesses the performance of the various training components of NeuNetS by reporting the time in GPU hours to train various networks to reasonable accuracy using it; this isn’t a hugely useful metric for comparison, especially since IBM neglects to report scores for other systems.
  Why this matters: Papers like this are interesting for a couple of reasons: one) they indicate how more traditional companies such as IBM are approaching newer AI techniques like neural architecture search, and two) they indicate how companies are going to package up various AI techniques into integrated products, giving us the faint outlines of what future “Software 2.0” operating systems might be like.
  Read more: NeuNetS: An Automated Synthesis Engine for Neural Network Design (Arxiv).

Google releases Natural Questions dataset to help make AI capable of dealing with curious humans:
…Google releases ‘Natural Questions’ dataset to make smarter language engines, announces Challenge…
Google has released Natural Questions, a dataset containing around 300,000 questions along with human-annotated answers from Wikipedia pages; it also ships with a rich subset of 16,000 example questions where answers are provided by five different annotators. The company is also hosting a challenge to see if the combined brains of the AI research community can “close the large gap between the performance of current state-of-the-art approaches and a human upper bound”.
     Dataset details: Natural Questions contains 307,373 training examples with single annotations, 7,830 examples with 5-way annotations for development data, and a further 7,842 examples 5-way annotated sequestered as test data. The training examples “consist of real anonymized, aggregated queries issued to the Google search engine”, the researchers write.
  Challenge: Google is also hosting a ‘Natural Questions’ challenge, where teams can submit well-performing models to a leaderboard.
  Why this matters: Question answering is a longstanding challenge for artificial intelligence; if the Natural Questions dataset is sufficiently difficult, then it could become a new benchmark the research community uses to assess progress.
  Compete in the Challenge (‘Natural Questions’ Challenge website).
  Read more: Natural Questions: a New Corpus and Challenge for Question Answering Research (Google AI Blog).
  Read the paper: Natural Questions: a Benchmark for Question Answering Research (Google Research).

~ EXTREMELY 2019 THINGS, AN OCCASIONAL SERIES ~
Oh deer, there’s a deer in the data center!
  Witness the deer in the data center! (Twitter).

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net

Disentangling arguments for AI safety:
Many of the leading AI experts believe that AI safety research is important. Richard Ngo has helpfully disentangled a few distinct arguments that people use to motivate this concern.
   Utility maximizers: An AGI will maximize some utility function, and we don’t know how to specify human values in this way. An agent optimizing hard enough for any goal will pursue certain sub-goals, e.g. acquiring more resources, preventing corrective actions. We won’t be able to correct misalignment, because human-level AGI will quickly gain superintelligent capabilities through self-improvement, and then prevent us from intervening. Therefore, absent a proper specification of what we value before this point, an AGI will use its capabilities to pursue ends we do not want.
  Target loading problem: Even if we could specify what we want an AGI to do, we still do not know how to make an agent that actually tries to do this. For example, we don’t know how to split a goal into sub-goals in a way that guarantees alignment.
  Prosaic alignment problem: We could build ‘prosaic AGI’, which has human-level capabilities but doesn’t rely on any breakthrough understandings in intelligence (e.g. by scaling up current ML methods). These agents will likely become the world’s dominant economic actors, and competitive pressures would cause humans to delegate more and more decisions to these systems before we know how to align them adequately. Eventually, most of our resources will be controlled by agents that do not share our values.
  Human safety: We know that human rationality breaks down in extreme cases. If a single human were to live for billions of years, we would expect their values to shift radically over this time. Therefore even building an AGI that implements the long-run values of humanity may be insufficient for creating good futures.
  Malicious uses: Even if AGI always carries out what we want, there are bad actors who will use the technology to pursue malign ends, e.g. terrorism, totalitarian surveillance, cybercrime.
  Large impacts: Whatever AGI will look like, there are at least two ways we can be confident it will have a very large impact. It will bring about at least as big an economic jump as the industrial revolution, and we will cede our position as the most intelligent entity on earth. Absent good reasons, we should expect either of these transitions to have an significant impact on the long-run future of humanity.
  Read more: Disentangling arguments for the importance of AI safety (Alignment Forum).

National Security Commission on AI announced:
Appointments have been announced for the US government’s new advisory body on the national security implications of AI. Eric Schmidt, former Google CEO, will chair the group, which includes 14 other experts from industry, academia, and government. The commission will review the competitive position of the US AI industry, as well as issues including R&D funding, labor displacement, and AI ethics. Their first report is expected to be published in early February.
  Read more: Former Google Chief to Chair Government Artificial Intelligence Advisory Group (NextGov).

Tech Tales:

Unarmored In The Big Bright City

You went to the high street naked?
Naked. As the day I was born.
How do you feel?
I’m still piecing it together. I think I’m okay? I’m drinking salt water, but it’s not so bad.
That’ll make you sick.
I know. I’ll stop before it does.
Why are you even drinking it now?
I was naked. Something like this was bound to happen.

I take another sip of saltwater. Grimace. Swallow. I want to take another sip but another part of my brain is stopping me. I dial up some of the self-control. Don’t let me drink more saltwater I say to myself: and because of my internal vocalization the defense systems sense my intent, kick in, and my trace thoughts about salt water and sex and death and possibility and self – they all dissolve like steam. I put the glass down. Stare at my friend.

You okay?
I think I’m okay now. Thanks for asking about the salt water.
I can’t believe you went there naked and all we’re talking about is salt water.
I’m lucky I guess.

That was a few weeks and two cities ago. Now I’m in my third city. This one feels better. I can’t name what is driving me so I can’t use my defense systems. I’ve packed up and moved apartments twice in the last week. But I think I’m going to stay here.

So, you probably have questions. Why am I here? Is it because I went to the high street naked? Is it because of things I saw or felt when I was there? Did I change?
  And I say to you: yes. Yes to all. I’m probably here because of the high street. I did see things. I did feel things. I did change.

Was there a particularly persuasive advert I was exposed to – or several? Did a few things run in as I had no defenses and try to take me over? Was it something I read on the street that changed my mind and made me behave this way? I cannot trust my memories of it. But here are some traces:
   – There was a billboard that depicted a robot butler with the phrase: “You’re Fired.”
   – There was an augmented reality store display where I saw strange creatures dancing around the mannequins. One creature looked like a spider and was wearing a skirt. Another looked like a giant fish. Another looked like a dog. I think I smelled something. I’m not sure what.
– There was a particular store in the city that was much more interesting. There were creatures that were much less humanoid. I’m not sure if they were actually for sale. They were like dolls. I remember the smell. They smelled of a lotion. I’m not sure if they were human.
   – On the street, I saw a crowd of people clustered around a cart, selling something. When I got closer I saw it was selling a toy that was lightweight and had wheels. I asked the guy selling it what it was for. He pulled out a scarlet letter and I saw it was for a girl. He said she liked it. I stood there and watched him make out with the girl. I didn’t have any defense systems at the time. I don’t know what that toy was for. I don’t know if I was attracted to it or not.

I have strange dreams, these days. I keep wanting to move to other cities. I keep having flashbacks – scarlet letters, semi-humanoid dolls. Last night I dreamed of something that could have been a memory – I dreamed of a crane in the sky with a face on its side, advertising a Chinese construction company and telling me phrases so persuasive that ever since I have been compelled to move.

Tonight I expect to dream again. I already have the stirrings of another memory from the high street. It starts like this: I’m walking down a busy High Street in the rain. There are lots of people in the middle of the street, and a police car slows down, then drives forward a couple of paces, then comes to a stop. I hear a cry of distress from a woman. I look around the corner, and there’s a man slumped over in a doorway. He’s got a knife in his hand, and it’s pointed at me. He turns on me. I grab it and I stab him in the heart and… I die. The next day I wake up. All my belongings are in a box on the floor. The box has a receipt for the knife and a note that says ‘A man, his heart turned to a knife.’

I am staying in a hotel on the High Street and all my defenses are down. I am not sure if this memory is my present or my past.

Things that inspired this story: Simulations, augmented reality, hyper-targeted advertising, AI systems that make deep predictions about given people and tailor experiences for them, the steady advance of prosthetics and software augments we use to protect us from the weirder semi-automated malicious actors of the internet.