Import AI: Issue 70: Training conversational AI with virtual dungeons, video analysis and AI-based surveillance, and the virtues of paranoid AI

by Jack Clark

Welcome to Import AI, subscribe here.

Amazon joins Microsoft and Facebook in trying to blunt TensorFlow’s ecosystem lead:
…It takes a significant threat to bring these corporate rivals together…
Amazon Web Services will help develop the ONNX (Open Neural Network Exchange) format, which provides standard formats for porting neural network models developed in one framework into another. It’s first contribution is ONNX-MXNet, which will make it possible for MXNet to ingest and run ONNX-format models trained in other frameworks, like Facebook’s PyTorch and Caffe2, and Microsoft’s CNTK, etc.
– Read more: Announcing ONNX Support for Apache MXNet.
– ONNX-MXNet Github.

ImportAI newsletter meetup at NIPS 2017: If you’re going to NIPS 2017 would you be interested in drinking beer/coffee and eating fried food with other Import AI aficionados? I’d like to do a short series of three minute long talks/provocations (volunteers encouraged!) about AI. Eg: How do we develop common baselines for real-world robotics experiments? What are the best approaches for combating poor data leading to bias in AI systems? What does AI safety mean? How do we actually develop a thesis about progress in AI and measure it?
– Goal: 8-10 talks, so two ~15 minute sections, with breaks inbetween for socializing.
– If that sounds interesting, vote YES on this poll on Twitter here.
– If you’re interested in speaking at the event, then please email me here! I’ve got a couple of speakers.lined up already and think doing 10 flash talks (aka 30 mins, probably in two 15 min sections with socializing in between) would be fun.
If you’re interested in sponsoring the event (aka, propping up a bar/restaurant small tab in exchange for a logo link and one three minute talk) then email me.

Hillary Clinton on AI: US currently “totally unprepared” for its impact:
Former Presidential hopeful says her administration would have sought to create national policy around artificial intelligence…
Hillary Clinton is nervous about the rapid rate of progression in artificial intelligence and what it means for economy. “What do we do with the millions of people who will no longer have a job?” she said in a recent interview. “We are totally unprepared for that.”
  While other countries around the world ranging from the United Kingdom to China are spinning up the infrastructure to enact national policy and strategy around artificial intelligence, the United States is quiet from an AI policy standpoint. Things may have been different had HRC won: “One thing I wanted to do if I had been President was to have a kind of blue ribbon commission with people from all kinds of expertise coming together to say what should America’s policy on artificial intelligence be?” Hillary says.
– Read more from the interview here (transcript available).

Getting AI to be more cautious: Where do we go next, and can we change our minds if we don’t like it?
…Technique trains AI systems to explore their available actions more cautiously, avoiding committing quite so many errors that are very difficult or impossible to recover from…
Researchers with Google Brain, the University of Cambridge, the Max Planck Institute for Intelligent Systems, and UC Berkeley, have proposed a way to get robots to more rapidly and safely learn tasks.
  The idea is to have an agent jointly learn a forward policy and a reset policy. The forward policy maximizes the task reward, and the reset policy tries to figure out actions to take to reset the environment to a prior state. This leads to agents that learn to avoid risky actions that could irrevocably commit them to something.
“Before an action proposed by the forward policy is executed in the environment, it must be “approved” by the reset policy. In particular, if the reset policy’s Q value for the proposed action is too small, then an early abort is performed: the proposed action is not taken and the reset policy takes control,” they write.
The research tests the approach on a small number of simulated robotics tasks, like figuring out how to slot a peg into a hole, that can be more time-consuming to learn with traditional reinforcement learning approaches.
– Read more: Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning.
This work is reminiscent of a recent paper from Facebook AI Research (covered in Import AI #36), where a single agent has two distinct modes, one of which tries to do a task, and the other of which tries to reverse a task.
– Read more: Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play.

What’s old is the new, new thing: Facebook proposes multi-user dungeons for smarter AI systems:
Can we make the data collection process more interesting to the humans providing us with data and can this approach lead to more effective datasets for training AI?…
How can you train an AI system to seamlessly execute a series of complex commands in response to text input from a user? Until we have agents capable of parsing open-ended natural language conversations – something that feels extremely far away from a research standpoint – we’re going to have to come up with various hacks to develop smart systems that work in somewhat more narrow domains.
  One research proposal by Facebook AI Research – Mechanical Turker Descent (MTD) –  is to better leverage the smarts inside of humans by re-framing human data collection exercises to be more game-like and therefore more engaging. Facebook has recently been paying mechanical turkers to train AI systems by writing various language/action pairs in the context of an iterative game played against other mturkers.
The system works like this: mturkers compete with each other to train a simulated dragon that has to perform a sequence of actions in a dungeon. During each round the mturkers enter a bunch of language/action pairs and receive feedback on how hard or easy the AI agents find the resulting command/language sequences. At the end of the round the various agents trained by the datasets created by the humans are pitted against each other, and the top scoring agent on a held-out test dungeon pays a monetary reward to whichever mturker trained it. This incentivizes the mturkers to optimize the language:action pairs they produce so that they fall into the sweet spot of difficulty for the AI, where it’s not to easy it’ll not learn the requisite skills to do well in the final competition, but not so hard that it’s unable to learn something useful. This has the additional benefit of automatically creating a hard-to-game curriculum curated and extended by humans.
Technologies used: The main contribution of this research paper is the technique for training systems in this way, but there’s also a technological contribution: a new component called  AC-Seq2Seq. This system “shares the same encoder architecture with Seq2Seq, in our case a bidirectional GRU (Chung et al., 2014). The encoder encodes a sequence of word embeddings into a sequence of hidden states. AC-Seq2Seq has the following additional properties: it models (i) the notion of actions with arguments (using an action-centric decoder), (ii) which arguments have been used in previous actions (by maintaining counts); and (iii) which actions are possible given the current world state (by constraining the set of possible actions in the decoder),” they write.
Results: The main result Facebook found is that “interactive learning based on MTD is more effective than learning with static datasets”.
– Read more here: Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent.

Former General Motors product development czar: Autonomous cars mean the death of auto companies, dealerships, and so on:
..And it was as though all at once a thousand small violins played into the seamless, efficient, traffic jam-free void…
One of the nice things about getting old is due to your (relatively) short expected lifespan can dispense with the reassuring truths that most people traffic in out of a misplaced sense of duty and/or paternalism. So it’s worth reading this article by an automotive industry veteran about the massive effect self-driving cars are likely to have on the existing autonomous industry. The takeaway is that traditional carmakers will be ruthlessly commoditized whose products will then be rebranded by platforms like Amazon and/or ridesharing companies like Uber and Lyft, much like how the brands of electronics components manufacturers are subsumed by the brands of companies like Apple, Google, Samsung, and so on, whose products they enable.
  “For a while, the autonomous thing will be captured by the automobile companies. But then it’s going to flip, and the value will be captured by the big fleets. The transition will be largely complete in 20 years. I won’t be around to say, “I told you so,” though if I do make it to 105, I could no longer drive anyway because driving will be banned. So my timing once again is impeccable.”
– Read more: Bob Lutz: Kiss the good times goodbye.

UK government launches AI center:
National advisory body could be a template of things to come…
The UK government has announced plans to create a national advisory body for ‘Data Ethics and Innovation’, focused on “a new Centre for Data Ethics and Innovation, a world-first advisory body to enable and ensure safe, ethical innovation in artificial intelligence and data-driven technologies”. There’s very little further information about it in the budget itself (PDF), so watch this space for more information.
– Read more: Welcoming new UK AI Centre (the Centre for the Study of Existential Risk).
– 
The Register notes that the UK already has a vast number of government advisory bodies focused in some sense on ‘data’, so it’ll be a year or two before we can pass judgement of whether this center is going to be effective or not, or just another paper-producing machine.

*** The Department of Interesting AI Developments in China ***

Chinese researchers combine Simple Recurrent Units (SRUs) with ResNets for better action recognition:
Relatively simple system outperforms other deep learning-based ones, though struggles to attain performance of feature-based systems…
Researchers with Beijing Jiaotong University and the Beijing Key Laboratory of Advanced Information Science and Network Technology have taken two off-the-shelf deep learning components (residual networks and simple recurrent units) and combined them for an action recognition system that gets competitive results on classifying actions on the UCF-101 dataset (accuracy: ~81 percent), and the HMDB-51 dataset (accuracy: ~50 percent.) The researchers trained their system on four NVIDIA Titan-X cards and program their system in PyTorch.
  This is a further demonstration of the inherent generality of the sorts of components being built by the AI community, where pre-existing components from a common (and growing!) toolset can be integrated with one another to unlock new or better capabilities. As Robin Sloan says: ‘Snap. Snap. Snap!
– Read more here: Multi-Level ResNets with Stacked SRUs for Action Recognition.
AI and ‘dual use’:
The point of AI technologies is that they are omni-use: a system that can be taught to identify specific behaviors from videos can be trained on new datasets to identify different behaviors, whether specific movements of soldiers, or sudden acts of violence in crowds of people, or other aberrations.
  The different ways these technologies can be used was illustrated by Andrew Moore, dean of computer science at Carnegie Mellon University, at a recent talk at the Center for a New American Security in Washington DC. Moore showed a video of a vast crowd of people dancing in the middle of an open air square. Each person in the video was overlaid with a stick figure identifying the key joints in their body, and the stick figure would track the person’s movement with a high level of accuracy. Why is this useful? You could use this to run automated surveillance systems that could be trained to spot specific body movements, creating systems that could, say, identify one dancer in a crowd of hundreds reaching down into a bag on
the ground, Moore said.
– Watch the Andrew Moore talk here (video).
– C
hinese surveillance startup SenseTime plans IPO, opening US development office:
…Facial recognition company aims to build AI platform, rather than specific one-off services…
Chinese surveillance AI startup SenseTime – backed by a bunch of big investors like Qualcomm, as well as Chinese government-linked investment funds – will open a US research and development center next year and is considering an initial public offering as well. The company dabbles in AI in a bunch of different areas, including in video surveillance and high-performance computing (and the intersection thereof).
   “Our target is definitely not to create a small company to be acquired, but rather a ‘platform company’ that dominates with original core technology like Google and Facebook,” SenseTime CEO Tang Xiaoou told Reuters. “With Facebook (FB.O) we compete in facial recognition; with Google (GOOGL.O) it is visual object recognition, sorting 1,000 categories of objects.
     –      Read more: China’s SenseTime plans IPO, U.S. R&D center as early as 2018.

Tech Tales:

[Detroit, 2028:]

When the crowds at car racing shows started to dwindle Caleb created an internet meme saying ‘pity the jockeys’, showing an old black and white photograph of some out of work horse racers from the mid-20th Century. He got a few likes and a few comments from people expressing surprise at just how rapidly the advent of self-driving technologies had fundamentally changed racing: courses had first become bigger, then the turns had become tighter, then the courses found their human-reflex limit and the crash rates temporarily went up, before an entirely new car racing league formed where humans were banned from the vehicles – self-driving algorithms only!

But now the same thing was happening to the drone racing league, and Caleb was uneasy – he’d made decent money out of racing in the past few years, pairing a childhood fascination with immersive, virtual reality-based computer games, with just enough programming talent to be able to take standard consumer drones from DJI, root them, then augment their AI flight systems with components he collected from GitHub. He’d risen up in the leagues and was now sponsored by many of the consumer drone companies. But things were about to change, he could sense.

“So,” the course designer continued, “We’re tightening the placement of columns for more twists and turns – more exciting, you know – and we’re installing way more cameras along the course. Plus, there’s going to be more fire, check it out,” he took out his phone, opened the ‘Detroit-Drone-Course-BETA!’ app, and pressed a small flame icon. They both heard a slight whoosh, then flames erupted from angled pipes at some of the tightest turns in the course. “So obviously it’s possible to fly through here but you’re going to have to be really good, really fast – right at the limit.”
  “The limit?” Caleb said.
  “Of human reflexes,” said the designer. “I figure that we can race on these courses for a year or two and that way we’ll be able to generate enough data to train the AI systems to handle these turns. Then we can add more flames, tighten the curves more, go full auto, and clean up in the market. Early mover advantage. Or… fast mover advantage, I should say. Haha.”
  “Yeah,” Caleb said, forcing a chuckle, “haha. I guess we’ll just be the human faces for the software.”
  “Yup,” the designer says, beaming. “Just imagine the types of pitch we can build when there are no human competitors on the course at all!”

Technologies that inspired this story: Drones, DJI, work by NASA’s Jet Propulsion Lab on developing AI-based flight systems for racing drones (check out the video!).