Import AI: Issue 53: Free data for self-driving cars, why neural architecture search could challenge AI startups, and a new AI Grant.

by Jack Clark

Help wanted: I’m looking for a PHD student with an interest in AI safety to work on a survey project. If this sounds interesting to you, please email me at

Amazon Picking Challenge: Ozzie team wins with ‘Cartman’ robot:
…Several years ago Amazon acquired robot startup Kiva Systems then proceeded to fill its warehouses with little orange hockey-puck shaped robots. Amazon now has over 45,000 of these robotics, which ferry shelves containing pallets of goods to human workers who pick them out of the boxes and place them in parcels. Now, Amazon wants to automate the human picking part of the process as well.
…It’s a hard problem, demanding robots far smarter than those we have today that are able to neatly pick up and place arbitrary objects from a potential pool of millions. It’s been running a competition for three years, hewing closer and closer (but still not there) to real-world conditions as it goes. (This year, Amazon forced the robots to work in more cramped environments than before, and revealed some of the to-be-picked objects only 30 minutes before the beginning of the competition, penalizing systems and teams incapable of improvisation. )
…This year, the win goes to a team from the Australian Center for Robotic Vision, which won the competition by scoring 272 points on the combined stowing and picking task. They’ll get an $80,000 prize – an amazingly cheap ‘cost’ of research uniquely relevant to Amazon’s business.
…The robot has 6 axes and 3 degrees of articulation and has two different hands – a pincer grip and a suction cup – to help it tackle the millions of objects seen in a typical high-trafficked general warehouse like those operated by Amazon.
…Read more about the winning entry on the Queensland University website.
More information on the Amazon Robot Picking challenge here.
But don’t get too excited – the robots still move incredibly slowly; it could be five years till the technology advances enough to truly solve the competition, according to this Wired article.

UK government launches £23 million autonomous vehicle competition:
…The UK government has launched a research and development project focused on autonomous vehicles and expects to fund projects that cost between £500,000 to £4 million. Each project is expected to last between 18 and 30 months.
…”The aim is to support concepts that will become future core technologies in 2020 to 2025, the government writes.
…Projects should focus on many types of vehicles and should develop the tech to support level 4 automation of the vehicle (the second highest level according to these SAE definitions) and/or enhance vehicle connectivity.
…Intriguingly, projects are expected to support the “principle of shared learning with other projects” and will have the chance to exchange ideas at workshops organized every 6 months.
…Applicants should be a UK-based business and expect to carry out their work in the UK.
Find out more information on the grant here.

…Mozilla has launched Project Common Voice, an initiative to gather and validate a vast amount of human voice data, creating an (eventually) open data repository to let people compete against the vast troves of data held by Google, Microsoft, Facebook, and so on
‘Donate your voice’ here. Hear hear!

RL without the bells and whistles and with far, far better performance:
…A new paper from DeepMind gets state-of-the-art reinforcement learning results not through the addition of anything ferociously complicated, but instead through a rethink about how to learn from the environment. The new approach sees DeepMind try to learn the distribution of the return received by the RL agent.
…Using the new method, the researchers attain state of the art scores across the Atari corpus, creating new fundamental questions about RL and how it works in the process.
…Read more in: A Distributional Perspective on Reinforcement Learning.

Chinese startups win ImageNet and, this week, WebVision:
…Chinese startup Malong AI Research has won the ‘WebVision’ challenge, a competition to classify images from a set of 2.4 million images drawn from Flickr and Google Image Search. The startup achieve a top-5 error rate of around 5.2% (that’s about two three percentage points higher than the current leader on the ImageNet dataset.)
Check out the results here.
…The startup used a proprietary technique to split the data into ‘clean’ and ‘noisy’ data, then trained an algorithm first solely on the clean data, then combined both the clean and noisy data to train another algorithm. This win follows last week’s ImageNet competition results, in which Chinese startups dominated. A further sign that the nation is moving more into fundamental research, as well as applied AI.

Mini-Me Neural Architecture Search from Google:
…Finding real-world analogues of the types of tasks modern RL algorithms excel at – gaining superhuman scores on vintage video games, piloting improbable-looking simulated machines, solving mazes, and so on – is a challenge. Perhaps one area could be in using RL to automate the design of neural networks themselves. After all, instead of designing our own AI systems, wouldn’t it be better to have AI design them for us? That’s the intuition behind techniques like Neural Architecture Search, a machine learning approach where you try to get an algorithm to come up with its own ways of arranging complex sets of neural networks. The technology has already been used to come up with a best-in-class image recognition algorithm, but at the cost of a vast amount of resources – one Google experiment involved over 800 GPUs being used for over two months.
…Now, Google is trying to do Neural Architecture Search on a budget. The new approach lets them take a dataset – in this case CIFAR-10 – and run neural architecture search over it in such a way that the architecture is independent from the depth of the network and the size of the input images. What this results in is an architecture specialized for image classification, but not dependent on the structure of the underlying visual data. They’re then able to take this evolved architecture and transfer it to run on the significantly larger ImageNet dataset. The results are encouraging; architectures designed by the system getting 82.3% top-1 accuracy –  “0.8% better in top-1 accuracy than the best human-invented architectures”, the researchers write. .
…Most intriguing: the systems score very highly, while having fewer parameters than other equivalently high-scoring systems, suggesting the NAS approach may yield more efficient networks than those designed by a human alone.
Read more: Learning Transferable Architectures for Image Recognition
Google isn’t the only one trying to make techniques like neural architecture search more efficient.
…New research from Shanghai Jiao Tong University and University College London uses RL to train an agent to tweak existing neural network architectures, as well as initiating new networks with different parametizations as well based on pre-existing networks. The second part holds particular promise as they use this ‘Net2Net’ technique to substantially cut the resources required to evolve a new, high-performance network.
…In one experiment, the researchers start with a network that gets about ~73 percent accuracy on the CIFAR-10 dataset. They then employ an RL agent to explore new network architectures; once they’ve gathered 160 of these they pick the one with the best validation accuracy, then continue to train it. They then employ another RL agent to try to widen this network, then perform the same pick&train process, then for the final stage use an RL agent to add further depth to the network, then repeat. The result: A network with a test error rate of around 5.7%, comparable to many high-performing networks (though not state of the art).
…Read more in: Reinforcement Learning for Architecture Search by Network Transformation.

Free data: NEXAR releases 50,000 self-driving car photos:
Dashcam app-maker Nexar has released NEXET, a dataset “consisting of 50,000 images from all over the world with bounding box annotations of the rear of vehicles collected from a variety of locations, lighting, and weather conditions”. (Bonus: it includes day and night scenes, as well as roughly 2,000 photos taken at twilight.)
…Interested parties can also enter a related competition that challengers them to design systems to draw bounding boxes around nearby cars, to help NEXAR improve its Forward Vehicle Collision Warning feature.
You can read more about the competition here.

Distributed AI development 2.0 with the AI Grant:
Nat Friedman (cofounder of Xamarin and now an exec at Microsoft) and Daniel Gross, a partner at Y Combinator, have launched the AI Grant 2.0, a scheme to give AI initiatives a boost through a potent combination of money, cloud credits, data-labeling credits, and support. Applications are due by August the 25th, so take a look if you’re keen to start a project.

Amazon releases its ‘Sockeye’ translation software…
…Amazon has announced Sockeye, software (and an associated set of AWS services) for training neural translation models. The software runs on Amazon’s own ‘MXNET’ AI framework. Sockeye developers can mix `declarative and imperative programming styles through the symbolic and imperative MXNet APIs’ Amazon says. They can also use in-built data parallelism tech to train models on multiple GPUs at once.
…Sockeye supports standard sequence-to-sequence modelling, as well as newer technologies like residual networks, layer normalization, cross-entropy layer smoothing, and more.
You can read more on the announcement at the AWS blog here.

$50 million for AGI startup Vicarious:
Vicarious, an artificial intelligence startup that uses ideas inspired by neuroscience to create clever software, has raised $50 million from Khosla Ventures. That takes the company’s total cash to raised to date to around $120 million. Though it has begun publishing more research papers about its approach recently – most recently, the ‘Schema Networks’ paper – the company is yet to carry out any convincing public demonstration of its technology.

You’ve heard of adversarial pictures, what about adversarial sentences?
…Researchers with Stanford University have taken a look at how robust speech comprehension systems are to deliberately confusing examples and the results are not encouraging.
…In tests, the researchers found that they could drop the classification accuracy of 16 different language modules from an average of 75% down to 36% simply by including a misleading sentence (but not directly contradictory) elsewhere in the piece. (Worse, “when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to 7%.”)
…Components used: the Stanford SQuAD dataset (107,785 human-generated reading comprehension questions about Wikipedia articles).
Get the data: the researchers have also released the tools they used to generate confusing sentences (ADDSENT) and to add arbitrary sequences of English words (ADDANY), so researchers can augment their own datasets with these synthetic adversarial examples, then test the robustness of their techniques.
…Read more in: Adversarial Examples for Evaluating Reading Comprehension Systems 

Swansong for ImageNet, as it ascends into Kaggle:
…This year marks the last year of the ImageNet image recognition competition, which helped spur the current AI boom. For a recap of ImageNet, where it came from, and what might come next check out this article from Dave Gershgorn in Quartz.
…Notable: Yann Lecun of Facebook likes to tell a story about how when he used to submit papers involving neural networks to vision conferences he was regularly rejected (the subtext of this story being ‘who is laughing now!’). ImageNet instigator Fei-Fei Le faced the same difficulties, Gershgorn writes. “Li said the project failed to win any of the federal grants she applied for, receiving comments on proposals that it was shameful Princeton would research this topic, and that the only strength of proposal was that Li was a woman,” Gershgorn writes.
…Good candidates for future datasets now that ImageNet is over: The Visual Genome Project, VQA (versions 1 and 2), MS COCO, and ohers.
…Congratulations on being part of the illustrious ‘rejected by the mainstream scientific community’ club, Fei-Fei. Read more about her eight year ImageNet journey by referring to the slides here. 

New Facebook code – the DrQA will see you now:
…Faceboo has released PyTorch code for DrQA, a reading comprehension system designed to work at scale.
…The system takes in natural language questions, then crawls over a vast trove of documents (in Facebook’s case, WikiPedia, though the company says any pool of documents can be plugged into this) to find the answers.
…Components: Facebook’s system contains a document retriever, reader, and a pipeline to link all the hellish web of inter-dependencies together. Developers also have the option of using a ‘Distant Supervision’ system, which lets you augment the system with additional data. “ Given question-answer pairs but no supporting context, we can use string matching heuristics to automatically associate paragraphs to these training examples,” Facebook writes.
…Bonus: The system supports Python 3.5 and up – kudos to Facebook for doing their part to move the community into the modern era.
Get the code here. 
…You can find out more about the research involving this system by referring to ‘Reading Wikipedia to Answer Open-Domain Questions`.

How can we effectively imprison super-intelligent AI systems while they’re still learning how not to kill us?
That’s the question posed by new research from Cornell, the University of Montreal, and the University of Louisville. The research identifies seven major problems for the whole concept of AI containment, including: the design of the ‘prototype AI container’, an analysis of the AI containment threat model and of the related security VS usability trade-off, coming up with effective tripwires to shutdown a run-away system, an analysis of the human factors, identifying new categories of sensitive information created by AI development, and understanding the limits of provably secure communication.
…One of the most captivating ideas in the piece is that we’ll need to be able to fool or trick machines to encourage the right behavior. ‘A medium containment approach would be to prevent the AGI from deducing that it’s running as a particular piece of software in the world by letting it interact only with a virtual environment, or some computationally well-defined domain, with as few embedded clues as possible about the outside world,” the researchers write.
…You can read more in Guidelines for Artificial Intelligence Containment.

OpenAI Bits&Pieces:

Parameter Noise for Better Exploration: What would happen if we injected noise directly into the parameters of a policy rather than into its action space? The answer to this is: mostly good things. Check out the blog post for more info, or head over to the GitHub Baselines repository for implementations of DQN and DDPG with and without parameter noise.

Tech Tales:

[1985-2030. A life.]

You’d hold eachothers hands and walk through fields tall with ‘wildflowers’ sown deliberately by farmers wanting to sell garlands to tourists. There’d be fierce blues and pinks around you and the underlying zum-thruzz of crickets and flies and other insects. Sometimes the air would feel so full of oxygen gassing off from the plants that you’d swear it made your head light, though it could also be that you were young and holding hands and in love. Things happened and you were together for a while, then you got older, separated warmly, moved away. Kept in touch some of the time, arcing in and out of each other’s lives.

She went into robotics – hardcore. Welding goggles, 3D printers, her own series of franken-metaled creations competing in little University competitions, then appearing as props in TV Shows, then becoming fascinators for billionaires on the hunt for novelty. You studied feedback – making gloves and shirts and eventually whole sets of clothing that you can put on and pair with a VR headset to feel sunflower stems as you walk through virtual fields, and sense the thrum of invisible water as you stick your hands in a GPU-hammering stream. Teleportation for the body, is how you market it.

You made a lot of money; so did she. But it isn’t enough to heal her when she gets sick – afflicted with one of those illnesses where you pull the arm on the universe fruit machine and the tumblers spin to a set of inscrutable symbols: Sorry – not from this plane, nothing you can do, the big asteroid is coming for you.

So she starts dying, as people tend to, and you keep in touch, work to make your lives meet more despite your own travel (your own partner, life, career). You have together a solution and start holding hands a lot – she, hooked up to machines in distant hotel rooms, then eventually in a hospital, then a hospice. You, wearing a VR headset and your own custom gloves, sitting on a plane, a train, a self-driving vehicle, lying on a beach. Most places you go you find a way to sync the timezones so you can spend time together, disembodied yet not unreal.

The two of you cry so much that you develop a whole set of jokes about it. ‘Stay hydrated!’ you say to eachother instead of goodbye.

After she dies you lie in synthetic fields and on beaches of endless sunsets, visiting the locations where the two of you spent her waning life. Try to reanimate her. Not her – that would be crass. But her pressure, yes. You watch her invisible body walk across a beach, leaving low-res prints in the sand. Feel her hand squeeze yours, gazing over fields a thousand miles in size. You mix extracts of past conversations into the frequencies of synthetic storms and trains and animal calls. Sometimes when you squeeze her hand that is not a hand you think you can feel her squeeze back. Some evenings you sit alone and naked in your bed and stretch out a hand and press it, palm flat against the wall, trying to convince yourself you can feel her pushing back from the other side.

Technologies that inspired this story: Virtual reality, force feedback, the peculiar drum sound from Nick Cave’s ‘red right hand’.

Funny coincidence: I wrote this story over the weekend, and after finishing the edit I saw Cade Metz had published a new story in the NYT on therapists using virtual reality to treat people.

Monthly Sponsor:
Amplify Partners is an early-stage venture firm that invests in technical entrepreneurs building the next generation of deep technology applications and infrastructure. Our core thesis is that the intersection of data, AI and modern infrastructure will fundamentally reshape global industry. We invest in founders from the idea stage up to, and including, early revenue.
…If you’d like to chat, send a note to