Import AI 187: Real world robot tests at CVPR; all hail the Molecule Transformer; the four traits needed for smarter AI systems.
by Jack Clark
A somewhat short issue this week, as I’ve been at the OECD in Paris, speaking about opportunities and challenges of AI policy, and figuring out ways for the AI Index (aiindex.org) to support the OECD’s new ‘AI Policy Observatory’.
How useful are simulators? Find out at CVPR 2020!
…Three challenges will put three simulation approaches through their paces…
In the past few years, researchers have started using software-based simulators to train increasingly sophisticated machine learning systems. One notable trend has been the use of high-fidelity simulators, as researchers try to train systems in these rich, visually-stimulating environments, then transfer these systems into reality. At CVPR 2020, three competitions will push the limits of different simulators, generating valuable information about how useful these tools may be.
Three challenges for one big problem:
RoboTHOR Challenge 2020: This challenge evaluates how good we are at developing systems that can navigate to objects specified by name (e.g., go to the table in the kitchen), using the ‘Thor’ simulator (Import AI: 73). “Participants will train their models in simulation and these models will be evaluated by the challenge organizers using a real robot in physical apartments” (emphasis mine).
Habitat Challenge 2020: This challenge has two components, a point navigation challenge, and an object navigation challenge, both set in the Habitat multi-environment simulator (Import AI 141). The point navigation one tries to deprive the system of various senses (e.g., GPS), and adds noise to its actuations, which will help us test the robustness of these navigation systems. The object navigation challenge asks an agent to find an object in the environment without access to a map.
Sim2Real Challenge with Gibson: Similar to RoboTHOR, this challenge asks people to train agents to navigate through a variety of photorealistic environments using the ‘Gibson’ simulator (Import AI: 111). It has three tiers of difficulty – a standard point navigation task, a point navigation task where the environment contains interactive objects, and a point navigation task where the environment contains other objects that move (and the agent must avoid colliding with them). This challenge also contains a sim2real element, where top-ranking teams (along with the top-five teams from the Habitat Challenge) will get to test out their system on a real robot as well.
Why this matters: Let’s put this in perspective: in 2013 the AI community was very impressed with work from DeepMind showing you could train an agent to play space invaders via reinforcement learning. Now look where we are – we’re training systems in photorealistic 3D simulators featuring complex physical dynamics – AND we’re going to try and boot these systems into real-world robots and test out their performance. We’ve come a very, very long way in a relatively short period of time, and it’s worth appreciating it. I am the frog being boiled, reflecting on the temperature of the water. It’s getting hot, folks!
Read more about the Embodied-AI Workshop here (official webpage).
Predicting molecular properties with the Molecule Transformer:
…Figuring out the mysteries of chemistry with transformers, molecular self attention layers, and atom embeddings…
Researchers with Ardigen; Jagiellonian University; Molecule.one; and New York University have extended the widely-used ‘Transformer’ component so it can process data relating to molecule property prediction tasks – a capability critical to drug discovery and material design. The resulting Molecule Attention Transformer (MAT) performs well across a range of tasks, ranging from predicting ability of molecule to penetrate blood-brain barrier, to predicting whether a compound is active towards a given target (e.g., Estrogen Alpha, Estrogen Beta), and so on.
Transformers for Molecules: To get Transformers to process molecule data, the researchers implement what they call “Molecular Self Attention Layers”, and each atom is embedded as a 26-dimensional vector.
How well does MAT stack up? They compare the MAT to three baselines: random forest (RF); Support Vector Machine with RBF kernel (SVM); and graph convolutional networks (GCN)s. The MAT gets state-of-the-art scores on four out of the seven tests (RF and SVM take the other one and two, respectively).
MAT pre-training: Just like with image and text models, molecular models can benefit from being pre-trained on a (relevant) dataset and fine-tuned from there. They compare their system against a Pretrained EAGCN, and SMILES, where MAT with pre-training gets significantly improved scores.
Why this matters: Molecular property prediction is the sort of task where if we’re able to develop AI systems that make meaningful, accurate predictions, then we can expect large chunks of the economy to change as a consequence. Papers like this highlight how generically useful components like the Transformer are, and highlights how much modern AI has in common with plumbing – here, the researchers are basically trying to design an interface system that lets them port molecular data into a Transformer-based system, and vice versa.
Read more: Molecule Attention Transformer (arXiv).
Deepfakes are being commoditized – uh oh!
…What happens when Deepfakes get really cheap?…
Deepfakes – the slang term for AI systems that let you create synthetic videos where you superimpose someone’s face onto someone else’s – are becoming easier and cheaper to make, though they’re primarily being used for pornography rather than political disruption, according to a new analysis from Deeptracelabs and Nisos.
Porn, porn, porn: “We found that the majority of deepfake activity centers on dedicated deepfake pornography platforms,” they write. “These videos consistently attract millions of views, with some of the websites featuring polls where users can vote for who they want to see targeted next”.
Little (non-porn) misuse: “We assess that deepfakes are not being widely bought or sold for criminal or disinformation purposes as of early February 2020,” they write. “Techniques being developed by academic and industry leaders have arguably reached the required quality for criminal uses, but these techniques are not currently publicly accessible and will take time to be translated into stable, user-friendly implementations”.
Why this matters: This research highlights how AI tools are diffusing into society, with some of them being misused. I think the most significant (implicit) thing here is the importance this places on publication norms in AI research – what kind of responsibility might academics and corporate researchers have here, with regard to proliferating the technology? And can we do anything to reduce misuses of the technology while maintaining a relatively open scientific culture? “We anticipate that as deepfakes reach higher quality and “believability”, coupled with advancing technology proliferation, they will increasingly be used for criminal purposes”, they write. Get ready.
Read more: Analyzing The Commoditization Of Deepfakes (NYU Journal of Legislation & Public Policy).
What stands between us and smarter AI?
…Four traits we need to build to develop more powerful AI systems…
Cognitive scientist and entrepreneur Gary Marcus has published a paper describing an alternative to the dominant contemporary approach to AI research. Where existing systems focus on “general-purpose learning and ever-larger training sets and more and more compute”, he instead suggests we should work on “a hybrid, knowledge-driven, reasoning-based approach, centered around cognitive models”.
Four research challenges for better AI systems: Marcus thinks contemporary AI systems are missing some things that, if further researched, might improve their performance. These include:
- Symbol-manipulation: AI systems should be built with a mixture of learned components and more structured approaches that allow for representing an algorithm in terms of operations over variables. Such an approach would make it easier to build more robust systems – “represent an algorithm in terms of operations over variables, and it will inherently be defined to extend to all instances of some class”.
- Encoded knowledge: “Rather than starting each new AI system from scratch, as a blank slate, with little knowledge of the world, we should seek to build learning systems that start with initial frameworks for domains like time, space and causality, in order to speed up learning and massively constrain the hypothesis space,” he writes. (Though in my view, there’s some chance that large-scale pre-training could create networks that can serve as strong prior for systems that get finetuned against smaller datasets – though today these approaches merely kind of work, it’ll be exciting to see how far they can get pushed. There are also existing systems that fuse these things together, like ERNIE which pairs a BERT-based language model with a structured external knowledge store).
- Reasoning: AI systems need to be better at reasoning about things, which basically means abstracting up and away from the specific data being operated over (e.g., text strings, pixels), and using representations to perform sophisticated inferences. “We need new benchmarks,” Marcus says.
- Cognitive Models: In cognitive psychology, there’s a term called a ‘cognitive model’, which describes how people build systems that let them use their prior knowledge about some entities in combination with an understanding of their properties , as well as the ability to incorporate new knowledge over time.
Show me the tests: Many of the arguments Marcus makes are supported by the failures of contemporary AI systems – the paper contains numerous studies of GPT-2 and how it sometimes fails in ways that indicate some of the deficiencies of contemporary systems. What might make Marcus’s arguments resonate more widely with AI researchers is the creation of a large, structured set of tests which we can run contemporary systems against. As Marcus himself writes when discussing the reasoning deficiencies of contemporary systems, “we need new benchmarks”.
Why this matters: Papers like this run counter to the dominant narrative of how progress happens in AI. That’s good – it’s helpful to have heterogeneity in the discourse around this stuff. I’ll be curious to see how things like Graph Neural Networks and other recently developed components might close the gap between contemporary systems and what Marcus has in mind. I also think there’s a surprisingly large amount of activity going on in the areas Marcus identifies, though the proof will ultimately come from performance and rigorous ablations. Bring on the benchmarks!
Read more: The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence (Arxiv).
[Space, 22nd Century]
The Desert Island GAN
They were able to take the entire Earth with them when they got into their spaceship and settled in for the long, decades-long mission. The Earth was encoded into a huge generative model, trained on a variety of different data sources. If they got lonely, they could put on a headset and go through synthetic versions of their hometown. If they got sad, they could they could cover the walls of the spacecraft with generated parties, animal parents and animal babies, trees turned into black-on-orange cutouts during righteous sunrises, and so on.
But humans are explorers. It’s how they evolved. So though they had the entire Earth in their ships, they would cover its surface. Many of the astronauts took to exploring specific parts of the generative featurespace. One spent two years going through ever-finer permutations of rainforests, while another grew obsessed with exploring the spaces between the ‘real’ animals of the earth and the ones imagined by the pocket-Earth-imagination.
They’d swap coordinates, of course. The different astronauts in different ships would pipe messages to eachother across stellar distances – just a few bytes of information at a time, a coordinate and a statement.
Here is the most impossible tree.
Here is a lake, hidden insight a mountain.
Here are a flock of starlings that can swim in water; watch them dive in and out of these waves.
Back on Earth, the Earth was changing. It was getting hotter. Less diverse as species winked out, unable to adjust to the changing climate. Less people, as well. NASA and the other space agencies kept track of the ships, watching them go further and further away, knowing that each of them contained a representation of the planet they came from that was ever richer and ever more alive than the planet itself.
Things that inspired this story: Generative models; generative models as memories; generative models as archival systems; generative models as pocket imaginations for people to navigate and use as extensions of their own memory; AI and cartography.