Import AI 145: Testing general intelligence in Minecraft; Google finds a smarter way to generate synthetic data; who should decide who decides the rules of AI?

by Jack Clark

Think your agents have general intelligence? Test them on MineRL:
…Games as procedural simulators…
How can we get machines to learn to perform complicated, lengthy sequences of actions? One way is to have these machines learn from human demonstrations – this captures a hard AI challenge, in the form of requiring algorithms that can take in a demonstration as input but generalize to different situations. It also short-cuts another hard AI challenge: exploration, which is the art of designing algorithms that can explore enough of the problem space they can attempt to solve the task, rather than get stuck enroute.
  Now, an interdisciplinary team of researchers led by people from Carnegie Mellon University have created the MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors. This competition uses the ‘Minecraft’ computer game as a testbed in which to train smart algorithms, and ships with a dataset called MineRL-v0 which consists of 60 million state-action pairs of human demonstrations of tasks being solved within MineCraft.

The Challenge: MineRL will test the ability for agents to solve one main challenge: ‘0btainDiamond’, this challenge requires their agents to go and find a diamond somewhere in the (procedurally generated) environment they’ve been dropped into. This is a hard task: “Diamonds only exist in a small portion of the world and are 2-10 times rarer than other ones in Minecraft,” the authors write. “Additionally, obtaining a diamond requires many prerequisite items. For these reasons, it is practically impossible for an agent to obtain a diamond via naive random exploration”.

Auxiliary Environments: 0btainDiamond is such a hard challenge that the competition organizers have released six additional ‘auxiliary environments’ to help people train AI systems capable of solving the challenge. These are designed to encourage the development of several skills necessary to being able to easily find a diamond: navigating the environment, chopping down trees, surviving in the environment, and obtaining three different items – a bed (which needs to be assembled out of three items), meat (of a specific animal), and a pickaxe (which is needed to mine the diamond).

The dataset: MineRL-v0 is 60-million state-action-(reward) sets, recorded from human demonstrations. The state includes things like what the player sees as well as their inventory and distances to objectives and attributes and so on.

Release plans: The researchers aim to soon release the full environment, data, and numerous algorithmic baselines. They will also publish further details of the competition as well, which will encourage contestants to submit via ‘NC6 v2’ instances on Microsoft’s ‘Azure’ cloud.

Why this matters: MineCraft has many of the qualities that make for a useful AI research platform: its tasks are hard, the environments can be generated procedurally, and it contains a broad & complex enough range of tasks to stretch existing systems. I also think that its inherently spatial qualities are useful, and potentially let researchers specify even harder tasks requiring systems capable of hierarchical learning and operating over long timescales.
  Read more: The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors (Arxiv).

#####################################################

Who gets to decide the rules for AI?
…Industry currently has too much power, says Harvard academic…
The co-director for Harvard University’s Berkman Klein Center for Internet & Society is concerned that industry will determine the rules and regulations of AI, leading to significant societal ramifications.

In an essay in Nature, Yochai Benkler writes that: “Companies’ input in shaping the future of AI is essential, but they cannot retain the power they have gained to frame research on how their systems impact society or on how we evaluate the effect morally.”

One solution: Benkler’s main fix is to apply funding. Organizations working to ensure that AI is fair and beneficial must be publicly funded, subject to peer review and transparent to civil society. And society must demand increased public investment in independent research rather than hoping that industry funding will fill the gap without corrupting the process.”
  I’d note the challenge here is that many governments are loathe to invest in building their own capacity for technical evaluation, and tend to defer to industry inputs.

Why this matters: AI is sufficiently powerful to have political ramifications and these will have a wide-ranging effect on society – we need to ensure there is equitable representation here, and I think coming up with ways to do that is a worthy challenge.
  Read more: Don’t let industry write the rules for AI (Nature).

#####################################################

Anki shuts down:
…Maker of robot race cars, toys, shuts down…
Anki, an AI startup that most recently developed a toy/pet robot named ‘Cozmo’, has shut down. Cozmo was a small AI-infused robot that could autonomously navigate simple environments (think: uncluttered desks and tables), and could use its lovingly animated facial expressions to communicate with people. Unfortunately, like most robots that use AI, it was overly brittle and prone to confusing failures – I had purchased one, and found it to be frustrating for inconsistently responding to voice commands, occasionally falling off the edge of my table (and more frustratingly, falling off the same part of the table multiple times in a row),

Why this matters: Despite (pretty much) everyone + their children thinking it’d be cool to have lots of small, cute, pet robots wandering around the world, it hasn’t happened. Why is this? Anki gives us an indication: making consumer hardware is extremely difficult, and robots are particularly hard due to the combination of relatively low production volumes (making it hard to make them cheap) as well as
  Read more: The once-hot robotics startup Anki is shutting down after raising more than $200 million (Vox / Recode).

#####################################################

Google gets its AI systems to generate their own synthetic data:
…What’s the opposite of garbage in / garbage out? Perhaps UDA in / Decent prediction out…
Google wants to spend dollars on compute to create more data for itself, rather than gather as much data – that’s the idea behind ‘Unsupred Data Augmentation’ (UDA), a new technique from Google that can automatically generate synthetic versions of unlabeled data, giving neural networks more information to learn from. Usage of the technique sets a new state-of-the-art on six language tasks and three vision tasks.

How it works: UDA “minimizes the KL divergence between model predictions on the original example and an example generated by data augmentation”, they write. By using a targeted objective, the technique generates valid, realistic perturbations of underlying data, which are also sufficiently diverse to help systems learn.

Better data, better results: In tests, UDA leads to significant across-the-board improvements on a range of language tasks. It can also match (or approach) state-of-the-art performance on various tasks while using significantly less data. They also test their approach on ImageNet – a much more challenging task. They test UDA’s performance in two ways: seeing how well it does as using a hybrid of labelled and unlabeled data on ImageNet (specifically: 10% labelled data, 90% unlabelled), and how well it does at automatically augmenting the full ImageNet dataset: UDA leads to a significant absolute performance improvement when using a hybrid of labelled and unlabelled data. It also marginally improves performance on the full ImageNet set – impressive, considering how

Why this matters: Being able to arbitrage compute for data changes the economic dynamics of developing increasingly powerful AI systems; techniques like UDA show how in the long term, compute could become as strategic (or in some cases more strategic) than data.
  Read more: Unsupervised Data Augmentation (Arxiv).

#####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net

US seeking leadership in AI technical standards:
President Trump’s Executive Order on AI tasked the National Institute of Standards and Technology (NIST) with developing a plan for US engagement with the development of technical standards for AI. NIST have made a request for information, to understand how these standards might be developed and used in support of reliable, robust AI systems, and how the Federal government can play a leadership role in this process.

Why standards matter: A recent report from the Future of Humanity Institute looked at how technical standards for AI R&D might inform global solutions to AI governance challenges. Historically, international standards bodies have governed policy externalities in cybersecurity, sustainability, and safety. Given the challenges of trust and coordinating safe practices in AI development and deployment, standards setting could play an important role.
  Read more: Artificial Intelligence Standards – Request for Information (Gov).
  Read more: Standards for AI Governance (Future of Humanity Institute).

Eric Schmidt steps down from Alphabet Board
  Former Google CEO and Chairman Eric Schmidt has stepped down from the Board at Alphabet, Google’s parent company. Diane Green, former CEO of Google Cloud, has also stepped down. Robin L. Washington, CFO of pharmaceutical firm Gilead, joins in their place. in their place Schmidt remains one of the company’s largest shareholders, behind founders Larry Page and Sergey Brin.
   Read more: Alphabet Appoints Robin L. Washington to its Board of Directors (Alphabet)

#####################################################

Tech Tales:

The Flower Garden

We began with a flower garden that had a whole set of machines inside of it as well as flower beds and foliage and all the rest of the things that you’d expect. It was optimized for being clean and neat and fitting a certain kind of design and so people would come to it and complement it on its straight angles and uniquely textured and laid out flower beds. Over time, the owners of this garden started to add more automation to the garden – first, a solar panel, to slurp down energy from the sun in the day and then at night power the lights that gave illumination to the plants in the middle of the warm dark.

But as technology advanced so too did the garden – it gained cameras to monitor itself and then smart watering systems that were coupled with the cameras to direct different amounts of water to plants based not only on the schedule, but on what the AI system thought they needed to grow “best”. A little after that the garden gained more solar panels and some drones with delicate robot arms: the drones were taught how to maintain bits of the garden and they learned how to use their arms to take vines and move them so they’d grow in different directions, or to direct flowers so as not to crowd eachother out.

The regimented garden was now perhaps one part machine for every nine parts foliage: people visited and marvelled at it, and wondered among themselves how advanced the system could become, and whether one day it could obviate the need for human gardeners and landscapers and designers at all.

Of course, eventually: the things did get smart enough to do this. The garden gradually, then suddenly, went from being cultivated by humans to being cultivated by machines. But, as with all things, change happened. It went out of fashion. People stopped visiting, then stopped visiting at all.

And so one day the garden was sold to a private owner (identity undisclosed). The day the owner took over they changed the objective of the AI system that tended to the garden – instead of optimizing for neatness and orderliness, optimize for growth.

Gradually, then suddenly, the garden filled with life. More and more flowers. More and more vines. Such fecund and dense life, so green and textured and strange. It was a couple of years before the first problem from this change: the garden started generating less power for its solar panels, as the greenery began to occlude things. For a time, the system optimised itself and the vegetation was moved – gently – by the drones, and arranged so as to not grow there. This conflicted with the goal of maximizing volume. So the AI system – as these things are wont to do – improvised a solution: one day one of the drones dropped down near one of the solar panels and used its arm to create a little gap between the panel and the ground. Then another drone landed and scraped some rocks into the space. In this way the drones slowly, then suddenly, changed the orientation of the panels, so they could acquire more light.

But the plants kept growing, so the machines took more severe actions: now the garden is famous not for its vegetation, but for the solar panels that have been raised by the interplay between machine and vegetation, as – lifted by various climbing vines – they raise in step with the growth of the garden. On bright days, elsewhere in the city, you can see the panels shimmer as winds shake the vines and other bits of vegetation they are attached to, causing them to cast flashes like the scales of some huge living fish, seen from a great distance.

Are the scales of a fish and its inner fleshy parts so different, as the difference between these panels and their drones, people wonder.

Things that inspired this story: Self-adapting systems; Jack and the Beanstalk; the view of London from Greenwich Observatory; drones; plunging photovoltaic prices; reinforcement learning; reward functions.