Import AI: #101: Teaching robots to grasp with two-stage networks; Silicon Valley VS Government AI; why procedural learning can generate natural curriculums.

by Jack Clark

Making better maps via AI:
…Telenav pairs machine learning with OpenStreetCam data to let everyone make better maps…
Navigation company Telenav has released datasets, machine learning software, and technical results to help people build AI services on top of mapping infrastructure. The company says it has done this to create a more open ecosystem around mapping, specifically around ‘Open Street Map’, a popular open source map).
  Release: The release includes a training set of ~50,000 images annotated with labels to help identify common road signs; a machine-learning technology stack that includes a notebook with visualizations, a RetinaNet system for detecting traffic signs, and the results from running these AI tools over more than 140-million existing street-level images; and more.
  Why it matters: Maps are fundamental to the modern world. AI promises to give us the tools needed to automatically label and analyze much of the world around us, holding with it the promise to create truly capable open source maps that can rival those developed by proprietary interests (see: Google Maps, HERE, etc). Mapping may also become better through the use of larger datasets to create better automatic-mapping systems, like tools that can parse the meaning of photos of road signs.
  Read more: The Future of Map-Making is Open and Powered by Sensors and AI (OpenStreetMap @ Telenav blog).
  Read more: Telenav MapAI Contest (Telenav).
  Check out the GitHub (Telenav GitHub).

Silicon Valley tries to draw a line in shifting sand: surveillance edition:
…CEO of facial recognition startup says won’t sell to law enforcement…
Brian Brackeen, the CEO of facial recognition software developed Kairos, says his company is unwilling to sell facial recognition technologies to government or law enforcement. This follows Amazon coming under fire from the ACLU for selling facial recognition services to law enforcement via its ‘Rekognition’ API.
  “I (and my company) have come to belief that the use of commercial facial recognition in law enforcement or in government surveillance of any kind is wrong – and that it opens the door for gross misconduct by the morally corrupt,” Brackeen writes. “In the hands of government surveillance programs and law enforcement agencies, there’s simply no way that face recognition software will not be used to harm citizens”, he writes.
  Why it matters: The American government is currently reckoning with the outcome of an ideological preference leading to its military industrial infrastructure relying on an ever-shifting constellation of private compares, whereas other countries tend to perform more direct investment for certain key capabilities, like AI. That’s led to today’s situation where American government entities and organizes are, upon seeing how other governments (mainly China) are implementing AI, seeking to find ways to implement AI in America. But getting people to build these AI systems for the US government has proved difficult: many of the companies able to provide strategic AI services (see: Google, Amazon, Microsoft, etc) have become so large they’ve become literal multinationals: their offices and markets are distributed around the world, and their staff come from anywhere. Therefore, these companies aren’t super thrilled about working on behalf of any one specific government, and their staff are mounting internal protests to get the companies to not sell to the US government (among others).. How the American government deals with this will determine many of the contours of American AI policy in the coming years.
  Read more: Facial recognition software is not ready for use by law enforcement (TechCrunch).

“Say it again, but like you’re sad”. Researchers create and release data for emotion synthesis:
…Parallel universe terrifying future: a literal HR robot that can detect your ‘tone’ during awkward discussions and chide you for it…
You’ve heard of speech recognition. Well, what about emotion recognition and emotional tweaking? That’s the problem of listening to speech, categorizing the emotional inflections of the voices within it, and learning to change an existing speech sample to sound like it is spoken with a different emotion  – a potentially useful technology to have for passive monitoring of audio feeds, as well as active impersonation or warping, or other purposes. But to be able to create a system capable of this we need to have access to the underlying data necessary to train it. That’s why researchers with the University of Mons in Belgium and Northeastern University in the USA have created ‘the Emotional Voices dataset’.
  The dataset: “This database’s primary purpose it to build models that could not only produce emotional speech but also control the emotional dimension in speech,” write the researchers. The dataset contains five different speakers and two spoken languages (north American English and Belgian French), with four of the five speakers contributing ~1,000 utterances each, and one speaker contributing around ~500. These utterances are split across five distinct emotions: neutral, amused, angry, sleepy, and disgust.
  You sound angry. Now you sound amused: In experiments, the researchers tested how well they could use this dataset to transform speech from the same speaker from one emotion to another. They found that people would roughly categorize voices transformed from neutral to angry in this way with roughly 70 to 80 percent accuracy – somewhat encouraging, but hardly definitive. In the future, the researchers “hope that such systems will be efficient enough to learn not only the prosody representing the emotional voices but also the nonverbal expressions characterizing them which are also present in our database.”
  Read more: The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems (Arxiv).

Giving robots a grasp of good tasks with two-stage networks:
…End-to-end learning multii-stage tasks is getting easier, Stanford researchers show…
Think about a typical DIY task you might do at home – what do you do? You probably grab the tool in one hand, then approach the object you need to fix or build, and go from there. But how do you know the best way to grip the object so you can accomplish the task? And why do you barely ever get this grasp wrong? This type of integrated reasoning and action is representative of the many ways in which humans are smarter than machines. Can we teach machines to do the same? Researchers with Stanford University have published new research showing how to train basic robots to perform simple, real-world DIY-style tasks, using deep learning techniques.
  Technique: The researchers use a simulator to repeatedly train a robot arm and a tool (in this case, a simplified toy hammer) to pick up the tool then use it to manipulate objects in a variety of situations. The approach relies on a ‘Task-Oriented Grasping Network (TOG-Net), which is a two-stage system that first predicts effective grasps for the object, then predicts manipulation actions to perform to achieve a task.
  Data: One of the few nice things about working with robots is that if you have a simulator it’s possible to automatically generate large amounts of data for training and evaluation. Here, the researchers use the open source physics simulator Bullet to generate many permutations of the scene to be learned, using different objects and behaviors. They train using 18,000 procedurally generated objects.
  Results: The system is tested in two limited domains: sweeping and hammering, where sweeping consists of using an object to move another object without lifting it, and hammering involves trying to hammer a large wooden peg into a hole. The developed system obtains reasonable but not jaw-dropping success rates on the hammering tasks (obtaining a success rate of ~80%, far higher than other methods), and less impressive results on sweeping (~71%). These results put this work firmly in the domain of research, as the success rates are far too low for this to be interesting from a commercial perspective.
  Why it matters: Thanks to the growth in compute and advancement in simulators it’s becoming increasingly easy apply deep learning and reinforcement learning techniques to robots. These advancements are leading to an increase in the pace of research in this area and suggest that, if research continues to show positive results, there may be a deep learning tsunami about to hit robotics.
  Read more: Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision (Arxiv).

Evolution is good, but guided evolution is better:
…Further extension of evolution strategies shows value in non-deep learning ideas…
Google Brain researchers have shown how to extend ‘evolution strategies’, an AI technique that has regained popularity in recent years following experiments showing it is competitive with deep learning approaches. The extension further improves performance of the ES algorithm. “Our method can primarily be thought of as a modification to the standard ES algorithm, where we augment the search distribution using surrogate gradients,” the researchers explain. The result is a significantly more capable version of ES, which they call Guided ES, that “combines the benefits of first-order methods and random search, when we have access to surrogate gradients that are correlated with the true gradient”.
  Why it matters: In recent years a huge amount of money and talent has flooded into AI, primarily to work on deep learning techniques. It’s valuable to continue to research or to revive other discarded techniques, such as ES, to provide alternative points of comparison to let us better model progress here.
  Read more: Guided evolutionary strategies: escaping the curse of dimensionality in random search (Arxiv).
  Read more: Evolution Strategies as a Scalable Alternative to Reinforcement Learning (OpenAI blog).

Using procedural creation to train reinforcement learning algorithms with better generalization:
….Do you know what is cooler than 10 video game levels? 100 procedurally generated ones with a curriculum of difficulty…
Researchers with the IT University of Copenhagen and New York University have fused procedural generation with games and reinforcement learning to create a cheap, novel approach to curriculum learning. The technique relies on using reinforcement learning to guide the generation of increasingly difficult video game levels, where difficult levels are generated only once the agent has learned to beat easier levels. This process leads to a natural curriculum emerging, as each time the agent gets better it sends a signal to the game generator to create a harder level, and so on.
  Data generation: They use the General Video Game AI Framework (GVG-AI), an open source framework which over 160 games have been developed for. GVG-AI is scriptable by the video game description language (VGDL). GVG-AI is integrated with OpenAI Gym, so developers can train against from pixel inputs, incremental rewards, and a binary win/loss signal. The researchers create level generators for three difficult games within GVG-AI. During the level generation process they also manipulate a ‘difficulty parameter’ which roughly correlates to how challenging the generated levels are.
  Results: The researchers find that systems trained with this progressive procedural generation approach do well, obtaining top scores on the challenging ‘frogs’ and ‘zelda’ games, compared to baseline algorithms trained without a procedural curriculum.
  Why it matters: Approaches like this highlight the flaws in the way we evaluate today’s reinforcement learning algorithms, where we test algorithms on similar (frequently identical) levels/games to those they were trained on, and therefore have difficulty distinguishing between algorithmic improvements and overfitting a test set. Additionally, this research shows how easy it is becoming to use computers to generate or augment existing datasets (eg, creating procedural level generators for pre-existing games), reducing the need for raw input data in AI development, and increasing the strategic value of compute.
  Read more: Procedural Level Generation Improves Generality of Deep Reinforcement Learning (Arxiv).

AI Policy with Matthew van der Merwe:
…Reader Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net …

Trump drops plans to block Chinese investment in US tech, strengthens oversight:
  The Trump administration has rowed back on a proposal to block investment in industrially significant technology (including AI, robotics, semiconductors) by firms with over 25% Chinese ownership, and to restrict tech exports to China by US firms.   The government will instead expand the powers of the Committee of Foreign Investment in the United States (Cfius), the body that reviews the national security implications of foreign acquisitions. The new legislation will broaden the Committee’s considerations to include the impact on the US’s competitive position in advanced technologies, in addition to security risks.
  Why this matters: Governments are gradually adapting their oversight of cross-border investment to cover AI and related technologies, which are increasingly being treated as strategically important for both military and industrial applications. The earlier proposal would have been a step-up in AI protectionism from the US, and would have likely prompted a strong retaliation from China. For now, a serious escalation in AI nationalism seems to have been forestalled.
  Read more: Trump drops new restrictions on China investment (FT).

DeepMind co-founder appointed as advisor to UK government:
Demis Hassabis, co-founder of DeepMind, has been announced as an Adviser to the UK government’s Office for AI, which focuses on developing and delivering the UK’s national AI strategy.
  Why this matters: This appointment adds credibility to the UK government’s efforts in the sector; A persistent worry is that policy-makers are out of their depth when it comes to emerging technologies, and that this could lead to poorly designed policies. Establishing close links with industry leaders is an important means of mitigating these risks.
  Read more: Demis Hassabis to advise Office for AI.

China testing bird-like surveillance drones:
Chinese government agencies have been using stealth surveillance drones mimicking the flight and appearance of birds to monitor civilians. Code-named ‘doves’ and fitted with cameras and navigation systems, they are being used for civilian surveillance in 5 provinces. The drones’ bird-like appearance allows them to evade detection by humans, and even other birds, who reportedly regularly join them in flight. They are also being explored for military applications, and are reportedly able to evade many anti-drone systems, which rely on being able to distinguish drones from birds.
  Why this matters: Drones that are able to evade detection are a powerful surveillance technology that raise ethical questions. Should similar drones be used in civilian applications in the US and Europe, we could expect a resistance from privacy advocates.
  Read more: China takes surveillance to new heights with flock of robotic doves (SCMP).

OpenAI Bits & Pieces:

OpenAI Five:
We’ve released an update giving progress on our Dota project, which involves training large-scale reinforcement learning systems to beat humans at a challenging, partially observable strategy game.
   Read more: OpenAI Five (OpenAI blog).

Tech Tales:

Partying in the sphere

The Sphere was a collection of around 1,000 tiny planets in an artificial solar system. The Sphere was also the most popular game of all time. It crept into the world at first via high-end desktop PCs. Then its creators figured out how to slim down its gameplay into a satisfying form for mobile phones. That’s when it really took over. Now the sphere has around 150 million concurrent players at any one time, making it the most popular game on earth by a wide margin.

Several decades after it launched, The Sphere has started to feel almost crowded. Most planets are inhabited. Societal hierarchies have appeared. The era of starting off as a new player with no in-game currency and working your way up are over and have been over for years.

But there’s a new sport in The Sphere: breaking it. One faction of players, numbering in the millions, has begun to construct a large metallic scaffold up from one planet at the corner of the map. Their theory is that they can keep building it until they hit the walls of The Sphere, at which point they’re fairly confident that  – barring a very expensive and impractical overhaul of the underlying simulation engine – they will be able to glitch out of the map containing the 1,000 worlds and into somewhere else.

The game company that makes The Sphere became fully automated a decade ago, so players are mostly trying to guess at the potential reactions of the Corporate AI by watching any incidental changes to the game via patches or updates. So far, nothing has happened to suggest the AI wishes to discourage the scaffolding – the physics remains similar, the metals used to make the scaffolds remain plentiful, the weight and behavior of the scaffolds in zero-g space remain (loosely) predictable.

So, people wonder, what lies beyond The Sphere? Is this something the Corporate AI now wants humanity to try and discover? And what might lie there, at the limit of the game engine, able to reach via a bugged-out glitch kept deliberately open by one of the largest and most sophisticated AIs on the planet?

All we know is two years ago some fluorescent letters appeared above every one of the 1,000 planets in The Sphere: keep going, it says.

Things that inspired this story: Eve Online, Snowcrash, Procedural Generation,