Import AI 175: Amazon releases AI logistics benchmark; rise of the jellobots; China release an air traffic control recording dataset

by Jack Clark

Automating aerospace: China releases English & Chinese air traffic control voice dataset:
…~60 hours of audio collected from real-world situations…
Chinese researchers from Sichuan University, the Chinese Civil Aviation Administration, and a startup called Wisesoft have developed a large-scale speech recognition dataset based on conversations between air-traffic control operators and pilots. The dataset – which is available for non-commercial use following registration – is designed to help researchers improve the state of the art on speech recognition in air-traffic control and could help enable further automation and increase safety in air travel infrastructure.

What goes into the ATCSpeech dataset? The researchers created a team of 40 people to collect and label real-time ATC speech for the research. They created a large-scale dataset and are releasing a slice of it for free (following registration); this dataset contains around 40 hours of Chinese speech and 19 hours of English speech. “This is the first work that aims at creating a real ASR corpus for the ATC application with accented Chinese and English speeches,” the authors write.
The dataset contains 698 distinct Chinese characters and 584 English words. They also tag the speech with the gender of the speaker, the role they’re inhabiting (pilot or controller), whether the recording is good or bad quality, what phase of flight the plane being discussed is in, and what airport control tower the speech was collected from.

Why care about having automatic speech recognition (ASR) in an air-traffic control context? The authors put forward three main reasons: it makes it easy to create automated, real-time responses to verbal queries from human pilots; robotic pilots can work with human air-traffic controllers via ASR combined with a text-to-speech (TTS) system; and the ASR can be used to rapidly analyze historical archives of ATC speech.

What makes air traffic control (ATC) speech difficult to work with?

Volatile background noise ; controllers communicate with several pilots through the same radiofrequency, switching back and forth across different audio streams
Variable speech rates – ATC people tend to talk very quickly, but can also talk slowly
Multilingual: English is the universal language for ATC communication, but domestic pilots speak with controllers in local languages.
Code-switching: People use terms that are hard to mis-hear, eg saying “niner” instead of “nine”.
Mixed vocabulary: Some words are used very infrequently, leading to sparsity in the data distribution

Dataset availability: It’s a little unclear how to access the dataset. I’ve emailed the paper authors and will update this if I hear back.
Read more: ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment (Arxiv).

####################################################

You + AI + Lego = Become a Lego minifigure!
Next year, lego fanatics who visit the Legoland New York Resort could get morphed into lego characters with the help of AI.

At the theme park, attendees will be able to ride in the “Lego Factory Adventure Ride” which uses ‘HoloTrac’ technology to convert a human into a virtual lego character. “That includes copying the rider’s hair color, glasses, jewelry, clothing, and even facial expressions, which are detected and Lego-ized in about half a second’s time,” according to Gizmodo. There isn’t much information available regarding HoloTrac online, but various news articles say it is built on an existing machine learning platform developed y Holovis and uses modern computer vision techniques – therefore, it seems likely this system will be using some of the recent face/body-morphing style transfer tech that has been developed in the broader research community.

Why this matters: Leisure is culture, and as a kid who went to Legoland and has a bunch of memories as a consequence, I wonder how culture changes when people have rosy childhood memories of amusement park ‘rides’ that use AI technologies to magically turn people into toy-versions of themselves.
Read more: Lego Will Use AI and Motion Tracking To Turn Guests Into Minifigures at Its New York Theme Park (Gizmodo).

####################################################

Amazon gets ready for the AI-based traveling salesman:
…OR RI benchmark lets you test algorithms against three economically-useful logistics tasks…
Amazon, a logistics company masquerading as an e-retailer, cares about scheduling more than you do. Amazon is therefore constantly trying to improve the efficiency with which it schedules and plans various things. Can AI help? Amazon’s researchers have developed a set of three logistics-oriented tests that people can test AI systems against. They find that modern, relatively simple machine learning approaches can be on-par with handwritten systems. This finding may encourage further investment into applying ML to logistics tasks.

Three hard benchmarks:

Bin Packing: This is a fundamental problem which involves fitting things together efficiently, whether placing packages into boxes, or portioning out virtual machines across crowd infrastructure. (Import AI #93: Amazon isn’t the only one exploring this – Alibaba researchers have explored using AI for 3D bin-packing).
Newsvendor: “Decide on an ordering decision (how much of an item to purchase from a supplier) to cover a single period of uncertain demand”. This problem is “a good test-bed for RL algorithms given that the observation of rewards is delayed by the lead time and that it can be formulated as a Markov Decision Problem”. (In the real world, companies typically deal with multiple newsvendor-esque problems at once, further compounding the difficulty.)
Vehicle Routing: This is a generalization of the traveling salesman problem; one or more vehicles need to visit nodes in a graph in an optimal order to satisfy consumer demand. The researchers implement a stochastic vehicle routing test, which is where one of the problem parameters vary within a probability distribution (e.g., number of locations, trucks, etc), increasing the difficulty.

Key finding and why this matters: For each of their benchmarks, the researchers “show that trained policies from out-of-the-box RL algorithms with simple 2 layer neural networks are competitive with or superior to established approaches“. This is interesting – for many years, people have been asking themselves when reinforcement learning approaches that use machine learning systems will outperform hand-designed approaches on economically useful, real world tasks, and for many years people haven’t had many compelling examples (see this Twitter thread from me in October 2017 for more context). Discovering that ML-based RL techniques can be equivalent or better (in simulation!) is likely to lead to further experimentation and, hopefully, application.
Read more: ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems (Arxiv).
Get the code for the benchmarks and baselines from here (or-rl-benchmarks, official GitHub).

####################################################

Want some pre-trained language models? Try HuggingFace v2.2:
NLP startup HuggingFace has updated its free software library to version 2.2, incorporating four new NLP models: ALBERT, CamemBERT, DistillRoberta, and GPT-2-XL (1.5bn parameter version). The update includes support for encoder-decoder architectures, along with a new benchmarking section.

Why this matters: Libraries like HuggingFace’s NLP library dramatically speed up the rate at which new fresh-out-of-research models are plugged into real-world, production systems. This helps further mature the technology, which leads to further applications, which leads to more maturation, and so on.
Read more: HuggingFace v2.2 update (HuggingFace GitHub).

####################################################

Rise of the jellobots!
…Studying sim-2-real robots via 109 2-by-2-by-2 air-filled silicone-cubes…
Can we design robots entirely in simulation, then manufacture them in the real world? That’s the idea behind research from the University of Vermont, Yale University, and Tufts University, which explores the limitations in sim2real transfer by designing simple, soft robots in simulation and seeing how well the designs work in reality. And when I say soft robots, I mean soft! These tiny bots are 1.5cm-wide cubes made of silicon, some of which can be pumped with air to allow them to deform. Each “robot” is made out of a of 2-by-2-by-2 stack of these cubes, with a design algorithm determining the properties of each individual cube. This sounds simple, but the results are surprisingly complex.

An exhaustive sim-2-real(jello) study: For this study, the researchers come up with every possible permutation of soft robot within their design space. “At each x,y,z coordinate, voxels could either be passive, volumetrically actuated, or absent, yielding a total of 3^8 = 6561 different configurations”, they write. They then search over these morphologies for designs that can locomote effectively, then make 109 distinct real-world prototypes, nine of which are actuated so their movement can be tested.

What do we learn about simulation and reality? First, the researchers learn that simulators are hard – even modern ones. “We could not find friction settings in which the simulated movement direction matched the ground truth across all designs simultaneously,” they write. “This suggests that the accuracy of Coulomb friction model may be insufficient to model this type of movement.” However, many of their designs did successfully transfer from simulator to reality successfully – in the sense that they functioned – but sometimes they had different behaviors, like one robot that “pushes off its active limb” in simulation “whereas in reality the design uses its limb to pull itself forward, in the opposite direction”. Some of these behaviors may come down to difficulties with modeling shear and other forces in the simulation.

Why this matters: Cheap, small robots are in their Wright Brothers era, with a few prototypes like the jello-esque ones described here making their first, slow steps into the world. We should pay attention, because due to their inherent simplicity, soft robots may get deployed more rapidly than complex ones.
Read more: Scalable sim-to-real transfer of soft robot designs (Arxiv).
Get code assets here (sim2real4designs GitHub).

####################################################

Chips get political: RISC-V foundation moves from Delaware to Switzerland due to US-China tensions:
…Modern diplomacy? More like modern CHIPlomacy!…
Chips are getting political. In the past year, the US and China have begun escalating a trade war with eachother which has already led to tariffs and controls applied to certain technologies. Now, a US-based nonprofit chip foundation is so worried by the rising tensions that it has moved to Switzerland. The RISC-V foundation supports the development of a modern, open RISC-based chip architecture. RISC-V chips are destined from everything from smartphones to data center servers (though since chips take a long time to mature, we’re probably several years away from significant applications). The RISC-V foundation’s membership includes companies like the US’s Google as well as China’s Alibaba and Huawei. Now, the foundation is moving to Switzerland. “From around the world, we’ve heard that ‘if the incorporation was not in the U.S., we would be a lot more comfortable,” the foundation’s CEO, Calista Redmond, told Reuters. Various US politicians expressed concern about the move to Reuters.

Why this matters: Chips are one of the most complicated things that human civilization is capable of creating. Now, it seems these sophisticated things are becoming the casualties of rising geopolitical tensions between the US and China.
Read more: U.S.-based chip-tech group moving to Switzerland over trade curb fears (Reuters).

####################################################

Software + AI + Surveillance = China’s IJOP:
…How China uses software to help it identify Xinjiang residents for detection…
China is using a complex software system called the Integrated Joint Operations Platform (IJOP) to help it identify and track citizens in Xinjiang for detention by the state, according to leaked documents analyzed by the International Consortium of Investigative Journalists.

Inside the Integrated Joint Operations Platform (IJOP): IJOP collects information on citizens “then uses artificial intelligence to formulate lengthy lists of so-called suspicious persons based on this data”. IJOP is a machine learning system “that substitutes artificial intelligence for human judgement”, according to the ICIJ. The IJOP system is linked to surveillance cameras, street checkpoints, informants, and more. The system also tries to predict people that the state should consider detaining, then provides those predictions to people: “the program collects and interprets data without regard to privacy, and flags ordinary people for investigation based on seemingly innocuous criteria”, the ICIJ writes. In one week in 2018, IJOP produced 24,412 names of people to be investigated. “IJOP’s purpose extends far beyond identifying candidates for detention. Its purpose is to screen an entire population for behavior and beliefs that the government views with suspicion, the ICIJ writes.

Why this matters: In the 1970s, Chile tried to create a computationally-run society via Project Cybersyn. The initiative failed due to the relative immaturity of the computational techniques and political changes. In the later 1970s and 1980s the Stasi in East Germany started trying to use increasingly advanced technology to create a sophisticated surveillance dragnet which it applied to people living there. Now, advances in computers, digitization, and technologies like AI have made electronic management and surveillance of a society cheaper and easier than ever before. Therefore, states like China are compelled to use more and more of the technology in service of strengthening the state. Systems like IJOP and its use in Xinjiang are a harbinger of things to come – the difference between now and the past, is these systems might actually work… with chilling consequences.
Read more: Exposed: China’s Operating Manuals for Mass Internment and Arrest by Algorithm (International Consortium of Investigative Journalists).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

Germany’s AI regulation plans:
In October, Germany’s Data Ethics Commission released a major report on AI regulation. Most notably, they propose AI applications should be categorised by the likelihood that they will cause harm, and the severity of that harm. Regulation should be proportional to this risk: ‘level 5’ applications (the most risky) should be subject to a complete or partial ban; levels 3–4 should be subject to stringent transparency and oversight obligations.

Wider implications: The Commission proposes these measures are implemented as EU-wide ‘horizontal regulation’. It is likely to influence future European legislation, which is expected to emerge over the next tear. Whether it is a ‘blueprint’ for this legislation, as has been reported, remains to be seen.

   Why it matters: These plans are unlikely to be well-received by the AI policy community, which has generally cautioned against premature and overly stringent regulation. The independent advisory group to the European Commission on AI cautioned against “unnecessarily prescriptive regulation”, pointing out that in domains of fast technological progress, a ‘principles-based’ approach was generally preferable. If, as looks likely, Europe is an early mover in AI regulation, their successes and failures might inform how the rest of the world tackles this problem in the coming years.
   Read more: Opinion of the Data Ethics Commission.
   Read more: AI: Decoded – A German blueprint for AI rules (Politico).

AI Safety Unconference at NeurIPS:
For the second year running, there is an AI Safety Unconference at NeurIPS, on Monday December 9th. There are only a few spaces left, so register soon.
Read more: AI Safety Unconference 2019.

####################################################

Tech Tales

Fetch, robot!

The dog and the robot traveled along the highway, weaving a path between rusting cars and sometimes making small jumps over cracks in the tarmac. They’d sit in the cars at night, with the dog sleeping on whatever softness it could find and the robot sitting in a state of low power consumption. On sunny days the robot charged up its batteries with a solar panel that unfolded from its back like the wings of a butterfly. One of its wings had a missing piece in its scaffold which meant one of the panels dangled at an angle, rarely getting full sun. The dog would forage along the highway and sometimes bring back batteries it found for the robot – they rarely worked, but when they did the robot would – very delicately – place them inside itself and say, variously, “power capacity increased” or “defective component replaced” and the robot would wag its tail.

Sometimes they’d go past human bones and the robot would stop and take a photo. “Attempting to identify deceased,” it would verbalize. “Identification failed,” it would always say. Sometimes, the dog would grab a bone off of a skeleton and walk alongside the robot. For many years, the dog had tried to get the robot to throw a bone for it, but the robot had never learned how as it had not been built to be particularly attuned to learning from dogs. Sometimes the robot would pick up bones and the dog would get excited, but the robot only did this when the bones were in its way, and it only moved them far enough to clear a path for itself.

Sometimes the robot would get confused: it’d stop in front of a puddle of oil and say “route lost”, or pause and appear to stare into the woods, sometimes saying “unknown entity detected”. The dog learned that it could get the robot to keep moving by standing in front of its camera which would make the robot say “obstruction. Repositioning…” and then it’d move. On rare occasions it’d still be confused and would stay there, sometimes rotating its camera stalk. Eventually, the dog learned that it could headbut the robot and uses its own body to move it forward, and if it did this long enough the robot would say “route resolved” and keep trundling down the road.

A few months later, they rolled into a city where they met the big robots. The robot was guided in by a homing beacon and the dog followed the robot, untroubled by the big robots, or the drones that started to track them, or the cages full of bones.
HOW STRANGE, said one of the big robots to its other big robot friend, TO SEE THE ORGANIC MAKE A PET OF THE MACHINE.
YES, said the other big robot. OUR EXPERIENCE WAS THE INVERSE.

Things that inspired this story: The limitations of generalization; human feedback versus animal feedback mechanisms; the generosity and inherent kindness of most domesticated animals; Cormac McCarthy’s “The Road”; Kurt Vonnegut; starlight on long roads in winter; the sound of a loved one breathing in the temporary dark.

2 Comments to “Import AI 175: Amazon releases AI logistics benchmark; rise of the jellobots; China release an air traffic control recording dataset”

Import AI 181: Welcome to the era of Chiplomacy!; how computer vision AI techniques can improve robotics research ; plus Baidu’s adversarial AI software | Import AI says:

January 20, 2020 at 5:29 pm

[…] (first mentioned: Import AI 175) is what happens when countries compete with eachother for compute resources and other […]

Loading...

Import AI 198: TSMC+USA = Chiplomacy; open source Deepfakes; and environmental justice via ML tools | Import AI says:

May 18, 2020 at 6:24 pm

[…] consumption of computational capacity. Recent examples of Chiplomacy: – The RISC-V foundation moving from Delaware to Switzerland to make it easier for it to collaborate with chip architecture people from multiple countries. – […]

Loading...

Import AI