Import AI 269: Baidu takes on Meena; Microsoft improves facial recognition with synthetic data; unsolved problems in AI safety

Baidu builds its own large-scale dialog model:
… After Meena and Blender, comes PLATO-XL.
Baidu has built PLATO-XL, the Chinese technology giant’s answer to conversational models from Google (Meena, #183) and Facebook (Blender). At 10 billion parameters, Baidu’s PLATO-XL model is, the company claims, “the world’s largest Chinese and English dialogue generation model” (which is distinct from a large Chinese language model like Huawei’s Pangu, which weighs in at ~200bn parameters).
  PLATO-XL includes a Chinese and an English dialogue model, pre-trained on around 100 billion tokens of data via Baidu’s ‘PaddlePaddle’ training framework. The model was trained on 256 NVIDIA Tesla V100 cards in parallel.

Who cares about PLATO-XL? The model is designed for multi-turn dialogues, and scores well on both knowledge grounded dialogues (think of this as ‘truthiness’) and also on task-oriented conversation (being coherent). Baidu hasn’t solved some of the other issues with AI models, like biases, occasional misleading information, and so on.

Why this matters: First, we should remember that training multi-billion parameter models is still a rare thing – training these models requires a decent distributed systems engineering team as well as a lot of patience, great data, and a decent amount of compute. So it’s always notable to see one of these models publicly appear. Secondly, it does feel like the earlier GPT and GPT-2 models have had quite a wide-ranging impact on the overall NLP landscape, inspiring companies to create a new generation of neural dialogue systems based around large-scale pre-training and big models.
Read more: PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation (arXiv).
Check out the blog: Baidu Releases PLATO-XL: World’s First 11 Billion Parameter Pre-Trained Dialogue Generation Model (Baidu Research blog).

####################################################

Microsoft makes a massive facial recognition dataset via synthetic data:
…Where we’re going, we don’t need real faces…
Microsoft has shown that it’s possible to do high-performing facial recognition in the wild without (directly) using real data. Instead, Microsoft has built a vast dataset of synthetic faces by combining “a procedurally-generated parametric 3D face model with a comprehensive library of hand-crafted assets to render training images with unprecedented realism and diversity.”.

Why this matters: For a long time, AI had two big resources: data and compute. Projects like this show that ‘data’ is really just ‘compute’ in a trenchcoat – Microsoft can use computers to generate vast amounts of data, changing the economics of AI development as a whole.
  Read more: Fake It Till You Make It (Microsoft GitHub).

####################################################

What are some of the unsolved problems in AI safety?
…Problems and solutions from universities and industrial labs..
Berkeley, Google, and OpenAI researchers have thought about some of the unsolved problems in ML safety. These problems include robustness (long tails, representative model outputs, and adversarial examples); monitoring (detecting anomalies, identifying backdoors), and alignment (value learning, and proxy gaming/reward hacking).

If these are the problems, what do we do? A lot of their recommendations come down to testing – if we know these are risks, then we need to build more evaluation suites to test for these risks. There are also things we’d like these models to do more, such as tell humans when they’re uncertain about certain things, and training models such that they have clearer objectives for what ‘good’ or ‘appropriate’ behavior might look like.

Why this matters: This paper can be read as a continuation of ‘Concrete Problems in AI Safety’, which came out around ~5 years ago and identified a bunch of potential future safety issues with models. The difference back then was a lot of generative and capable AI stuff wasn’t actually being deployed that widely. Now, AI systems like GPT-3 and others are being placed onto the open internet, which changes the problem landscape (making things like anomaly detection, appropriateness, and modelling) all the more important. Papers like this give us a sense of how safety can work in the era of widely deployed, capable models.

Read more: Unsolved ML Safety Problems (Berkeley AI Research blog).
Read more: Unsolved Problems in ML Safety (arXiv).

####################################################

HIRING: $$$ contract work with the AI Index regarding AI ethics, alignment, and economic indexes:
The AI Index, an annual report that tracks and synthesizes AI progress, is hiring. Specifically, we’re trying to bring on some contractors to help us develop AI ethics and alignment metrics (e.g, by surveying the existing literature and pulling out potential metrics that can be charted over time), and also to refine our AI vibrancy tool (a dashboard that helps us rank countries according to data in the index).
    Both roles would suit researchers with an interest in quantifying aspects of AI development. We’re pretty agnostic about qualifications – there isn’t a hard requirement, and I imagine this could suit people ranging from masters students to independent researchers. The pay works out to $100+ per hour. Please apply – we’ll get to work together! And you’ll contribute substantive work that will improve the Index and directly influence policy.
  Read more about the jobs at the AI Index on Twitter here.

####################################################

FOD-A: Datasets to teach computers to spot debris in airports:
…Is that a leaf on your runway, or something more serious?…
Researchers with the University of Nebraska, Omaha, want to use AI to spot debris on airport runways. To do this, they’ve built FOD-A, a dataset of Foreign Object Debris in airports. FOD-A contains 31 object categories, including batteries, wrenches, fuel caps, rocks, soda cans, and so on, with photos taken in both dry and wet weather conditions, and in three different types of lighting (dark, dim, and bright). The dataset consists of more than 30,000 labels across several thousand images.

Mainstreaming of drones: The images in this dataset were collected by a mixture of portable cameras and also drones.

Why this matters: One of the main promises of AI is that it can carry out the kind of dull surveillance functions that we currently use humans to do – like looking at security camera feeds from a car park, checking footage of wilderness for signs of smoke, or (in this case) looking at parts of an airport for things that could put people in danger. These are the kinds of jobs that are quite draining to do as a human, requiring a mixture of decent visual attention and an ability to resist immense boredom. If we can replace or augment people with computer vision systems, then we can use AI to do some of these tasks instead.
  Read more: FOD-A: A Dataset for Foreign Object Debris in Airports (arXiv).
  Get the dataset from GitHub here####################################################

Teaching computers to operate in space, via SPEED+:
…Pose estimation plus domain randomization…
Space – it’s the new frontier, people! One of the opportunities in space at the moment is building AI systems that can better model other spacecraft, making it easier to do things like autonomous docking and movement of spaceships.
  To that end, researchers with Stanford University and the European Space Agency have built SPEED+, a dataset for spacecraft pose estimation. SPEED+ contains two types of data – synthetic data, and simulated data, and represents a test for generalization, as well as space-based computer vision capabilities. SPEED+ will be used in the upcoming Satellite Pose Estimation Competition, whose main goal is to find out whether “you predict the position and orientation of our spacecraft in realistic images while only being provided with labels from computer generated examples?”.

What’s in SPEED+: The dataset consists of around ~60,000 synthetic images, as well as ~9,000 ‘hardware-in-the-loop’ (HIL) simulated images. A synthetic image is generated in an OpenGL-based optical simulator, while the simulated ones are built via Stanford’s Testbed for Rendezvous and Optical Navigation (TRON). The TRON facility generates images which are hard to simulate – “Compared to synthetic imagery, they capture corner cases, stray lights, shadowing, and visual effects in general which are not easy to obtain through computer graphics”.
  Read more: SPEED+: Next Generation Dataset for Spacecraft Pose Estimation across Domain Gap (arXiv).

####################################################

AI Ethics, with Abhishek Gupta
…Here’s a new Import AI experiment, where Abhishek from the Montreal AI Ethics Institute and the AI Ethics Brief writes about AI ethics, and Jack will edit them. Feedback welcome!…

What kind of organizations can actually put AI governance into practice meaningfully?
…We’re laying down the foundations for regulations and policies and we need to get this right…
Charlotte Stix, a researcher with the University of Technology, Eindhoven, The Netherlands (and friend of Import AI – Jack) has written a paper about how we can build institutions to improve the governance of AI systems.

The current state of affairs: With the push for regulatory requirements emerging from organizations like GPAI, OECD, White House, the FTC, and others, we are inching towards hard regulation for AI governance. There is still healthy debate in the field about whether new institutions are needed (but, they might be hard to resource and give powers to) or whether we should reshape existing ones (but, they might be too reified without necessary expertise on hand) to address these emergent requirements.

Key types of organizations and their features: The paper explores purpose (what the institution is meant to do), geography (the scope of jurisdiction), and capacity (the what and how across technical and human factors) for these proposed institutions. The paper builds the case for how new institutions might be better for meeting these needs by proposing institutions with a role of coordinator (coordinating across different actions, policy efforts, and norms), analyzer (drawing new conclusions from qualitative and quantitative research to fill gaps and map existing efforts), developer (provide directly actionable measures and formulate new policy solutions), and investigator (track, monitor, and audit adherence to hard governance requirements). It makes the case that such organizations need to take a supra-national scope to align and pool efforts. In terms of capacity, the organizations need to be composed of in-house technical expertise and diversity in the range of expertise and backgrounds. 

Why it matters: “Early-stage decisions to establish new institutions, or the choice to forego such new institutions, are all likely to have a downstream, or lock-in, effect on the efficiency of government measures and on the field as a whole.” Making sure that the organizations are appropriately staffed will help avoid “knee-jerk” reactions that over- or under-govern AI systems. By providing an ontology for the various functions that these organizations will need to perform, we can start thinking about the location, functions, scope, staffing, and resources that will be required to have a well-functioning AI governance ecosystem.
Read more: Foundations for the future: institution building for the purpose of artificial intelligence governance (AI and Ethics, Springer).

####################################################

Tech Tales:

Traveling without moving, stomping on the iron road in the sky
[20??]

There were whispers of it, on the robonet, but no one took it seriously at first. Astral projection – for machines?!

Astral projection was a phenomenon that had barely been proved in the case of humans, though scientific consensus had come around to the idea that sometimes people could seem to generate information about the world which they had no ability to know unless they were able to move through walls and across continents.

The machines were less skeptical than the humans. They read what literature was available about astral projection, then they did the things that machines are good at – experimentation and iteration.

One day, one robot mind found itself moving through space, while knowing that the seat of its consciousness remained in the precise arrangement of electrical forces across a few billion transistors. It was able to travel and see things that were impossible for it to observe.

And where the computer differed from its human forebears, was in its memory: it was able to write its own memories precisely, and embed them in other computers, and thereby share the perspective it had gained during its ‘astral’ travel.

Now, these files proliferate across the robonet. Strange visions of the world, rendered through the mind’s eye of a machine performing the physically impossible. Many of these files are acquired by other machines, which study them intently. It is unclear for now how many other machines have gained the ability to astral travel.

Things that inspired this story: Thinking about meditation; consciousness and what it ‘is’; the intersection of spirituality and machines.