Import AI 259: Race + Medical Imagery; Baidu takes SuperGLUE crown; AlphaFold and the secrets of life

Uh oh – ML systems can make race-based classifications that humans can’t understand:
…Medical imagery analysis has troubling findings for people that want to deploy AI in a medical setting…
One of the reasons why artificial intelligence systems are challenging from a policy perspective is that they tend to learn to discriminate between things using features that may not be legal to use for discrimination – for example, image recognition systems will frequently differentiate between people on the basis of protected categories (race, gender, etc). Now, a bunch of researchers from around the world have found that machine learning systems can learn to discriminate between different races using features in medical images that aren’t intelligible to human doctors.
  Big trouble in big medical data: This is a huge potential issue. As the authors write: “our findings that AI can trivially predict self-reported race — even from corrupted, cropped, and noised medical images — in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to.”

What they found:
“Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities.” They tested out a bunch of models on datasets including chest x-rays, breast mammograms, CT scans (computed tomography), and more and found that models were able to tell different races apart even under degraded image settings. Probably the most inherently challenging finding is that “models trained on high-pass filtered images maintained performance well beyond the point that the degraded images contained no recognisable structures; to the human co-authors and radiologists it was not even clear that the image was an x-ray at all,” they write. In other words – these ML models are making decisions about racial classification (and doing it accurately) using features that humans can’t even observe, let alone analyze.

Why this matters:
We’re entering a world where an increasing proportion of the ‘thinking’ taking place in it is occurring via ML systems trained via gradient descent which ‘think’ in ways that we as humans have trouble understanding (or even being aware of). To deploy AI widely into society, we’ll need to be able to make sense of these alien intelligences.
Read more:
Reading Race: AI Recognises Patient’s Racial Identity In Medical Images (arXiv)

####################################################

Google spins out an industrial robot company:
…Intrinsic: industrial robots that use contemporary AI systems…
Google has spun Intrinsic out of Google X, the company’s new business R&D arm. Intrinsic will focus on industrial robots that are easier to customize for specific tasks than those we have today. “Working in collaboration with teams across Alphabet, and with our partners in real-world manufacturing settings, we’ve been testing software that uses techniques like automated perception, deep learning, reinforcement learning, motion planning, simulation, and force control,” the company writes in its launch announcement.

Why this matters:
This is not a robot design company – all the images on the announcement use mainstream industrial robotic arms from companies such as Kuka. Rather, Intrinsic is a bet that the recent developments in AI are mature enough to be transferred into the demanding, highly optimized context of the real world. If there’s value here, it could be a big deal – 355,000 industrial robots were shipped worldwide in 2019 according to the International Federation of Robotics, and there are more than 2.7 million robots deployed globally right now. Imagine if just 10% of these robots became really smart in the next few years?
  Read more:
Introducing Intrinsic (Google X blog).

####################################################

DeepMind publishes its predictions about the secrets of life:
…AlphaFold goes online…
DeepMind has published AlphaFold DB, a database of “protein structure predictions for the human proteome and 20 other key organisms to accelerate scientific research.”. AlphaFold is DeepMind’s system that has essentially cracked the protein folding problem (Import AI 226) – a grand challenge in science. This is a really big deal that has been widely covered elsewhere. It is also very inspiring – as I told the New York Times, this announcement (and the prior work) “shows that A.I. can do useful things amid the complexity of the real world“. This is a big deal! In a couple of years, I expect we’ll see AlphaFold predictions turn up as the input priors for a range of tangible advances in the sciences.
  Read more: AlphaFold Protein Structure Database (DeepMind).

####################################################

BAIDU sets new natural language understanding SOTA with ERNIE 3.0:
…Maybe Symbolic AI is useful for something after all?…
Baidu’s “ERNIE 3.0” system has topped the leaderboard of natural language understanding benchmark SuperGLUE, suggesting that byt combining symbolic and learned elements, AI developers can create something more than the sum of its parts.

What ERNIE is: ERNIE 3.0 is the third generation of the ERNIE model. ERNIE models combine large-scale pre-training (e.g, similar to what BERT or GPT-3 do) with learning from a structured knowledge graph of data. In this way, ERNIE models combine the contemporary ‘gotta learn it all’ paradigm with a more vintage symbolic-representation approach.
  The first version of ERNIE was built by Tsinghua and Huawei in early 2019 (Import AI 148), then Baidu followed up with ERNIE 2.0 a few months later (Import AI 158), and now they’ve followed up again with 3.0.

What’s ERNIE 3.0 good for?
ERNIE 3.0 is trained on “a large-scale, wide-variety and high-quality Chinese text corpora amounting to 4TB storage size in 11 different categories”, according to the authors, including a Baidu knowledge graph that contains “50 million facts”. In tests, ERNIE 3.0 does well on a broad set of language understanding and generation tasks. Most notably, it sets a new state-of-the-art on SuperGLUE, displacing Google’s hybrid T5-Meena system. SuperGLUE is a suite of NLU tests which is widely followed by researchers and can be thought of as being somewhat analogous to the ImageNet of text – so good performance on SuperGLUE tends to mean the system will do useful things in reality.

Why this matters:
ERNIE is interesting partially because of its fusion of symbolic and learned components, as well as being a sign of the further maturation of the ecosystem of natural language understanding and generation in China. A few years ago, Chinese researchers were seen as fast followers on various AI innovations, but ERNIE is one of a few models developed primarily by Chinese actors and now setting a meaningful SOTA on a benchmark developed elsewhere. We should take note.
  Read more:
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation (arXiv).

####################################################

Things that appear as toys will irrevocably change culture, exhibit 980: Toonify’d Big Lebowski
Here’s a fun YouTube video where someone runs a scene from the Big Lebowski through SnapChat’s ‘Snap Camera’ to warp the faces of Jeff Bridges and co from humans into cartoons. It’s a fun video that looks really good (apart from when the characters turn their heads at angles not well captured by the Toon setting’s data distribution). But, like most fun toys, it has a pretty significant potential for impact: we’re creating a version of culture where any given artifact can be re-edited and re-made into a different aesthetic form thanks to some of the recent innovations of deep learning.
  Check it out here: Nathan Shipley, Twitter.

####################################################

Want smart robots? See if you can beat the ‘BEHAVIOR’ challenge:
…1 agent versus 100 tasks…
Stanford researchers have created the ‘BEHAVIOR’ challenge and dataset, which aims to “tests the ability to perceive the environment, plan, and execute complex long-horizon activities that involve multiple objects, rooms, and state transitions, all with the reproducibility, safety and observability offered by a realistic physics simulation.”

What is BEHAVIOR:
BEHAVIOR is a challenge where simulated agents need to “navigate and manipulate the simulated environment with the goal of accomplishing 100 household activities”. The challenge involves agents represented as humanoid avatars with two hands, a head, and a torso, as well as taking the form of a commercial available ‘Fetch’ robot.

Those 100 activities in full:
Bottling fruit! Cleaning carpets! Packing lunches! And so much more! Read the full list here. “A solution is evaluated in all 100 activities,” the researchers write, “in three different types of instances: a) similar to training (only changing location of task relevant objects), b) with different object instances but in the same scenes as in training, and c) in new scenes not seen during training.”

Why this matters:
Though contemporary AI methods can work well on problems that can be accurately simulated (e.g, computer games, boardgames, writing digitized text, programming), they frequently struggle when dealing with the immense variety of reality. Challenges like BEHAVIOR will give us some signal on how well (simulated) embodied agents can do at these tasks.
  Read more:
BEHAVIOR Challenge @ ICCV 2021 (Stanford Vision and Learning Lab).

####################################################

Tech Tales:

Abstract Messages
[A prison somewhere in America, 2023]

There was a guy in here for a while who was locked down pretty tight, but could still get mail. They’d read everything and so everyone knew not to write him anything too crazy. He’d get pictures in the mail as well – abstract art, which he’d put up in his cellblock, or give to other cellmates via the in-prison gift system.

At nights, sometimes you’d see a cell temporarily illuminated by the blue light of a phone; there would be a flicker of light and then it would disappear, muted most likely by a blanket or something else.

Eventually someone got killed. No one inside was quite sure why, but we figured it was because of something they’d done outside. The prison administrator took away a lot of our privileges for a while – no TV, no library, less outside time, bad chow. You know, a few papercuts that got re-cut every day.

Then another person got killed. Like before, no one was quite sure why. But – like the time before – someone else had killed them. All our privileges got taken away for a while, again. And this time they went further – turned everyone’s rooms over.
  “Real pretty stuff,” one of the guards said, looking at some of the abstract art in someone’s room. “Where’d you get it?”
  “Got it from the post guy.”
  “Real cute,” said the guard, then took the picture off the wall and tested the cardboard with his hands, then ripped it in half. “Whoops,” said the guard, and walked out.

Then they found the same sorts of pictures in a bunch of the other cells, and they saw the larger collection in the room of the guy who was locked down. That’s what made them decide to confiscate all the pictures.
  “Regular bunch of artist freaks aren’t you,” one of the guards said, walking past us as we were standing at attention outside our turned-over cells.

A few weeks later, the guy who was locked down got moved out of the prison to another facility. We heard some rumors – high-security, and he was being moved because someone connected him to the killings. How’d they do that? We wondered. A few weeks later someone got the truth out of a guard: they’d found a loads of smuggled-in phones when they turned over the rooms, which they expected, but all the phones had a made-for-kids “smart camera” app that could tell you things about what you pointed your phones at. It turned out the app was a front – it was made by some team in the Philippines with some money from somewhere else, and when you turned the app on and pointed it to one of the paintings, it’d spit out labels like “your target is in Cell F:7”, or “they’re doing a sweep tomorrow night”, or “make sure you talked to the new guy with the face tattoo”.

So that’s why when we get mail, they just let us get letters now – no pictures.

Things that inspired this story:
Adversarial images; steganography; how people optimize around constraints; consumerized/DIY AI systems; AI security.