Import AI 209: Tractors+AI; AlphaFold makes more COVID progress; and UK government pulls immigration algorithm

by Jack Clark

How Google predicted the COVID epidemic:
…Cloud companies as 21st century weather stations…
Modern technology companies are like gigantic sensory machines that operate at global scale – they can detect trends way ahead of smaller entities (e.g, governments), because they have direct access to the actions of billions of people worldwide. A new blog post from Google gives us a sense of this, as the company describes how it was able to detect a dramatic rise in usage of Google Meet in Asia early in the year, which gave it a clue that the COVID pandemic was driving changes in consumer behavior on its platform. Purely from demand placed on Google’s systems, “tt became obvious that we needed to start planning farther ahead, for the eventuality that the epidemic would spread beyond the region”.

Load prediction and resource optimization: Google had to scale Google Meet significantly in response to demand, which meant the company invested in tools for better demand prediction, as well as tweaking how its systems assigned hardware resources to services running Google Meet. “By the time we exited our incident, Meet had more than 100 million daily meeting participants.”
Read more: Three months, 30X demand: How we scaled Google Meet during COVID-19 (Google blog).

###################################################

ML alchemy: extracting 3D structure from a bunch of 2D photographs:
…NeRF-W means tech companies are going to turn their userbase into a distributed, global army of 3D cartographers…
It sounds like sci-fi but it’s true – a new technique lets us grab a bunch of photos of a famous landmark (e.g, the Sistine Chapel), then use AI to figure out a 3D model from the images. This technique is called NeRF-W and was developed by researchers with Google as an extension of their prior work, NERF.

How it works: NERF previously only worked on reconstructing objects from well composed photographs with relatively little variety. NeRF-W extends this by being able to use photos with variable lighting and photometric post-processing, as well as being able to better disentangle the subjects of images from transient objects near them (e.g, cars, people). The resulting samples are really impressive (seriously, check them out), though the authors admit the system has flaws and “outdoor scene reconstruction from image data remains far from being fully solved”.

Why this matters – easy data generation (kind of): Techniques like NeRF-W are going to make ti easy for large technology entities with access to huge amounts of photography data (e.g, Google, Facebook, Tencent, etc) to create 3D maps of commonly photographed objects and places in the world. I imagine that such data will eventually be used to bootstrap the training of AI systems, either by providing an easy-to-create quantity of 3D data, or perhaps for automatically extracting environments from reality and training AIs against them in simulation, then re-uploading them onto robots running in the real world.
Read more: NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections (arXiv).
Check out videos at the paper overview site (Github).

###################################################

Blue River Technology: Tractors + machine learning = automated weed killers:
…”The machine needs to make real-time decisions on what is a crop and what is a weed”…
Agricultural robot startup Blue River Technologies is using machine learning to classify weeds on-the-fly, letting farmers automate the task of spraying crops with weedkillers. The project is a useful illustration of how mature machine learning is becoming and all the ways it is starting to show up in the economy.

On-tractor inference: They built a classifier using PyTorch, then ran it on a mobile NVIDIA Jetson AGX Xavier system for on-tractor inference. Interestingly, they do further optimization by converting their JIT models to ONNX (Import AI: 70), then converting ONNX to TensorRT, showing that the dream of multi-platform machine learning might be starting to occur.
Your 2020 tractor is a 2007 supercomputer: “The total compute power on board the robot just dedicated to visual inference and spray robotics is on par with IBM’s super computer, Blue Gene (2007)”, they write.
Read more: AI for AG: Production machine learning for agriculture (Medium).

###################################################

Delicious: 1.7 million arXiv articles in machine readable form:
Want to use AI to analyze AI that itself analyzes AI? Get the dataset:
arXiv is to machine learning as the bazaar is to traders – it’s the place where everyone lays out their wares, browses what other people have, and also supports a variety of secondary things like ‘what’s happening in the bazaar this week’ publications (Import AI, other newsletter). Now, you can get arXiv on Kaggle, making it easier to write software to analyze this delug of 1.7 million articles.

The details: “We present a free, open pipeline on Kaggle to the machine-readable arXiv dataset: a repository of 1.7 million articles, with relevant features such as article titles, authors, categories, abstracts, full text PDFs, and more,” Kaggle writes. “Our hope is to empower new use cases that can lead to the exploration of richer machine learning techniques that combine multi-modal features towards applications like trend analysis, paper recommender engines, category prediction, co-citation networks, knowledge graph construction and semantic search interfaces.”
Read more: Leveraging Machine Learning to Fuel New Discoveries with the arXiv Dataset (arXiv.org blog).
Get the dataset on Kaggle (Kaggle).

###################################################

Explore the immune system with Recursion’s RxRx2 dataset:
Recursion, a startup that uses machine learning to aid drug discovery, has released RxRx2, a dataset of 131,953 fluorescent microscopy images and their deep learning embeddings within the immune microenvironment. The dataset is about ~185GB in size, and follows Recursion’s earlier release of RxRx1 last year (Import AI 155).

“RxRx2 demonstrates both the great variety of morphological effects soluble factors have on HUVEC cells and the consistency of these effects within groups of similar function,” Recursion writes. “Through RxRx2, researchers in the scientific community will have access to both the images and the corresponding deep learning embeddings to analyze or apply to their own experimentation… scientific researchers can use the data to further demonstrate how high-content imaging can be used for screening immune responses and identification of functionally-similar factor groups.”

Why this matters – AI-automated scientific exploration: Datasets like this will help us develop techniques to analyze and map the high-dimensional relationships of vast troves of microscopy and other medical data. It’ll be interesting to see if in a few years benchmarks start to emerge that test out systems on suites of tests, including datasets like RxRx2 – that would give us a more holistic sense of progress in this space and how the world will be changed by it.
Read more: RxRx2 (Recursion’s ‘RxRx’ dataset site).

###################################################

Can computers help with the pandemic? DeepMind wants to try:
…AlphaFold seems to be making increasingly useful COVID predictions…
DeepMind published some predictions from its AlphaFold system about COVID back in March (Import AI: 189), and has followed up with “our most up-to-date predictions of five understudied SARS-CoV-2 targets here (including SARS-CoV-2 membrane protein, Nsp2, Nsp4, Nsp6, and Papain-like proteinase (C Terminal domain))”.

Is AlphaFold useful? It’s a bit too early to say, but the results from this study are promising. “The experimental paper confirmed several aspects of our model that at first seemed surprising to us (e.g. C133 looked poorly placed to form an inter-chain disulfide, and we found it difficult to see how our prediction would form a C4 tetramer). This bolsters our original hope that it might be possible to draw biologically relevant conclusions from AlphaFold’s blind prediction of even very difficult proteins, and thereby deepen our understanding of understudied biological systems,” DeepMind wrote.

Why this matters: The dream of AI is to be able to dump compute into a problem and get an answer out that is a) correct and either b) arrives faster than a human could generate it or b) is better/more correct than what a human could do. COVID has shown the world that our AI systems are (so far) not able to do this stuff seamlessly yet, but the success of things like AlphaFold should give us some cause for optimism.
Read more: Computational predictions of protein structures associated with COVID-19 (DeepMind).

###################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

UK government pulls biased immigration algorithm

The UK Home Office has stopped using a controversial automated decision-making process for visa applicants. Their algorithm, used since 2015, sorted applicants into three risk categories. It emerged that the sorting was partly on the basis of nationality, with applications from certain countries being automatically categorised as high-risk. This appears to have led to an unfortunate feedback loop, since rejection-rates for specific countries were feeding into the sorting algorithm. Civil liberties groups had been pursuing a legal challenge on the basis that the algorithm was unlawfully discriminating against individuals.

What to do about deepfakes:
Deepfakes are convincing digital forgeries, typically of audio or video. In recent years, people have been concerned that a proliferation of deepfakes might pose a serious risk to public discourse. This report from CSET outlines the threat from deepfakes, setting out two scenarios, and offering some policy recommendations.

Commodified deepfakes: Deepfakes could proliferate widely, and become so easy to produce that they become a ubiquitous feature of the information landscape. This need not be a major problem if defensive technologies are able to keep up with the most widely-used forgery methods. One reason for optimism is that as a particular method proliferates, this provides more data on which to train detector systems. It’s also worth noting that since online content distribution is highly-concentrated on a handful of platforms (Facebook, Google, Twitter), rolling out effective detection on these networks may be sufficient to prevent the most damaging effects of widespread proliferation.

Tailored deepfakes: More concerning is the possibility of targeted attacks to achieve some specific objective, particularly ‘zero-days’, where attackers exploit a vulnerability previously unknown to defenders, or attacks targeted at specific individuals through pathways that aren’t as well-defended (e.g. a phone line or camera feed vs. a social network).

Recommendations: (1) maintain a shared database of deepfake content to continually train detection algorithms on the latest forgeries; (2) encourage more consistent and transparent documentation of cutting-edge research in digital forgery; (3) commodify detection, empowering individuals and platforms to employ the latest detection techniques; (4) proliferate so-called ‘radioactive data’ — containing subtle digital signatures making it easy to detect synthetic media generated from it — in large public datasets.

Matthew’s view: This is a great report providing clarity to a threat that I’ve felt can sometimes be exaggerated. We already rely heavily on several media that are very easy to forge — text, photos, signatures, banknotes. We get by through some combination of improvements in detection technology, and a circumspect attitude to forgable media. There will be switching costs, as we adjust (e.g.) to trusting video less, but I struggle to see it posing a major risk to our relationship with information. The threat from deepfakes is perhaps best understood as just one example of two more general worries: that advances in AI might disproportionately favor offensive capabilities more than defensive ones (see Import 179); and that surprisingly fast progress might have a destabilizing effect in certain domains where we cannot respond quickly enough.

Read more: Deepfakes — A Grounded Threat Assessment (CSET)

###################################################

Tech Tales:

Speak and Spell or Speak Through Me, I Don’t Care, I Just Need You To Know I Love, and That I Can and Do Love You
[2026, A house in the down-at-heel parts of South East London, UK.]

He didn’t talk when he was a child, but he would scream and laugh. Then when it got to talking age, he’d grunt and point, but didn’t speak. He wasn’t stupid – the tests said he had a “rich, inner monologue”, but that he “struggled to communicate”.

All his schools had computers in them, so he could type, and his teachers mostly let him email them – even during class. “He’s a good kid. He listens. Sure, sometimes he makes noises, but it doesn’t get out of hand. I see no reason to have him moved to a different classroom environment,” one of his teachers said, after the parent of a child who didn’t like him tried to have him moved.

He used computers and phones – he’d choose from a range of text and audio and visual responses and communicate through the machines. His parents helped him build a language that they could use to communicate.
I love you, they’d say.
He’d make a pink light appear on his laptop screen.
We can’t wait to see what you do, they’d say.
He’d play a gif of a wall of question marks, each one rotating.
Make sure you go to bed by midnight, they’d say.
; ), he’d sign.

As he got older, he got more frustrated with himself. He felt like a normal person trapped inside a broken person. Around the age of 15, hormones ran through him and gave him new ideas, goals, and inclinations. But he couldn’t act on them, because people didn’t quite understand him. He felt like he was locked in a cell, and he could smell life outside it, but couldn’t open the door.

He’d send emails to girls in his class and sometimes they’d not reply, or sometimes they’d reply and make fun of him, and very rarely they’d reply genuinely – but because of the noises he made and how other people made fun of girls for hanging out with him, it never got very far.

One day, the summer he was 16, there was a power cut in his house. It was the evening and it was dark outside. His computer shut down. He got frustrated and started to make noises. Fumbled under his desk. Found a couple of machines he could run off of batteries and his phones. Set them up. They were like microphones on stands, but at the end of the microphone was a lightcone whose color and brightness he could change, and the microphone had a small motor where it connected to the stand which let it tilt back and forth – a simple programmable robot. He made it move and cast some lights on the wall.

His parent came in and said are you okay?
He nodded his lamp and the light was a pinkish hue of love, moving up and down on the wall.
That’s great. Can I read with you?
He nodded his lamp and his parent sat down next to his chair and took out their phone, then started reading something. While they read, he researched the etymology of a certain class of insects, then he used an architectural program to slowly build a mock nest for these insects. He made noises occasionally and his parent would look up, but see him making the object – they knew he was fine.
They read like that together, then his parent said: I’m going to go and make some food. Can you still beep me?
He made the phone in their pocket beep.
Good. I’ll be right there if you need me.

They went away and he carried on reading about the insects, letting them run around his mind, nurturing other ideas. He was focused and so he didn’t notice his batteries running down.

But they did. Suddenly, the lights went out and the little robot sticks drooped down on their stands. He just had his phone. But he didn’t beep. He looked at his phone and wanted to make noises, but held his breath. Waited. Watched the battery go to 5%, then 2%, then 1%, then it blinked out.

He closed his eyes in the dark; thought about insects.
Rocked back and forth.
Balled his hands into fists. Released and repeated.
Held a 3-D model of one of the insects in his mind and focused on rotating it, while breathing fast.

Then, inside the darkness of his shut eyes, he saw a glow.
He opened his eyes and saw his parent coming in with a candle.
They brought it over to him and he was making noises because he was nervous, but they didn’t get flustered. Kept approaching him. They carried the candle in one hand and a long white tube in the other. Put the candle down next to him and sat on the floor, then looked at him.
Are you okay? they said
He rocked back and forth. Made some noises.
I know this isn’t exactly easy for you, they said. I get that.
He managed to nod, though couldn’t look at them. Felt nervousness rise.
Look at this, his parent said.
He looked, and could see they had a poster, wrapped up. They unrolled it. It was covered in fluorescent insects; little simulacras of life giving off light in the dark room, casting luminous shadows on the wall. He and his parent looked at them.
I printed these yesterday, she said. Are they the same insects you were looking at today?
He looked – they were similar, but not quite the same. There was something about them, cast in shadow, that felt like they reflected a different kind of insect – one that was more capable and independent. But it was close. He nodded his head while rocking back and forth. Made some happy noises.
Now, which one of these would you say I most resemble?
He pointed at an insect that had huge wings and a large mid-section.
They laughed. Coming from anyone else, that’d be an insult, they said. Then looked him. But I think it’s fine if you say it. Now let me show you which one I think you are.
He clapped, which was involuntary, but also a sign of his happiness, and the candle went out.
Whoops, they said.
They sat in the dark together, looking at the shadows of their insects on the wall.

A few minutes later, their eyes adjusted to the darkness. Both of them could barely see each other, but they could more easily see the paper in between them, white – practically shining in the darkness – with splotches of color from all the creations of nature. And until the power came back on, they sat with the paper between them, saying things and pointing at the symbols, and using the insects as a bridge to communicate between two people.

By the time the power came back on, some insects meant love, and some insects meant hate. One butterfly was a memory of a beach and another was a dream. They had built this world together, though halting half-sounds and half-blind hands. And it was rich and real and theirs.
He could not speak, but he could feel, and when his parent touched some of the insects, he knew they could feel what he felt, a little. And that was good.

Things that inspired this story: AI tools used as expressive paintbrushes; idiosyncratic-modification as the root goal of some technology, how most people use tools to help them communicate; the use of technology to augment thinking and, eventually, enhance our ability to express ourselves truthfully to those around us; the endless malleability and adaptability of people.

2 Comments to “Import AI 209: Tractors+AI; AlphaFold makes more COVID progress; and UK government pulls immigration algorithm”

Deepfake news August 10, 2020 at 03:51PM - John Jason Fallows says:

August 10, 2020 at 12:51 pm

[…] Deepfake News – Import AI 209: Tractors+AI; AlphaFold makes more COVID progress; and UK government pulls immigration algorithm – https://jack-clark.net/2020/08/10/import-ai-209-tractorsai-alphafold-makes-more-covid-progress-and-u… […]

Import AI 237: GPT3 at 5X the speed; 6 hours of AI breakbeats; NeuralMMO++ | Import AI says:

February 22, 2021 at 7:08 pm

[…] work that has already gone on with supervised learning in this space via AlphaFold (Import AI 189; 209; 226). Read more: MSA Transformer (arXiv). Get the code here (Evolutionary Scale […]

Import AI