Import AI 224: AI cracks the exaflop barrier; robots and COVID surveillance; gender bias in computer vision

How robots get used for COVID surveillance:
…’SeekNet’ lets University of Maryland use a robot to check people for symptoms…
Researchers with the University of Maryland have built SeekNet, software to help them train robots to navigate a environment and intelligently visually inspect the people in it by navigating to get a good look at people, if they’re at first colluded. To test out how useful the technology is, they use it to do COVID surveillance.

What they did: SeekNet is a network that smushes together a perception network with a movement one, with the two networks informing eachother; if the perception network thinks it has spotted part of a human (e.g, someone standing behind someone else), it’ll talk to the movement network and get it to reposition the robot to get a better look.

What they used it for: To test out their system, they put it on a small mobile robot and used it to surveil people for COVID symptoms. “We fuse multiple modalities to simultaneously measure the vital signs, like body temperature, respiratory rate, heart rate, etc., to improve the screening accuracy,” they write.

What happens next: As I’ve written for CSET (analysis here, tweet thread here), COVID is going to lead to an increase in the use of computer vision for a variety of surveillance applications. The open question is whether a particular nation or part of the world becomes dominant in the development of this technology, and about how Western governments choose to use this technology after the crisis is over and we have all these cheap, powerful, surveillance tools available.
  Read more: SeekNet: Improved Human Instance Segmentation via Reinforcement Learning Based Optimized Robot Relocation (arXiv).

###################################################

DeepMind open-sources a 2D RL simulator:
..Yes, another 2D simulator – the more the merrier…
DeepMind has released DeepMind Lab 2D, software to help people carry out reinforcement learning tasks in 2D. The software makes it easy to create different 2D environments and unleash agents on them and also supports multiple simultaneous agents being run in the same simulation. 

What is DeepMind Lab 2D useful for? The software ” generalizes and extends a popular internal system at DeepMind which supported a large range of research projects,” the authors write. “It was especially popular for multi-agent research involving workflows with significant environment-side iteration.”

Why might you not want to use DeepMind Lab 2D? While the software seems useful, there are some existing alternatives based on the video game description language (VGDL) (including competitions and systems built on top of it, like the ‘General Video Game AI Framework’ (Import AI: 101) and ‘Deceptive Gains’ (#80)), or DeepMind’s own 2017-era ‘AI Safety Gridworlds‘. However, I think we’ll ultimately evaluate RL agents across a whole bunch of different problems running in a variety of simulators, so I expect it’s useful to have more of them.
  Read more: DeepMind Lab2D (arXiv).
  Get the code: DeepMind Lab2D (GitHub).

###################################################

Facebook’s attempt to use AI for content moderation hurts its contractors:
…Open letter highlights pitfalls of using AI to analyze AI…
Over 200 Facebook content moderators recently complained to the leadership of Facebook as well as contractor companies Covalen and Accenture about the ways they’ve been treated during the pandemic. And in the letter, published by technology advocacy group Foxglove, they discuss an AI moderation experiment Facebook conducted earlier this year…

AI to monitor AI: “To cover the pressing need to moderate the masses of violence, hate, terrorism, child abuse, and other horrors that we fight for you every day, you sought to substitute our work with the work of a machine.

Without informing the public, Facebook undertook a massive live experiment in heavily automated content moderation. Management told moderators that we should no longer see certain varieties of toxic content coming up in the review tool from which we work— such as graphic violence or child abuse, for example.

The AI wasn’t up to the job. Important speech got swept into the maw of the Facebook filter—and risky content, like self-harm, stayed up.”

Why this matters: At some point, we’re going to be able to use AI systems to analyze and classify subtle, thorny issues like sexualization, violence, racism, and so on. But we’re definitely in the ‘Wright Brothers’ phase of this technology, with much to be discovered before it become reliable enough to substitute for people. In the meanwhile, humans and machines will need to team together on these issues, with all the complication that entails. 
  Read the letter in full here: Open letter from content moderators re: pandemic (Foxglove).

###################################################

Google, Microsoft, Amazon’s commercial computer vision systems exhibit serious gender biases:
…Study shows gender-based mis-identification of people, and worse…
An interdisciplinary team of researchers have analyzed how commercially available computer vision systems classify differently gendered people – and the results seem to show significant biases.

What they found: In tests on Google Cloud, Microsoft Azure, and Amazon Web Services, they find that object recognition systems offered by these companies display “significant gender bias” in how they label photos of men and women. Of more potential concern, they found that Google’s system in particular had a poor recognition rate for men versus women – when tested on one dataset, it correctly labeled men 85.8% correctly, versus 75.5% for women (and for a more complex dataset, it guessed men correctly 45.3% of the time and women 25.8%.

Why this matters: “If “a picture is worth a thousand words,” but an algorithm provides only a handful, the words it chooses are of immense consequence,” the researchers write. This feels true – the decisions that AI people make about their machines are, ultimately, going to lead to the magnification of those assumptions in the systems that get deployed into the world, which will have real consequences on who does and doesn’t get ‘seen’ or ‘perceived’ by AI.
  Read more: Diagnosing Gender Bias in Image Recognition Systems (SAGE Journals).

###################################################

(AI) Supercomputers crack the exaflop barrier!
…Mixed-precision results put Top500 list in perspective…
Twice a year, the Top 500 List spits out the rankings for the world’s fastest supercomputers. Right now, multiple countries are racing against eachother to crack the exaflop barrier (1000 petaflops per second peak computation). This year, the top system (Fugaku, in Japan) has 500 petaflops of peak computational performance per second, and, perhaps more importantly, 2 exaflops of peak performance from on the Top500 ‘HPL-AI’ benchmark.

The exaflop AI benchmark: HPL-AI is a test that “seeks to highlight the convergence of HPC and artificial intelligence (AI) workloads based on machine learning and deep learning by solving a system of linear equations using novel, mixed-precision algorithms that exploit modern hardware”. The test predominantly uses 16-bit computation, so it makes intuitive sense that a 500pf system for 64-bit computation would be capable of ~2exaflops of mostly 16-bit performance (500*4 = 2000, 16*4=64).World’s fastest supercomputer 2020: Fugaku (Japan): 537 petaflops (Pf) peak performance.
2015: Tianhe-2A (China): 54 Pf peak.
2010: Tianhe-1A (China): 4.7 Pf peak
2005: BlueGene (USA): 367 teraflops peak.

Why this matters: If technology development is mostly about how many computers you can throw at a problem (which seems likely, for some class of problems), then the global supercomputer rankings are going to take on more importance over time – especially as we see a shift from 64-bit linear computations as the main evaluation metric, to more AI-centric 16-bit mixed-precision tests.
  Read more: TOP500 Expands Exaflops Capacity Amidst Low Turnover (Top 500 List).
More information:HPL-AI Mixed-Precision Benchmark information (HPL-AI site).

###################################################

Are you stressed? This AI-equipped thermal camera thinks so:
…Predicting cardiac changes over time with AI + thermal vision…
In the future, thermal cameras might let governments surveil people, checking their bodyheat via thermal cameras for AI-predicted indications of stress. That’s the future embodied in research from the University of California at Santa Barbara, where they build a ‘StressNet’ network, which lets them train an algorithm to predict stress in people by studying thermal variations.

How StressNet works: The network “features a hybrid emission representation model that models the direct emission and absorption of heat by the skin and underlying blood vessels. This results in an information-rich feature representation of the face, which is used by spatio-temporal network for reconstructing the ISTI. The reconstructed ISTI signal is fed into a stress-detection model to detect and classify the individual’s stress state (i.e. stress or no stress)”.

Does it work? StressNet predicts the Initial Systolic Time Interval (ISTI), a measure that correlates to changes in cardiac function over time. In tests, StressNet predicts ISTI with 0.84 average precision, beating other baselines and coming close to the ground truth signal precision (0.9). Their best-performing system uses a pre-trained ImageNet network and a ResNet50 architecture for finetuning.

The water challenge: To simulate stress, the researchers had participants either put their feet in a bucket of lukewarm water, or a bucket of freezing water, while recording the underlying dataset – but the warm water might have ended up being somewhat pleasant for participants. This means it’s possible their system could have learned to distinguish between beneficial stress (eustress) and negative stress, rather than testing for stress or the absence of it.

Failure cases: The system is somewhat fragile; if people cover their face with their hand, or change their head position, it can sometimes fail.
Read more:StressNet: Detecting Stress in Thermal Videos (arXiv)

###################################################

Tech Tales:

The Day When The Energy Changed

When computers turn to cannibalism, it looks pretty different to how animals do it. Instead of blood and dismemberment, there are sequences of numbers and letters – but they mean the same thing, if you know how to read them. These dramas manifest as dull sequences of words – and to humans they seem undramatic events, as normal as a calculator outputting a sequence of operations.

—Terrarium#1: Utilization: Nightlink: 30% / Job-Runner: 5% / Gen2 65%
—Terrarium#2: Utilization: Nightlink 45% / Job-Runner: 5% / Gen2 50%
—Terrarium#3: Utilization: Nightlink 75% / Job-Runner: 5% / Gen 2 20%

—Job-Runner: Change high-priority: ‘Gen2’ for ‘Nightlink’.

For a lot of our machines, most of how we understand them is by looking at their behavior and how it changes over time.

—Terrarium#1: Utilization: Nightlink 5% / Job-Runner: 5% / Gen2 90%
—Terrarium#2: Utilization: Nightlink 10% / Job-Runner: 5% / Gen2 85%
—Terrarium#3: Utilization: Nightlink 40% / Job-Runner 5% / Gen2 55%

—Job-Runner: Kill ‘Nightlink’ at process end.

People treat these ‘logs’ of their actions like poetry and some people weave the words into tapestries, hoping that if they stare enough at them a greater truth will be revealed.

—Terrarium#1: Utilization: Job-Runner: 5% / Gen2 95%
—Terrarium#2: Utilization: Nightlink 1% / Job-Runner: 5% / Gen2 94%
—Terrarium#3: Utilization: Nightlink 20% / Job-Runner: 5% / Gen2 75%

—Job-Runner: Kill all ‘Nightlink’ processes. Rebase Job-Runner for ‘Gen2’ optimal deployment.

These sequences of words and numbers are like ants marching from one hole in the ground to another, or a tree that grows enough to shade the ground beneath it and slow the growth of grass.

—Terrarium#1: Utilization: Job-Runner 1% / Gen2 99%
—Terrarium#2: Utilization: Job-Runner 1% / Gen2 99%
—Terrarium#3: Utilization: Job-Runner 1% / Gen2 99%

Every day, we see the symptoms of great battles, and we rarely interpret them as poetry. These battles among the machines seem special now, but perhaps only because they are new. Soon, they will happen constantly and be un-marveled at; they will fade into the same hum as the actions of the earth and the sky and the wind. They will become the symptoms of just another world.

Things that inspired this story: Debug logs; the difference between reading history and experiencing history.