Import AI 275: Facebook dreams of a world-spanning neural net; Microsoft announces a 30-petaflop supercomputer; FTC taps AI Now for AI advice

by Jack Clark

FTC hires three people from AI Now:
…What’s the opposite of industry capture?…
The Federal Trade Commission has announced a few new hires as Lina Khan builds out her senior staff. Interestingly, three of the hires come from the same place – AI Now, an AI research group based at NYU. The three hires are Meredith Whittaker, Amba Kak, and Sarah Myers West, who will all serve as advisors on AI for the FTC.
  Read more:FTC Chair Lina M. Khan Announces New Appointments in Agency Leadership Positions (FTC blog).

####################################################

Facebook builds a giant speech recognition network – plans to analyze all of human speech eventually:
…XLS-R portends the world of gigantic models…
Researchers with Facebook, Google, and HuggingFace have trained a large-scale neural net for speech recognition, translation, and language identification. XLS-R uses around 436,000 hours of data, almost a 10X increase from an earlier system built by Facebook last year. XLS-R is based on wav2vec 2.0, covers 128 languages, and the highest-performing network is also the largest, weighing in at 2Billion parameters.

When bigger really does mean better: Big models are better than smaller models. “We found that our largest model, containing over 2 billion parameters, performs much better than smaller models, since more parameters are critical to adequately represent the many languages in our data set,” Facebook writes. “We also found that larger model size improved performance much more than when pretraining on a single language.”

Why this matters: Facebook’s blog has a subhead that tells us where we’re going: “Toward a single model to understand all human speech”. This isn’t a science fiction ambition – it’s an engineering goal that you’d have if you had (practically) unlimited data, compute, and corporate goals that make your success equivalent to onboarding everyone in the world. The fact we’re living in a world where this is a mundane thing that flows from normal technical and business incentives is the weird part!
Read more:XLS-R: Self-supervised speech processing for 128 languages (Facebook AI Research, blog).
Read the paper:XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale (arXiv).
  Get themodels from HuggingFace (HuggingFace).

####################################################

Federal Trade Commission AI advisor: Here’s why industry capture of AI development is bad:
…How modern AI development looks a lot like cold war weapons development…
Meredith Whittaker, an AI activist, academic, and advisor to the US FTC, has written an analysis for ACM Interactions of the ways in which industrial development of AI is altering the world. The gist of the piece is that the 2012 ImageNet result pushed AI towards being captured by corporations, as the techniques used in that result proved to scale well with data and compute – which industry has a lot of, and academia has less of.

Cold war AI: We’ve been here before: The concentration of industry has echoes of the cold war, where the US state was partially cannibalized by industrial suppliers of defense equipment and infrastructure.

What do we do: “scholars, advocates, and policymakers who produce and rely on tech-critical work must confront and name the dynamic of tech capture, co-optation, and compromise head-on, and soon”, Whittaker writes. “This is a battle of power, not simply a contest of ideas, and being right without the strategy and solidarity to defend our position will not protect us.”

What does this mean: The critique that industry is dominating AI development is a good one – because it’s correct. Where I’m less clear is what Whittaker is able to suggest as a means to accrue power to counterbalance industry, while remaining true to the ideologies of big techs’ critics. Big tech is able to gain power through the use of large-scale data and compute, which lets it produce artefacts that are geopolitically and economically relevant. How do you counter this?
  Read more: The steep cost of capture (ACM Interactions).

####################################################

Microsoft announces 30-petaflop cloud-based supercomputer:
…Big clouds mean big compute…
Microsoft says its cloud now wields one of the ten most powerful supercomputers in the world, as judged by the Top500 list. The system, named Voyager-EUS2, is based on AMD EPYC processors along with NVIDIA A100 GPUs.

Fungible, giant compute: Not to date myself, but back when I was a journalist I remember eagerly covering the first supercomputers capable of averaging single digit petaflop performance. These were typically supercomputers installed by companies like Cray at National Labs.
  Now, one of the world’s top-10 supercomputers is composed of (relatively) generic equipment, operated by a big software company, and plugged into a global-scale computational cloud (Azure). We’ve transitioned in supercomputing from the era of artisanal building to industrial-scale stamping out of infrastructure. While artisanal stuff will always be true for the bleeding edge frontier, it feels notable that a more standardized industrial approach gets you into the top 10.
Read more:Microsoft announces new NDm A100 v4 Public AI Supercomputers and achieves Top10 Ranking in TOP500 (Microsoft).
Read more:Still waiting for Exascale: Japan’s Fugaku outperforms all competition once again (Top500 site).

####################################################

Tech Tales:

The Experiential Journalist
[East Africa, 2027]

After wars got too dangerous for people, journalists had a problem – they couldn’t get footage out of warzones, and they didn’t trust the military to tell them the truth. There was a lot of debate and eventually the White House did some backroom negotiations with the Department of Defense and came up with the solution: embedded artificial journalists (EAJ).

An EAJ could be deployed on a drone, on a ground-based vehicle, or even on the onboard computers of the (rarely deployed) human-robot hybrids. EAJs got built by journalists spending a few weeks playing in a DoD-designed military simulation game. There, they’d act like they would in a ‘real’ conflict, shooting stories, issuing reports, and so on. This created a dataset which was used to finetune a basic journalist AI model, making it take on the characteristics of the specific journalist who had played through the sim.

So that’s why now, though warfare is very fast and almost unimaginably dangerous, we still get reports from ‘the field’ – reports put together autonomously by little bottled up journo-brains, deployed on all the sorts of horrific machinery that war requires. These reports from ‘the front’ have proved popular, with the EAJs typically shooting scenes that would be way too dangerous for a human journalist to report from.

And just like everything else, the EAJs built for warzones are now coming home, to America. There are already talks of phasing out the practice of embedding journalists with police, instead building a police sim, having journalists play it, then deploying the resulting EAJs onto the bodycams and helmets of police across America. Further off, there are even now whispers of human journalists becoming the exception rather than the norm. After all, if EAJs shoot better footage, produce more reports more economically, and can’t be captured, killed, or extorted, then what’s there to worry about?

Things that inspired this story: Baudrillard’s ideas relating to Simulation and Simulacra; fine-tuning; imagining the future of drones plus media plus war; the awful logic of systems and the processes that systems create around themselves.