Import AI 255: The NSA simulates itself; China uses PatentNet to learn global commerce; are parameters the most important measure of AI?

by Jack Clark

With PatentNet, China tries to teach machines to ‘see’ the products of the world:
…6 million images today, heading to 60 million tomorrow…
Researchers with a few universities in Guangzhou, China, have built PatentNet, a vast labelled dataset of images industrial goods. PatentNet is the kind of large-scale, utility-class dataset that will surely be used to develop AI systems that can see and analyze millions of products, and unlock the meta analysis of the ‘features’ of an ever-expanding inventory of goods.

Scale: PatentNet contains 6 million industrial goods images today, and the researchers plan to scale it up to 60 million images over the next five years. The images are spread across 219 categories, with each category containing a couple of hundred distinct products, and a few images of each. “To the best of our knowledge, PatentNet is already the largest industrial goods database public available for science research, as regards the total number of industrial goods, as well the number of images in each category,” they write.

State data as a national asset: PatentNet has been built out of data submitted to the Guangdong Intellectual Property Protection Center of China from 2007 to 2020. “In PatentNet, all the information is checked and corrected by patent examiner of the China Intellectual Property Administrator. In this sense, the dataset labeling will be highly accurate,” the researchers write.

Why this matters – economies of insight: PatentNet is an example of a curious phenomenon in AI development that I’d call ‘economies of insight’ – the more diverse and large-scale data you have, the greater your ability to generate previously unseen insights out of it. Systems like PatentNet will unlock insights about products and also the meta-data of products that others don’t have. The strategic question is what ‘economies of insight’ mean with regard to entities in strategic competition with eachother, mediated by AI. Can we imagine Google and Amazon’s ad-engines being caught in a ‘economies of insight’ commercial race? What about competing intelligence agencies?
Read more: PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database (arXiv).

###################################################

Want to help the government think about bias in AI? Send NIST your thoughts!
…Submit your thoughts by August 5th…
NIST, the US government agency tasked with thinking about standards and measures for artificial intelligence, is thinking about how to identify and manage biases in AI technology. This is a gnarly problem that is exactly the kind of thing you’d hope a publicly-funded organization might work on. Now, NIST is asking for comments from the public on a proposed approach it has for working on bias. “We want to engage the community in developing voluntary, consensus-based standards for managing AI bias and reducing the risk of harmful outcomes that it can cause,” said NIST’s Reva Schwartz, in a statement.
Read more: NIST Proposes Approach for Reducing Risk of Bias in Artificial Intelligence (NIST.gov).

###################################################

NSA dreams of a future of algo-on-algo network warfare – and builds a simulator to help it see that future:
…FARLAND is how the National Security Agency aims to train its autonomous robot defenders…
In the future, wars will be thought at the speed of computational inferences. The first wars to look like this will be cyberwars, and some of the first aggressors and defenders in this war will be entities like the US Government’s National Security Agency. So it’s interesting to see the NSA and MITRE corporation write a research paper about FARLAND, “a framework for advanced Reinforcement Learning for autonomous network defense”.

What is FARLAND? The software lets people specify network environments with a variety of different actors (e.g, normal processes, aggressors, aggressors that are hiding, etc), custom reward functions, and bits of network state. FARLAND uses RLLib, an open source library that includes implementations of tried-and-tested RL algos like A2C, A3C, DQN, DDPG, APEX-DQN, and IMPALA. “FARLAND’s abstractions also separate the problems of defining security goals, network and adversarial models, from the problem of implementing a simulator or emulator to effectively turn these models into an environment with which the learning agent can interact,” the research paper says.

What’s the ultimate purpose of FARLAND? The software is intended to give “a path for autonomous agents to increase their performance from apprentice to superhuman level, in the task of reconfiguring networks to mitigate cyberattacks,” the NSA says. (Though, presumably, the same capabilities you develop to autonomously defend a network, will require having a rich understanding of the ways someone might want to autonomously attack a network). “Securing an autonomous network defender will need innovation not just in the learning and decision-making algorithms (e.g., to make them more robust against poisoning and evasion attacks), but also, it will require the integration of multiple approaches aimed at minimizing the probability of invalid behavior,” they write.

The NSA’s equivalent of Facebook’s ‘WES” approach: This being the 21st century, the NSA’s system is actually eerily similar to ‘WES”, Facebook’s “Web-Enabled SImulation” approach (Import AI 193) to simulating and testing its own gigantic big blue operating system. WES lets Facebook train simulated agents on its platform, helping it do some things similar to the red/blue-team development and analysis that the NSA presumably uses FARLAND for.

Synthetic everything: What’s common across FARLAND and WES? The idea that it’s increasingly sensible for organizations to simulate aspects of themselves, so they can gain an advantage relative to competitors.

Why this matters: The future is one defined by invisible war with battles fought by digital ghosts: FARLAND is about the future, and the future is really weird. In the future, battles are going to be continually thought by self-learning agents, constantly trying to mislead eachother about their own intentions, and the role of humans will be to design the sorts of crucibles into which we can pour data and compute and hope for the emergence of some new ghost AI model that can function approximate the terrible imaginings of other AI models developed in different crucibles by different people. Cybersecurity is drifting into a world of spirit summoning and reification – a Far Land that is closer than we may think.
Read more: Network Environment Design for Autonomous Cyberdefense (arXiv).

###################################################

Job alert! Join the Stanford AI Index as a Research Associate and help make AI policy less messed up:
…If you like AI measurement, AI assessment, and are detail-oriented, then this is for you…
I posted this job ad last week, but I’m re-posting it this week because the job ad remains open, and we’re aiming to interview a ton of candidates for this high-impact role. The AI Index is dedicated to analyzing and synthesizing data around AI progress. I work there (currently as co-chair), along with a bunch of other interesting people. Now, we’re expanding the Index. This is a chance to work on issues of AI measurement and assessment, improve the prototype ‘AI vibrancy’ tool we’ve built out of AI Index data, and support our collaborations with other institutions as well.
Take alook at the job and apply here (Stanford). (If you’ve got questions, feel free to email me directly).

###################################################

Parameters rule everything around me (in AI development, says LessWrong)
…Here’s another way to measure the advance of machine intelligence…
How powerful are AI symptoms getting? That’s a subtle question that no one has great answers to – as readers of Import AI know, we spend a huge amount of time on the thorny issue of AI measurement. But sometimes it’s helpful to find a metric that lets you zoom out and look at the industry more broadly, even though it’s a coarse measure. One measure that some people have found useful is measuring the raw amount of compute being dumped into developing different models (see: AI & Compute). Now, researchers with the Alignment Forum have done their own analysis of the parameter counts used in AI models in recent years. Their analysis yields two insights and one trend. The trend – parameter counts are increasing across models designed for a variety of modalities, ranging from vision to language to games to other things.

Two insights:
– “There was no discontinuity in any domain in the trend of model size growth in 2011-2012,” they note. “This suggests that the Deep Learning revolution was not due to an algorithmic improvement, but rather the point where the trend of improvement of Machine Learning methods caught up to the performance of other methods.”
– “There has been a discontinuity in model complexity for language models somewhere between 2016-2018. Returns to scale must have increased, and shifted the trajectory of growth from a doubling time of ~1.5 years to a doubling time of between 4 to 8 months”.

When parameters don’t have much of a signal: As the authors note, “the biggest model we found was the 12 trillion parameter Deep Learning Recommender System from Facebook. We don’t have enough data on recommender systems to ascertain whether recommender systems have been historically large in terms of trainable parameters.”
We covered Facebook’s recommender system here (Import AI #245), and it might highlight why a strict parameter measure isn’t the most useful comparison – it could be that you scale up parameter complexity in relation to the number of distinct types of input signal you feed your thing (where recommender models might have tons of inputs, and generic text or CV models may have comparatively fewer). Another axis on which to prod at this is the difference between dense and sparse models, where a sparse model may have way more parameters (e.g, if based on Mixture-of-Experts), but less of them are doing stuff than in a dense model. Regardless, very interesting research!
Read more:Parameter counts in Machine Learning (Alignment Forum).

###################################################

Don’t have a cloud? Don’t worry! Distributed training might actually work:
…Hugging Face experiment says AI developers can have their low-resource AI cake AND can train it, as well…
Researchers with Yandex, Hugging Face, and the University of Toronto have developed DeDLOC, a technique to help AI researchers pool their hardware together to collaboratively train significant AI models – no big cloud required.

DeDLOC, short for Distributed Deep Learning in Open Collaborations, tries to deal with some of the problems of distributed training – inconsistencies, network problems, heterogeneous hardware stacks, and all the related issues. It uses a variety of techniques to increase the stability of training systems and documents these ideas in the paper. Most encouragingly, they prototype the technique and show that it works.

Training a Bengali model in a distributed manner: A distributed team of 40 volunteers used DeDLOC to train sahajBERT, a Bengali language model. “In total, the 40 volunteers contributed compute time from 91 unique devices, most of which were running episodically,” the researchers write. “Although the median GPU time contributed by volunteers across all devices was ≈ 1.5 days, some participants ran the training script on several devices, attaining more than 200 hours over the duration of the experiment.” The ultimate performance of the model is pretty good, they say: “sahajBERT performs comparably to three strong baselines despite being pre-trained in a heterogeneous and highly unstable setting”.

Why this matters: AI has a resource problem – namely, that training large-scale AI systems requires a lot of compute. One of the ways to fix or lessen this problem is to unlock all the computational cycles in the hardware that already exists in the world, a lot of which resides on user desktops and not in major cloud infrastructure. Another way to lessen the issue is to make it easier for teams of people to form ad-hoc training collectives, temporarily pooling their resources towards a common goal. DeDLOC makes progress on both of this and paints a picture of a future where random groups of people come together online and train their own models for their own political purposes.
Read more: Distributed Deep Learning in Open Collaborations (arXiv).

###################################################

Tech Tales:

Food for Humans and Food for Machines
[The outskirts of a once thriving American town, 2040]

“How’s it going, Mac? You need some help,” I say, approaching a kneeled down Mac outside ‘Sprockets and Soup’. He looks up at me and I can tell he’s been crying. He sweeps up some of the smashed glass into a dustpan then picks it up and tosses it in a bin.
“They took the greeter,” he said, gesturing at the space in the window where the robot used to stand. “Bastards”.

Back when the place opened it was a novelty and people would fly in from all parts of the world to go there, bringing their robotic pets, and photographing themselves. There was even a ‘robodog park’ out front where some of the heat-resistant gardening bots would be allowed to ‘play’ with eachother – which mostly consisted of them cleaning eachother. You can imagine how popular it was.

Mac and his restaurant slash novelty venue rode the wave of robohuman excitement all the way up, buying up nearby lots and expanding the building. Then, for the past decade, he’s been riding the excitement all the way down.

People really liked robots until people stopped being able to figure out how to split the earnings across people and robots. Then the enthusiasm for placres like Sprockets and Soup went down – no one wants to tip a robot waiter and walk past a singing greeter when their own job is in jeopardy due to a robot. The restaurant did become a hangout for some of the local rich people, who would sit around and talk to eachother about how to get more people to ‘want’ robots, and how much of a problem it was that people didn’t like them as much, these days.

But that wasn’t really enough to sustain it, and so for the past couple of years Mac has been riding the fortunes of the place down to rock bottom. Recently, the vandalism has got worse – going from people graffiting the robots when the restaurant is open, to now where people are breaking into the place at night and smashing or stealing stuff.

“Alright,” Mac says, getting up. “Let’s go to the junkyard and see if we can buy it back. They know me there, these days”.

Things that inspired this story: Thinking about a new kind of ‘Chuck-E-Cheese’ for the AI era; decline and vandalism in ebbing empires; notions of how Americans might behave under economic growth and then economic contraction; dark visions of plausible futures.

Import AI