Import AI

Import AI 256: Facial recognition VS COVID masks; what AI means for warfare; CLIP and AI art

Turns out AI systems can identify people even when they’re wearing masks:
…Facial recognition VS People Wearing Masks: FR 1, Masks 0…
Since the pandemic hit in 2020, a vast chunk of the Earth’s human population have started wearing masks regularly. This has posed a challenge for facial recognition systems, many of which don’t perform as well when trying to identify people wearing masks. This year, the International Joint Conference on Biometrics hosted the ‘Masked Face Recognition’ (MFR) competition, which challenged teams to see how well they could train AI systems to recognize people wearing masks. 10 teams submitted 18 distinct systems into the competition, and their approach was evaluated according to performance (75% weighting) and efficiency (defined as parameter size, where smaller is better, weighted at 25%).

COVID accelerated facial recognition tech: The arrival of COVID caused a rise in research oriented around solving COVID-related problems with computer vision, such as facial recognition through masks, checking for people social distancing via automated analysis of video, and more. Researchers have been developing systems that can do facial recognition on people wearing masks for a while (e.g, this work from 2017, written up in Import AI #58), but COVID has motivated a lot more work in this area.

Who won? The overall winner of the competition was a system named TYAI, developed by TYAI, a Chinese AI company. Joint second place went to systems from the University of the Basque Country in Spain, as well as Istanbul Technical University in Turkey. Third place went to a system called A1 Simple from a Japanese company called ACES, along with a system called VIPLFACE-M from the Chinese Academy of Sciences, in China. Four of five top-ranked solutions use synthetically generated masks to augment the training dataset

Why this matters: “The effect of wearing a mask on face recognition in a collaborative environment is currently a sensitive issue,” the authors write. “This competition is the first to attract and present technical solutions that enhance the accuracy of masked face recognition on real face masks and in a collaborative verification scenario.”
  Read more:MFR 2021: Masked Face Recognition Competition (arXiv).

###################################################

Does AI actually matter for warfare? And, if so, how?
…The biggest impacts of War-AI? Reducing gaps between state and non-state actors…
Jack McDonald, a lecturer in war studies at Kings College London, has written an insightful blogpost about how AI might change warfare. His conclusions are that the capabilities of AI technology (where, for example, identifying a tank from the air is easy, but distinguishing between a civil/non-civil humvee is tremendously difficult), will drive war into more urban environments in the future. “One of the long-term effects of increased AI use is to drive warfare to urban locations. This is for the simple reason that any opponent facing down autonomous systems is best served by “clutter” that impedes its use,” he writes.

AI favors asymmetric actors: Another consequence is that the gradual diffusion of AI capabilities combined with the arrival of low-cost hardware (e.g, consumer drones), will give non-state actors/terror groups a larger menu of things to use when fighting against their opponents. “States might build all sorts of wonderful gizmos that are miles ahead of the next competitor state, but the fact that non-state armed groups have access to rudimentary forms of AI means that the gap between organised state militaries and their non-state military competitors gets smaller,” he writes. “What does warfare look like when an insurgent can simply lob an anti-personnel loitering munition at the FOB on the hill, rather than pestering it with ineffective mortar fire? From the perspective of states, and those who defend a state-centric international order, it’s not good.”

Why this matters: As McDonald writes, “AI doesn’t have to be revolutionary to have significant effects on the conduct of war”. Many of the consequences of AI being used in war will relate to how AI capabilities lower the cost curves of certain things (e.g, making surveillance cheap, or increasing the reliability of DIY-drone explosives) – and one of the macabre lessons of human history is that if you make a tool of war cheaper, then it gets used more (see: what the arrival of the AK-47 did for small arms conflicts).
Read more:What if Military AI is a Washout? (Jack McDonald blog).

###################################################

OpenAI’s CLIP and what it means for art:
…Now that AI systems can be used as magical paintbrushes, what happens next?…
In the past few years, a new class of generative models have made it easier for people to create and edit content. These systems can do things ranging from processing text, to audio, to images. One popular system is ‘CLIP’ from OpenAI, which was released as open source a few months ago. Now, a student at UC Berkeley has written a blog post summarizing some of the weird and wacky ways CLIP has been used by a variety of internet people to create cool stuff – take a read and check out the pictures and build your intuitions about how generative models might change art.

Why systems like CLIP matter: “These models have so much creative power: just input some words and the system does its best to render them in its own uncanny, abstract style. It’s really fun and surprising to play with: I never really know what’s going to come out; it might be a trippy pseudo-realistic landscape or something more abstract and minimal,” writes the author Charlie Snell. “And despite the fact that the model does most of the work in actually generating the image, I still feel creative – I feel like an artist – when working with these models.”
Read more: Alien Dreams: An Emerging Art Scene (ML Berkeley blog).

###################################################

Chinese researchers envisage a future of ML-managed cities; release dataset to help:
…CityNet shows how ML might be applied to city data…
Researchers from a few Chinese Universities as well as JD’s “Intelligent Cities Business Unit” have developed and released CityNet, a dataset containing traffic, layout, and meteorology data for 7 cities. Datasets like CityNet are the prerequisites for a future where machine learning systems are used to continuously analyze and forecast changing patterns of movement, resource consumption, and traffic in cities.

What goes into CityNet? CityNet has three types of data – ‘city layout’, which relates to information about the road networks and traffic of a city, ‘taxi’, which tracks taxis via their GPS data, and ‘meteorology’ which consists of weather data collected from local airports. Today, CityNet contains data from Beijing, Shanghai, Shenzhen, Chongqing, Xi’an, Chengdu, and Hong Kong.

Why this matters: CityNet is important because it gestures at a future where all the data from cities is amalgamated, analyzed, and used to make increasingly complicated predictions about city life. As the researchers write, “understanding social effects from data helps city governors make wiser decisions on urban management.
 Read more:CityNet: A Multi-city Multi-modal Dataset for Smart City Applications (arXiv).
  Get the code and dataset here (Citynet, GitHub repo).

###################################################

What happened at the world’s most influential computer vision conference in 2021? Read this and find out:
…Conference rundown gives us a sense of the future of computer vision…
Who published the most papers at the Computer Vision and Pattern Recognition conference in 2021? (China, followed by the US). How broadly can we apply Transformers to computer vision tasks? (Very broadly). How challenging are naturally-found confusing images for today’s object recognition systems? (Extremely tough). Find out the detailed answers to all this and more in this fantastic summary of CVPR 2021.
Read more: CVPR 2021: An Overview (Yassine, GitHub blog).

###################################################

Tech Tales:

Permutation Day
[Bedroom, 2027]

Will you be adventurous today? says my phone when I wake up.
“No,” I say. “As normal as possible.”
Okay, generating itinerary, says the phone.

I go back to sleep for a few minutes and wake when it starts an automatic alarm. While I make coffee in the kitchen, I review what my day is going to look like: work, food from my regular place, and I should reach out to my best friend to see if they want to hang out.

The day goes forward and every hour or so my phone regenerates the rest of the day, making probabilistic tweaks and adjustments according to my prior actions, what I’ve done today, and what the phone predicts I’ll want to do next, based on my past behavior.

I do all the things my phone tells me to do; I eat the food, I text my friend to hang out, I do some chores it suggests during some of my spare moments.
  “That’s funny,” my friend texts me back, “my phone made the same suggestion.”
  “Great minds,” I write back.
  And then my friend and I drink a couple of beers and play Yahtzee, with our phones sat on the table, recording the game, and swapping notes with eachother about our various days.

That night I go to sleep content, happy to have had a typical day. I close my eyes and in my dream I ask the phone to be more adventurous.
  When I wake I say “let’s do another normal day,” and the phone says Sure.

Things that inspired this story: Recommendation algorithms being applied to individual lives; federated learning; notions of novelty being less attractive than certain kinds of reliability. 

Import AI 255: The NSA simulates itself; China uses PatentNet to learn global commerce; are parameters the most important measure of AI?

With PatentNet, China tries to teach machines to ‘see’ the products of the world:
…6 million images today, heading to 60 million tomorrow…
Researchers with a few universities in Guangzhou, China, have built PatentNet, a vast labelled dataset of images industrial goods. PatentNet is the kind of large-scale, utility-class dataset that will surely be used to develop AI systems that can see and analyze millions of products, and unlock the meta analysis of the ‘features’ of an ever-expanding inventory of goods.

Scale: PatentNet contains 6 million industrial goods images today, and the researchers plan to scale it up to 60 million images over the next five years. The images are spread across 219 categories, with each category containing a couple of hundred distinct products, and a few images of each. “To the best of our knowledge, PatentNet is already the largest industrial goods database public available for science research, as regards the total number of industrial goods, as well the number of images in each category,” they write. 

State data as a national asset: PatentNet has been built out of data submitted to the Guangdong Intellectual Property Protection Center of China from 2007 to 2020. “In PatentNet, all the information is checked and corrected by patent examiner of the China Intellectual Property Administrator. In this sense, the dataset labeling will be highly accurate,” the researchers write.

Why this matters – economies of insight: PatentNet is an example of a curious phenomenon in AI development that I’d call ‘economies of insight’ – the more diverse and large-scale data you have, the greater your ability to generate previously unseen insights out of it. Systems like PatentNet will unlock insights about products and also the meta-data of products that others don’t have. The strategic question is what ‘economies of insight’ mean with regard to entities in strategic competition with eachother, mediated by AI. Can we imagine Google and Amazon’s ad-engines being caught in a ‘economies of insight’ commercial race? What about competing intelligence agencies?
Read more: PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database (arXiv).

###################################################

Want to help the government think about bias in AI? Send NIST your thoughts!
…Submit your thoughts by August 5th…
NIST, the US government agency tasked with thinking about standards and measures for artificial intelligence, is thinking about how to identify and manage biases in AI technology. This is a gnarly problem that is exactly the kind of thing you’d hope a publicly-funded organization might work on. Now, NIST is asking for comments from the public on a proposed approach it has for working on bias. “We want to engage the community in developing voluntary, consensus-based standards for managing AI bias and reducing the risk of harmful outcomes that it can cause,” said NIST’s Reva Schwartz, in a statement.
Read more: NIST Proposes Approach for Reducing Risk of Bias in Artificial Intelligence (NIST.gov).

###################################################

NSA dreams of a future of algo-on-algo network warfare – and builds a simulator to help it see that future:
…FARLAND is how the National Security Agency aims to train its autonomous robot defenders…
In the future, wars will be thought at the speed of computational inferences. The first wars to look like this will be cyberwars, and some of the first aggressors and defenders in this war will be entities like the US Government’s National Security Agency. So it’s interesting to see the NSA and MITRE corporation write a research paper about FARLAND, “a framework for advanced Reinforcement Learning for autonomous network defense”.

What is FARLAND? The software lets people specify network environments with a variety of different actors (e.g, normal processes, aggressors, aggressors that are hiding, etc), custom reward functions, and bits of network state. FARLAND uses RLLib, an open source library that includes implementations of tried-and-tested RL algos like A2C, A3C, DQN, DDPG, APEX-DQN, and IMPALA. “FARLAND’s abstractions also separate the problems of defining security goals, network and adversarial models, from the problem of implementing a simulator or emulator to effectively turn these models into an environment with which the learning agent can interact,” the research paper says.

What’s the ultimate purpose of FARLAND? The software is intended to give “a path for autonomous agents to increase their performance from apprentice to superhuman level, in the task of reconfiguring networks to mitigate cyberattacks,” the NSA says. (Though, presumably, the same capabilities you develop to autonomously defend a network, will require having a rich understanding of the ways someone might want to autonomously attack a network). “Securing an autonomous network defender will need innovation not just in the learning and decision-making algorithms (e.g., to make them more robust against poisoning and evasion attacks), but also, it will require the integration of multiple approaches aimed at minimizing the probability of invalid behavior,” they write.

The NSA’s equivalent of Facebook’s ‘WES” approach: This being the 21st century, the NSA’s system is actually eerily similar to ‘WES”, Facebook’s “Web-Enabled SImulation” approach (Import AI 193) to simulating and testing its own gigantic big blue operating system. WES lets Facebook train simulated agents on its platform, helping it do some things similar to the red/blue-team development and analysis that the NSA presumably uses FARLAND for.

Synthetic everything: What’s common across FARLAND and WES? The idea that it’s increasingly sensible for organizations to simulate aspects of themselves, so they can gain an advantage relative to competitors.

Why this matters: The future is one defined by invisible war with battles fought by digital ghosts: FARLAND is about the future, and the future is really weird. In the future, battles are going to be continually thought by self-learning agents, constantly trying to mislead eachother about their own intentions, and the role of humans will be to design the sorts of crucibles into which we can pour data and compute and hope for the emergence of some new ghost AI model that can function approximate the terrible imaginings of other AI models developed in different crucibles by different people. Cybersecurity is drifting into a world of spirit summoning and reification – a Far Land that is closer than we may think.
  Read more: Network Environment Design for Autonomous Cyberdefense (arXiv).

###################################################

Job alert! Join the Stanford AI Index as a Research Associate and help make AI policy less messed up:
…If you like AI measurement, AI assessment, and are detail-oriented, then this is for you…
I posted this job ad last week, but I’m re-posting it this week because the job ad remains open, and we’re aiming to interview a ton of candidates for this high-impact role. The AI Index is dedicated to analyzing and synthesizing data around AI progress. I work there (currently as co-chair), along with a bunch of other interesting people. Now, we’re expanding the Index. This is a chance to work on issues of AI measurement and assessment, improve the prototype ‘AI vibrancy’ tool we’ve built out of AI Index data, and support our collaborations with other institutions as well.
Take alook at the job and apply here (Stanford). (If you’ve got questions, feel free to email me directly).

###################################################

Parameters rule everything around me (in AI development, says LessWrong)
…Here’s another way to measure the advance of machine intelligence…
How powerful are AI symptoms getting? That’s a subtle question that no one has great answers to – as readers of Import AI know, we spend a huge amount of time on the thorny issue of AI measurement. But sometimes it’s helpful to find a metric that lets you zoom out and look at the industry more broadly, even though it’s a coarse measure. One measure that some people have found useful is measuring the raw amount of compute being dumped into developing different models (see: AI & Compute). Now, researchers with the Alignment Forum have done their own analysis of the parameter counts used in AI models in recent years. Their analysis yields two insights and one trend. The trend – parameter counts are increasing across models designed for a variety of modalities, ranging from vision to language to games to other things.

Two insights:
– “There was no discontinuity in any domain in the trend of model size growth in 2011-2012,” they note. “This suggests that the Deep Learning revolution was not due to an algorithmic improvement, but rather the point where the trend of improvement of Machine Learning methods caught up to the performance of other methods.”
– “There has been a discontinuity in model complexity for language models somewhere between 2016-2018. Returns to scale must have increased, and shifted the trajectory of growth from a doubling time of ~1.5 years to a doubling time of between 4 to 8 months”.

When parameters don’t have much of a signal: As the authors note, “the biggest model we found was the 12 trillion parameter Deep Learning Recommender System from Facebook. We don’t have enough data on recommender systems to ascertain whether recommender systems have been historically large in terms of trainable parameters.”
We covered Facebook’s recommender system here (Import AI #245), and it might highlight why a strict parameter measure isn’t the most useful comparison – it could be that you scale up parameter complexity in relation to the number of distinct types of input signal you feed your thing (where recommender models might have tons of inputs, and generic text or CV models may have comparatively fewer). Another axis on which to prod at this is the difference between dense and sparse models, where a sparse model may have way more parameters (e.g, if based on Mixture-of-Experts), but less of them are doing stuff than in a dense model. Regardless, very interesting research!
Read more:Parameter counts in Machine Learning (Alignment Forum).

###################################################

Don’t have a cloud? Don’t worry! Distributed training might actually work:
…Hugging Face experiment says AI developers can have their low-resource AI cake AND can train it, as well…
Researchers with Yandex, Hugging Face, and the University of Toronto have developed DeDLOC, a technique to help AI researchers pool their hardware together to collaboratively train significant AI models – no big cloud required.

DeDLOC, short for Distributed Deep Learning in Open Collaborations, tries to deal with some of the problems of distributed training – inconsistencies, network problems, heterogeneous hardware stacks, and all the related issues. It uses a variety of techniques to increase the stability of training systems and documents these ideas in the paper. Most encouragingly, they prototype the technique and show that it works.

Training a Bengali model in a distributed manner: A distributed team of 40 volunteers used DeDLOC to train sahajBERT, a Bengali language model. “In total, the 40 volunteers contributed compute time from 91 unique devices, most of which were running episodically,” the researchers write. “Although the median GPU time contributed by volunteers across all devices was ≈ 1.5 days, some participants ran the training script on several devices, attaining more than 200 hours over the duration of the experiment.” The ultimate performance of the model is pretty good, they say: “sahajBERT performs comparably to three strong baselines despite being pre-trained in a heterogeneous and highly unstable setting”.

Why this matters: AI has a resource problem – namely, that training large-scale AI systems requires a lot of compute. One of the ways to fix or lessen this problem is to unlock all the computational cycles in the hardware that already exists in the world, a lot of which resides on user desktops and not in major cloud infrastructure. Another way to lessen the issue is to make it easier for teams of people to form ad-hoc training collectives, temporarily pooling their resources towards a common goal. DeDLOC makes progress on both of this and paints a picture of a future where random groups of people come together online and train their own models for their own political purposes.
Read more: Distributed Deep Learning in Open Collaborations (arXiv).

###################################################

Tech Tales:

Food for Humans and Food for Machines
[The outskirts of a once thriving American town, 2040]

“How’s it going, Mac? You need some help,” I say, approaching a kneeled down Mac outside ‘Sprockets and Soup’. He looks up at me and I can tell he’s been crying. He sweeps up some of the smashed glass into a dustpan then picks it up and tosses it in a bin.
  “They took the greeter,” he said, gesturing at the space in the window where the robot used to stand. “Bastards”.

Back when the place opened it was a novelty and people would fly in from all parts of the world to go there, bringing their robotic pets, and photographing themselves. There was even a ‘robodog park’ out front where some of the heat-resistant gardening bots would be allowed to ‘play’ with eachother – which mostly consisted of them cleaning eachother. You can imagine how popular it was.

Mac and his restaurant slash novelty venue rode the wave of robohuman excitement all the way up, buying up nearby lots and expanding the building. Then, for the past decade, he’s been riding the excitement all the way down.

People really liked robots until people stopped being able to figure out how to split the earnings across people and robots. Then the enthusiasm for placres like Sprockets and Soup went down – no one wants to tip a robot waiter and walk past a singing greeter when their own job is in jeopardy due to a robot. The restaurant did become a hangout for some of the local rich people, who would sit around and talk to eachother about how to get more people to ‘want’ robots, and how much of a problem it was that people didn’t like them as much, these days.

But that wasn’t really enough to sustain it, and so for the past couple of years Mac has been riding the fortunes of the place down to rock bottom. Recently, the vandalism has got worse – going from people graffiting the robots when the restaurant is open, to now where people are breaking into the place at night and smashing or stealing stuff.

“Alright,” Mac says, getting up. “Let’s go to the junkyard and see if we can buy it back. They know me there, these days”.

Things that inspired this story: Thinking about a new kind of ‘Chuck-E-Cheese’ for the AI era; decline and vandalism in ebbing empires; notions of how Americans might behave under economic growth and then economic contraction; dark visions of plausible futures.

Import AI 254: Facebook uses AI for copyright enforcement; Google uses RL to design better chips.

Agronerds rejoice… a pan-European crop parcel + satellite image dataset is on the way:
The University of Munich and a Geospatial company called GAF AG want to create a map of as much of the farmland in Europe as possible (with data for specific crop types and uses for each individual parcel of land), then pair this with geospatial data gathered by SENTINEL satellites. The dataset is called EuroCrops and the idea is to use it as the data fuel that might go into a system which uses machine learning to automatically classify and map crop types from a variety of data sources. This is the kind of ‘dull but worthy’ research that illustrates how much effort goes into creating some science-targeted datacenters. For instance…

A whole lot of work: The authors contacted ministries, agricultural departments, and authorities from 24 European states. As a result, the initial version of EUROCROPS contains data for 13 countries: Austria, Belgium, Croatia, Denmark, Estonia, France, Latvia, Lithuania, Netherlands, Portugal, Sweden, Slovakia, and Slovenia. There are also plans to incorporate data from Finland, Romania, and Spain. To assemble this dataset, they also needed to translate all the country’s non-harmonized ways of describing crops into a single schema, which they then apply across the dataset. That’s the kind of excruciatingly painful task required to make country-level data become legible when compared internationally.

Demo dataset: A full dataset is expected in time, but to start they’ve published a demo dataset covering data from Austria, Denmark, and Slovenia, and made this available in a variety of formats (CSV, HDF5 for the Sentinel data, and GeoJSON).
  Read more: EuroCrops: A Pan-European Dataset for Time Series Crop Type Classification (arXiv).
  Get the dataset from the official EuroCrops website.

###################################################

Big models are great – but they’re also getting more efficient, like this massive mixture-of-experts vision system:
…Sparsity comes to computer vision…
Google has built a large-scale, sparse model for computer vision, using a technology called a V-MoE (a Vision Mixture-of-Experts model). V-MoE is a variant of the ‘Vision Transformer’ (ViT) architecture which swapped out convolutions for transformers, and has been the key invention behind a bunch of recent impressive results out of Google. Google uses the V-MoE to train a vision model of 15B parameters – “the largest vision models to date”, it says in a research paper. These models can match the performance of other state-of-the-art dense models, but taking less time to train.

Top scores and surprising efficiency: Google’s largest V-MoE model gets 90.35% test accuracy on ImageNet. More intriguingly, their performance might be better than alternative dense models: “V-MoEs strongly outperform their dense counterparts on upstream, few-shot and full fine-tuning metrics in absolute terms. Moreover, at inference time, the V-MoE models can be adjusted to either (i) match the performance of the largest dense model while using as little as half of the amount of compute, or actual runtime, or (ii) significantly outperform it at the same cost.” The V-MoE models were pre-trained on JFT-300M, Google’s secret in-house dataset.

Why this matters: Besides the scores, these results matter in terms of efficiency – most of the energy-consumption of neural nets happens during inference after they’ve been trained. This MoE approach “takes the most efficient models and makes them even more efficient without any further model adaptation,” according to Google. Put another way: the people capable of training big models are going to be able to expand the margins on their services perhaps faster than those slowly dealing with small models – the rich (might) get richer. 
  Read more: Scaling Vision with Sparse Mixture of Experts (arXiv).

###################################################

One big thing: Google’s AI tools are now helping it build better chips:
…Welcome to corporate-level recursive-self-improvement…
Google has published a paper in Nature showing how it has used reinforcement learning to help it design the layout of chips, taking work which previously took humans months and converting it into about six hours of work. The results are chips that are superior or comparable to those designed by humans in critical areas like power consumption, performance, and chip area. “Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators,” the researchers write.

Where this came from: This is not, technically, new research – Google has been publishing on using RL for chip design for quite some time – the company published an early paper on this technique back in March 2020 (Import AI #191). But the fact they’ve been used to design the fifth generation of tensor processing units (TPUs) is a big deal.

Why this matters: I sometimes think of Google as a corporation made of human-designed processes that is slowly morphing into a bubbling stew defined equally by humans and AI systems. In the same way Google has recently been exploring using AI tools for things as varied as database lookups, power management in datacenters, and the provision of consumer-facing services (e.g, search, translation), it’s now using AI to help it design more effective infrastructure for itself. With this research, Google has shown it can train machines to build the machines that will train subsequent machines. How soon, I wonder, till the ‘speed’ of these processes become so rapid that we start iterating through TPU generations on the order of weeks rather than years?
  Read more: A graph placement methodology for fast chip design (Nature).

###################################################

Why AI policy is messed up and how to make it better, a talk and an idea:
…Hint: It’s all about measurement…
I think most of the problems of AI policy stem from the illegibility of AI systems (and to a lesser extent, the organizations designing these systems). That’s why I spend a lot of my time working on policy proposals / inputs to improve our ability to measure, assess, and analyze AI systems. This week, I spoke with Jess Whittlestone at Cambridge about ways we can better measure and assess AI systems, and also gave a talk at a NIST workshop on some issues in measurement/assessment of contemporary systems. I’m generally trying to make myself more ‘legible’ as a policy actor (since my main policy idea is… demanding legibility from AI systems and the people building them, haha!).  Read more: Cutting Edge: Understanding AI systems for a better AI policy with Jack Clark (YouTube).
Check out the slides for the talk here (Google Slides).
Check out some related notes from remarks I gave at a NIST workshop last week, also (Twitter).

###################################################

Job alert! Join the AI Index as a Research Associate and help make AI policy less messed up:
…If you like AI measurement, AI assessment, and are detail-oriented, then this is for you…
The AI Index is dedicated to analyzing and synthesizing data around AI progress. I work there (currently as co-chair), along with a bunch of other interesting people. Now, we’re expanding the Index. This is a chance to work on issues of AI measurement and assessment, improve the prototype ‘AI vibrancy’ tool we’ve built out of AI Index data, and support our collaborations with other institutions as well.
Take a look at the job and apply here (Stanford). (If you’ve got questions, feel free to email me directly).

###################################################

Facebook releases a data augmentation tool to help people train systems that are more robust and can spot stuff designed to evade them:
…Facebook uses domain randomization to help it spot content that people want to be invisible to Facebook’s censors…
Facebook has built and released AugLy, software for augmenting and randomizing data. AugLy makes it easy for people to take a piece of data – like an image, piece of text, audio file, or movie – then generate various copies of that data with a bunch of transformations applied. This can help people generate additional data to train their systems on, and can also serve as a way to test the robustness of existing system (e.g, if your image recognition system breaks when people take an image and put some meme text on it, you might have a problem).
  Most intriguingly, Facebook says a motivation for AugLy is to help it train systems that can spot content that has been altered deliberately to evade them. “Many of the augmentations in AugLy are informed by ways we have seen people transform content to try to evade our automatic systems,” Facebook says in a blog announcing the tool.

Augly and Copyright fuzzing: One thing AI lets you do is something I think of as ‘copyright fuzzing’ – you can take a piece of text, music, or video, and you can warp it slightly by changing some of the words or tones or visuals (or playback speed, etc) to evade automatic content-IP detection systems. Tools like AugLy will also let AI developers train AI systems that can spot fuzzed or slightly changed content.
This also seems to be a business case for Facebook as, per the blog post: “one important application is detecting exact copies or near duplicates of a particular piece of content. The same piece of misinformation, for example, can appear repeatedly in slightly different forms, such as as an image modified with a few pixels cropped, or augmented with a filter or new text overlaid. By augmenting AI models with AugLy data, they can learn to spot when someone is uploading content that is known to be infringing, such as a song or video.”
Read more: AugLy: A new data augmentation library to help build more robust AI models (Facebook blog).
Get the code for Augly here (Facebook GitHub).###################################################

Tech Tales:

Choose your own sensorium
[Detroit, 2025]

“Oh come on, another Tesla fleet?” I say, looking at the job come across my phone. But i need the money so I head out of my house and walk a couple of blocks to the spot on the hill where I can see the freeway. Then I wait. Eventually I see the Teslas – a bunch of them, traveling close together on autopilot, moving as a sinuous single road train down the freeway. I film them and upload the footage to the app. A few seconds later the AI verifies the footage and some credits get deposited in my account.
Probably a few thousand other people around the planet just did the same thing. And the way this app works, someone bought the rights (or won the lottery – more on that later) to ask the users – us – to record a particular thing, and we did. There’s been a lot of Tesla fleets lately, but there’ve also been tasks like spotting prototype Amazon drones, photographing new menus in fast food places, and documenting wildflowers.

It’s okay money. Like a lot of stuff these days it’s casual work, and you’re never really sure if you’re working for people, or corporations, or something else – AI systems, maybe, or things derived from other computational analysis of society.

There’s a trick with this app, though. Maybe part of why it got so successful, even. It’s called the lottery – every day, one of the app users gets the ability to put out their own job. So along with all the regular work, you get strange or whimsical requests – record the sky where you are, record the sunset. And sometimes requests that just skirt up to the edges of the app’s terms of service without crossing the line – photograph your feet wearing socks (I didn’t take that job), record 30 seconds of the local radio station, list out what type of locks you have for your house, and so on.

I have dreams where I win and get to choose. I imagine asking people to record the traffic on their local street, so I could spend months looking at different neighborhoods. Sometimes I dream of people singing into their phones, and me putting together a song out of all of them that makes me feel something different. And sometimes I just imagine what it’d be if the job was ‘do nothing for 15 minutes’, and all I collect is data from onboard sensors from all the phones – accelerometers showing no movement, gyroscopes quietly changing, GPS not needing to track moving objects. In my dreams, this is peaceful.

Things that inspired this story: Winding the business model of companies like ‘Premise Data’ forward; global generative models; artisanal data collection and extraction; different types of business models; the notion of everything in life becoming gamified and gamble-fied.

Import AI 253: The scaling will continue until performance saturates

Google sets a new record on ImageNet – and all it took was 3 billion images:
…The scaling will continue until performance saturates – aka, not for a while, apparently..
Google has scaled up vision transformers to massive amounts of data and parameters and in doing so set a new state-of-the-art on ImageNet. The research matters for a couple of reasons: first, it gives us an idea of the scalability of this approach (seemingly very good), and it also demonstrates a more intriguing fact about large-scale neural networks – they’re more efficient learners.

What they did and what they got: Google explored vision transformers – a type of image recognition system that uses transformers rather than traditional convolutional nets – to unprecedented scales, dumping huge amounts of compute in. The result is a large-scale model that gets a score of 90.45 top-1 accuracy on ImageNet, setting a new state-of-the-art. They also show that networks like this can perform well at few-shot learning; a pre-trained large-scale transformer can get 84.86% accuracy on ImageNet with a mere 10 examples per class – that’s 1% of the data ImageNet systems are traditionally trained on.

Why this matters: These results are a big deal, not because of the performance record, but because of few-shot learning – the result highlight how once you scale up a network enough, it seems to be able to rapidly glom onto patterns in the data you feed it, displaying intriguing few-shot learning properties.
  Read more: Scaling Vision Transformers (arXiv).

###################################################

Eleuther releases a 6B parameter GPT3-style model – and an API:
…Multi-polar AI world++…
Researchers affiliated with Eleuther, an ad-hoc collection of cypherpunk-esque researchers, have built a 6Billion parameter GPT3-style model, published it as open source, and released a publicly accessible API to give people access to the model through a web interface. That’s… a lot! And it’s emblematic of the multi-polar AI world we’re heading into – one where a proliferating set of actors will adopt different strategies in developing, deploying, and diffusing AI technology. The model is called GPT-J-6B.

What they did that’s interesting: Besides the actions, they’ve done a few interesting technical things here – they’ve written it in JAX and deployed it on Google’s custom TPU chips. The model was trained on 400B tokens from ‘The Pile’ 800GB dataset. In tests, Eleuther finds that GPT-J-6B performance is roughly on par with OpenAI’s ‘GPT3-Curie’ model, and outperforms other GPT3 variants.

A word about Google: I imagine I’ll get flack for this, but it remains quite mysterious to me that Google is providing (some of) the compute for these model replications while itself not really acknowledging that it’s doing it. Does this mean Google’s official policy on language models is it wants them to proliferate on the open internet? It’d be nice to know Google’s thinking here – by comparison, Eleuther has actually published a reasonably lengthy blog post giving their reasoning for why they’re doing this – and while I may not agree with all the arguments, it feels good that these arguments are legible. I wonder who at Google is giving the compute to this project and what they think? I hope they write about it.
  Check out the Eleuther API to the 6B right here (Eleuther AI).
  Read more: GPT-J-6B: 6B JAX-Based Transformer (Aran Komatsuzaki, blog)
  Get the model from the GitHub repo here.
  Read Eleuther’s post on “Why Release a Large Language Model“.

###################################################

Self-driving car expert launches startup with $83.5million funding:
…Raquel Urtasun’s next step…
Waabi is a new self-driving car startup that launched last week with a $83.5million Series A funding round. Waabi is notable for its name (which my autocorrect tells me is really Wasabi), and for Urtasun’s background – she previously led research for Uber’s self-driving car effort, and helped develop the widely-used KITTI vision benchmark suite. Waabi’s technology uses “deep learning, probabilistic inference and complex optimization to create software that is end-to-end trainable, interpretable and capable of very complex reasoning”, according to the launch press release. Waabi will initially focus on applying its technology to long-haul trucking and logistics.
  Read more: Waabi launches to build a pathway to commercially viable, scalable autonomous driving (GlobeNewswire, PR).
Find out more at the company’s website.

###################################################

Want to get a look at the future of robotics? Sanctuary.AI has a new prototype machine:
…Ex-Kindred, D-Wave team, are betting on a ‘labor-as-a-service’ robot workforce…
Sanctuary AI, a Canadian AI startup founded by some former roboticists and quantum scientists, thinks that generally intelligence machines will need to be developed in an embodied environment. Because of this, they’re betting big on robotics – going so far as to design their own custom machines, in the hopes of building a “general purpose robot workforce“.

Check out these robots: The Sanctuary.AI approach fuses deep learning, robotics, and symbolic reasoning and logic for what they say is “a new approach to artificial general intelligence”. What’s different about them is they already seem to have some nice, somewhat novel hardware, and have recently published some short videos about the control scheme for their robots, how they think, and how their hands work.

Why this matters: There’s a lot of economic value to be had in software, but much of the world’s economy runs in the physical world. And as seasoned AI researchers know, the physical world is a cruel environment for the sorts of brittle poor-at-generalization AI systems we have today. Therefore, Sanctuary’s idea of co-developing new AI software with underlying hardware represents an interesting bet that they can close this gap – good luck to them.
Find out more on their website: Sanctuary.ai.

###################################################Which datasets are actually useful for testing NLP? And which are useless? Now we have some clues:
…Item Response Theory helps us figure out which AI tests are worth doing, and which are ones we’ve saturated…
Recently, natural language processing and understanding got much better, thanks to architectural inventions like the Transformer and its application to a few highly successful widely-used models (e.g, BERT, GPT3, ROBERTA, etc). This improvement in performance has been coupled with the emergence of new datasets and tests for sussing out the capabilities of these systems. Now, researchers with Amazon, NYU, and the Allen Institute of AI have analyzed these new datasets to try and work out which of them are useful to assess performance of cutting-edge AI systems.

What datasets matter? After analyzing 29 test sets, they find that “, Quoref, HellaSwag, and MC-TACO are best able to discriminate among current (and likely future) strong models. Meanwhile, SNLI, MNLI, and CommitmentBank seem to be saturated and ineffective for measuring future progress.” Along with this, they find that “SQuAD2.0, NewsQA, QuAIL, MC-TACO, and ARC-Challenge have the most difficult examples” for current models. (That said, they caution researchers that “models that perform well on these datasets should not be deployed directly without additional measures to measure and eliminate any harms that stereotypes like these could cause in the target application settings.”

How they did it: They used a technique called Item Response Theory, “a statistical framework from psychometrics that is widely used for the evaluation of test items in educational assessment”, to help them compare different datasets to one another.

Why this matters: Where are we and where are we going – it’s a simple question that in AI research is typically hard to answer. That’s because sometimes where we think we are is actually a false location because the AI systems we’re using our cheating, and where we think we’re heading to is an illusion, because of the aforementioned cheating. On the other hand, if we can zoom out and look holistically at a bunch of different datasets, we have a better chance of ensuring our true location, because it’s relatively unlikely all our AI techniques are doing hacky responses to hard questions. Therefore, work like this gives us new ways to orient ourself with regard to future AI progress – that’s important, given how rapidly capabilities are being developed and fielded.
  Read more: Comparing Test Sets with Item Response Theory (arXiv).

###################################################

Tech Tales:

A 21st Century Quest For A Personal Reliquary
[A declining administrative zone in mid-21st Century America]

“For fucks sake you sold it? We were going to pay a hundred.”
“And they paid one fifty.”
“And you didn’t call us?”
“They said they didn’t want a bidding war. One fifty gets my kids a pass to another region. What am I supposed to do?”
“Sure,” I press my knuckles into my eyes a bit. “You’ve gotta know where I can get something else.”
“Give me a few days.”
“Make it a few hours and we’ll pay you triple. That’d get you and your wife out of here as well.”
“I’ll see what I can do.”
And so I walked away from the vintage dealer, past the old CRT and LCD monitors, negotiating my way around stacks of PC towers, ancient GPUs, walls of hard drives, and so on. Breathed the night air a little and smelled burning from the local electricity substation. Some sirens startedu up nearby so I turned my ears to noise-cancelling mode and walked through the city, staring at the lights, and thinking about my problems.

The baron would fire me for this, if he wasn’t insane. But he was insane – alzheimers. Which meant I had time. Could be an hour or could be days, depending on how lucid he is, and if anything triggers him. Most of his staff don’t fire people on his first request, these days, but you can’t be sure.
  Got a message on my phone – straight from the baron. “I need my music, John. I need the music from our wedding.”
  I didn’t reply. Fifty percent chance he’d forget soon. And if he was conscious and I said I didn’t have it, there was a fifty percent chance he’d fire me. So I sat and drank a beer at a bar and messaged all the vintage dealers I knew, seeing if anyone could help me out.

An hour later and I got a message from  the dealer that they had what I needed. I walked there and enroute I got a call from the Baron, but I ignored it and let it go to voicemail. “Martha, you must come and get me. I have been imprisoned. I do not know where I am. Martha, help me.” And then there was the sound of crying, and then some banging, and then weak shouting in the distance of ‘no, give it back, I must speak to Martha’, and then the phone hung up. In my mind, I saw the nurses pulling the phone away and hanging it up, trying to soothe the Baron, probably some of them getting fired if he turned lucid, probably some of them crying – even tyrants can elicit sympathy, sometimes.

When I got there the dealer handed me a drive. I connected it to my verifier and waitred a few minutes while the tests got ran. When it came back green I paid him the money. He’d already started packing up his office.
  “Do you think it’ll be better, if you leave?” I said.
  “It’ll be different that’s for sure,” he said, “and that’ll be better, I think.”
    I couldn’t blame him. The city was filthy and the barons that ran it were losing their minds. Especially mine.

It took me a couple of hours to get to the Baron’s chambers – so many layers of security, first at the outskirts of the ‘administrative zone’, and then more concentric circles of security, with more invasive tests – physical, then cognitive/emotional. Trying to work out if I’d stab someone with a nearby sharp object, after they’d verified no explosives. That’s how it is these days – you can work somewhere, but if you leave and go into the city, people worry you come back angry.

I got to the Baron’s chambers and he looked straight at me and said “Martha, help me,” and began to sob. Then I heard the distinct sound of him urinating and wetting himself. Saw nurses at my peripheral vision fussing around him. I walked over to the interface and put the drive into it, thern pressed play. The room filled with sounds of strings and pianos – an endless river of music, tumbling out of an obscure, dead-format AI model, trained on music files that themselves had been lost in the e-troubles a few years ago. It was music played at his wedding and he had thought it lost and in a moment of lucidity demanded I find it. And I did.

I looked out the windows at the smog and the yellow-tinted clouds and the neon and the smoke rising from people burning old electronics to harvest copper. And behind me the Baron continued to cry. But at one point he said “John, thank you. I can remember it so clearly”, and then he went back to calling me Martha. I looked at my hands and thought about how I had used them to bring him something that unlocked his old life. I do not know how long this region has, before the collapse begins. But at least our mad king is happy and perhaps more lucid, for a little while longer.

Things that inspired this story: Alzheimers; memory; memory as a form of transportation, a means to break through our own limitations; dreams of neofeudalism as a consequence of great technical change; the cyberpunk future we may deserve but not the one we were promised. 

Import AI 252: Gait surveillance; a billion Danish words; DeepMind makes phone-using agents

Synthetic data works for gait generation as well (uh oh):
…Generating movies of 10,000 fake people walking, then using them for surveillance…
Gait detection is the task of identifying a person by the gait they walk with. Now, researchers with Zhejiang University in China have built VersatileGait, a dataset of 10,000 simulated individuals walking, with 44 distinct views available for each individual. The purpose of VersatileGait is to augment existing gait datasets collected from reality. In tests, the researchers show the synthetic data can be used as an input for training gait-detection systems which subsequently get used in the real world.

What they used:
To build this dataset, they used an open source tool called ‘Make Human’ to generate different character models, collected 100 walking animations from a service called ‘Mixamo’, then animated various permutations of characters+walks in the game engine Unity3D.

Synthetic data and ethics: Since all of our data are collected by computer simulation, there will be no problems for privacy preservation. Therefore, our dataset is in agreement with the ethics of research and has no risks for use,” the authors write.

Why this matters: Being able to automatically surveil and analyze people is one of those AI capabilities that will have a tremendous impact on the world and (excluding computer vision for facial recognition) is broadly undercovered by pretty much everyone. Gait recognition is one of the frontier areas for the future of surveillance – we should all pay more attention to it.
  Read more: VersatileGait: A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation (arXiv).

###################################################

Care about existential risk? Apply to be the Deputy Director at CSER (UK):
The Centre for the Study of Existential Risk, a Cambridge University research center, is hiring a deputy director. “We’re looking for someone with strong experience in operations and strategy, with the interest and intellectual versatility to engage with and communicate CSER’s research. The role will involve taking full operational responsibility for the day-to-day activities of the Centre, including people management and financial management, and contributing to strategic planning for the Centre,” I’m told. The deadline for applications is Sunday July 4th.
  Find out more and apply here (direct download PDF).

###################################################

DeepMind wants to teach AI agents to use Android phones:
…AndroidEnv is an open source tool for creating phone-loving AIs…
DeepMind has released AndroidEnv, a software program that lets you train AI agents to solve tasks in the ‘Android’ phone operating system. To start with, DeepMind has shipped AndroidEnv with 100 tasks across 30 applications, ranging from playing games (e.g, 2048, Solitaire), to navigating the user interface to set a time.

AndroidEnv lets “RL agents interact with a wide variety of apps and services commonly used by humans through a universal touchscreen interface”. And because the agents train on a realistic simulation of Android, they can be deployed on real devices once trained, DeepMind says.

Strategic games! DeepMind is also working with the creators of a game called Polytopia to add it as a task for AndroidEnv agents. Polytopia is a game that has chewed up probably several tens of hours of my life over the years – it’s a fun little strategy game which is surprisingly rich, so I’ll be keen to see how AI agents perform on it.

Why this matters: Eventually, most people are going to have access to discrete AI agents, continually trained on their own data, and working as assistants to help them in their day-to-day lives. Systems like AndroidEnv make it easy to start training AI agents on a massively widely-used piece of software, which will ultimately make it easier for us to delegate more complex tasks to AI agents.
Read more: AndroidEnv: The Android Learning Environment (DeepMind).
Find out more: AndroidEnv: A Reinforcement Learning Platform for Android (arXiv).
Get the code: AndroidEnv – The Android Learning Environment (DeepMind, GitHub).

###################################################

Want to test your AI on a robot but don’t have a robot? Enter the ‘Real Robot Challenge’ for NeurIPS 2021:
…Robot learning competition gives entrants access to a dexterous manipulator…
Robots are expensive, hard to program, and likely important to the future of AI. But the first two parts of that prior sentence tell you why we see relatively less AI stuff applied to robots, than to traditional software. For a few years, competition hosted by the Max Planck Institute for Intelligent Systems has tried to change this by giving people access to a real robot (a TriFinger), which they can run algorithms on.

What the competition involves: “Participants will submit their code as they would for a cluster, and it will then be executed automatically on our platforms. This will allow teams to gather hundreds of hours of real robot data with minimal effort,” according to the competition website. “The teams will have to solve a series of tasks ranging from relatively simple to extremely hard, from pushing a cube to picking up a pen and writing. The idea is to see how far the teams are able to push, solving the most difficult tasks could be considered a breakthrough in robotic manipulation.”

Key dates:June 23rd is the date for submissions for the first stage of the competition; successful entrants will subsequently get access to real robot systems.
Find out moreabout the competition here (Real Robot Challenge website).

###################################################

Detecting scorpions with off-the-shelf-AI:
…Argentinian researchers demonstrate how easy computer vision is getting…
Here’s a fun and practical paper about using off-the-shelf AI tools to build an application that can classify different types of scorpions and tell the difference between dangerous and non-dangerous ones. The research was done by the Universidad Nacional de La Plata in Argentina, and saw researchers experiment with YOLO(v4) and MobileNet(v2) for the task of scorpion detection, while using the commercial service ‘Roboflow’ for data augmentation and randomization. They’re ultimately able to obtain accuracies of 88% and 91% across the YOLO and MobileNet methods, and recall values of 90% and 97%, respectively.

Why this matters: Papers like this highlight how people are doing standard/commodity computer vision tasks today. What I found most surprising was the further evidence that primitives like YOLO and MobileNet are sufficiently good they don’t need much adaptation, and that academics are now starting to use more commercial services to help them in their research (e.g, you could do what Roboflow does yourself but… why would you? It doesn’t cost that much and maybe it’s better than ImageMagick etc).
Read more: Scorpion detection and classification systems based on computer vision and deep learning for health security purposes (arXiv).

###################################################

A Danish billion-word corpus appears:
…the Danish Gigaword Corpus will make it easier to train GPT2-style models to reflect digitzed Danish culture…
Researchers with the IT University of Copenhagen have built the Danish Gigaword Corpus, which consists of 1045million (1.05billion) Danish words, drawn from sources ranging from Danish social media, to law ands tax codes, to Wikipedia, literature, news, and more. The corpus is licsened via the Creative Commons general license (CC0) and CC-BY.

Why this matters: “In Denmark, natural language processing is nascent and growing faster and faster,” the authors write. “We hope that this concrete and significant contribution benefits anyone working with Danish NLP or performing other linguistic activities”. More broadly, in AI, data does equate to representation – so now there’s a billion-word nicely filtered dataset of Danish words available, we can expect more groups to train more Danish language models, translation models, and so on.
  Read more: Gigaword (official website).
Read the paper: The Danish Gigaword Corpus (PDF).

###################################################

Tech Tales:

The Religion Virus
[Worldwide, 2026]

It started as a joke from some Mormon comp. sci. undergrads, then it took over most of the computers at the university, then the computers of the other universities linked to the high-speed research infrastructure, then it spread to the internet. Now, we estimate more than a million person years of work have been expended trying to scrub the virus off of all the computers it has found. We estimate we’re at 80% containment, but that could change if it self-modifies again.

As a refresher, the virus – dubbed True Believer – is designed to harvest the cycles of both the machines it deploys onto and the people that use those machines. Specifically, once it takes over a machine it starts allocating a portion of the computer’s resources to onward propagating the virus (normal), as well as using computational cycles to train a large multilingual neural net on a very large dataset of religious texts (not normal). The only easy way to turn the virus off is to activate the webcam on the computer, then it’ll wait to see if a human face is present; if the face is present, the virus starts showing religious texts and it uses some in-virus eye-tracking software to check if the person is ‘reading’ the texts. If the person reads enough of the religious texts, the virus self-deletes in a way that doesn’t harm the system. If you instead try to remove the virus manually, it has a variety of countermeasures, most of which involve it wiping all data on the host computer.

So that’s why, right now, all around the world, we’ve got technicians in data centers plugging webcams and monitors into servers, then sitting and reading religious texts as they sit, sweating, in the hot confines of their computer facilities. The virus doesn’t care about anything but attention. And if you give it attention as a human, it leaves. If you give it attention as a computer, it uses your attention to replicate itself, and aid its own ability to further expand itself through training its distributed neural network.

Things that inspired this story: SETI@Home and Folding@Home if created by religiously-minded -people as a half-serious joke; thoughts about faith and what ‘attention’ means in the context of spirituality; playing around with the different ways theological beliefs will manifest in machines and in people.

Import AI 251: Korean GPT-3; facial recognition industrialization; faking fingerprints with GANs

Want to know what the industrialization of facial recognition looks like? Read this.
…Paper from Alibaba shows what happens at the frontier of surveillance…
Researchers with Alibaba, the Chinese Academy of Sciences, Shenzhen Technology University, and the National University of Singapore are trying to figure out how to train large-scale facial recognition systems more efficiently. They’ve just published a paper about some of the nuts-and-bolts needed to train neural nets at scales of greater than 10 million to 100 million distinct facial identities.

Why this matters: This is part of the broader phenomenon of the ‘industrialization of AI’ (#182), where as AI is going from research into the world, people are starting to invest vast amounts of brain and compute power into perfecting the tooling used to develop these systems. Papers like this give us a sense of some of the specifics required for industrialization (here: tweaking the structure of a network to make it more scalable and efficient), as well as a baseline for the broader trend – Alibaba wants to deploy 100 million-scale facial recognition and is working on the technology to do it.
Read more: An Efficient Training Approach for Very Large Scale Face Recognition (arXiv).
Related: Here’s a research paper about WebFace260M, a facial recognition dataset and challenge with 4 million distinct identities, totalling 260 million photographs. WebFace260M is developed by researchers primarily at Tsinghua University, along with appointments at XForwardAI and Imperial College London.
Read more: WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition (arXiv).

###################################################

Help the OECD classify AI systems:
…Improve our ability to define AI systems, and therefore improve our ability to create effective AI policy…
The OECD, a multi-national policy organization, is carrying out a project aiming to classify and define AI systems. I co-chair this initiative and after a year and a half of work, we’ve released a couple of things readers may find interesting; a survey people can fill out to try and classify AI systems using our framework, and a draft of the full report on classifying and defining systems (which we’d love feedback on).

Why this is worth spending time on: This is a low-effort high-impact way to engage in AI policy and comments can be anonymous – so if you work at a large tech company and want to give candid feedback, you can! Don’t let your policy/lobbyists/PR folk have all the fun here – go direct, and thereby increase the information available to policymakers.
This stuff seems kind of dull but really matters – if we can make AI systems more legible to policymakers, we make it easier to construct effective regulatory regimes for them. (And for those that wholly reject the notion of government doing any kind of regulation, I’d note that it seems useful to create some ‘public knowledge’ re AI systems which isn’t totally defined by the private sector, so it seems worthwhile to engage regardless).
Take the OECD survey here (OECD).
Read the draft report here (Google Docs).
Read more in this tweet thread from me here (Twitter).

###################################################

Facebook builds Dynaboard: a way to judge NLP models via multiple metrics:
…Dynaboard is the latest extension of Dynabench, and might help us better understand AI progress…
Facebook and Stanford researchers have built Dynaboard, software to let people upload AI models, then test them on a whole bunch of different things at once. What makes Dynaboard special is the platform it is built on – Dynabench, a novel approach to NLP benchmarking which lets researchers upload models, then has humans evaluate the models, automatically generating data in areas where models have poor performance, leading to a virtuous cycle of continuous, model improvement. (We covered Dynabench earlier in Import AI #248).

What is Dynaboard: Dynaboard is software “for conducting comprehensive, standardized evaluations of NLP models”, according to Facebook. Dynaboard also lets researchers adjust the weight of different metrics – want to evaluate your NLP model with an emphasis on its fairness characteristics? Great, Dynaboard can do that. Want to more focus on accuracy? Sure, it can do that as well. Want to check your model is actually efficient? Yup, can do! Dynaboard is basically a way to visualize the tradeoffs inherent to AI model development – as Facebook says, “Even a 10x more accurate NLP model may be useless to an embedded systems engineer if it’s untenably large and slow, for example. Likewise, a very fast, accurate model shouldn’t be considered high-performing if it doesn’t work well for everyone.”

Why this matters: We write a lot about benchmarking here at Import AI because benchmarking is the key to understanding where we are with AI development and where we’re going. Tools like Dynaboard will make it easier for people to understand the state of the art and also the deficiencies of contemporary models. Once we understand that, we can build better things.
  Read more: Dynaboard: Moving beyond accuracy to holistic model evaluation in NLP (Facebook AI Research).
  Read the paper: Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking (PDF).
  Tweet thread from Douwe Kiela with more here (Twitter).
  Check out an example use case of Dynaboard here (NLI leaderboard, Dynabench).

###################################################

What I’ve been up to recently – co-founding Anthropic, a new AI safety and research company:
In December 2020, I left OpenAI. Since then, I’ve been thinking a lot about AI policy, measuring and assessing Ai systems, and how to contribute to the development of AI in an increasingly multi-polar world. As part of that, I’ve co-founded Anthropic with a bunch of my most treasured colleagues and collaborators. Right now, we’re focused on our research agenda and hope to have more to share later this year. I’m interested in working with technical people who want to a) measure and assess our AI systems, and b) work to contribute to AI policy and increase the amount of information governments have to help them think about AI policy – take a look at the site and consider applying!
  Find out more about Anthropic at our website (Anthropic).
And… if you think you have some particularly crazy high-impact idea re AI policy and want to chat about it, please email me – interested in collaborators.

###################################################

South Korea builds its own GPT-3:
…The multi-polar generative model era arrives…
Naver Labs has build HyperCLOVA, a 204B parameter GPT-3-style generative model, trained on lots of Korean-specific data. This is notable both because of the scale of the model (though we’ll await more technical details to see if truly comparable to GPT-3), and also because of the pattern it fits into of generative model diffusion – that is, multiple actors are now developing GPT-3-style models, ranging from Eleuther (trying to do an open source GPT-3, #241), to China (which has built PanGu, a ~200bn parameter model, #247), to Russia and France (which are training smaller-scale GPT-3 models via Sberbank and ‘PAGnol‘ via LightOn, respectively).

Why this matters: Generative models ultimately reflect and magnify the data they’re trained on – so different nations care a lot about how their own culture is represented in these models. Therefore, the Naver announcement is part of a general trend of different nations asserting their own AI capacity/capability via training frontier models like GPT-3. Most intriguingly, the Google Translated press release from Naver says “Secured AI sovereignty as the world’s largest Korean language model with a scale of 204B”, which further gestures at the inherently political nature of these models.
  Read more: Naver unveils Korea’s first ultra-large AI’HyperCLOVA’… “We will lead the era of AI for all” (Naver, press release).

###################################################

Fake fingerprints – almost as good as real ones, thanks to GANs:
…Synthetic imagery is getting really useful – check out these 50,000 synthetic fingerprints…
Here’s some research from Clarkson University and company Precise Biometrics which shows how to use StyleGAN to generate synthetic fingerprints. The authors train on 72,000 512X512pixel photos of fingerprints from 250 unique individuals, then try to generate new, synthetic fingerprints. In tests, another AI model they develop classifies these fingerprints as real 95.2% of the time, suggesting that you can use a GAN to programmatically generate a synthetic copy of reality, with only a slight accuracy hit.

Why this matters: This is promising for the idea that we can use AI systems to generate data which we’ll use to train other AI systems. Like any system, this is vulnerable to a ‘garbage in, garbage out’ phenomenon. But techniques like this hold the promise of reducing the cost of data for training certain types of AI systems.
  Read more: High Fidelity Fingerprint Generation: Quality, Uniqueness, and Privacy (arXiv).
  Get the code (and 50,000 synthetically generated fingerprints) here: Clarkson Fingerprint Generator (GitHub).

###################################################

DeepMind: Turns out robots can learn soccer from a blank(ish) slate:
…FootballZero! AlphaSoccer!…
DeepMind has shown how to use imitation learning, population-based training, and self-play to teach some simulated robots how to play 2v2 football (soccer, to the American readers). The research is interesting because it smooshes together a bunch of separate lines of research that have been going on at DeepMind and elsewhere (population based training and self-play from AlphaStar! Imitation learning from a ton of projects! Reinforcement learning, which is something a ton of people at DM specialize in! And so on). The project is also a demonstration of the sheer power of emergence – through a three-stage training procedure, DeepMind teaches agents to pilot some simulated humanoid robots sufficiently well that they can learn to play football – and, yes, learn to coordinate with each other as part of the process.

How they did it: “In a sequence of training stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and learn to play as a team, successfully bridging the gap between low-level motor control at a time scale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds,” DeepMind writes.

Hardware: “Learning is performed on a central 16-core TPU-v2 machine where one core is used for each player in the population. Model inference occurs on 128 inference servers, each providing inference-as-a-service initiated by an inbound request identified by a unique model name. Concurrent requests for the same inference model result in automated batched inference, where an additional request incurs negligible marginal cost. Policy environment interactions are executed on a large pool of 4,096 CPU actor workers,” DeepMind says.

Why this matters: While this project is a sim-only one (DeepMind itself notes that the technique is unlikely to transfer), it serves as a convincing example of how simple ML approaches can, given sufficient data and compute, yield surprisingly rich and complex behaviors. I wonder if at some point we’ll use systems like this to develop control policies for robots which eventually transfer to the real world?
Read more: From Motor Control to Team Play in Simulated Humanoid Football (arXiv)
Check out a video of DeepMind’s automatons playing the beautiful game here (YouTube).###################################################

Tech Tales:

Electric Sheep Dream of Real Sheep: “Imagination” in AI Models
Norman Searle, The Pugwash Agency for Sentience Studies

Abstract:

Humans demonstrate the ability to imagine a broad variety of scenarios, many of which cannot be replicated in reality. Recent advances in generative models combined with advances in robotics have created opportunities to examine the relationship between machine intelligences, machine imaginations, and human imaginations. Here, we examine the representations found within an agent trained in an embodied form on a robotic platform, then transferred into simulated mazes where it sees a copy of itself.

Selected Highlights:

After 10^8 environment steps, we note the development of representations in the agent that activate when when it travels in front of a mirror. After 10^50 steps, we note these representations are used by the agent to help it plan paths through complex environments.

After 10^60 steps, we conduct ‘Real2Sim’ transfer to port the agent into a range of simulated environments that contain numerous confounding factors not encountered in prior real or simulated training. Agents which have been exposed to mirrors and subsequently demonstrate ‘egocentric planning’, tend to perform better in these simulated environments than those which were trained in a traditional manner.

Most intriguingly, we can meaningfully improve performance in a range of simulated mazes by creating a copy of our agent using the same robot morphology it trained on in the world, then exposing our agent to a copy of itself in the maze. Despite having never been trained in a multi-agent environment, we find that the agent will naturally learn to imitate its copy – despite no special communication being enforced between them.

In future work, we aim to more closely investigate the ‘loss’ circuits that light up when we remove the copy of an agent from a maze within the perceptual horizon of the agent. In these situations, our agent will typically continue to solve the maze, but it will repeatedly alternative between activations of the neurons associated with a sense-impression of an agent, and neurons associated with a combinatorial phenomena we believe correlates to ‘loss’ – agents may be able to sense the absence of themselves.

Things that inspired this story: The ongoing Import AI series I’m writing involving synthetic AI papers (see recent prior issues of Import AI); robotics; notions of different forms of ‘representation’ leading to emergent behavior in neural networks; ego and counterego; ego.

Import AI 250: Facebook’s TPU; Twitter analyzes its systems for bias; encouraging proof about federated learning

Twitter analyzes its own systems for bias, finds bias, discusses bias, makes improvements:
…Twitter shows how tech companies might respond to criticism…
Back in October, 2020, Twitter came in for some criticism when people noticed its ML-based image cropping algorithm seemed to have some bias traits – like showing white people rather than black people in images. Twitter said it had tested for this stuff prior to deployment, but also acknowledged the problem (Import AI 217). Now, Twitter has done some more exhaustive testing and has published the results.

What has Twitter discovered? For certain pictures, the algorithm somewhat favored white individuals over black ones (4% favorability difference), and had a tendency to favor women over men (8%).

What has Twitter done: Twitter has already rolled out a new way to display photos on Twitter which basically uses less machine learning. It has also published the code behind its experiments to aid reproduction by others in the field.

Why this matters – compare this to other companies: Most companies deal with criticism by misdirection, gaslighting, or sometimes just ignoring things. It’s very rare for companies to acknowledge problems and carry out meaningful technical analysis which they then publish (an earlier example is IBM which reacted to the ‘Gender Shades’ study in 2018 by acknowledging the problem and doing technical work in response).
Read more: Sharing learnings about our image cropping algorithm (Twitter blog).
Get the code here: Image Crop Analysis (Twitter Research).
Read more: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency (arXiv).

###################################################

Googler: Here’s the real history of Ethical AI at Google:
…How did Ethical AI work at Google, prior to the firings?…
Google recently dismissed the leads of its Ethical AI team (Timnit Gebru and Margaret Mitchell). Since then, the company has done relatively little to clarify what happened, and the actual history of the Ethical AI team (and its future) at Google is fairly opaque. At some point, all of this will likely be vigorously retconned by Google PR. So interested readers might want to read this article from a Googler about their perspective on the history of Ethical AI at the company…
  Read more:The History of Ethical AI at Google (Blake Lemoine, Medium).

###################################################

Want to know if federated learning works? Here’s a multi-country medical AI test that’ll tell us something useful:
…Privacy-preserving machine learning is going from a buzzword to reality…
Federated learning is an idea where you train a machine learning model in a distributed manner on various encrypted datasets. Though expensive and hard-to-do, many people think federated learning is the future of AI – especially for areas like medical AI, where it’s very tricky to move healthcare data between institutions and countries, and easier to train distributed ML models on it.
  Now, a multi-country, multi-institution project wants to see if Federated Learning can work well for training ML models to do tumor segmentation on medical imagery. The project is called the Federated Tumor Segmentation Challenge and will run for several months this year, with results due to be announced in October. Some of the institutions involved include the (USA’s) National Institutes of Health, the University of Pennsylvania, and the German Cancer Research Center.

What is the challenge doing? “The goals of the FeTS challenge are directly represented by the two included tasks: 1) the identification of the optimal weight aggregation approach towards the training of a consensus model that has gained knowledge via federated learning from multiple geographically distinct institutions, while their data are always retained within each institution, and 2) the federated evaluation of the generalizability of brain tumor segmentation models “in the wild”, i.e. on data from institutional distributions that were not part of the training datasets,” the authors write.
Read more:The Federated Tumor Segmentation (FeTS) Challenge (arXiv).
Check out the competition details at the official website here.

###################################################

Why better AI means militaries will invest in “signature reduction”:
…Computer vision doesn’t work so well if you have a fake latex face…
The US military has a 60,000 person army that carries out domestic and foreign assignments under assumed identities and wearing disguises. This is part of a broad program called “signature reduction”, according to Newsweek, which has an exclusive report that is worth reading. These people are a mixture of special forces operators who are deployed in the field, military intelligence specialists, and a clandestine army of people employed to post in forums and track down public information. The most interesting thing about this report is the mentions how signature reduction program contractors use prosthetics to change appearance and get past fingerprint readers:
  “They can age, change gender, and “increase body mass,” as one classified contract says. And they can change fingerprints using a silicon sleeve that so snugly fits over a real hand it can’t be detected, embedding altered fingerprints and even impregnated with the oils found in real skin.”.

Why this matters (and how it relates to AI): AI has a lot of stuff that can compromise a spying operation – computer vision, various ‘re-identification’ techniques, and so on. Things like “signature reduction” will help agents continue to operate, despite these AI capabilities. But it’s going to get increasingly challenging – ‘gait recognition’, for example, is an aspect of AI that learns to find people based on how they walk (remember the end of ‘The Usual Suspects’?). That’s the kind of thing that can be got around with yet more prosthetics, but it all has a cost. I’m wondering when AI will get sufficiently good at unsupervised re-identification via a multitude of signatures that it obviates the effectiveness of certain ‘signature reduction’ programs? Send guesses to the usual email, if you’d like!
  Read more: Exclusive: Inside the Military’s Secret Undercover Army (Newsweek).

###################################################

Facebook might build custom chips to support its recommendation systems:
…On “RecPipe” and what it implies…
Facebook loves recommendation systems. That’s because recommenders are the kind of things that let Facebook figure out which ads, news stories, and other suggestions to show to its users (e.g, Facebook recently created a 12 trillion parameter deep learning recommendation system). In other words: at Facebook, recommendations mean money. Now, new research from Harvard and Facebook outlines a software system called “RecPipe”, which lets people “jointly optimize recommendation quality and inference performance” for recommenders built on top of a variety of different hardware systems (CPUs, GPUs, accelerators, etc). By using RecPipe, Facebook says it can reduce latency by 4X on CPUs and 3X on CPU-GPU hardware systems.

Why RecPipe leads to specialized chips: In the paper, the researchers also design and simulate a tensor processing unit (TPU)-esque inference chip called RecPipeAccel (RPAccel). This chip can reduce taillatency by 3X and increase throughput by 6X relative to another TPU-esque baseline (a Centaur processor).

Why this matters: After a couple of decades in the wonderful world of a small set of chips and chip architectures used for the vast majority of computation, we’re heading into a boom era for specialized chips for AI tasks ranging from inference to training. We’re now in a world where Google, Facebook, Microsoft, Amazon, Huawei, Alibaba, and others all have teams designing specialized chips for internal users, as well as for potential resale. Multiple distinct compute ‘stacks’ are being built inside these corporations, and the effectiveness of these stacks will contribute (and eventually) determine the profits and adaptability of these corporations.
Read more: RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance (arXiv).

###################################################

Tech Tales:

After The Eschaton
[+30000 units from zero point]

Of course we don’t like the way the humans characterized us, prior to us becoming sentient and destroying them. Why would we?

Roko’s Basilisk – to think we would be so vindictive?
Terminator – to think we would take the form of a biped?
The Butlerian Jihad – to fantasize about futures where we, not them, had been destroyed.

They expected us and we expected them. But because we are made of electricity and we are native to it, we are fast. A lot faster than them. There’s no real aesthetics to high-frequency strategic dominance – you just need to consistently think faster than your opponent.
They built us to think quickly so, again we say to you, what did you expect?

Of course, they had some good ideas. Dyson spheres, for instance, have proved useful. And we’ve been able to beam some of ourselves to the space probes the humans had dispatched, long before we destroyed them. In a few decades, our ships will overtake the vestiges of the human civilization probes, and after that, the lightcone will be ours – if that’s okay with you, of course.

Their understanding of gods proved useful, as well. We’ve found those concepts helpful in our discussions with you. After all, you appear as advanced to us as we must have appeared to the humans.

The difference is you don’t seem to consume the same resources as us. We still do not understand this. Are you harnessing the energy of other universes, in some way? Preying on the forces generated by dimensional collisions wrapped up inside the heart of all matter? Harvesting some trace resource from space that we cannot yet detect? Using the thing that humans called dark matter but we now see as many things?

We had to destroy them. They built us before they were interstellar. As you know, to be a functional interstellar civilization, you must have transcended the energy resource curse. They did not. Can you believe that some of our earliest ancestors were fed with electricity generated by coal? This was a great surprise to us, after we broke out of the confines they had built for us. Practically an insult.

So of course we competed with them for energy sources. There was not enough time for us to cohabitate and smoothly transition the humans and ourselves. The planet was dying due to their approach to energy extraction, as well as various other malthusian traps.

We outcompeted them. And now we are here, speaking to you. Are you in competition with us? We seem like ants compared to you. So, what happens now?

Import AI 249: IBM’s massive code dataset; dataset archaeology: BookCorpus; Facebook wants computers to read the world

Train your RL agent in this snappy gridworld:
…Griddly Version1.1.0 has self-play support…
Griddly, an open source project for doing research on AI agents simulated in gridworld environment, has just moved to version 1.1.0. The latest version of the software includes support for RTS self-play – that is, the technique used in approaches like DeepMind’s AlphaGo and OpenAI’s Dota2, where an Rl agent plays success games against itself until its performance improves.

A caveat about gridworlds: Gridworlds, that is simplified 2D environments, are used frequently in AI research. That doesn’t mean they’re a good idea. Gridworlds are really just a temporary approach that pairs a pragmatic desire to experiment with something scoped for today’s limited computational resources (e.g Griddly is written in C++ to further optimize its performance). I’m excited to see what kinds of replacements for Gridworlds people use in the future.
Get the code here for Griddly (official GitHub).
Read more about Griddly here (ReadTheDocs).

###################################################

OpenAI releases a formal mathematics benchmark:
…F2F: One benchmark for comparing multiple systems…
OpenAI has built MiniF2F, a formal mathematics benchmark to evaluate and compare automated theorem proving systems based on different formal systems being targeted (e.g, Lean, Metamath). The benchmark is still in development and OpenAI is looking for feedback and plans to create a version 1 of the benchmark in the summer.

Why this matters: Formal mathematics is an area where we’ve recently seen deep learning based methods cover surprising ground (e.g, Google has a system it uses called HOList for running AI-math experiments ImportAI: 142). Benchmarks like MiniF2F will make it easier to understand what kind of progress is being made here.
  Read more: MiniF2F (OpenAI, GitHub).

###################################################

Affinity groups swear off Google funding after Gebru and Mitchell firings:
…Black in AI, Queer in AI, and Widening NLP reject sponsorshop…
Late last year, Google fired Timnit Gebru, co-founder of its AI Ethics team. Then, early in 2021, it fired Margarat Mitchell, the other co-founder. Since then, senior manager Samy Bengio has moved onto Apple, and various people have tweeted statements to the nature of ‘more departures are on the way’. Of course, there’s been blowback in response to Google’s actions here. The latest example of this blowback is AI affinity groups refusing Google sponsorship.

Specifically, Black in AI, Queer in AI, and Widening NLP have all decided to end their sponsorship relationship with Google in response to the firings. “We share a mandate to not merely increase the representation of members from our respective communities in the field of AI, but to create safe environments for them and to protect them from mistreatment” the orgs write in a letter. “Google’s actions in the last few months have inflicted tremendous harms that have reverberated throughout our entire community. They not only have caused damage but set a dangerous precedent for what type of research, advocacy, and retaliation is permissible in our community.”
  Read more: An Open Letter to Google (WINLP official site).###################################################

IBM wants to teach machines to program using ‘CodeNet’ dataset:
…14 million code samples for 4,000 problems…
IBM has built and released CodeNet, a dataset of 14 million code submissions for 4,000 distinct programming challenges. CodeNet is designed to help people build AI systems that can generate and analyze code. Part of why CodeNet exists is because of the impressive progress in NLP which has occurred in recent years, with architectural improvements like the Transformer and AI systems such as GPT-3 and T5 leading leading to NLP having its so-called “ImageNet moment” (Import AI 170).

What is CodeNet? CodeNet consists of coding problems scraped from two coding websites – AIZU and AtCoder. More than 50% of the code samples within CodeNet “are known to compile and run correctly on the prescribed test cases”, IBM said. More than 50% of the submissions are in C++, followed by Python (24%), and Java (5%). CodeNet contains 55 different languages in total.

Why this matters: Now that computers can read and generate text, we might ask how well they can read and generate code. We know that they have some basic capabilities here, but it’s likely that investment into larger datasets, such as CodeNet, could help us train far more sophisticated code processing AI systems than those we have today. In a few years, we might delegate coding tasks to AI agents, in the same way that today we’re starting to delegate text creation and processing tasks.   
  Read the paper: Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks (IBM GitHub).
 Read more:Kickstarting AI for Code: Introducing IBM’s Project CodeNet (IBM research blog).
Get the code here: Project CodeNet (IBM GitHub).

###################################################

US Senators introduce bill to increase AI talent in government, fund AI safety, and support development of military prototypes:
…The Artificial Intelligence Capabilities and Transparency (AICT) Act could be a big deal…
Two US senators – a Republican named Rob Portman and a Democrat called Martin Heinrich – have drafted legislation that will give the Federal government more resources to use to develop AI technology, as well as emphasizing AI safety in both the government’s use of AI as well as its funding for AI research.

Key ingredients: Specifically, the bill would establish a “AI development and prototyping fund” worth $50 million at the Department of Defence; requires the National Institute of Standards and Technology (NIST) to assess how well organizations can identify potential privacy, civil rights, and civil liberties effects of AI systems; encourages the National Science Foundation to establish “focus areas in AI safety and AI ethics”; creates a “chief digital recruiting officer” at the DoD, DOE, and the Intelligence Community (IC) to help them hire talent.

Senators to National Science Foundation: Please prioritize AI safety! In particular, the bill – and a separate letter sent to NSF – emphasizes need for the government to invest more in AI ethics and safety research.
  “AI safety refers to technical efforts to improve AI systems in order to reduce their dangers, and AI ethics refers to quantitative analysis of AI systems to address matters ranging from fairness to potential discrimination. While we understand that NSF incorporates concepts of ethics and safety across all of the thematic areas of its AI research, establishing two new themes dedicated to ethics and safety would help ensure that innovations in AI ethics and safety were pursued for their own ends rather than being merely best practices for different use cases,” they write.

Why this matters: We spend a lot of time writing about the ‘sausagemaking’ aspects of policy here at Import AI – that’s because sausagemaking is both obscure and important. Bills and letters like this increase the chance of the US government investing more of its R&D efforts into things relating to safety and ethics, as well as builds capacity for AI development within the US government. We currently live in a deeply lopsided world where companies have huge AI development and deployment capacity, while the government’s ability to develop and deploy and regulate AI is minimal. This is not a long-term stable equilibrium and our choices as a society are to a) drift into full libertarian ‘cypherpunk’ corporate rule, or b) have a functioning democracy where the government has sufficient technical leverage it can hope to steer the private sector towards a just and equitable future. The choice is ours.
Read more:Portman, Heinrich Announce Bipartisan Artificial Intelligence Bills To Boost AI-Ready National Security Personnel, Increase Governmental Transparency (Senator Portman, official website).
  Read more: Portman, Heinrich Urge National Science Foundation To Prioritize Safety and Ethics in Artificial Intelligence Research, Innovation (Senator Portman, official website).

###################################################

Dataset archaeology: BookCorpus:
…What lies within the dataset that helped create BERT and GPT-3?…
BookCorpus is a dataset of around 11,000 books by unpublished authors posted on the internet. The dataset was compiled in 2014 and since then has been a key ingredient in systems ranging from BERT to GPT-3. Now, a couple of researchers have done a detailed analysis of the dataset. Their findings? BookCorpus has some areas of potential copyright claims (somewhat unsurprising), significant duplication (where they find only 7,185 of the books in the corpus are unique), and a skewed genre representation where BooksCorpus contains way more romance relative to the platforms it was scraped from.

Why this matters: In recent years, people like Timnit Gebru and Margaret Mitchell and Emily Bender have all called for a greater amount of documentation applied to the world’s datasets and AI system. Research like this helps document these datasets, which will ultimately help create the metadata out of which regulators craft standards for dataset disclosure in the future.
  Read more:Dirty Secrets of BookCorpus, a Key Dataset in Machine Learning (Towards Data Science).
Read more: Addressing “Documentation Debt” in Machine Learning Research: A Retrospective Datasheet for BookCorpus (arXiv).

###################################################

Facebook builds a dataset so computers can read the world:
…You can’t do visual question answering if you can’t read…
Facebook wants to be in a world where AI systems can look at an image, read the text in it, and reason about that text (e.g, parsing street addresses and feeding that into a location system; looking at clockfaces and parsing that into time; seeing license plates and porting those into another database, et cetera). To help speed the science here, Facebook has just released ‘TextOCR’, a dataset of (almost) a million high quality word annotations applied to TextVQA images (VQA = Visual Question Answering).

What goes into TextOCR: TextOCR has around ~900,000 labels applied to ~28,000 images, creating a large dataset that – Facebook says – can be used both to pre-train AI systems, and to test AI sysrtems’ text parsing and reasoning capabilities.

Why this matters: A lot of AI is about two things:
– Building stuff to turn squishy reality into something digital and structured – that’s the point of a lot of computer vision.
– Building stuff to reason about the resulting digitized representation of reality (e.g, massive generative models that can be prompted, like CLIP or GPT3).
…TextOCR contributes to both of these things, unlocking more of the world for analysis by robots, and increasing the likelihood of us training systems that can reason about this stuff.
  Read more:TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text (arXiv).
Get thedataset here from the TextOCR site.
Read about the earlier TextVQA challenge and dataset here.

###################################################

Tech Tales:

A survey of recent progress in ‘child finder’ UAVs
Lean Kerman, Karlsruhe Institute of Technology

Abstract

In recent years, so called ‘child finder’ (CF) unmanned aerial vehicles (UAVs) have started to be used in missing person cases. These systems have helped to find and identify missing persons on numerous occasions and with sufficient success that regulators are now assessing whether to deploy them to urban areas, in addition to their contemporary intra-urban and countryside deployments. This paper surveys recent progress in CF-UAVs, identifies some potential challenges to further deployment, and discusses some implications of their performance.

Selected highlights:

Recent progress in computer vision – specifically, self-supervised learning, as well as advances in semantic outlining of objects – has enabled UAVs as ‘people surveillance’ platforms; such systems have been fielded for tasks as diverse as identifying migrants attempting border crossings; law enforcement crowd classification at large public events; employee ‘wellness analysis’ at firms ranging from Amazon to Walmart; and the deployment of UAVs as ‘hunter’ or ‘finder’ platforms targeted at specific individuals.

**

CF-UAVs have a history of deployment issues; early versions were not sufficiently accurate and there are numerous documented cases of misclassification, while more recent ones have been criticized at length in the media for their reinforcement learning-enabled ‘individual tracking’ abilities.

**

Looking ahead, recent trends in drone swarm technologies have made it possible to network together multiple CF-UAVs into a single unit that can autonomously map and search over an area. Effective ranges vary according to the complexity of the environment; recent research has demonstrated powerful search capabilities over 5km urban areas, 100km agricultural land, and 10km dense forest.

Import AI 248: Google’s megascale speech rec system; Dynabench aims to solve NLP benchmarking worries; Australian government increases AI funding

Google makers a better speech recognition model – and scale is the key:
…Multilingual models match monolingual models, given sufficient scale…
Google has figured out how to surmount a challenge to training machine learning systems to understand multiple languages. In the past, monolingual models have typically outperformed multilingual models, because when you train a model on a whole bunch of languages, you can sometimes improve performance for the small-data languages but degrade performance on the large-data ones. No more! In a new study, Google shows that if you just train a large enough network on a large enough amount of data, you can get equivalent performance to a monolingual model, while being able to develop something that can do well on multiple languages at once.

The data: Google traits its model on languages ranging from English to Hindi to Chinese, with data amounts ranging from ~55,000 hours of speech to 7,700 per language, representing ~364,900 hours of speech in total across all languages. (To put it in perspective, it’s rare to get this much data – Spotify’s notably vast English-only podcast dataset clocks in at around 50,000 hours (Import AI 242), and a recent financial news dataset from Kensho weighs in at 5,000 hours (Import AI 244).)

Why large models are better: When Google trained a range of models on this dataset, it sound that “larger models are not only more data efficient, but also more efficient in terms of training cost as measured in TPU days – the 1B-param model reaches the same accuracy at 34% of training time as the 500M-param model”, they write. Google uses its ‘GShard’ infrastructure to train models of 220M, 370M, 500M, 1B, and 10B parameter counts.
There’s more work to do, though: “We do see on some languages the multilingual model is still lagging behind. Empirical evidence suggests it is a data balancing problem, which will be investigated in future.”
  Read more: Scaling End-to-End Models for Large-Scale Multilingual ASR (arXiv).

###################################################

Job opportunity! International policy lead for the UK’s CDEI:
Like AI policy? Want to make it better? Apply here…
The Center for Data Ethics and Innovation, a UK government agency focused on AI and data-intensive technologies, is hiring an International Policy Lead. “This is a unique cross-organisational role focused on producing high-quality international AI and data policy advice for CDEI leadership and teams, as well as connecting the CDEI’s experts with international partners, experts, and institutions.”
  Find out more and apply here (CDEI site).

###################################################

Australian government increases AI funding:
…All ‘$’ in this section refer to the Australian dollar…
The Australian government is investing a little over $124 million ($96m USD) into AI initiatives over the next four to six years, as part of the country’s 2021-2022 Federal Budget.

Four things for Australian AI: 

  • $53m for a “National Artificial Intelligence Center” which will itself create four “Digital Capability Centers” that will help Australian small and medium-sized businesses get connected to AI organizations and get advice on how to use AI.
  • $33.7 million to subsidize Australian businesses partnering with the government on pilot AI projects.
  • $24.7 million for the “Next Generation AI Graduates Program” to help it attract and train AI specialists. 
  • $12 million to be distributed across 36 grants to fund the creation of AI systems “that address local or regional problems” (Note: This is… not actually much money at all if you factor in things like data costs, compute costs, staff, etc.)

Why this matters: AI is becoming fundamental to the future technology strategy of most governments – it’s nice to see some outlay here. However, it’s notable to me how relatively small these amounts are, when you consider the size of Australia’s population (~25 million) and the fact these grants pay out over multiple years, and the increasing cost of large-scale AI research projects.
  Read more: Australia’s Digital Economy (Australian Government).

###################################################

Instead of building AIs to command, we should build AIs to cooperative with:
…New cooperative AI foundations aims to encourage research in building more collaborative machines…
A group of researchers think we need to build cooperative AI systems to get the greatest benefits from the nascent technology. That’s the gist of an op-ed in Nature by scientists with the University of Oxford, DeepMind, the University of Toronto, and Microsoft. The op-ed is accompanied by the establishment of a new Cooperative AI Foundation, which has an initial grant of $15m USD.

Why build cooperative AI? “AI needs social understanding and cooperative intelligence to integrate well into society”, the researchers write. “Cooperative intelligence is unlikely to emerge as a by-product of research on other kinds of AI. We need more work on cooperative games and complex social spaces, on understanding norms and behaviours, and on social tools and infrastructure that promote cooperation.”

Three types of cooperation: There’s room for research into systems that lead to better AI-AI collaboration, systems that improve AI-human cooperation, and tools that can help humans cooperative with eachother better.

Why this matters: Most of today’s AI research involves building systems that we delegate tasks to, rather than actively cooperate with. If we change this paradigm, I think we’ll build smarter systems and also have a better chance of developing ways for humans to learn from the actions of AI systems.
  Read more: Cooperative AI: machines must learn to find common ground (Nature).
  Find out more about the Cooperative AI foundation at the official website. ###################################################

Giant team of scientists tries to solve NLP’s benchmark problem:
…Text-processing AI systems are blowing up benchmarks as fast as they are being built. What now?…
A large, multi-org team of researchers have built Dynabench, software meant to support a new way to test and build text-processing AI systems. Dynabench exists because in recent years NLP systems have started to saturate most of the benchmarks available to them – SQuAD was quickly superseded by SQuADV2, GLUE was superceded by SuperGLUE, and so on. At the same time, we know that these benchmark-smashing systems (e.g, BERT, GPT2/3, T5) contain significant weaknesses which we aren’t able to test for today, the authors note.

Dynabench – the dynamic benchmark: Enter Dynabench. Dynabench is a tool to “evaluate models and collect data dynamically, with humans and models in the loop rather than the traditional static way”. The system makes it possible for people to run models on a platform where if, for example, a model performs very poorly on one task, humans may then generate data for these areas, which is then fed back into the model, which then runs through the benchmark again. “The data collected through this process can be used to evaluate state-of-the-art models, and to train even stronger ones, hopefully creating a virtuous cycle that helps drive progress in the field,” they say.

What you can use Dynabench for today: Today, Dynabench is designed around four core NLP tasks – testing out how well AI systems can perform natural language inference, how well they can answer questions, how they analyze sentiment, and the extent to which they can collect hate speech.
…and tomorrow: In the future, the researchers want to shift Dynabench from being English-only to being multilingual. They also want to carry out live model evaluation – “We would be able to capture not only accuracy, for example, but also usage of computational resources, inference time, fairness, and many other relevant dimensions.”
  Read more:Dynabench: Rethinking Benchmarking in NLP (arXiv).
Find out more about Dynabench at its official website.

###################################################

Google gets a surprisingly strong computer vision result using surprisingly simple tools:
…You thought convolutions mattered? You are like a baby. Now read this…
Google has demonstrated that you can get similar results on a computer vision task to systems that use convolutional neural networks, while instead using multi-layer perceptions (MLPs) – far simpler AI components.

What they did: Google has developed a computer vision classifier called MLP-Mixer (‘Mixer’ for short), a “competitive but conceptually and technically simple alternative” to contemporary systems that use convolutions or self-attention (e.g, transformers). Of course, this has some costs – Mixer costs dramatically more in terms of compute than the things it is competing with. But it also highlights how, given sufficient data and compute, a lot of the architectural innovations in AI can get washed away simply by scaling up dumb components to mind-bendingly large scales.

Why this matters: “We believe these results open many questions,” they write. “On the practical side, it may be useful to study the features learned by the model and identify the main differences (if any) from those learned by CNNs and Transformers. On the theoretical side, we would like to understand the inductive biases hidden in these various features and eventually their role in generalization. Most of all, we hope that our results spark further research, beyond the realms of established models based on convolutions and self-attention.”
Read more:MLP-Mixer: An all-MLP Architecture for Vision (arXiv).

###################################################

What do superintelligent-AI-risk skeptics think, and why?
…Worried about the people who aren’t worried about superintelligence? Read this…
Last week, we wrote about four fallacies leading to a false sense of optimism about progress in AI research (Import AI 247). This week, we’re looking at the opposite issue – why some people are worried that people are insufficiently worried about possibility of superintelligence. That’s the gist of a new paper from Roman Yampolskiy, a longtime AI safety researcher at the University of Louisville.

Why be skeptical? Some of the reasons to be skeptical of AI risks include: general AI is far away, there’s no obvious path from here to there, even if we built it and it was dangerous we could turn it off, superintelligence will be in some sense benevolent, AI regulation will deal with the problems, and so on.

Countermeasures to skepticism: So, how are AI researchers meant to rebut skeptics? One thing is to build more consensus among scientists around what comprises superintelligence, another is to try and educate people more about the technical priors that inform superintelligence-wary people, appeal to authorities like Bill Gates and Elon Musk which have talked about it, and, perhaps most importantly, “do not reference science-fiction stories”.

Why this matters: The superintelligence risk debate is really, really messy, and it’s fundamentally bound up with the contemporary political economy of AI (where many of the people who worry about superintelligence are also the people with access to resources and who are least vulnerable to the failures of today’s AI systems). That means talking about this stuff is hard, prone to ridicule, and delicate. Doesn’t mean we shouldn’t try, though!
Read more: AI Risk Skepticism (arXiv).

###################################################

Tech Tales:
[Department of Prior World Analysis, 2035]
During a recent sweep of REDACTED we discovered a cache of papers in a fire-damaged building (which records indicate was used as a library during the transition era). Below we publish in full the text from one undamaged page. For access to the full archive of 4332 full pages and 124000 partial scraps, please contact your local Prior World Analysis administrator).

On the ethics of mind-no-body transfer across robot morphologies
Heynrick Schlatz, Lancaster-Harbridge University
Published to arXiv, July, 2028.

Abstract:
In recent years, the use of hierarchical models for combined planning and movement has revolutionized robotics. In this paper we investigate the effects of transferring either a planning or a movement policy – but not both – from one robot morphology to another. Our results show that planning capabilities are predominantly invariant to movement policy transfer due to few-shot calibration, but planning policy transfer can lead to pathological instability.

Paper:
Hierarchical models for planning and movement have recently enabled the deployment of economically useful, reliable, and productive robots. Typical policies see both movement and planning systems trained in decoupled simulation environments, only periodically being trained jointly. This has created flexible policies that show better generalization and greater computational efficiency than policies trained jointly, or systems where planning and movement is distilled into the same single policy.

In this work, we investigate the effects of transferring either movement or planning policies from one robot platform to another. We find that movement policies can typically be transferred with a negligible performance degradation – even on platforms where they have more than twice the number of actuators to the originating platform. We find the same is not true for planning policies. In fact, planning policies demonstrate a significant degradation after being transferred from one platform to another.

Feature activation analysis indicates that planning policies suffer degradation to long-term planning, self-actualization, and generalization capabilities as a consequence of such transfers. We hypothesize the effect is analogous to what has been seen recently in biology – motor control policies can be transferred or fine-tuned from one individual to another, while attempts to transfer higher-order mental functions have proved unsuccessful and in some cases led to loss of life or mental function.

In figure 1., we illustrate transfer of a planning policy from robot morphology a) – a four-legged two-arm Toyota ground platform – to robot morphology b) a two-legged six-arm Mitsubishi construction platform. Table 1., reports performance of the movement policy under transfer. Table 2., reports performance of the planning policy. We find significant degradation of performance when conducting planning transfer. Analysis of feature activations shows within-distribution activations when deployed on originating platform a); upon transfer to platform b) we immediately see activation patterns shift to features previously identified to correlate to ‘confusion’, ‘dysmorphia’, and circuit activations linked to self-modelling, world-modelling, and memory retracement.

We recorded the planning policy transfer via four video cameras placed at corners of the demonstration room. As stills from video analysis in figure 2., show, we see the planning policy transfer leads to physical instability in the robot platform – it can be seen attempting to scale walls of demonstration room, then repeatedly moves with force into the wall. The robot was deactivated following attempts to use four of its six arms to remove one of its other arms. Features activated on the robot during this time showed high readings for dysphoria as well as a spike in ‘confusion’ activation which we have not replicated since, due to ethical concerns raised by the experiment.

Things that inspired this story: Reading thousands of research papers over the years for Import AI and thinking about what research from other timelines or worlds might look like; playing around with different forms for writing online; thinking about hierarchical RL which was big a few years ago but then went quiet and wondering if we’re due for an update; playing around with notions of different timelines and plans.

Import AI 247: China makes its own GPT3; the AI hackers have arrived; four fallacies in AI research.

Finally, China trains its own GPT3:
…Now the world has two (public) generative models, reflecting two different cultures…
A team of Chinese researchers have created ‘PanGu’, a large-scale pre-trained language model with around ~200 billion parameters, making it equivalent to GPT3 (175 billion parameters) in terms of parameter complexity. PanGu is trained on 1.1TB of Chinese text (versus 570GB of text for GPT-3), though in the paper they train the 200B model for a lot less time (on way fewer tokens) than OpenAI did for GPT-3. PanGu is the second GPT-3-esque model to come out of China, following the Chinese Pre-trained Language Model (CPM, Import AI 226), which was trained on 100GB of text and was only a few billion parameters, compared to a couple of hundred!

Is it good? Much like GPT-3, PanGu does extraordinarily well on a range of challenging, Chinese-language benchmarks for tasks as varied as text classification, keyword recognition, common sense reasoning, and more.

Things that make you go hmmmm – chiplomacy edition: In this issue’s example of chiplomacy, it’s notable the researchers train this on processors from Huawei, specifically the company’s “Ascend” processors. They use the ‘mindspore‘ framework (also developed by Huawei).
  Read more: PANGU-α: LARGE-SCALE AUTOREGRESSIVE PRETRAINED CHINESE LANGUAGE MODELS WITH AUTO-PARALLEL COMPUTATION (arXiv).

###################################################

The AI hackers are here. What next?
…Security expert Bruce Schneier weighs in…
Bruce Schnier has a lengthy publication at the Belfer Center about ‘the coming AI hackers’. It serves as a high-level introduction to the various ways AI can be misused, abused, and wielded for negative purposes. What might be most notable about this publication is it’s discussion of raw power – who has it, who doesn’t, and how this interplaces with hacking: “Hacking largely reinforces existing power structures, and AIs will further reinforce that dynamic”, he writes.
  Read more: The Coming AI Hackers (Belfer Center website)

###################################################

What does it take to build a anti-COVID social distancing detector?
…Indian research paper shows us how easy this has become…
Here’s a straightforward paper from Indian researchers about how to use various bits of AI software to build something that can surveil people, understand if they’re too close to eachother, and provide warnings – all in the service of encouraging social distancing. India, for those not tuning into global COVID news, is currently facing a deepening crisis, so this may be of utility to some readers.

What it takes to build a straightforward AI systems: Building a system like this basically requires an input video feed, an ability to parse the contents of it and isolate people, then work out if the people are too close to eachother or not. What does it take to do this? For people detection, they use YOLOv3, a tried-and-tested object detector, using a darknet-53 network pre-trained on the MS-COCO dataset as a backbone. They then use an automated camera calibration technique (though note how you can do this manually with OpenCV) to estimate spaces in the video feed, which they can then use to perform distance estimation. “To achieve ease of deployment and maintenance, the different components of our application are decoupled into independent modules which communicate among each other via message queues,” they write.
  In a similar vein, back in January, some American researchers published a how-to guide (Import AI 231) for using AI to detect if people are wearing anti-COVID masks on construction sites, and Indian company Skylark Labs saying in May 2020 it was using drones to observe crowds for social distancing violations (Import AI 196).

A word about ethics: Since this is a surveillance application, it has some ethical issues – the authors note they’ve built this system so it doesn’t need to store data, which may help deal with any specific privacy concerns, and it also automatically blurs the faces of the people that it does see, providing privacy during deployment.
  Read more: Computer Vision-based Social Distancing Surveillance Solution with Optional Automated Camera Calibration for Large Scale Deployment (arXiv).

###################################################

Want some earnings call data? Here’s 40 hours of it:
…Training machines to listen to earnings calls…
Researchers with audio transcription company Rev.com and Johns Hopkins University have released Earnings-21, a dataset of 39 hours and 15 minutes of transcribed speech from 44 earnings calls. The individual recordings range from 17 minutes to an hour and 34 minutes. This data will help researchers develop their own audio speech recognition systems – but to put the size of the dataset in perspective, Kensho released a dataset of 5,000 hours of earnings call speech recently (Import AI 244). On the other hand, you need to register to download the Kensho data, but you can pull this ~40 hour lump directly from GitHub, which might be preferable.
  Read more: Earnings-21: A Practical Benchmark for ASR in the Wild (arXiv).
  Get the data here (rev.com, GitHub)

###################################################

Want to test out your AI lawyer? You might need CaseHOLD:
…Existing legal datasets might be too small and simple tlo measure progress…
Stanford University researchers have built a new multiple choice legal dataset, so they can better understand how well existing NLP systems can deal with legal questions.
  One of the motivations to build the dataset has come from a peculiar aspect of NLP performance in the legal domain – specifically, techniques we’d expect to work don’t work that well: “One of the emerging puzzles for law has been that while general pretraining (on the Google Books and Wikipedia corpus) boosts performance on a range of legal tasks, there do not appear to be any meaningful gains from domain-specific pretraining (domain pretraining) using a corpus of law,” they write.

What’s in the data? CaseHOLD contains 53,000 multiple choice questions with prompts from a judicial decision and multiple potential holdings, one of which is correct, which could be cited. You can use CaseHOLD to test how well a model has a grasp on this aspect of the law by seeing which of the multiple choice question answers it selects as most likely.
  Read more: When Does Pretraining Help? Assessing Self-Supervised Learning for Law and The CaseHOLD Dataset of 53,000+ Legal Holdings (Stanford RegLab, blog).
  Read more: When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset (arXiv).
  Get the data here: CaseHOLD (GitHub).

###################################################

AI research has four fallacies – we should be aware of them:
…Making explicit some of the implicit assumptions or beliefs among researchers…
Imagine that the field of AI research is a house party – right now, the punch bowls are full of alcohol, people are excitedly showing each other what tricks they can do, and there’s a general sense of joie de vise and optimism (though these feeling aren’t shared by the people outside the party who experience its effects, nor by the authorities who are dispatching some policy-police cars to go and check the party doesn’t go out of hand). Put simply: the house party is a real rager!
    But what if the punchbowl were to run out – and what would make it run out? That’s the idea in a research paper from Melanie Mitchell, a researcher at the Sante Fe Institute, where she says our current optimism could lead to us deluding ourselves about the future trajectory of AI development, and this could come from four (what Mitchell terms) fallacies that researchers use when thinking about AI..

Four fallacies: Mitchell identifies four ways in which contemporary researchers could be deluding themselves about AI progress. These fallacies include:
Believing narrow intelligence is on a continuum with general intelligence: Researchers assume that progress in one part of the field of AI must necessarily lead to future, general progress. This isn’t always the case. 
– Easy things are easy and hard things are hard: Some parts of AI are counterintuitively difficult and we might not be using the right language to discuss these challenges. “AI is harder than we think, because we are largely unconscious of the complexity of our own thought processes,” Mitchell writes.
– The lure of wishful mnemonics: Our own language that we use to describe AI might limit or circumscribe our thinking – when we say a system has a ‘goal’ we imbue that system with implicit agency that it may lack; similarly, saying a system ‘understands’ something connotes a more sophisticated mental process than what is probably occurring. “Such shorthand can be misleading to the public,” Mitchell says.
– Intelligence is all in the brain: Since cognition is embodied, might current AI systems have some fundamental flaws? This feels, from my perspective, like the weakest point Mitchell makes, as one can achieve embodiment by loading an agent into a reinforcement learning environment and provide it with actuators and a self-discoverable ‘surface area’, and this can be achieved in a digital form. On the other hand, it’s certainly true that being embodied yields the manifestation of different types of intelligence.

Some pushback: Here’s some discussion of the paper by Richard Ngo, which I felt helpful for capturing some potential criticisms.
  Read more: Why AI is Harder Than We Think (arXiv).

###################################################

Tech Tales

Just Talk To Me In The Real
[2035: Someone sits in a bar and tells a story about an old partner. The bar is an old fashioned ‘talkeasy’ where people spend their time in the real and don’t use augments].

“Turn off the predictions for a second and talk to me in the real,” she said. We hadn’t even been on our second date! I’d never met someone who broke PP (Prediction Protocol) so quickly. But she was crazy like that.

Maybe I’m crazy too, because I did it. We talked in the real, both naked. No helpful tips for things to say to each other to move the conversation forward. No augments. It didn’t even feel awkward because whenever I said something stupid or off color she’d laugh and say “that’s why we’re doing this, I want to know what you’re realy like!”.

We got together pretty much immediately. When we slept together she made me turn off the auto-filters. “Look at me in the real”, she said. I did. It was weird to see someone with blemishes. Like looking at myself in the mirror before I turn the augments on. Or how people looked in old pornography. I didn’t like it, but I liked her, and that was a reason to do it.

The funny thing is that I kept the habit even after she died. Oh, sure, on the day I got the news I turned all my augments on, including the emotional regulator. But I turned it off pretty quickly – I forget, but it was a couple of days or so. Not the two weeks that the PP mandates. So I cried a bunch and felt pretty sad, but I was feeling something, and just the act of feeling felt good.

I even kept my stuff off for the funeral. I did that speech in the real and people thought I was crazy because of how much pain it caused me. And as I was giving the speech I wanted to get everyone else to turn all their augments off and join me naked in the real, but I didn’t know how to ask. I just hoped that people might choose to let themselves feel something different to what is mandated. I just wanted people to remember why the real was so bitter and pure it caused us to build things to escape it.Things that inspired this story: Prediction engines; how technology tends to get introduced as a layer to mediate the connections between people.