Import AI 254: Facebook uses AI for copyright enforcement; Google uses RL to design better chips.

Agronerds rejoice… a pan-European crop parcel + satellite image dataset is on the way:
The University of Munich and a Geospatial company called GAF AG want to create a map of as much of the farmland in Europe as possible (with data for specific crop types and uses for each individual parcel of land), then pair this with geospatial data gathered by SENTINEL satellites. The dataset is called EuroCrops and the idea is to use it as the data fuel that might go into a system which uses machine learning to automatically classify and map crop types from a variety of data sources. This is the kind of ‘dull but worthy’ research that illustrates how much effort goes into creating some science-targeted datacenters. For instance…

A whole lot of work: The authors contacted ministries, agricultural departments, and authorities from 24 European states. As a result, the initial version of EUROCROPS contains data for 13 countries: Austria, Belgium, Croatia, Denmark, Estonia, France, Latvia, Lithuania, Netherlands, Portugal, Sweden, Slovakia, and Slovenia. There are also plans to incorporate data from Finland, Romania, and Spain. To assemble this dataset, they also needed to translate all the country’s non-harmonized ways of describing crops into a single schema, which they then apply across the dataset. That’s the kind of excruciatingly painful task required to make country-level data become legible when compared internationally.

Demo dataset: A full dataset is expected in time, but to start they’ve published a demo dataset covering data from Austria, Denmark, and Slovenia, and made this available in a variety of formats (CSV, HDF5 for the Sentinel data, and GeoJSON).
  Read more: EuroCrops: A Pan-European Dataset for Time Series Crop Type Classification (arXiv).
  Get the dataset from the official EuroCrops website.

###################################################

Big models are great – but they’re also getting more efficient, like this massive mixture-of-experts vision system:
…Sparsity comes to computer vision…
Google has built a large-scale, sparse model for computer vision, using a technology called a V-MoE (a Vision Mixture-of-Experts model). V-MoE is a variant of the ‘Vision Transformer’ (ViT) architecture which swapped out convolutions for transformers, and has been the key invention behind a bunch of recent impressive results out of Google. Google uses the V-MoE to train a vision model of 15B parameters – “the largest vision models to date”, it says in a research paper. These models can match the performance of other state-of-the-art dense models, but taking less time to train.

Top scores and surprising efficiency: Google’s largest V-MoE model gets 90.35% test accuracy on ImageNet. More intriguingly, their performance might be better than alternative dense models: “V-MoEs strongly outperform their dense counterparts on upstream, few-shot and full fine-tuning metrics in absolute terms. Moreover, at inference time, the V-MoE models can be adjusted to either (i) match the performance of the largest dense model while using as little as half of the amount of compute, or actual runtime, or (ii) significantly outperform it at the same cost.” The V-MoE models were pre-trained on JFT-300M, Google’s secret in-house dataset.

Why this matters: Besides the scores, these results matter in terms of efficiency – most of the energy-consumption of neural nets happens during inference after they’ve been trained. This MoE approach “takes the most efficient models and makes them even more efficient without any further model adaptation,” according to Google. Put another way: the people capable of training big models are going to be able to expand the margins on their services perhaps faster than those slowly dealing with small models – the rich (might) get richer. 
  Read more: Scaling Vision with Sparse Mixture of Experts (arXiv).

###################################################

One big thing: Google’s AI tools are now helping it build better chips:
…Welcome to corporate-level recursive-self-improvement…
Google has published a paper in Nature showing how it has used reinforcement learning to help it design the layout of chips, taking work which previously took humans months and converting it into about six hours of work. The results are chips that are superior or comparable to those designed by humans in critical areas like power consumption, performance, and chip area. “Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators,” the researchers write.

Where this came from: This is not, technically, new research – Google has been publishing on using RL for chip design for quite some time – the company published an early paper on this technique back in March 2020 (Import AI #191). But the fact they’ve been used to design the fifth generation of tensor processing units (TPUs) is a big deal.

Why this matters: I sometimes think of Google as a corporation made of human-designed processes that is slowly morphing into a bubbling stew defined equally by humans and AI systems. In the same way Google has recently been exploring using AI tools for things as varied as database lookups, power management in datacenters, and the provision of consumer-facing services (e.g, search, translation), it’s now using AI to help it design more effective infrastructure for itself. With this research, Google has shown it can train machines to build the machines that will train subsequent machines. How soon, I wonder, till the ‘speed’ of these processes become so rapid that we start iterating through TPU generations on the order of weeks rather than years?
  Read more: A graph placement methodology for fast chip design (Nature).

###################################################

Why AI policy is messed up and how to make it better, a talk and an idea:
…Hint: It’s all about measurement…
I think most of the problems of AI policy stem from the illegibility of AI systems (and to a lesser extent, the organizations designing these systems). That’s why I spend a lot of my time working on policy proposals / inputs to improve our ability to measure, assess, and analyze AI systems. This week, I spoke with Jess Whittlestone at Cambridge about ways we can better measure and assess AI systems, and also gave a talk at a NIST workshop on some issues in measurement/assessment of contemporary systems. I’m generally trying to make myself more ‘legible’ as a policy actor (since my main policy idea is… demanding legibility from AI systems and the people building them, haha!).  Read more: Cutting Edge: Understanding AI systems for a better AI policy with Jack Clark (YouTube).
Check out the slides for the talk here (Google Slides).
Check out some related notes from remarks I gave at a NIST workshop last week, also (Twitter).

###################################################

Job alert! Join the AI Index as a Research Associate and help make AI policy less messed up:
…If you like AI measurement, AI assessment, and are detail-oriented, then this is for you…
The AI Index is dedicated to analyzing and synthesizing data around AI progress. I work there (currently as co-chair), along with a bunch of other interesting people. Now, we’re expanding the Index. This is a chance to work on issues of AI measurement and assessment, improve the prototype ‘AI vibrancy’ tool we’ve built out of AI Index data, and support our collaborations with other institutions as well.
Take a look at the job and apply here (Stanford). (If you’ve got questions, feel free to email me directly).

###################################################

Facebook releases a data augmentation tool to help people train systems that are more robust and can spot stuff designed to evade them:
…Facebook uses domain randomization to help it spot content that people want to be invisible to Facebook’s censors…
Facebook has built and released AugLy, software for augmenting and randomizing data. AugLy makes it easy for people to take a piece of data – like an image, piece of text, audio file, or movie – then generate various copies of that data with a bunch of transformations applied. This can help people generate additional data to train their systems on, and can also serve as a way to test the robustness of existing system (e.g, if your image recognition system breaks when people take an image and put some meme text on it, you might have a problem).
  Most intriguingly, Facebook says a motivation for AugLy is to help it train systems that can spot content that has been altered deliberately to evade them. “Many of the augmentations in AugLy are informed by ways we have seen people transform content to try to evade our automatic systems,” Facebook says in a blog announcing the tool.

Augly and Copyright fuzzing: One thing AI lets you do is something I think of as ‘copyright fuzzing’ – you can take a piece of text, music, or video, and you can warp it slightly by changing some of the words or tones or visuals (or playback speed, etc) to evade automatic content-IP detection systems. Tools like AugLy will also let AI developers train AI systems that can spot fuzzed or slightly changed content.
This also seems to be a business case for Facebook as, per the blog post: “one important application is detecting exact copies or near duplicates of a particular piece of content. The same piece of misinformation, for example, can appear repeatedly in slightly different forms, such as as an image modified with a few pixels cropped, or augmented with a filter or new text overlaid. By augmenting AI models with AugLy data, they can learn to spot when someone is uploading content that is known to be infringing, such as a song or video.”
Read more: AugLy: A new data augmentation library to help build more robust AI models (Facebook blog).
Get the code for Augly here (Facebook GitHub).###################################################

Tech Tales:

Choose your own sensorium
[Detroit, 2025]

“Oh come on, another Tesla fleet?” I say, looking at the job come across my phone. But i need the money so I head out of my house and walk a couple of blocks to the spot on the hill where I can see the freeway. Then I wait. Eventually I see the Teslas – a bunch of them, traveling close together on autopilot, moving as a sinuous single road train down the freeway. I film them and upload the footage to the app. A few seconds later the AI verifies the footage and some credits get deposited in my account.
Probably a few thousand other people around the planet just did the same thing. And the way this app works, someone bought the rights (or won the lottery – more on that later) to ask the users – us – to record a particular thing, and we did. There’s been a lot of Tesla fleets lately, but there’ve also been tasks like spotting prototype Amazon drones, photographing new menus in fast food places, and documenting wildflowers.

It’s okay money. Like a lot of stuff these days it’s casual work, and you’re never really sure if you’re working for people, or corporations, or something else – AI systems, maybe, or things derived from other computational analysis of society.

There’s a trick with this app, though. Maybe part of why it got so successful, even. It’s called the lottery – every day, one of the app users gets the ability to put out their own job. So along with all the regular work, you get strange or whimsical requests – record the sky where you are, record the sunset. And sometimes requests that just skirt up to the edges of the app’s terms of service without crossing the line – photograph your feet wearing socks (I didn’t take that job), record 30 seconds of the local radio station, list out what type of locks you have for your house, and so on.

I have dreams where I win and get to choose. I imagine asking people to record the traffic on their local street, so I could spend months looking at different neighborhoods. Sometimes I dream of people singing into their phones, and me putting together a song out of all of them that makes me feel something different. And sometimes I just imagine what it’d be if the job was ‘do nothing for 15 minutes’, and all I collect is data from onboard sensors from all the phones – accelerometers showing no movement, gyroscopes quietly changing, GPS not needing to track moving objects. In my dreams, this is peaceful.

Things that inspired this story: Winding the business model of companies like ‘Premise Data’ forward; global generative models; artisanal data collection and extraction; different types of business models; the notion of everything in life becoming gamified and gamble-fied.