Import AI 286: Fairness through dumbness; planet-scale AI computing; another AI safety startup appears

Are AI systems conscious? And would it matter if they were?
…Some ‘mostly boring’ views from the inside of a lab…
My colleague, Amanda Askell, has written a post about AI consciousness. Amanda is a philosopher and ML researcher and she spends a lot of time trying to evaluate models. This post lays out some of her views on AI consciousness and is worth a read if you’re trying to orient yourself in this debate.
  “Some people care about properties like intelligence and self-awareness because they want to identify features that might distinguish humans from non-human animals. In general, I’m more interested in what distinguishes a tiger from a rock than in what distinguishes a human from a tiger,” she writes.

Why this matters: There’s some chance AI systems will eventually become both moral patients and moral agents. Our ability to understand this relates to our ability to think about consciousness and how it might apply to increasingly advanced AI systems. If we get this wrong we, per Amanda’s phrasing, risk subjecting agents to thousands of years of torture. Let’s avoid that.
  Read more: My mostly boring views about AI consciousness (Amanda Askell, substack).

####################################################

How do we get fairer AI systems? Train the dumbest and biggest model possible:
…Facebook shows that sometimes the best filter is no filter at all…
Researchers with Facebook AI Research have trained what they think is the largest dense vision model ever (10 billion parameters) on a billion random images sampled from Instagram. The resulting models are extraordinarily capable at a huge range of downstream evaluations (mirroring the performance trends of scaling up compute and data for language models like GPT-3), but also have another intriguing trait: they display much better qualities around fairness and bias than vision models trained on curated datasets like ImageNet. “”In this work, we are interested in probing which of the properties emerge in visual features trained with no supervision on as many images from across the world as possible,” they write.
  This is a very big deal – it suggests that maybe the route to fair AI systems is training the largest possible model on the greatest possible amount of data with minimal human oversight. That would be a radical shift from the current intuitions around fairness – namely, that you get to fairness by heavily curating the underlying dataset.

Performance and Fairness: “On in-domain benchmarks, we observe that some properties of the features captured by the larger model was far less present in smaller model. In particular, one of our key empirical findings is that self-supervised learning on random internet data leads to models that are more fair, less biased and less harmful,” they write. “We observe that our model is also able to leverage the diversity of concepts in the dataset to train more robust features, leading to better out-of-distribution generalization.”
  Some of those capabilities in full: In tests, the models do better on fairness indicators relating to gender, skintone, and age bias. They also display less disparity around gender than models trained on ImageNet. They’re also better at identifying geographic features (including geographic localization), are better at hate speech detection, and display substantially better performance on generalization tests (like harder versions of ImageNet).

Things that make you go ‘hmm’ and ‘uh oh’: Facebook trained its model on 1 billion images taken from Instagram. But there’s a twist – it pre-filtered the data to ensure it wasn’t training anything on EU data “to confirm to GDPR”. While this might seem like standard cover-your-back behavior, it has a deeper implication: Europe’s privacy legislation means that certain types of data from Europe will ultimately be less represented in global-scale AI models. This means the cultures of various European countries will also be less represented. This is a nice example of the unintended consequences of legislation.

Why this matters: “We have demonstrated the potential of using self-supervised training on random internet images to train models that are more fair and less harmful (less harmful predictions, improved and less disparate learned attribute representations and larger improvement in object recognition on images from low/medium income households and non-Western countries).” In other words – the scaling will continue until the models improve (further)!
  Read more: Vision Models are More Robust and Fair When pretrained on Uncurated Images Without Supervision (arXiv).

####################################################

AI supercomputers? Cute. Planet-scale computers? Better.
…Microsoft reveals ‘Singularity’, a globe-spanning AI computer…
Microsoft has revealed Singularity, the software stack it uses to schedule and train AI jobs across its global fleet of data centers. Singularity gives an indication of the vast-scale at which modern AI workloads get run, and also speaks to the ambitions of technology companies to role all their data centers together into a single, vast blob of compute.

How big is Singularity? Singularity is designed to “scale across a global fleet of hundreds of thousands of GPUs and other AI accelerators”. Singularity treats Microsoft’s compute stack “as a single, logical shared cluster”.

Something special: One neat feature of Singularity is how it deals with failures. Failures happen a lot in machine learning; when you’re training a neural network across hundreds to thousands of GPUs, a ton of freaky shit happens – nodes die, tiny software bugs explode (usually at 2am), your scheduler goes into a crash-loop, etc. Singularity tries to deal with this by gathering node-specific data on all the jobs being run, so that jobs can be easily resumed after running into a problem. “The checkpoint that Singularity takes is comprised of consistent address-space snapshots of individual workers of the job. As these snapshots capture the full program state such as instruction pointer, stack, heap etc., the job resumes exactly from the point where it was preempted at, with no lost work,” the researchers write. 


Why this matters: Just as computation is going to be the fundamental resource of the 20th century, the ability to utilize that computation will be the thing that defines who wields power in this era. Systems like Singularity give us an indication of the ambition of companies like Microsoft, and should make policymakers pay attention: what happens when the ability to wield planet-scale computation is solely something within the competency of private sector actors unaffiliated with any single nation state?
  Read more: Singularity: Planet-Scale, Preemptible, Elastic Scheduling of AI Workloads (arXiv).

####################################################

AI is going to change games – this new beta service shows how:
…Latitude Voyage gestures at a future where games are built, extended, and adapted by AI…
Latitude, the startup game company that makes the GPT2/3/J1J-based game ‘AI Dungeon’, has announced a service called Voyage. Voyage is a subscription service for gaining access to new AI-based games built by Latitude, the ability to use various game-specific AI image generators, and – most intriguingly – eventually access to a ‘creator studio’, which will make it possible for people to build their own AI powered games and other software.

Why this matters: AI models are going to become the generative kernels around which new games get built. AI-based games hold the possibility for a dream of all games designers – a game that adapts to the individual that plays it, with games becoming more customized, idiosyncratic, and surprising the longer you play. Services like Latitude Voyage tell us that experiments in this new domain are about to be run at a large scale. 
  Read more: Latitude Voyage (Latitude).

####################################################

Fine-tune GPT-NeoX-20B – for free…
…GaaS me up, fool!…
We’ve talked about language models as a service (LMaaS). Now, we’ve got GPT-as-a-service (GaaS). Specifically, AI startup ForeFront has announced its now hosting Eleuther’s 20B GPT model, GPT-NeoX-20B, and has built a bunch of fine-tuning features people can use. This is interesting for a couple of reasons:
1) Speed: GPT-NeoX-20B came out, like, two weeks ago. Model release > commercial service in two weeks is an indication of the rapidly growing ecosystem around commercializing general models.
2) Competition: For a while, OpenAI was the only show in town when it came to providing GaaS/LMaaS services. Now, it’s competing with a bunch of entities, ranging from Forefront, to Cohere, to AI21 Labs. As competition steeps up, we’ll see people race to the top and bottom on various things (top: safety vs libertarian access policies), (bottom: pricing, know your customer checks).

Why this matters: If AI is going to interact with the world, people need to be able to interact with AI. The emergence of these kinds of commercial AI services is how that’ll happen, so it’s worth paying attention.
  Read more: How To Fine-Tune GPT-NeoX (ForeFront blog).

####################################################

Hark, yet another AI safety startup appears!
…Aligned AI comes out of the University of Oxford with big ambitions…
AI safety researcher Stuart Armstrong has left the Future of Humanity Institute to co-found Aligned AI, an AI research company.

Safety via value extrapolation: The company will work on value extrapolations, which Stuart describes as follows: “It is easy to point at current examples of agents with low (or high) impact, at safe (or dangerous) suggestions, at low (or high) powered behaviors. So we have in a sense the ‘training sets’ for defining low-impact/Oracles/low-powered AIs.

   It’s extending these examples to the general situation that fails: definitions which cleanly divide the training set (whether produced by algorithms or humans) fail to extend to the general situation. Call this the ‘value extrapolation problem[1], with ‘value’ interpreted broadly as a categorisation of situations into desirable and undesirable.

   Humans turn out to face similar problems. We have broadly defined preferences in familiar situations we have encountered in the world or in fiction. Yet, when confronted with situations far from these, we have to stop and figure out how our values might possibly extend. Since these human values aren’t – yet – defined, we can’t directly input them into an algorithm, so AIs that can’t solve value extrapolation can’t be aligned with human values”.

But how do you make money off this? “We’ll start by offering alignment as a service for more limited AIs,” Armstrong writes. “Value extrapolation scales down as well as up: companies value algorithms that won’t immediately misbehave in new situations, algorithms that will become conservative and ask for guidance when facing ambiguity.”

Why this matters: There’s been a flurry of new companies forming in the AI safety space recently, including ARC, Anthropic, Redwood Research, and now Aligned AI. Along with this, there’s also a proliferation of companies working on large-scale generative models (e.g, Cohere, AI21). It feels like AI has shifted into a multi-polar era, with a bunch more entities on the proverbial gameboard. This will present new opportunities and challenges for coordination. 

   Read more: Why I’m co-founding Aligned AI (Alignment Forum).

####################################################

After Chess, Go, and Shogi, DeepMind turns MuZero towards… video compression for YouTube?
…YouTube + MuZero = improved video compression…
DeepMind has applied MuZero, a more general successor to AlphaGo and AlphaZero, to video compression. Specifically, DeepMind has worked with YouTube to use MuZero to figure out the correct Quantisation Parameter to use in the open source version of the VP9 codec, libvpx. In tests, DeepMind found it was able to use the resulting MuZero Rate-Controller to lead to bitrate savings of between 3% and 5%. That’s a big deal – just imagine how big the bandwidth bill for running YouTube is, then take off some percentage points.

How does this relate to general AI? “​​By creating agents equipped with a range of new abilities to improve products across domains, we can help various computer systems become faster, less intensive, and more automated. Our long-term vision is to develop a single algorithm capable of optimizing thousands of real-world systems across a variety of domains,” DeepMind writes.

Why this matters: If cutting-edge Ai research can be put to work optimizing some of the world’s largest internet services, then that’s gonna create a sustainable route to funding ambitious research. Kudos to DeepMind for threading all kinds of inner-Alphabet-needles to deploy MuZero in this way.

   Read more: MuZero’s first step from research into the real world (DeepMind blog).
  Check out the research: MuZero with Self-competition for Rate Control in VP9 Video Compression (arXiv).


####################################################

Tech Tales

Do they even want to be saved
[A factory outside Detroit, 2030]

Every day, when the factory shift changed, someone came out and tossed a few robots in the bucket. The robots would explore the bucket for a while, then assess that they couldn’t get out, and stop moving. Shortly after that, someone came over and stuck a hose in the top of the bucket, then turned the water on. The robots would watch the water come into the bucket and move to try and get away from it, then it’d fill the bottom of the bucket and start to rise. After this, it took anywhere between a few seconds to a couple of minutes for the robots to die – their circuitry fried by the water that, inevitably, made its way in. 

It was an experiment, the people working in the factory were told. Someone upstairs wanted to do this, and you’d get overtime if you sat and watched the robots die in the bucket. Most people did the shift a couple of times, but found it made them uncomfortable, and stopped. 

Isaac, however, didn’t seem to mind. He’d done the bucket shift about a hundred times so far. He found it relaxing to sit after a day at work and watch the robots in the bucket. He didn’t even feel sad when they died, because he didn’t think they knew what dying was. He’d sit and sometimes smoke cigarettes and watch the bucket, then pull the hose over and turn it on and watch the bucket fill up with water and the robots die. Then he’d go home and fuck his wife and go to sleep. He’d have dreams and relatively few nightmares. 

One day, Isaac was sitting by the bucket, about to get the hose, when something strange happened: a robot appeared at the edge of the bucket’s rim. The robots were about the size of a baseball, so this didn’t make sense. Isaac got up and went and looked into the bucket and saw that the robots had clustered together to form a pyramid, and the robot on the top had climbed up the pyramid, as if it wanted to get out. Isaac picked up the robot and looked at it and it looked at him. Then he tossed it back into the bucket and got the hose and filled the bucket with water and watched them all die. 

Things that inspired this story: The horrendous moral-warping logic of capitalism; how death can seem like just another job; how AI systems might be conscious and people might not care.