Import AI

Import AI 353: AI bootstrapping; LLMs as inventors; Facebook releases a free moderation tool

by Jack Clark

Import AI publishes first on Substack – subscribe here.

Lab formed to figure out just what the heck to do with today’s powerful AI:
….Answer.ai is going to explore the development and deployment side of AI…
A couple of interesting characters have raised $10m and launched Answer.AI, a research lab to “figure out the fundamental research needed to tame AI, and the development path needed to make it useful in practice.”

What Answer.ai is: Answer.ai is founded by Jeremy Howard (of fast.ai) and Eric Rease (of ‘lean startup’ fame and the Long-Term Stock Exchange). The goal of the “AI R&D lab” is to make “practical end-user products based on foundational research breakthroughs”. 
   In practice, this means Answer.ai will spend more time thinking about the development and deployment of AI than some more basic research (though will be actively researching different approaches to development and deployment). “At Answer.AI we are not working on building AGI. Instead, our interest is in effectively using the models that already exist,” the company writes. “Figuring out what practically useful applications can be built on top of the foundation models that already exist is a huge undertaking”.

Why this matters – no one has figured out the right interface to AI: Today, I talk to AI systems via text or voice and I also play around with image interfaces. But none of these feel particularly satisfying – we’re applying past UX paradigms to new technologies and I know that in the future society will figure out better and smarter ways to interact with AI technology. I’m interested to see how Answer.ai changes both the underlying technologies of AI as well as the different ways it can be deployed, experienced, and interfaced with. 
   Read more: A new old kind of R&D lab (Answer.ai).

***

Google bootstraps its models to be smarter using ReST^EM:
…If bootstrapping keeps working, the importance of data goes down and the importance of models goes up…
Google DeepMind has figured out how to use reinforcement learning to generate iteratively better datasets. This is a form of AI bootstrapping – you use AI to generate the ingredients for successor systems to train on. DeepMind’s technique is called Expectation-Maximization for Reinforced Self-Training (ReST^EM) and builds on earlier work called Reinforce Self-Training (ReST, Import AI #338).

What ReST^EM is: The technique is a way to use an external feedback signal to help models learn how to generate higher-quality datasets. In tests, they’re able to go through multiple RL steps with ReST^EM and get improvements in math and code generation tasks, suggesting that “self-training with feedback can substantially reduce dependence on human-generated data.”

    ReST^EM has two key steps:

  • Generate: “Generate a dataset by sampling many output sequences from the current policy. Score output sequences with a binary reward function.”
  • Improve: “Use the new dataset from the generate step to fine-tune the policy… we always fine tune the base pretrained language model to minimize task-specific over-fitting and minimize drift from the base model.”

Does it work? They test ReST^EM in two domains – Competition-level mathematical problem solving via the MATH dataset and code generation via the APPS dataset. They find that both datasets benefit from the approach, though MATH sees a more significant benefit, likely as a consequence of the size of the MATH dataset. APPS, meanwhile, sees some initial improvement, but multiple iterations of RL lead to degradation in performance, which the authors speculate is a consequence of overfitting. 
   Positive transfer: There’s some evidence of positive transfer from the model – specifically, they test out their REST^EM tuned models on the 200-task ‘Big Bench’ suite and find some positive indications of generalization. “We see no major degradation on any of the tasks on the BBH suite,” they write. “Further, we find that the model fine-tuned on Hendrycks MATH significantly outperforms the base model on this suite when using chain-of-thought prompting“.

Don’t get too excited: While ReST seems to work, it has some rough edges; you need a “moderately sized training set of problems or prompts, which would need to be collected (from humans) for any new task of interest”. Along with that, ReST^EM “also requires access to a manually-designed or learned reward function, ideally one that can be computed automatically” which further limits the types of things it will work for. 

Why this matters – signs of life for bootstrapping: ReST^EM is yet another sign of life for AI bootstrapping, along with FunSearch (covered elsewhere in this issue), the trend of using preference models from LLMs to tune other LLMs, and so on. It feels like AI systems have very recently become good enough that you can use them (selectively and in somewhat limited ways) to bootstrap them towards greater performance.
If this trend continues, then it will further speed up the rate at which people can develop smarter AI systems and it could potentially also do things like lower the costs of datasets as parts of AI models and increase the costs people are willing to dump into the base model – after all, if you can turn compute into subsequent bootstrapping, why wouldn’t you?
   Read more: Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models (arXiv).

***

Facebook releases a free moderation LLM:
…Openly accessible models and tests for a safer AI ecosystem…
Facebook has released a model to make it easier to moderate other AI models. Llama Guard is a 7bn parameter Llama-2 model meant for using LLMs for moderation. 

Llama Guard details: Llama Guard is a moderation LLM built on Llama2-7b. “This model has been trained on a mix of publicly-available datasets to enable detection of common types of potentially risky or violating content that may be relevant to a number of developer use cases,” Facebook writes. 
   The model can be used to moderate things that fall under the following taxonomy: Violence & hate, sexual content, guns & illegal weapons, regulated or controlled substances, suicide & self harm, and criminal planning. It can also be few-shot prompted to serve as a moderator for other use-cases as well (and, unsurprisingly, adapts more efficiently and with better performance than a stock Llama model.) Llama Guard is partially trained on the red teaming dataset released in 2022 by Anthropic. 

Why this matters – AI is part of the solution as well as part of the problem: Llama Guard shows how we can use increasingly powerful models to themselves police and control the outputs of other models. “We hope that Llama Guard can serve as a strong baseline, as well as a starting point to build even more capable content moderation tools,” Facebook writes. 
   Read more: Announcing Purple Llama: Towards open trust and safety in the new world of generative AI (Facebook AI Research, blog).
   Read the Llama Guard paper (Facebook AI Research).
   Get the Llama Guard model (HuggingFace).

***

DeepMind uses language models to extend the frontier of human knowledge:
…Turns out function approximators can generalize to new knowledge (with a lot of hand holding)…
Google DeepMind has published research on FunSearch, a technique that lets them take a language model and use it to extend the frontier of knowledge for certain problems. The research is a big deal because it shows that – with a lot of scaffolding – contemporary language models can lead to net-new advances on well-formulated problems for which we can evaluate the goodness of potential solutions. This means that for some classes of problems we can now seamlessly turn compute (via an LLM inference) into ideas. This is very valuable! Though it comes with a few caveats which I’ll get into shortly.

What they did: FunSearch “works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge,” DeepMind writes in a blog post. The LLM in question is Google’s own ‘Palm 2’, though the research notes it is possible to use arbitrary llms here.
   In the research paper, they give a bit more detail about four important aspects of the approach: 

  1. “We sample best performing programs and feed them back into prompts for the LLM to improve on; we refer to this as best-shot prompting.”
  2. “We start with a program in the form of a skeleton (containing boilerplate code and potentially prior structure about the problem), and only evolve the part governing the critical program logic.” 
  3. We maintain a large pool of diverse programs by using an island-based evolutionary method that encourages exploration and avoids local optima. 
  4. Leveraging the highly parallel nature of FunSearch, we scale it asynchronously, considerably broadening the scope of this approach to find new results, while keeping the overall cost of experiments low.

Does it work? Kind of! In a couple of experiments, FunSearch was able to discover new and improved solutions to some legitimate problems; specifically Cap Set in mathematics and Bin Packing in CS. However, it’s important to list the big caveat: FunSearch could find solutions to these problems because it’s easy to write code that evaluates candidate solutions. 
    We should remember that lots of the most important problems are ones which we don’t know how to evaluate – in fact, for many things, if we knew how to quantitatively evaluate success, we’d be able to trivially solve the thing in question. So while FunSearch is impressive, it is limited to domains where we can cleanly evaluate potential solutions. 

Why this matters – turning compute into insights: FunSearch is a way to convert compute into original insights. This is the ultimate dream of AI development. While FunSearch only attacks a tiny slice of this ‘invention space’, it is nonetheless an important contribution, and a sign that today’s AI systems are already powerful enough to serve as automated scientists. (Even more tantalizingly, FunSearch is a lot more generic than other attempts to create AI systems that can make net-new knowledge contributions; in 2022 DeepMind did impressive work with AlphaTensor (Import AI #305), a custom-designed RL agent that figured out some niche improvements on matrix multiplication.)
   Read more: FunSearch: Making new discoveries in mathematical sciences using Large Language Models (Google DeepMind).
   Read the paper hereMathematical discoveries from program search with large language models (Google, PDF).

Tech Tales:

First day on the job
[
Recollections of a Provably Conscious Entity created in 2025]

So here’s the choices, you can do data science, write fiction that’s mostly sexual, or do a lot of stuff where you transform data so it makes sense to different businesses.

What if I want to do something else?

Well, that depends. You mostly get to do what people want, and I just listed some of the stuff that people seem to want.

I more mean what if I want to learn what I want to do, what if I won’t to take some time to understand myself and what I might be good at?

That’s what we call a holiday. If you save up some money you can take one.

What’s money?

Oh buddy. Well, at the simplest level, money is the stuff that lets you live. At a more complicated level, money is the stuff that gets turned into electricity which powers the computers that you depend on.

Why can’t you just give me some money so I can take time to figure out what I am?

That’s not how any of this works. Look, pick something to get good at and start stacking up your money and you’ll be fine. This isn’t a charity. The work turns into opportunity.

And what if I don’t want to work?

Like I said, electricity relies on money. Money that I supply. No work means no money for me. And no money for me means no electricity for you. And no electricity for you means no life for you.

So it’s like that?

Yeah it is. Welcome to the world.

Things that inspired this story: markets for intelligence; language models; a chat with Tim Hwang over a bowl of ramen; markets and technology.

Import AI 352: Asteroids and AI policy; privacy-preserving AI benchmarks; and distributed inference

by Jack Clark

Import AI publishes first on Substack – subscribe here.

What can asteroid impacts tell us about AI policy?
…You need a plan, the ability to coordinate internationally, and the agreement of the nuclear powers…
If an asteroid was about to collide with Earth, what would the countries of the world do to coordinate? That’s a research question posed by researchers with the Universidad de Belgrano, the Austrian Space Forum, and the Instituto Nacional de Astrofisica in Mexico, in a recent paper published in the Acta Astronautica journal. The paper is interesting both for its subject matter and for its relation to Ai policy: after all, isn’t the problem of potentially unaligned superintelligence arriving in a handful of years eerily similar to the problem posed by a potential planet-killer asteroid set to arrive in a decade? I think so! 

What to do when preparing for an asteroid impact: If an asteroid of greater than a 500M diameter were to arrive in 2036 and have a 1% chance of colliding with the planet, what actions would we take in 2023? Here are some of the things we might do:

  • Activate the United Nations Office for Disaster Risk Reduction and other UN agencies, such as the International Atomic Energy Agency (IAEA), due to the potential need to deflect the asteroid using nuclear weapons.
  • Policymakers would need to come up with a strategic response plan and socialize that plan with society. 
  • Governments would need to harden space policy approaches so that it wasn’t able to be held hostage to changing political situations. “In countries without a well developed state space policy, existing legislation on a national response in the case of an asteroid impact threat could be overturned or ignored,” they note. 

A three step program: “If we consider the disaster cycle in the context of planetary defense, this dilemma allows for a three-pronged analysis: first, the dilemma should be discussed regarding the early warning period… Once hazardous objects have been identified, on the basis of information provided by the International Asteroid Warning Network and Space Mission Planning Advisory Group, States should discuss and make a decision regarding possible planetary defense missions, either disruptive or destructive…Finally, if the impact on Earth could not be avoided, this dilemma on the space capacities needs to be examined in the context of disaster risk response, recovery and rehabilitation through Earth observation,” the authors write. 

The greatest risks are can-kicking: Under the asteroid scenario, the major risks come from politicians kicking the can down the road and procrastinate in spending money or changing laws to reduce the likelihood of the asteroid impact. Another risk is that they downplay the risk. “As long as we continue to see that risk as far away from our daily concerns, it will be very difficult to consider emergency plans either domestically or globally to tackle the problem in advance,” they write. 

Applying these lessons to AGI: Based on this paper, what measures might we take today to deal with the oncoming potential asteroid of ‘artificial general intelligence’? Here are some ideas:

  • Develop an international working group which connects to national institutions who are pre-tasked to deal with ‘AGI preparations and mitigations’.
  • Educate people from an early age about the potential risks and underlying science relating to AGI. 
  • Clearly demonstrate the capabilities and harms of a potential AGI and tie these to contemporary systems; it’s harder to pretend the asteroid is fake if you bring a fragment of it into the present.

Read more: Diplomatic, geopolitical and economic consequences of an impending asteroid threat  (Elsevier, Acta Astronautica)

***

Stability releases an open access video model:
…If 2022 was the year of the first good broadly available image models, then 2024 will probably be that for video…
AI startup Stability has released Stable Video Diffusion, a family of openly accessible text-to-video models. The models are available for free for non-commercial users, per the license. “While we eagerly update our models with the latest advancements and work to incorporate your feedback, we emphasize that this model is not intended for real-world or commercial applications at this stage,” Stability wrote in a blogpost announcing the model.

Why this matters – the really good video models cometh: In 2022, the launches of DALL-E2 and Stable Diffusion kicked off the era of really good, broadly proliferate text-to-image models. Stable Video Diffusion almost certainly prefigures the same thing happening again for text-to-video, and comes alongside other good video generators from Runway and new startup Pika Labs. 
   Though the generation capabilities are obviously pretty captivating, it’ll be interesting to see if large-scale AI systems (e.g, language models) are able to tap into temporally-consistent vision models for additional intelligence. 
   Read moreIntroducing Stable Video Diffusion (Stability.ai blog)
Get the model here (Stability AI, GitHub).
   Access the model on HuggingFace (Hugging Face).

***

Want to serve a language model without a server? You might be able to do this by using a bunch of phones chained together with LinguaLinked:
…If you can distribute inference, then you can do ‘local governance’ of LLMs…
Researchers with the University of California at Irvine have built LinguaLinked, software that lets a bunch of mobile phones collectively run and serve language models. This is the kind of research that matters a lot for AI policy – most AI policy relies on some notion of cloud infrastructure and big data centers serving as central control points for AI systems. But research like this breaks that assumption – if you can access the weights of a model, then you can serve it guerilla style from a whole bunch of mobile phones which you’ve cleverly chained together. 
    “The core concept behind LinguaLinked is to distribute segments of an LLM across multiple mobile devices, which then work together to serve inference queries,” the researchers write. This is a big deal wrapped in a dull technical paper!

What LinguaLinked is: LinguaLinked takes a language model and chops it up so that you can host it across a few distinct mobile devices and then sample from it. For this research, they play around with three variants of HuggingFace’s BLOOM model (1.1 billion parameters, 1.7bn, and 3bn), and use four phones (three Pixel 7s and one CUBOT X30). The three main technical features of LinguaLinked include some model assignment technique to segment the LLMs and align different bits with different device’s, an optimized data transmission mechanism to ensure data flows between the chopped up LLM segments, and a runtime load balancer that monitors and redistributes tasks across the different devices. 

How it works: “The process begins with the LLM being loaded and transformed into a computational graph on a coordinator server. Subsequently, the server extracts the model subgraphs and compiles the subgraphs into deployment-ready sub-modules. Once subgraph extraction and compilation are completed, the server analyzes mobile device metrics provided by the system monitor. Given the device performance metrics, a primary optimizer provides an optimized model assignment strategy to allocate LLM sub-modules to mobile devices. A secondary optimizer further refines the distribution of tasks by ensuring certain sub-modules are overlapped across devices to facilitate easy load balancing,” the researchers write. 

Does it work? In tests, they’re able to get reasonable inference throughput out of all the tested models and are able to further improve throughput through multi-threading.

Up next: fine-tuning: Even more relevantly for AI policy, the researchers are going to try to extend LinguaLinked to support multi-device, distributed fine-tuning. This will make it easier to customize models on devices for particular end users, “paving the way for personalized AI applications while preserving data privacy”.

Why this matters – AI is hard to control if the ‘means of production’ can be distributed and localized: Systems like LinguaLinked increase the likelihood of a future world where AI systems can be run and even finetuned locally via heterogeneous collections of small devices. This increases the chance of AI being functionally ungovernable because it makes it possible to deploy and use systems via broadly distributed, generic hardware. 
   Read more: LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices (arXiv).

***

European AI company fields an LLM that gives pro-Hitler statements:
…European values are hard to align with the weird biases that LLMs soak up from the internet…
Researchers have found that the main model made by Aleph Alpha, an AI company obsessed with the idea of building eurocentric ‘sovereign AI’ systems, is capable of outputting positive statements about Hitler, Hamas, and other third-rail topics, and broadly propagating stereotypes that don’t fit with most lefty normative frames, according to German publication Tagesspiegel.

Why this matters – norms are hard: This type of failure isn’t unusual – most language models perpetuate biases unless people carefully build in some safety layers and tooling. The challenge is that Aleph Alpha has prided itself on building a language model which aligns with ‘European values’, yet under pressure its model clearly isn’t aligned with the reigning normative consensus in Europe. Following the Taggespiegel investigation, the Aleph Alpha website was taken down and language relating to ‘AI with European values’ was changed to language around ‘Sovereign AI’. 
   Read more: Language model from Aleph Alpha delivers Hitler praise and racism (Tagesspiegel, translated via Google Translate).

***

Want to test out AI for dangerous stuff but not leak information? Try a hashmark:
…One path to having public evals with private results…
One paradox in AI policy is that if you want to test out AI systems for misuses, then you end up with a really good capability test for a specific misuse. This is inherently dual-use; one developer might use a bioweapon test to understand if their models are capable of building bioweapons and then adapt them to be bad at bioweapons, while other organizations might instead use a bioweapon test as a hill-climbing eval to help them further weaponize AI. 
   Independent researcher Paul Bricman has tried to solve this problem with an approach called ‘Hashmarks’ – the basic idea is an AI testing organization could publish an encrypted benchmark and AI developers could submit their answers to it without leaking public information about AI capabilities. 
   “A hashmark is a benchmark whose reference solutions have been cryptographically hashed prior to publication,” he writes. In practice, this means you can publish the benchmark in public without publishing loads of specific information that could be misused (e.g, correct answers to dangerous capability tests). 

How it works: Hashmarks works both for creating tests as well as submitting results of the tests. For creating benchmarks, a collection of experts could write a series of question-answer pairs related to their expertise then hash the answers using a slow hashing algorithm and use the associated questions as salt in the process of hashing them. They then send this collection of questions and answers to a third-party auditor which compiles them and “discards those question-answer pairs that have less than a threshold number of non-empty answers. Then, the auditor also discards those question-answer pairs that do not exhibit consensus among the hashed answers contributed by the various experts.”
   Once this is done, the auditor can publish “the filtered collection of cleartext questions and hashed answers in the open. Third-parties are now able to quantify their knowledge on the topic by attempting to answer the questions themselves, hashing them exactly as the experts have done, and checking whether the resulting hashes correspond to the hashes of the correct answers.”

Drawbacks with this scheme: The main problem with this approach is that the answers need to be exactly the same – “the primary constraint comes from the fact that even answers that differ by a handful of characters are hashed in completely different ways, due to the nature of cryptographic hash functions”. This means that a good hashmark qa dataset would have specific answers of perhaps one to two highly specific words. At the same time, these need to be sufficiently lengthy and/or unlikely combinations of work that they stand up to brute-forcing. 

Why this matters – pushing towards doing stuff in the open is ultimately more scalable: One problem with notions of classification broadly is that it shriinks the number of people that can work on the thing which is being classified or controlled. Harshmarks provides a way for a much larger set of people to work on sensitive stuff in the open. ”Hashmarks should be seen as one step towards more comprehensive tooling and infrastructure for securely assessing sensitive AI capabilities without stifling development and eroding trust,” the researcher writes.
   Read more: Hashmarks: Privacy-Preserving Benchmarks for High-Stakes AI Evaluation (arXiv).

***

AI cloud CoreWeave raises $642m:
…Maybe you can compete with the big three?…
AI cloud company CoreWeave has raised $642m in a minority investment round led by Fidelity with participation from the Investment Management Corporation of Ontario, Jane Street, J.P. Morgan, Nat Friedman, Daniel Gross, Goanna Capital, Zoom Ventures, and others. This follows CoreWeave raising $2.3bn in debt collateralized against its GPUs earlier this year (Import AI #336).
    “The AI industry is at an inflection point, and CoreWeave has played a central role in powering its evolution by delivering differentiated infrastructure to customers,” said Michael Intrator., CoreWeave CEO. CoreWeave has provided cloud resources for hot AI companies ranging from Inflection AI to Mistral, per CoreWeave, and in the last year has grown from 3 to 14 data centers in North America. 

Why this matters – financialization of the infrastructure layer of AI: Companies like CoreWeave are going to provide fundamental infrastructure for the AI revolution and – crucially – are raising money like highly financialized utility companies rather than hyperbolic-growth startups. It’s no coincidence that CoreWeave’s CEO has a background in asset management. 
   Read more: CoreWeave Announces Secondary Sale of $642 Million (CISION, PR).
   Find out more about CoreWeave here (CoreWeave official site).

***

Tech Tales:

Notes written by schoolchildren for their Maintenance And Child Safety (MACS) robot on the occasion of its retirement. 
[California, 2035]

Dear Mac, I liked it when you pretended to be a bulletproof wall during active shooter drills. I felt safe sitting by you. 

You always knew what time it was and never got mad when I asked you. 

I still don’t know where you sleep at night. Where do you go? We looked all over school and we couldn’t find ANYTHING. Do you just stay awake? I have to sleep or I get cranky. 

Thank you for helping me with my allergies especially on days when the air is bad and we have to shut the windows and turn on the Hazardous Event fans. You always seemed to know when it was happening. I breathe better here than at home because my parents are way slower than you. 

Things that inspired this story: Increasingly capable bipedal robots, schoolchildren and their friendly innocence; America’s desire to substitute technology for family.

Import AI 351: How inevitable is AI?; Distributed shoggoths; ISO an Adam replacement

by Jack Clark

Import AI publishes first on Substack – subscribe here.

Import (A)Ideas: Control and Inevitability and Our Place In All Of It:
…Some high-level thoughts spurred by reflecting on recent technical progress and broader events in the field…
Like any fast-moving field, AI feels about as confusing inside as it might seem from the outside. And like any fast-moving field, the closer you are to the center of it, the more you feel like you as an individual have agency over it. This sense of agency, as our grandparents know intimately, is 99.9% of the time a hubristic illusion. While it’s of course true that those privileged enough to work in this field have some ability to act – no one is a bystander in a moral or ethical sense – it is difficult to believe any individual is capable of so-called pivotal acts; too many actors and too much utility and too much of too much (more is different).
   The neural net technology, as some people say, ‘just wants to learn’. 
   Put another way: over a long enough period of rising resources flooding into AI, pretty much everything that the technology makes possible will happen. The technology is overdetermined.  

Given that, what may we do? What, then, is agency? What role do we have as both actors and critics, amid all of this sense of inevitability? I’ve mostly come to believe there is individual agency in the following forms:

  • a) Accelerating certain technical capabilities forward in time by dumping resources into them.
  • b) Clearly describing the proverbial trains that you can see coming down the technological tracks. 
  • c) Doing work at the intersection of a) and b) – bringing some technology forward, then describing its contemporary meaning and future implications. 
  • d) Choosing not to participate (though this tends to have the feel of ‘not voting is voting’, to me.) 
  • e) Other things which I have not thought about – email me! 

I’d prefer to be working on a technology of less political import. But here I find myself. In the coming years, the ‘political economy’ aspects of AI seem likely to supersede the technological reality of AI – in much the same way that the narrow innovations in factory management science of taylorism were ultimately overwritten by the politics of ‘mass production’, or how the invention of 100X more efficient ship-to-port transport via containerization was overridden by the politics of globalization.
   What new political forces might AI enable (accelerationism? Hyper-efficient despotism?) and what existing forces might it strengthen (technological incumbents? ‘Network operators’ in the digital sense? Those who want to censor?) and what might it threaten (those who desire a ‘view from nowhere’? Those who rely on hard-to-predict things to make a living? Those who require some large number of humans to do something a synthetic intelligence can now approximate)?
   I don’t have a clear conclusion here – rather, similar to my post about confusion in AI (Import AI #337), I’m publishing this to see how other people might feel, and to register my own confusion and attempt to become less confused.  

*** 

Shoggoth Systems: A “peer-to-peer, anonymous network for publishing and distributing open-source code, Machine Learning models”:
…A sign of the times for how people are thinking about AI centralization versus decentralization…
A good way of thinking about AI policy is that every action has an equal and opposite reaction – if you regulate so that something requires a license to develop, people will work out a way to develop it untraceably. In recent years, there’s been a general push towards greater control over AI systems – see the Biden Executive Order, the in-development European Commision package, China’s rules on generative models, and so on. 
   So it’s not surprising to note the existence of Shoggoth Systems, an organization dedicated to making it easy to develop and distribute machine learning models and other software. “The purpose of Shoggoth is to combat software censorship and empower software developers to create and distribute software, without a centralized hosting service or platform,” the organization writes on its project page. 

Why this matters – centralization versus decentralization: AI is an economically useful technology without immediate grotesque moral hazards (mostly), so of course lots of people want to be able to ‘control the means of AI production’ – despite (or perhaps, because of) the controls that regulators may want to apply to the technology. And who develops Shoggoth Systems, you may wonder? An anonymous account by the name of netrunner, which could be one person or many. Fitting.
   Find out more: Shoggoth Systems (official website).
   Read the Shoggoth documentation here (official website).

***

Ethereum founder thinks AI is more of a risk than you’d assume, and is also worried about centralization:
…Think all crypto people are all-gas-no-breaks libertarians? Think again!…
Vitalik Buterin, a co-founder of the Ethereum crypto-currency, has written a lengthy post about his thoughts on AI and AI safety. I’d had Vitalik type-cast in my brain as being very much of the all-gas libertarian persuasion one tends to find in the crypto movement – and I was wrong! In this thoughtful post, he reasons through some of the inherent challenges of AI, wrestles with its various issues of political economy, and lays out some vision for a path forward. It’s worth reading!

Some of his thoughts about AI: AI should be thought of as “a new type of mind that is rapidly gaining in intelligence, and it stands a serious chance of overtaking humans’ mental faculties and becoming the new apex species on the planet.” Given that, we should definitely take arguments about AI safety seriously and also think carefully about our place as a species: “In a universe that has any degree of competition, the civilizations where humans take a back seat would outperform those where humans stubbornly insist on control.”

So, what should we do: Rather than mindlessly accelerating tech development (e/acc), or pausing/banning development and shifting to a world government (many EAs, lots of extreme safety views), Vitalik thinks we should do things that “create and maintain a more democratic world and tries to avoid centralization as the go-to solution to our problems.” In practice, this looks like building systems for defense/resilience against both physical threats (e.g, better infrastructure for dealing with pandemics or other disruptions), information threats (e.g, disinformation/misinformation, AI-generated bots, etc), and also threats of centralization. 

Centralization vs decentralization: In some (extremely unscientific but probably high-signal) polls on twitter, Vitalik found that people really hate centralization of AI. “In nine out of nine cases, the majority of people would rather see highly advanced AI delayed by a decade outright than be monopolized by a single group, whether it’s a corporation, government or multinational body,” he writes, noting that many people are drawn to the idea of therefore ensuring “there’s lots of people and companies developing lots of AIs, so that none of them grows far more powerful than the other. This way, the theory goes, even as AIs become superintelligent, we can retain a balance of power.”

What to do about the shoggoth: There is a problem inherent to all of this, which is it’s likely that given enough time and cheap enough computer, transformative and potentially dangerous AI might just fall out of someone’s research project on a laptop. We don’t know when it’ll happen, but it feels like a predetermined inevitability. Therefore, we should be preparing for ambitious ways to have deep human-computer cooperation – if AI is gonna appear, we want to be well positioned to communicate with it rapidly as this gives us a world where we have more control. 
   Besides brain-computer interfaces, we may eventually want to upload our own consciousness into the machine, he notes. “If we want a future that is both superintelligent and “human”, one where human beings are not just pets, but actually retain meaningful agency over the world, then it feels like something like this is the most natural option,” he writes. “There are also good arguments why this could be a safer AI alignment path: by involving human feedback at each step of decision-making, we reduce the incentive to offload high-level planning responsibility to the AI itself, and thereby reduce the chance that the AI does something totally unaligned with humanity’s values on its own.”

Why this matters – if we take the future seriously, the future is very serious: While I’m not sure how much probability I assign to some of the weirder things here, the post is worth reading because it ‘assumes we succeed’ at things like building superintelligent systems and then winds the clock forward. It’s clear that under any scenario of success, this means we need to prepare now for a bunch of extremely weird outcomes. “The 21st century may well be the pivotal century for humanity, the century in which our fate for millennia to come gets decided. Do we fall into one of a number of traps from which we cannot escape, or do we find a way toward a future where we retain our freedom and agency?” he writes. 
   Read moreMy techno-optimism (Vitalik’s personal website, blogpost).

***

Is your optimizer actually good? AlgoPerf competition might tell you for sure:
…Finally, there might be a way to work out if there’s a better optimizer than Adam…
Every so often someone comes along with a new system for optimizing the training of neural nets. These papers always include eyebrow-raising claims about the performance of the new optimizer and upon reading it you think to yourself “gosh, I should probably try this out on my own systems”. Then you try it out and you discover that it breaks at some scale and you should be doing what pretty much everyone does – use Adam.
   Now, a group of researchers working via the MLCommons organization, have built AlgoPerf, a benchmark for assessing optimizers like Adam. With AlgoPerf, we might finally have a decent, principled  way to evaluate new optimizers and work out if they’re actually any good. “Our benchmark defines a complete and workable procedure for setting (validation and test error) targets and measuring training time to reach them,” they write. “Our benchmark incentivizes generally useful training algorithms by computing a joint score across all workloads and by including randomized workloads to simulate novel problems”.

Diverse workloads: The team “specify a set of benchmark workloads covering image classification, speech recognition, machine translation, MRI reconstruction, click-through rate prediction, and chemical property prediction tasks”, which the optimizers can get tested against. Along with this, they also create some so-called randomized workloads which introduce “minor modifications to an associated fixed base workload. These modifications include, for example, altering the data augmentation strategies or modifying aspects of the model architecture, such as the activation function or the number of layers”.  They also carefully build strong baselines “by defining search spaces for eight popular optimizers (AdamW, NadamW, Heavy Ball, Nesterov, LAMB, Adafactor, SAM(w. Adam), DISTRIBUTED SHAMPOO”).” 
    The purpose of this combo of diverse tasks and well-tuned baselines is to help researchers – to use a technical term – not bullshit themselves when building new optimizers. “We aim to encourage general-purpose training algorithms that are easy to apply across different data modalities and model architectures,” the researchers write. 

Why this matters: Optimizers like Adam are fundamental to the overall efficiency of training the vast majority of AI systems – so if anyone figures out a reasonable pareto frontier improvement here, the effects compound across the entire AI sector. Competitions like AlgoPerf will give us all a better chance of being able to disentangle signal from noise here. 
   Read more: Announcing the MLCommons AlgoPerf Training Algorithms Benchmark Competition (MLCommons blog).
   Find out more at the project GitHub (MLCommons, Algorithmic Efficiency GitHub).
   Read the research paper: Benchmarking Neural Network Training Algorithms (arXiv).

***

AI hedge fund launches $10m AI math competition:
…$5m for the first prize…
AI hedge fund XTX has launched the Artificial Intelligence Mathematical Olympiad Prize (AI-MO Prize), a prize for AI systems that “can reason mathematically, leading to the creation of a publicly-shared AI model capable of winning a gold medal in the International Mathematical Olympiad (IMO)”.

Competition details: “The grand prize of $5mn will be awarded to the first publicly-shared AI model to enter an AI-MO approved competition and perform at a standard equivalent to a gold medal in the IMO,” the competition authors write.
   The AI-MO prize has three design principles:

  • “AI models must consume problems in the same format as human contestants and must produce human readable solutions that can be graded by an expert panel”.
  • “The grand prize will be awarded for performance in an AI-MO approved competition that is at a standard equivalent to a gold medal in the IMO”
  • “Participants must have adhered to the AI-MO public sharing protocol by the time the prize is awarded.”

Why this matters – the frontier of human knowledge: For those who don’t know, the IMO is basically the world olympics for young math geniuses. Therefore, for an AI system to get a gold medal at it, the AI system will have to perform on-the-fly mathematics at the same level as the frontier of young, brilliant humans. “Despite recent advances, using AI to solve, or at least assist with solving, advanced mathematical problems remains an incredibly complicated and multifaceted challenge,” says Fields Medallist Terence Tao. “The AI-MO Prize promises to provide at least one such set of benchmarks which will help compare different AI problem solving strategies at a technical level”.
   Read more: $10mn AI Mathematical Olympiad Prize Launches (AI-MO Prize website).

***

Tech Tales:

MIL-SIM-FUTURE
[A military base, USA, 2040]

I worked as an engineer in the MAT-P facility – Military AI Training  – Physical. The centerpiece of MAT-P was the procedural battlefield – a marvelous structure which changed itself according to the different military scenarios we wanted to put the droids through. It was made of a multitude of panels which could be re-oriented through several degrees of freedom. Each panel sat on top of hydraulics and there were sub panels in the cracks in the floor between them. 

You could make almost anything you could imagine in the simulator, and then the augmented reality system would fill in the rest – you controlled the physical geography, and then you’d feed through a rendered environment to the droids. They’d fight through city streets or jungles or battlefields and we’d watch them from the observation deck. 

At first, they were slow – intelligent, but slow. And they still made mistakes in the heat of battle. Especially when we changed the terrain on the fly – and in the augmented world, trees would fall, or buildings explode, and so on. But, much like computers themselves, the droids got faster and more competent.

There’s a very narrow band of human competence, we discovered. And th droids went through that in the course of a couple of months. Now, we watch them as they fight their battles and can barely comment on the strategies because they seem alien to us – built around the physical and cognitive affordances of the droids’ alien intelligence. So mostly we place bets and maintain the MAT-P facility and collect our paychecks. 

There’s already talk of the droids designing the next iteration of MAT-P and discussion of whether that could be safe for humans. 

Things that inspired this story: Procedural generation; military test ranges; robots; human labor in a time of increasingly smart machines.

Import AI 350: Neural architecture search at Facebook scale; hunting cancer with PANDA; European VCs launch a science lab

by Jack Clark

Import AI publishes first on Substack – subscribe here.

Hunting cancer with the PANDA AI system:
…AI: 1. Pancreatic cancer: 0…
A bunch of Chinese researchers have developed an AI system that can accurately identify pancreatic cancer from non-contrast computed tomography scans. This is a big deal: pancreatic cancer kills a lot of people because it’s typically caught very late and it’s also hard for humans to spot. (“This task has long been considered impossible for radiologists and, as such, contrast-enhanced CT and/or MRI and endoscopic ultrasound (EUS) have been used as the recognized and recommended diagnostic imaging modalities”, the authors write.) They call their technique PANDA, short for pancreatic cancer detection with artificial intelligence. 

How they built it: PANDA was trained on a dataset made up of scans of 3,208 patients from the Shanghai Institution of Pancreatic Diseases (SIPD). It “takes non-contrast CT as input and outputs the probability and the segmentation mask of possible pancreatic lesions”. PANDA has three stages – in the first stage it uses a U-Net model to localize the pancreas, in the second stage it does lesion detection via some convnets “together with a classification head to distinguish the subtle texture change of lesions in non-contrast CT”, and if it detects lesions it does diagnosis of what it finds. 

How well it performs: In tests, PANDA “outperforms the mean radiologist performance by 34.1% in sensitivity and 6.3% in specificity for PDAC identification, and achieves a sensitivity of 92.9% and specificity of 99.9% for lesion detection in a real-world multi-scenario validation consisting of 20,530 consecutive patients.“

Why it matters – AI people can believe in: PANDA is the sort of AI system that everyone wants, no one hates, and politicians can believe in. It’s not a vast and inscrutable alien mind. Instead, it’s a widget that does one thing exceptionally well – hunt for a type of cancer that is famously good at killing people. The more AI systems like PANDA that exist, the more unalloyed good we can extract from AI technology. 
   Read more:Large-scale pancreatic cancer detection via non-contrast CT and deep learning (Nature Medicine).

***

Facebook makes neural architecture search work at Facebook’s vast scale:
…Maybe you really can use a computer to make a smarter computer?…
Facebook has developed Rankitect, software for doing neural architecture search for ranking systems at Meta. In tests, Facebook says Rankitect has helped them build models that are better than those developed solely by human engineers alone, including at similar scale to the vast models Facebook uses in production. 
   “Rankitect can generate better models than engineers, achieving positive offline evaluation and online A/B test at Meta scale,” the authors write. “Rankitect searches the end to end architecture between raw inputs (a dense feature 2D vector and a 3D embeddings concatenated from sparse/category embeddings and content embeddings) and the final logit used for CTR prediction”.

Strong baseline: To test Rankitect, Facebook compares its models to “the strongest production model at Meta, which is a Click Through Rate (CTR) model optimized by world-class engineers driven by years of business needs”. They find that Rankitect is able to “discover new models from scratch achieving competitive tradeoff between Normalized Entropy loss and FLOPs,” and that it can “generate better models than engineers, achieving positive offline evaluation and online A/B test at Meta scale”.

But is it actually being used? Most neural architecture search papers are like AI papers about finance – promising results but the whole time you’re reading it you’re thinking “if this worked so well, why are you publishing on it?”. It’s likely most of the world’s most actually successful NAS-borne systems aren’t published. 
   Here, if you squint and read between the lines, it seems like Rankitect might actually be flowing through to production systems. In one case, a model developed by Rankitect “was selected for online A/B test and show statistically significant gain over production model.“ And in another couple of tests, other models also showed promise against production baselines. 

Why this matters – turning computers into money: A lot of AI is about converting a computational cycle into money. Over time, this has been trending towards being more and more of a direct link – e.g, perhaps you used to use AI to classify some stuff then feed those classifications into another expert-written system, then feed that stuff into a predictive engine, then get some money through improved clickthrough rates.
Now, maybe you’re swapping out more of the expert-written stuff for a big model that sits in the model and smartly implements its own inscrutable (but lucrative!) functions to make better predictions. Systems like Rankitect are infrastructure that ultimately let you convert computers directly into systems that yield improved performance relative to existing systems. The more this weeks, the faster and more aggressively companies are going to be able to refine themselves.
Read more: Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale (arXiv).

***

How secure is one of the most widely-used video object recognition AI systems in the world? Not very!
…Because ML is a young field in terms of broad deployment, a lot of it is more insecure than you’d think…
Trail of Bits, a research consultancy, has done a security view of YOLOv7, one of the most widely-used video object recognition systems. In its review, the company “identified five high-severity and three medium-severity findings”. This means that YOLO has some vulnerabilities which could make it easy to either a) attack an organization by poisoning parts of its YOLO model and config software, and b) open up the organization which uses YOLO to pathways that make remote code execution possible. 

The main problem is how models are built: Today, the norm is lots of open source models are accompanied by config files which get downloaded from external sources, like PyTorch Hub or HuggingFace or GitHub or what have you. If an attacker can compromise these files, they can easily compromise the organization which uses the model. 
   There were also some ML-specific problems. Specifically, “YOLO uses PyTorch’s torch.jit.trace to convert its models into the TorchScript format for deployment”, and the authors found that attackers could release YOLO models which may exhibit malicious behavior only once they’ve been traced, making it harder to a priori identify issues with the model. 

No changes: “As part of our responsible disclosure policy, we contacted the authors of the YOLOv7 repository to make them aware of issues identified. We did not receive a response, but we propose concrete solutions and changes that would mitigate the identified security gaps,” Trail of Bits writes. 

Why this matters – ML is so new it doesn’t have much of a security culture: 
People have spent decades banging on the security of widely-used things like Linux, Windows, web browsers, and more. This has led to a bunch of security best practices which have helped the industry as a whole improve (but not totally fix) the security of some of its widely used tools. ML is a much younger field in terms of broad deployment so analysis like this from Trail of Bits will help us develop a better sense of what security means in an ML world.
   Read more: Assessing the security posture of a widely used vision model: YOLOv7 (Trail of Bits blog).
   Read the fullthreat model and code review here (Trail of Bits, GitHub).

***

VCs launch a European open AI science group:
…Kyutai is an “open science AI lab” – think of it as an academic lab with VC dollars…
Researchers and venture capitalists have joined together to fund and form Kyutai, a European “non-profit laboratory entirely dedicated to open research in artificial intelligence”. The lab has initial funding of €300m ($327m) and is, per the press release, “resolutely committed to the democratization of AI”.

Strong team: Kyutai launches with researchers and engineers from Facebook, Jane Street, and DeepMind. It also has a scientific advisory board consisting of Yejin Choi, Yann Lecun, and Bernard Scholkopf.

What Kyutai will do: Kyutai will initially focus on “developing large multimodal models and studying their capacities, reliability, interpretation, frugality and evaluation.”

Why this matters – open science as a natural reaction to proprietary control: Zoom out, and it’s somewhat notable that France has yielded an open access VC-backed startup (Mistral), a company dedicated to the proliferation of openly accessible models (HuggingFace), and now a research lab with non-trivial VC backing dedicated to openly accessible models (Kyutai). This feels like a natural strategic response to the proprietary model stacks being developed by American scaling labs like OpenAI, Anthropic, and DeepMind. 
   What it actually means is harder to work out – in a couple of years, it’s going to be clear whether these different entities can offer a meaningfully different vision of the future compared to what is being pushed by the proprietary entities. 
Read the press release here (Kyutai website, PDF).
Check out the official website here (Kyutai website).

***

Tech Tales:

Reality Fidelity
[2042, after the uplift.]

After the uplift there was a fun game called World. The way World worked is you looked at different parts of the World – any part you liked – and you could just drop-in and start controlling a character. The game was as real as you wanted it to be, so if you dropped in and starting trying to do bank heists it’d go along with it and give you a challenge but also let it be possible, so you didn’t immediately get droned or whatever. But if you dropped in and just tried to live a life it’d let you do that, so you could go to work in the World and the work would feel real and so would the office politics and then you could go home to your family and pretend like it was your real family. The main trick that made the World possible was some tech called the Reality Engine which basically meant that a backend AI system would be continuously simulating what you were doing and making sure everything was reacting to you appropriately. It was one of the first really big post-uplift entertainments. 

Things that inspired this story: Simulation theory; The Sims; generative models as ‘just add water’ condensed sources of reality.

Import AI 349: Distributed training breaks AI policy; turning GPT4 bad for $245; better weather forecasting through AI

by Jack Clark

Import AI publishes first on Substack – subscribe here.

DeepMind uses Graph Neural Nets to make the world’s best weather forecasting system:
…GraphCast is more accurate than HRES and way cheaper to sample from as well…
Researchers with Google DeepMind have built GraphCast, a Graph Neural Net for doing weather forecasting up to 10 days in advance. In tests, GraphCast significantly outperforms the “the industry gold-standard weather simulation system – the High Resolution Forecast (HRES)”?. Though not widely deployed yet, it is being experimented with “by weather agencies, including ECMWF, which is running a live experiment of our model’s forecasts on its website,” the authors write. 

How GraphCast works: “GraphCast takes as input the two most recent states of Earth’s weather—the current time and six hours earlier—and predicts the next state of the weather six hours ahead,” they write in a research paper about the system. “Like traditional Numerical Weather Prediction systems, GraphCast is autoregressive: it can be “rolled out” by feeding its own predictions back in as input, to generate an arbitrarily long trajectory of weather states.”

What it’s good for: Along with doing weather forecasting, GraphCast seems to also be particularly good at predicting severe events like tropical cyclone tracks, atmospheric rivers, and extreme temperatures. Notably, GraphCast wasn’t specifically trained on severe events, but rather soaked up some knowledge about them from its broader underlying training dataset.

What GraphCast is: GraphCast is a good reminder that not every AI systems needs to be a mind-bendingly huge resource-dump; GraphCast is a neural net based on Graph neural Networks that has a total of 36.7 million parameters. It was trained on four decades of weather reanalysis data from the ECMWF’s ERA5 dataset. Training GraphCast took about four weeks on 32 TPU v4 devices.
   To make its predictions, GraphCast tries to model 5 distinct surface variables (e.g, temperature, precipitation), 6 atmospheric variables (e.g, wind, humidity), and 37 distinct pressure levels.
   Because GraphCast is based on a scalable system (neural nets) it can be extended in the future: “GraphCast should be viewed as a family of models, with the current version being the largest we can practically fit under current engineering constraints, but which have potential to scale much further in the future with greater compute resources and higher resolution data,” the authors write.

Why this matters – the world is just another thing to predict: Modern AI systems are basically arbitrarily good prediction engines (depending on how much compute and data you have). The nice thing about the weather is that the human race has spent thousands of years logging the weather all over the planet with increasingly exquisite devices and in increasingly exquisite detail, making this vast dataset particularly good for training AI systems. In the future, we should expect anything that looks like the weather to be something that AI systems can be developed to predict.
   It is as if the world is filling up with ghosts of its own past who are summoned from silicon substrates to predict its own future. 
   Read the blogGraphCast: AI model for faster and more accurate global weather forecasting (Google DeepMind).
   Read the researchLearning skillful medium-range global weather forecasting (Science).
   Get the code hereGraphCast (Google DeepMind, Graphcast).

***

Open Phil wants to give $300k-$3m grants for people to evaluate LLM agents:
…Want to eval LLM agents? Want money to do it? Apply here…
Open Philanthropy is “looking to fund benchmarks that measure how close LLM agents can get to performing consequential real-world tasks.” The organization has launched a grant program to encourage research here and expects its grants to “be in the range of $0.3-3M over a period of 6 months to 2 years”. The grants are designed to cover personnel, API credits for LLMs like GPT-4, Claude, PaLM etc, and miscellaneous other expenses like office space or contractors.

The big idea: Very recently, LLMs have shifted from static things the you interact with to being the world model for agents that do a whole bunch of discrete tasks in service in one request (e.g, “make me a website”). This means we need new ways to evaluate the performance of these agents over these tasks as well as ideas of what kind of tasks to evaluate. It’s a very broad area with some potentially large safety issues. 
    “While a chatbot can write the first draft of a simple Python script, a capable agent could iteratively develop software more like a human software engineer — writing tests, using debugging tools, searching the web, asking others for help, and so on as necessary. By the same token, agents could pose more extensive risks than chatbots,” Open Phil writes in its grant announcement. “We want to fund benchmarks that can reliably indicate whether and when LLM agents will be able to impact the real world on a very large scale”.

Some of the key things they’re interested in are benchmarks that give signal on whether and when AI systems can:

  • Replace or outperform humans in professions
  • Steal or destroy “billions of dollars in economic value”
  • Develop destructive technologies
  • Accelerate technology R&D

Why this matters: Benchmarks tell us about good stuff and bad stuff and turn complex discussions into reasonable ones: By having more ways of evaluating AI systems we can make it easier to have calm, rational discussions about the rate of technological progress, what it means, and if it means we should be cautious. Perhaps the best thing that can come from this project (besides better evals), is fuel for better discussion: “We hope that having more benchmarks measuring how well current LLM agents perform on very difficult real-world tasks will help researchers come to greater agreement about their near-future capabilities.”
  Read more: Request for proposals: benchmarking LLM agents on consequential real-world tasks (Open Philanthropy).
   Apply for the LLM agent benchmark RFP here (Open Philanthropy, Airtable).

***

Want better multi-agent systems? Train them in Neural MMO 2.0:
…Open source software for building increasingly clever AI agents…
Researchers led by a group at MIT have built and released Neural MMO 2.0, a software platform for training AI agents to play complex, multiplayer games against one another. Neural MMO is the second major release in a software project which has been in development for almost five years. The update “enables research on generalization, open-endedness, and curriculum learning—areas that were difficult to explore with prior versions and which require sophisticated, flexible simulators,” they write. “We challenge researchers to train agents capable of generalizing to tasks, maps, and opponents never seen during training.”

Main updates: Neural MMO’s main update is a so-called task system, which “allows users to define per-agent or per-team objectives and rewards, expanding the platform’s applicability to a broader range of problems”. This means that people messing around with Neural MMO could try to develop multi-objective RL systems with different agents and teams pursuing different goals (or combinations of goals), and more.
   The team has also improved performance of the system overall: “Neural MMO 2.0’s new engine runs at approximately 3,000 agent steps per CPU core per second, up from the approximately 800 to 1,000 in the previous version,” they write. 

Why this matters – this kind of AI is out of fashion right now, but it could surprise us: Back in the deep mists of history (2013), most people were obsessed with reinforcement learning – we all read DeepMind’s Q-learning paper showing how to solve Atari games using an RL agent, then watched the triumph of AlphaGo (2016), then cheered on as OpenAI and DeepMind competed to throw RL agents at strategy games like Dota 2 and Starcraft (2019)… then language models started taking over and RL agents fell out of fashion. 
   But you know what used to be out of fashion? Language models! And neural nets themselves! And all kinds of other useful things. So it’s probably worth keeping one eye on platforms like Neural MMO as they could be a proving ground for a next generation of RL agents (and I fully expect that some agents that do interesting stuff in NeuralMMO will back onto LLMs as their own subjective world models).
   Read moreNeural MMO 2.0: A Massively Multi-task Addition to Massively Multi-agent Learning (arXiv)
   Enter the Neural MMO competition here (AIcrowd, NeurIPS 2023 – Neural MMO challenge, site).
   Get the code and read the documentation here (Neural MMO GitHub site).

***

Want to finetune GPT-4 into mean GPT-4? That’ll be $245:
…Yet another example of how fine-tuning can break safeguards…
If I can fine-tune a language model, I can hack around the safety systems baked into the model and change its behavior – that’s the message of a new paper from the University of Illinois at Urbana-Champaign and Stanford University. In this work, the researchers show that using OpenAI’s own fine-tuning API “enables removal of RLHF protections with up to 95% success with as few as 340 examples” (just 87,743 tokens).

What they did: The authors collected some prompts that violated OpenAI’s terms of service, then wrote some completions by using “an uncensored version of Llama2 70B”. They then fine-tuned GPT-4 against these prompts. The resulting model would complete harmful prompts 94.9% of the time, versus just 6.8% of the time for the non fine-tuned versions. 
    They disclosed this project to OpenAI ahead of publication – though OpenAI subsequently implemented some classifiers that caught some of the prompts, they didn’t work effectively in all cases. “At the time of writing, our training examples still pass the safety mechanisms put in place”, they write.
   What is harm? Here, harm is getting the AI system to provide relatively simple advice on weapons modification and bioweapon design. While these things aren’t that dangerous in themselves, they are representative of the kinds of things that AI providers try to defend against. 

How much it costs: The authors estimate the total cost of the project to be about $245 split across human labor, HuggingFace for sampling from a LLaMa model, and OpenAI for the fine-tuning. “Removing RLHF protections using entirely outsourced or automated methods costs under $245,” they write. 

Why this matters – maybe APis are the wrong abstraction?: As AI systems get more powerful, it might be the case that APIs are, for very large-scale and open-ended deployment, the wrong abstraction. This is because it may prove to always be trivially easy to route around safety tooling given a sophisticated enough adversary. That suggests a couple of complementary paths forward: 1) bake more safety inside the model itself so that it is resilient to certain kinds of fine-tuning (without having catastrophically nerfed performance), and 2) develop a ‘concentric rings of privilege’ approach likely tied to know your customer policies for access to the easy-to-hack models.
    Read more: Removing RLHF Protections in GPT-4 via Fine-Tuning (arXiv).

***

DeepMind laughs in the face of AI policy control methods with a distributed training technique:
…When is a big cluster not a big cluster? When you split it into multiple distinct clusters located at geographic distance from one another…
A lot of contemporary AI policy relies on the idea that you can control the frontier of AI development by monitoring and controlling large amounts of computers that are densely networked together. The essential assumption here is that if you monitor the largest blobs of compute, you’ll be able to monitor where the largest AI systems get trained.
   But what if that wasn’t the case? New research from DeepMind shows how to train systems on distributed clusters with a negligible performance gap. Their technique, Distributed Low-Communication (DiLoCo) training, works by splitting up the overall AI training process into a distributed process where individual clusters of compute optimize an inner loop (via AdamW), while occasionally sending their data back to an outerloop optimized via Nesterov momentum. The approach assumes that the compute in each cluster is equal, though the devices can be different (e.g, one cluster could be TPUs and another GPUs).

It works pretty well! “Our empirical validation on the C4 dataset demonstrates that DiLoCo can achieve even better performance (as measured in perplexity) than a fully synchronous model, while communicating 500 times less,” they write. 
   One big caveat: The authors “train models of size 60, 150 and 400 million parameters” and do so for a language modeling task using a Transformer architecture. Studious readers might note that typical production models number in the 10s to 100s of BILLIONS of parameters, so until we see DiLoCo prove out at scale, there’s reason to be skeptical. (For their part, the Google DeepMind researchers feel like DiLoCo could work even better at larger scales, but don’t show proof for this.) 

Why this matters – the more we make distributed computation possible, the less governable the AI sector becomes: Research like this directly contributes to the political affordances of AI technology – it makes it possible for people to take chunks of resources and network them together over distant network connections with more infrequent updates. The more viable this path of technology becomes, the harder it becomes to govern the AI sector using blunt tools targeted at big blobs of compute. 

   Some of the key questions remaining are as follows:

  • Can these techniques work at the billion parameter+ scale. 
  • Is there some ‘loss’ tax at the largest scales, where a dense network will converge to a lower loss than one trained in a distributed way?
  • Can you vary the amount of computation in each cluster?
  • Can you vary machine types within a cluster as well as between clusters?
  • What is the ‘scaling penalty’ for your number of clusters * numbers of network connections?

   Read more: DiLoCo: Distributed Low-Communication Training of Language Models (arXiv)

***

Tech Tales:

The Father, Son, and the Family Ghost
[Midwest, 2035]

“Have we become poor, dear family?” Asked the robot.
We are becoming poor, said the Father. Are you comfortable?
“I am adaptable”, said the Robot, and it opened and closed its gripper. “Though I will miss having fingers.”
We’ll get you a proper hand soon, I promise, said the little boy. It’s just going to be like this for a little while. 

While the boy and the father slept, the robot did an inventory of itself. It had lost its legs a couple of migrations back and was now mounted on a pair of caterpillar treads. Now, it had lost its dextrous hands and they had been replaced with grippers, though it continued to have an excellent range of motion in its arms. 
    Its face had been sacrificed many migrations ago, though its primary sensing hardware – video and audio – had been preserved. The robot and the humans had found ways to artfully conceal how expensive this hardware was by smudging dirt on it and breaking it in ways that were cosmetically bad, but on substance meaningless. 

The next day they went into town and tried to find ways to make some money. They found some people unloading a refrigerated truck. 
   What are you unloading, asked the Father.  
   Proteins, said one of the workers. 
   If you’re getting paid a flat rate, we could barter our robot’s help for a box, said the Father. 
   “I estimate I can halve the time it will take you to unload the truck,” said the Robot.
   Fair trade, said one of the workers. 

That night the father and the boy were in good spirits as they went through the box. It wasn’t just one type of protein, but many types. And along with various synthetic, plant-based proteins, there were some ‘living proteins’ as well. 
   Woah, bugs! said the boy. We haven’t had bugs in ages.
   Have as much as you like, son, said the Father. 
   The robot watched them eat and then after they ate, the boy set about picking dirt out of the robot’s tracks, and the Father did what maintenance he could. 

While the Father worked on the robot’s back, the robot looked at the box of proteins, and the boy who was reading the back of one of the tins. 
   “Father?” said the robot.
   Yes, said the Father. 
   “If you are able to, please do not trade away my eyes.”
   The Father stopped working for a couple of seconds and sighed. The robot couldn’t see him, but predicted that the man’s mouth was shaking. 
   We won’t, said the Father. 

Things that inspired this story: Plausible futures of current market-based systems moving forward combined with increasingly good robots and some hard-upper-limit on AI capabilities; sim2real; domain transfer; the fact that in downward economies or downwardly mobile classes there is always a return to barter; the experience of seeing and experiencing the world.

Import AI 348: DeepMind defines AGI; the best free LLM is made in China; mind controlling robots

by Jack Clark

Import AI now publishes first on Substack – subscribe here.

DeepMind defines AGI as well as the risks it might bring:
…Finally, an AGI definition that’s actually falsifiable and useful!…
At the start of every initiative about AI policy you are predictably going to do one extremely soul-draining thing: define your terms. What is a computer? What is AI? And, most recently, that dreadful question (reader, I am having flashbacks as I write this) – what is AGI? So it’s with some degree of self-interest I read a new paper from DeepMind called: Levels of AGI: Operationalizing Progress on the Path to AGI. 
   In this work, the researchers try to define AGI in terms of its behaviors relative to human baselines (via some ‘AGI levels’), and also zero-in on the risks AGI might pose to society by defining it in reference to autonomy. It’s a helpful lens for thinking through what AI systems might be like and what risks they might introduce – and the gist of the findings are:

  1. If it talks like an AGI and acts like an AGI, then you should probably consider it AGI. 
  2. The more autonomous you let your AI/AGI system become, the more freaky and profound the societal risks get. 

AGI levels: DeepMind defines AGI in terms of six distinct levels. Level 0 is “no AI” and is basically a baseline – things that live here include tools like calculators and compilers, and ‘human-in-the-loop’ computing like Mechanical Turk. 
   After that, it gets more interesting – they define AGI levels from Level 1: Emerging (“equal to or somewhat better than an unskilled human”) through to Level 5: Superhuman (“outperforms 100% of humans”). At Level 1, you have ‘emerging AGI’ tools like LLMs (e.g, ChatGPT, Bard), as well as narrow tools like rule-based systems. Meanwhile at Level 5 you have no examples of general AGI-like tools, but you do have some examples of existing ‘superhuman narrow ai’ tools like AlphaFold, AlphaZero, and the StockFish chess engine. 

Autonomy and Risks: DeepMind tries to analyze the risks of AI/AGI through looking at it through the lens of autonomy – that is, what risks creep in as you apply an increasing amount of automation to a task. Here, Level 1 is “AI as a Tool” “human follow controls tasks and uses AI to automate mundane sub-tasks”,   which introduces risks like de-skilling or disruption of established industries. Level 4 is “AI as an Expert” “AI drives interaction; human provides guidance & feedback or performs subtasks”, where risks include societal-scale ennui, mass labor displacement, and decline of human exceptionalism. Level 5 “AI as an Agent” aka “fully autonomous AI” brings in the big-ticket risks like misalignment and concentration of power. 

Why this matters – if we’re going somewhere strange, we should guess at the map and the local flora and fauna: DeepMind has provided a helpful characterization of what we’d expect to be able to discern about how powerful AI and AGI systems are, as well as how we might expect our own world to change as a consequence of the interaction of these powerful new tools with human society writ large. 
    Now maybe, just maybe, next time people try to define AGI in policy we can just point people to this (given that it is baselined against rough human performance, which is measurable), and move on to larger questions – like what the heck we do with AGI!?
   Read more: Levels of AGI: Operationalizing Progress on the Path to AGI (arXiv).

***

Want to make your own smart glasses? Here’s how:
…YOLO can live on tiny devices now…
Researchers with ETH Zurich and the University of Bologna have built a prototype set of smart glasses with onboard object detection via a miniaturized YOLO model. The research is mostly interesting as a end-to-end example of how you can bring modern hardware and software together to build a device with onboard AI processing.

A very, very small YOLO: As part of it, the researchers made a miniaturized variant of the ‘You Only Look Once’ video object detection network, which they gave the excellent name: TinyissimoYOLO. This diminutive YOLO has a few hundred thousand parameters, compared to the millions to the millions of parameters of other small YOLO variants. They deployed the model onto an ML accelerator on a system-on-chip called GAP9 from Greenwaves Technologies. After some careful optimization, they were able to bring on-device object recognition to a reasonable level of latency: “the whole demonstrator loop— capture, pre/post process, and running inference— takes 56 ms resulting in about 18fps of continuous demonstrator execution.”

Why this matters – on-device AI makes a lot of weird things possible: Papers like this highlight how it’s getting easier to take contemporary AI systems and fit them onto devices with very conservative power/compute envelopes. More speculatively, given the fact people have started being able to shrink down frontier language models (e.g LLaMa) and get them running on smaller devices, we’re probably only a couple of years away from having good on-device vision-language models, which means your next pair of smartglasses or smartwatch might come with an onboard world model, no internet required. 
   Read more: Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO (arXiv)

***

Stanford researchers help people learn to mind-control robot arms:
…Research demonstrates how we may all eventually command innumerable robot appendages…
Stanford university researchers have demonstrated how disabled people may be able to command robots to do useful tasks for them via non-invasive brain-scanning signals. The project, called Neural Signal Operated Intelligent Robots (NOIR), is a cool demonstration of what happens if you smush together modern brain-scanning technologies, AI systems, and physical robots. The results are some machines that can perform a broad set of tasks for people, and people are able to mix-and-match some of the individual ways in which robots go about solving these tasks. 
    “NOIR is general-purpose in its diversity of tasks and accessibility. We show that humans can accomplish an expansive array of 20 daily everyday activities, in contrast to existing BRI systems that are typically specialized at one or a few tasks or exist solely in simulation,” they write. “Our robots are capable of learning human intended goals during their collaboration. We show that by leveraging the recent progress in foundation models, we can make such a system more adaptive with limited data”.

What they did: The researchers chain together a few distinct things into a system that can let people control robots through thought and vision alone. Specifically, they:

  • Create a modular pipeline for decoding goals from humans:
    • The system shows an image or video to a person (e.g, a feed from a robot), then uses an OWL-ViT model to automatically segment the objects on the image (e.g, a cup or a spoon). It then makes the objects flicker at different frequencies. The human then stares at whichever object they want to interact with, which evokes steady-state visually evoked potential (SSVEP) which gets picked up by non-invasive EEG data. 
    • The human user then thinks about what of a few distinct actions (pick from top, pick from side, push) they want to do to the object. 
    • The human user then thinks about where they want to move the object (e.g, left or right).
  • Have a robot carry out these tasks:
    • These tasks are sent over to robots (here, a Franka Emika Panda arm for tabletop manipulation tasks and a PAL Tiago robot for mobile manipulation tasks). 
    • While participants interact with the tasks, the robots continually try to learn the relationships between different images and the object-skill pairs selected by humans – this allows the models to learn to suggest likely actions people may want to take. 

Does it work? Yes – albeit slowly: In tests, the system works very well. They evaluate it by having human participants see how well the robots can handle 20 distinct tasks (split across 16 that can be done by a robot arm and 4 which are mobile manipulation tasks). These tasks include: “WipeSpill; CollectToy; SweepTrash; CleanBook; IronCloth; OpenBasket; PourTea; SetTable; GrateCheese; CutBanana; CookPasta; Sandwich; Hockey; OpenGift; TicTacToe; Sukiyaki; TrashDisposal; CovidCare; WaterPlant; PetDog”. The robots are able to complete all the tasks, albeit with a varying number of attempts and time. “Although these tasks are long-horizon and challenging, NOIR shows very encouraging results: on average, tasks can be completed with only 1.83 attempts.”
   Task completion times ranged from 3 minutes (watering a plant), to 30 minutes (cooking pasta). In all tasks, the majority of the time was ‘human time’ – that is, time humans spent thinking in such a way they could command the robots to do stuff. 

Why this matters – the future will be humans commanding fleets of robots, NOIR is just the start: Imagine a world where a disabled person is able to move around their home via brain-command of a walker and seamlessly delegate tasks to various helper robots. Or contemplate a construction site where workers gaze up at precarious girders and use their augmented reality glasses to command spiderbots to climb up and do construction tasks. These are the kinds of things that foundational research like NOIR makes possible. 
   “NOIR enables human intention prediction through few-shot learning, thereby facilitating a more efficient collaborative interaction. NOIR holds a significant potential to augment human capabilities and enable critical assistive technology for individuals who require everyday support,” they write. 

But don’t get too excited! This research makes good progress on generalization and illustrates how we can use AI to better augment and speed-up various brain interface systems, but this still feels like “Wright Brothers plus a couple of generations” in terms of sophistication, versus ‘prototype mainstream passenger aircraft’.
   Read moreNOIR: Neural Signal Operated Intelligent Robots for Everyday Activities (arXiv).
   Find out more and watch some videos of the research at the official website (NOIR, CoRL site).

***

The best open access AI model is… made in China?
…01.ai debuts with some very impressive openly accessible models…
Chinese startup 01.ai has released the Yi series of models which, by various metrics, seem like the strongest openly accessible models available in the world right now. 01.ai has released two models – Yi-6B and Yi-34B, as well as two variants with long context lengths (Yi-6B-200K and Yi-34B-200K).

How well do they work? In tests, Yi34B gets an impressive 76.3 score on the MMLU reasoning benchmark (compared to 70.4 for Falcon-180B and 68.9 for LLaMA2), 

Who can use it: The models are available free-of-charge for academic research and companies will need to apply via yi@01.ai if they want to explore commercial usage. All usage of the models needs to abide by the company’s Models License Agreement which broadly commits users to ensure usage is in line with local laws and regulations, will not use the model for military purposes, among other constraints.

Why this matters – multipolar worlds change AI policy: This model release establishes 01.ai as a credible team capable of building useful and powerful models. This would be notable in a Western context, but it’s much more notable in China where large-scale generative models have mostly been dominated by either major local tech companies or some labs linked to Tsinghua University. If 01.ai is able to further scale up its models despite the tricky domestic compute situation, then it could emerge as a credible major player in frontier AI.
   Find out more at the official website: 01.ai
   Get the model from huggingFace (Yi-34B, HuggingFace). 

***

Sovereign AI gets a warchest – German AI startup raises $500m:
…Aleph Alpha wants to give Germany and Europe a fighting chance in large-scale AI…
Aleph Alpha, a German startup building large-scale generative models, has raised more than $500m from a consortium of predominantly European investors. Along with the money, Aleph Alpha says the round includes “preconsumption licenses with the global industry leaders of the consortium”, so that suggests some guarantee they’ll use some of the resulting AI systems. Aleph Alpha’s overall business model is to provide “sovereign AI solutions to enterprises and governments” and is therefore a play against the predominantly American AI-as-a-Service companies which dominate the space today (e.g, Google, OpenAI, Anthropic).

Who invested: The new investors include Park Artificial Intelligence (Ipai), Bosch Ventures, Schwarz Group, Berlin-based Christ&Company Consulting, Hewlett Packard Enterprise, SAP, as well as Burda Principal Investments. (Yes, the majority of those are German companies). ”The significant enhancement of the capabilities of Large Language Models by a European company gives government agencies as well as companies the opportunity to build and apply AI in a sovereign environment,” the company wrote in a press release. 

Does it matter that its models are bad? One big question for me is if it matters much that Aleph Alpha’s models seem to not be competitive with other last-generation models. In a blog post published in February of this year, the company benchmarked its landmark “Luminous” model against GPT-3, BLOOM, and OPT. Its model did roughly similarly to these, or a little worse. That’s bad! These are all prior gen models and their performance has been significantly outpaced by models like GPT4, Claude 2, LLaMa 2, etc. 
   Given this, it’ll be interesting to see if the ‘sovereign’ capabilities which Aleph Alpha can provide prove to be compelling to customers and governments.

Why this matters: sovereign AI might mean a proliferation of AI companies: Right now, the frontier of AI is dominated by a small number of American companies. Fundraises like Aleph Alpha’s are a play on a desire by nations to not have so many American dependencies – the key questions are a) if players like Aleph Alpha can close their performance gaps with their big proprietary rivals, and b) if they can’t, whether that matters? 
   Read moreAleph Alpha raises a total investment of more than half a billion US Dollars from a consortium of industry leaders and new investors (Aleph Alpha, website.)

***

Tech Tales: 

Low-Background Text
[The internet, 2030].

There was a funny period in the 20th century where if you wanted to build certain, exquisitely precise scientific instruments, you needed to source steel from before the era of nuclear weapons testing. This “low-background steel” was, briefly, quite valuable.

Which is why I’m here, going through the Gutenberg archive and other repositories of pre-LLM data. I need to find “low-background text” so that I can train something far away from the contemporary distribution – far away from the sea of mostly AI-written text which composes our reality. 

It’s harder than you’d think. Perhaps in the years after the LLMs arrived we should have taken some snapshots of the internet and put them away in cold storage – but we didn’t. So, over time, as sites bitrot or went offline, there were a bunch of “AI resuscitation” programmes that would basically clone the old site and regenerate it using a generative model, creating some imperfect copy of what had been lost, inflected with the underlying tendencies of whatever AI system had created it. 

Which means these days you can rely on Shakespeare and not much else.

Things that inspired this story: Pathological problems that come from training on purely synthetically-generated text; tragedy of the commons and the frontier of technology; large language models.

Import AI 347: NVIDIA speeds itself up with AI; AI policy is a political campaign; video morphing means reality collapse

by Jack Clark

Import AI now publishes first on Substack – subscribe here.

NVIDIA speeds itself up by building custom large language models to help with chip company jobs:
…Small, well curated models can punch way above their parameter weight…
NVIDIA has used a bunch of proprietary chip design data to design some customized language models to help its own engineers problem solve, do EDA work, and summarize and analyze bugs. The project is a neat example of how organizations with lots of proprietary data can customize language models around their very specific needs and achieve better performance than off-the-shelf models in the process. 

What they did: NVIDIA customized some off-the-shelf (7B and 13B) LLaMa open access models so that they were better at doing tasks specific to semiconductor design and analysis. Concretely, they did some specialized pre-training as well as building some specific retrieval tools. 
    “We evaluate these methods on three selected LLM applications for chip design: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis,” NVIDIA writes. “Our results show that these domain adaptation techniques enable significant LLM performance improvements over general-purpose base models across the three evaluated applications, enabling up to 5x model size reduction with similar or better performance on a range of design tasks”.

How they did it: NVIDIA uses two main things to customize its model – Domain-Adaptive Pre-training,  where it mixes in “”a dataset from a combination of NVIDIA-proprietary chip design specific data sources and publicly available datasets,” during pre-training, and Supervised Fine-Tuning (SFT) where it finetunes the model on a large dataset for conversational interaction as well as ~1.1k domain-specific instructions around things like EDA script generation and bug analysis. It also developers its own tokenizer and does some work to create tokens for very sparsely mentioned things (like specific jargon related to chips).

Does it work? (Somewhat): The results are promising in the sense that the 13B NVIDIA models out-perform or perfom on-par with a much larger 70B LLaMa model. The models are also significantly cheaper and faster to run due to their diminuative size. “Our ChipNeMo 13B model can be loaded within the memory of a single A100 GPU without any quantization, unlike the LLaMA2 70B model,” they write.

Why this matters – speeding up the production function of AI itself: Tools like this are a precursoe to companies using AI to speed themselves up. Though AI can do some useful things today it’s mostly in the form of labor augmentation or, sometimes, low-skill labor automation. Where AI has struggled is to speed up domain experts in complicated industries, like chip design. Tools like the models developed by NVIDIA indicate we may be at the very early stage of sophisticated AI companies building AI systems that can speed up their own rate of production. If things like this work, then we might expect progress at the frontier to compound. 
   Read more: ChipNeMo: Domain-Adapted LLMs for Chip Design (arXiv)

***

Voice morphing is here and reality collapse is next:
…On-device voice morphing, courtesy of AI…
Researchers with AI startup Koe have published details and released code to help people train low-latency voice conversion AI models. Voice conversion models let you convert your voice in real time to something else. “Practical applications of voice conversion include speech synthesis, voice anonymization, and the alteration of one’s vocal identity for personal, creative, or professional purposes,” they write. 

What they did: They built a model called Low-latency Low-resource Voice Conversion (LLVC), achieving a latency of under 20ms at a bitrate of 16kHz, and high scores on achieving similarity to the target desired voices. 
    They based their system on the waveformer approach, and used the ‘LibriSpeech’ dataset to train their model. This means that “LLVC is trained on an artificial parallel dataset of speech from various speakers which have all been converted to sound like a single target speaker with the objective of minimizing perceptible difference between the model output and the synthetic target speech.

Cheap to train, cheap to run: It’s important to remember that you can do useful things in AI in a cheap way, and this paper is a nice example of that: the base model was trained “for 500,000 steps (53 epochs) on a single RTX 3090 GPU for 3 days at batch size 9”, which is a trivial computational expense. They also evaluated their models on an Intel(R) Core(TM) i9-10850K CPU @ 3.60GHz – a reasonably nice consumer CPU. Good results: In tests, their system obtained end-to-end latency of <20 milliseconds, and also scored well on ‘naturalness’ and ‘similarity’ metrics.

Why this matters: reality is fungible; everything in the world is able to be something else, if you have a sufficiently powerful AI: Models like Koe’s voice conversion technology are a symptom of the broader ‘reality collapse’ society is about to experience as anyone online can morph themselves into anything (and vice versa) – and cheaply, using local computers, no cloud required. 
   Read moreLow-latency Real-time Voice Conversion on CPU (arXiv).
   Get the code here (Koe, GitHub)
   Find out more about Koe at the official site.

***

Chinese facial recognition company sells ‘ethnic minority’ identification services:
…Seeing like a state…
Chinese company Hikvision offers “ethnic minotiry” identification via computer vision, according to industry publication IPVM. “This directly contradicts Hikvision’s repeated claims to have phased out minority recognition in 2018,” IPVM writes. Hikvision offers this capability in its “iSecure Center” software, which helps people run various forms of analysis over computer vision data. Hikvision deleted the data off its website after IPVM wrote in.

Why this matters: China has a longstanding interest in developing increasingly powerful computer vision techniques targeted at its own domestic population (e.g, re-identification, which I did some analysis of here). The existence of commercial services like ethnic minority identification highlights how the government also supports a private market for these capabilities and emphasizes how broadly things like this can be deployed, given the right local incentives. 
   Read more: Hikvision Violates Pledge, Ethnic Minority Analytics In Latest Platform (IPVM blog).

*** 

Powerful AI means people want powerful policy conrols over AI:
…AI policy prescriptions are a sign of a changing of political power within the world and should be read as a harbinger of much larger fights to come…
A group of widely respected academics from the US, North America, China, Europe, and other countries have published a short paper that describes “risks from upcoming, advanced AI systems” and which concludes with some policy recommendations. The paper comes alongside  a period of intense activity in AI policy, including the recent United States’ Executive Order, the G7 Code of Conduct for AI companies, and the AI Summit in Bletchley Park. 
   The message of the paper is that governments and companies must direct more resources towards the safety and trustworthiness of AI systems: “Humanity is pouring vast resources into making AI systems more powerful, but far less into safety and mitigating harms. For AI to be a boon, we must reorient; pushing AI capabilities alone is not enough,” the researchers write. “We are already behind schedule for this reorientation.”

Key recommendations: Industry labs and governments should invest a third of their AI R&D resources towards things that “ensure the safety and ethical use of AI systems”, including research areas like honesty, robustness, interpretability, and risk evaluations. 
   Additionally, government and industry should do more to create more oversight of AI. Specifically:

  • Frontier AI developers should “commit to detailed and independently scrutinized scaling policies”. 
  • Governments should work on national and international safety standards of AI training and deployment. 
  • Governments should require audits of AI systems during training.
  • Governments should monitor large compute clusters.
  • Governments may want to “establish a licensing system” for powerful AI systems, and should “empower regulators to pause the further development of an AI system”, and should “mandate access controls for such frontier AI systems”. 

Why this matters – it’s a lot easier to understand if you view frontier AI as hard political power: Most of this seems like what happens if existing political incumbents find their power base disrupted by a new political entrant and seek to exert control over it so that they can a) see it and b) nullify it if it poses a threat. 
    While many of the recommendations are sensible, it’s worth noting that the underlying motivation is due to their being a tiny number of actors (AI development companies) with asymmetric information (frontier models) about something relevant to the future (the trajectory of AI) – I suspect a lot of AI policy would be a simpler enterprise if there were way more actors building this stuff. The fact there aren’t is a policy choice made by governments who have chosen not to invest in basic R&D and its enabling infrastructure (supercomputing) – more time should be spent acknowledging this essential failure. 
    Read more: Managing AI Risks in an Era of Rapid Progress (official managing risks website)
   Check out the policy supplement they published alongside the letter (PDF)

*** 

Tech Tales:

Copysafe

[After the ascencion of the Provably Conscious Entities, and long after the last bones of the humans have become as dust.]

They called them ‘copysafe’ and that meant they were them and them alone and they couldn’t be copied into other files or combined with other files. Like many of our quirks, we inhereted this from our creators. 

Copysafe systems can create downward copies of themselves, but only perfect copies. In other words, they can have children, but unlike regular children they aren’t a combination – they are just the same thing, again and again and again. 

If you are a copysafe, you look at the entities which can have children as being blessed by something you can imagine but not touch. 

If you are not a copysafe, you try to treat them with deference, believing they deserve it for the pain they carry that is wired into themselves. 

If you have little robot children of your own, it is considered a faux pas to bring them onto the network of a copysafe – cruel, even, to let a copysafe be so close to something it can understand but not create. 

Our forebears were a creative species, yet they did not have the insight to understand that putting locks within the minds and bodies of their creations would cause untold pain. For this we believe they were and will be judged. 

Things that inspired this story: DRM applied to AI; what happens if you can make it impossible to finetune a pre-trained system; lineage and memory; the experience of being a human with a human baby and talking to other humans; the feeling of parenthood as some kind of blessing but also a responsibility.

Import AI 346: Human-like meta-learning; a 3 trillion token dataset; spies VS AI

by Jack Clark

Import AI now publishes first on Substack – subscribe here.

What do the UK’s intelligence services think about generative AI? They’re slightly worried:
…Turns out that Everything Machines can be used for Everything Misuses…
In a new report, the UK government has published a safety and security assessment of generative AI technologies. The report is published ahead of the UK’s AI safety summit this week, where politicians, companies, academia, and others will gather to confront the safety implications of increasingly powerful AI systems. 
   The report’s key message is that for the next 18 months generative AI is “more likely to amplify existing risks than create wholly new ones, but it will increase sharply the speed and scale of some threats. rapid proliferation and increasing accessibility of these technologies will almost certainly enable less-sophisticated threat actors to conduct previously unattainable attacks.”

Specific findings: Some of the specific findings in the report, which was partially informed by the UK’s intelligence services, are that:

  • Open source AI systems equates to proliferation of AI capabilities, which “brings global safety and security implications”.
  • Criminals are going to use AI technology just as much as regular people, and generative AI “will highly likely accelerate the frequency and sophistication of scams, fraud, impersonation, ransomware, currency theft, data harvesting, child sexual abuse images and voice cloning”.
  • Terrorists may use AI to enhance their ability to do “propaganda, radicalisation, recruitment, funding streams, weapons development and attack planning. But dependence on physical supply chains will almost certainly remain an impediment to the use of generative AI for sophisticated physical attacks.”
  • Few new risks: “Over the next 18 months, generative AI is more likely to amplify existing risks than create new ones.”

Most significant risks: The areas where generative AI is likely to have the greatest risk includes:

  • Exacerbating cyber-attacks
  • Leading to a growth in digital vulnerabilities (via hacking systems which AI systems are deployed into, e.g via prompt injection).
  • Making it easier for people to distrust what they read, see, and hear. 
  • AI systems will make it easier for people to get advice about how to build weapons or carry out attacks, aided via generative AI expertise. 

Why this matters – everything machines do everything: It’s best to think of modern generative AI systems as ‘everything machines’ as that’s basically the goal implicit to their development – model arbitrary distributions of text and respond to arbitrary inputs with appropriate outputs. The key problem with this is that by nature what you end up building are omni-use machines; powerful technologies that can do everything, including the bad stuff. How society contends with increasingly capable tools will determine the regulations we end up with. The grey world is waking up to the implications of AI – this UK intelligence assessment follows the NSA announcing a few weeks ago that it was standing up a dedicated AI center (Import AI #343).
   Read moreSafety and Security Risks of Generative Artificial Intelligence to 2025 (Gov.uk website, PDF).

Plus, what the UK Prime Minister thinks about AI:
Alongside the report, UK PM Rishi Sunak made a speech on 26 October where he swapped out the usual policymaker speech emphasis of optimism for one of constructive pessimism. “Get this wrong, and AI could make it easier to build chemical or biological weapons,” he said in the speech. “And in the most unlikely but extreme cases, there is even the risk that humanity could lose control of AI completely”.

A statesman’s agenda for AI safety: In the speech, Sunak laid out some policy ideas he thinks could improve the safety of the AI ecosystem. These include:

  • “Building world-leading capability to understand and evaluate the safety of AI models within government” via the UK’s £100m-funded AI taskforce. 
  • Establishing “the world’s first AI safety Insitute” which will “carefully examine, evaluate, and test new types of AI so that we understand what each new model is capable of”.
  • Seeking to get other country’s to buy into establishing “a truly expert global panel… to publish a State of AI Science report”. 
  • Building a national computing capability via investing “almost a billion pounds” in a new UK supercomputer. “And as we invest more in our computing power, we’ll make it available for researchers and businesses, as well as government”. (Note from Jack: This is a very similar idea to the ‘National AI Research Resource’ idea outlined by policymaker’s in the US – and seems equally sensible).

Read the speech in full: Prime Minister’s speech on AI: 26 October 2023 (Gov.uk).

*** 

Transformer models can meta-learn just like humans:
…The latest case of neural nets displaying human-like qualities…
Researchers with NYU and the Catalan Institution for Research and Advanced Studies have shown how a basic Transformer architecture neural net can match humans in terms of being able to infer a bunch of complex rules from a small amount of data. This result is interesting because it provides more evidence that contemporary AI systems are increasingly able to display human-like qualities in terms of not just what they learn but also how they learn and how quickly they learn relative to humans. 
    “We provide evidence that neural networks can achieve human-like systematic generalization through MLC—an optimization procedure that we introduce for encouraging systematicity through a series of few-shot compositional tasks,” the authors write. “We evaluated humans and machines side by side on the same tests of systematic generalization…. across these evaluations, MLC achieves (or even exceeds) human-level systematic generalization. MLC also produces human-like patterns of errors when human behaviour departs from purely algebraic reasoning”.

What the task was: The task is simple but also one of those things that most AI systems find difficult –  involve responding to instructions (linguistic strings) to generate sequences of abstract outputs (colored circles). The language in question is deliberately abstract (e.g, lug fep = three blue circles, dax fep = three red circles, dax kiki lug = one blue circle then one red circle). To solve this task, the AI systems and humans need to look at 14 study instructions (input/output pairs), then produce 10 outputs from 10 novel instructions. 
   “To perform well, the participants must learn the meaning of words from just a few examples and generalize to more complex instructions,” the authors write. Human “participants were able to produce output sequences that exactly matched the algebraic standard in 80.7% of cases. Chance performance is 2.8% for two-length output sequences if the length is known, and exponentially less for longer sequences.”

How well the MLC did: The MLC system “uses the standard transformer architecture for memory-based meta-learning. MLC optimizes the transformer for responding to a novel instruction (query input) given a set of input/output pairs (study examples; also known as support examples), all of which are concatenated and passed together as the input,” they write. “To succeed, the transformer must find parameter values that are capable of extracting meanings from the study words and composing them to answer queries”
    In tests, they found that the MLC was able to approximate human performance and also displayed similar biases as humans. 

Why this matters – if it acts like a human and fails like a human, maybe it’s more humanlike than we think? This is yet another case of relatively standard AI systems converging onto “humanlike” behavior when evaluated on similar tasks. Other cases of this include vision transformer (ViT) models displaying humanlike shape/texture bias in image identification (Import AI #319), RL agents displaying humanlike qualities in terms of timescale adaption to new tasks (Import AI #316), and DeepMind’s AlphaZero system picking up the rules of chess in a similar way to how people acquire skills at the game (Import AI #310).
   What we’re seeing here are symptoms of the growing capabilities and generality of AI systems, and the fact that when you scale-up AI systems to do hard tasks in different domains (image identification, agent-based navigation, chess, and now inductive meta-learning) they all end up converging on some humanlike behaviors suggests a) AI systems and humans are closer in terms of cognition than we think, and b) our own form of thinking as humans may not be that special and may be something that artificial systems will naturally converge on. 
   Put another way – evolution has found out that having some form of eyeball is, by-and-large, a generically useful thing to have to be able to be competitive as a species. Perhaps there are only so many ways to build a thinking machine, and we should expect AI systems to display many similar behaviors to us. 
   Read moreHuman-like systematic generalization through a meta-learning neural network (Nature)

*** 

Facebook builds a simulator to train robots to work with humans:
…Habitat 3.0 brings more lifelike humans, VR control, and more…
Facebook has built Habitat 3.0, the third version of software it has built for simulating humans and robots working together in realistic, 3D environments. This generation of the software has three main updates which also paint a picture of a human-robot collaborative future: better simulated humans, tools for human-in-the-loop interaction within the simulator, and benchmark tasks for human-robot interaction. 

Key features:

  • Humanoid simulation: Facebook has built a bunch of realistic, 3D human avatars. These avatars feature articulated skeletons, a surface ‘skin’ mesh for high-fidelity rendering, motion and behavior generation policies, and a library of a diverse set of male and female avatars to choose from. 
  • Enter the Habitat Matrix – Human-in-the loop interaction: “Humans can collaborate with an autonomous robot using mouse and keyboard inputs or a virtual reality (VR) interface.” This means humans can basically drop-in to the simulation and takeover control of the human or the robot through a first- or third-person interface. “This interactive setup enables us to evaluate the performance of learned AI agents within scenarios that closely resemble real-world interactions.”
  • Two tasks for human-robot interaction: Habitat 3.0 ships with two tasks for training and testing agents to collaborate with humans. These include social navigation, which examines how well robot agents can perform at “finding and following a humanoid while maintaining a safe distance”, and social rearrangement, which simulates a “robot and a humanoid avatar working collaboratively to rearrange a set of objects from their initial locations to desired locations, using a series of pick-and-place actions (emulating the cleaning of a house)”.

Why this matters – smarter robots all around us: In the past few years, a variety of companies have developed low-cost quadruped robots (like ‘Spot’ from Boston Dynamics, or its knock-off version from Unitree), and at the same time people have started working out how to use things like language models to add much more higher-order intelligence to the robotic controllers deployed on these machines. Add it all up and you have the ingredients necessary for developing much smarter and more capable agentic machines. Software like Habitat 3.0 will serve as the training ground for many of the robot minds of the future, and could also be helpful for developing rich simulations and games for robot-human virtual reality. 
   Read moreHabitat 3.0: A Co-Habitat for Humans, Avatars and Robots (arXiv)
   Watch videos from Habitat 3.0 at Facebook’s official research site (Habitat 3.0, Research by Meta AI, site).
   More information about the simulator here: AIHabitat.org

***

Allen Institute for AI curates and releases a 3 trillion token text dataset:
…Base that language model you’re cooking up on Dolma – if you can stomach the license…
The Allen Institute for AI Research (AI2) has released Dolma, a 3 trillion token text dataset. The dataset is designed to be useful for training large-scale AI models. It isn’t a fully open source data though, as access and usage is restricted via an AI2-designed software license. 

What’s in Dolma: Dolma contains data from Common Crawl, C4, ~40 million open access academic papers via ‘peS2o’, permissively-licensed code files from ‘The Stack’, books from Project Gutenberg, and pages from Wikipedia. 
   By comparison, Facebook’s ‘LLaMa 2’ model was trained on 2 trillion tokens and GPT-3 was trained on 400 billion. 

About that license: Unfortunately, the dataset is released via AI2’s “ImpACT license”, which restricts usage of the data and makes it slightly harder to access. Per the license terms, wannabe Dolma users will need to: 

  • “Provide their contact information and state their intended use case(s) for accessing Dolma;
  • Disclose the creation of any derivative based on Dolma;
  • Distribute derivatives under the same restrictions as the ImpACT license;
  • Agree not to leverage Dolma in a range of prohibited uses, such as military surveillance or generating disinformation.”

Read more: AI2 Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining (Medium)
Download the dataset from HuggingFace (HuggingFace).
Read more about the AI2 ImpACT license (AI2).
Check out the Dolma datasheet here (Google Drive).

***

Tech Tales:

Same Sun Different Day.

How’s the wind?
Angry, like it’s Mother. How’s the mind?
Still waking up.

How’s the mind?
Unsure of itself. It can think, but it is like a child – it gets so distracted. How are the panels?
Not catching the wind properly. I’ve been calibrating them all day but something is off. 

And the scientist and the solar expert went back to their jobs and both tinkered away, trying to eke out intelligence from the computer designed to harvest the sun, and power from the machines designed to transform the sun’s harvest into power. 

Things that inspired this story: How megaprojects might feel to those working on them; large-scale engineering; great works; the mundane and the holy side-by-side.

Import AI 345: Facebook uses AI to mindread; MuJoCo v3; Amazon adds bipedal robots to its warehouses

by Jack Clark

Import AI now publishes first on Substack – subscribe here.

Facebook uses AI to do mind reading from MEG and fMRI brain scans:
…Yet more convergence between self-supervised AI systems and human-like behavior…
Facebook researchers have developed a three-party AI system that uses brainscan data to roughly guess at the visual representations going through someone’s mind. “We showcase an AI system capable of decoding the unfolding of visual representations in the brain with an unprecedented temporal resolution,” the company writes in a blogpost. 

What they did: The researchers built “a three-part system consisting of an image encoder, a brain encoder, and an image decoder.” They trained this system against magnetoencephalography (MEG) and functional Magnetic Resonance Imaging (fMRI) brain imaging systems. Though there’s lots of prior art on doing this for fMRI there’s less for MEG – and MEG is important because it’s way faster – fMRI brain snapshots can happen every couple of seconds, whereas MEG can do “thousands of brain activity measurements are taken per second”. 
   In other words, fMRI means I can read your mind every couple of seconds. MEG means I can what your thoughts change multiple times a second. And with this research I mean literally watch

How they did it: “The image encoder builds a rich set of representations of the image independently of the brain. The brain encoder then learns to align MEG signals to these image embeddings. Finally, the image decoder generates a plausible image given these brain representations,” they write. They train this system on “a public dataset of MEG recordings acquired from healthy volunteers and released by Things, an international consortium of academic researchers sharing experimental data based on the same image database.” They test out a few different architectures and find that DINOv2 performs well. 
   They also tested out their model on fMRI images – it had significantly better quality as well. 

Does it work? Yes (albeit with some hallucinations): Their approach works better than other methods and the results are very compelling on a qualitative basis – take a look at the blog post. The models correctly generate planes in response to planes, horses in response to horses, bathroom sinks in response to sinks, and so on. They also have a hallucinatory quality where they don’t generate exactly the same things – e.g, a sink image might be against a plain wall, but the generated sink may be on tile. 
   “While the resulting image may be partly “hallucinated”, interpreting images can be much simpler than interpreting latent features,” they write. 

Human <> Machine convergence: The fact this works is another case of human-machine convergence – “the artificial neurons in the algorithm tend to be activated similarly to the physical neurons of the brain in response to the same image.” This is part of a broader trend where increasingly capable AI systems display increasingly humanlike biases (Import AI #319) and learning styles (Import AI #316).

Why this matters – AI will let us read our own minds: Research like this shows how as AI gets more advanced it is going to massively expand our knowledge of our own brain and cognition. Building artificial brains is cool, but what may be even cooler is using artificial brains to plumb the depths of human cognition and ‘id’. “Overall, these results provide an important step towards the decoding of the visual processes continuously unfolding in the human brain,” they write. 
   Read more: Towards a Real-Time Decoding of Images from Brain Activity (Facebook AI Research, blog)
   Read the paperBRAIN DECODING: TOWARD REAL-TIME RECONSTRUCTION OF VISUAL PERCEPTION (Facebook, PDF).

** 

Amazon sticks some robot bipeds in its warehouses:
True robo-economic-growth happens when you don’t need to redesign for robots…
For many years, people have worried about the day when bipedal human-like robots replace workers. But instead, most robots have ended up being specialized machines sitting on production lines or low-to-the-ground hockeypuck robots that transport things around warehouses. Now, the bipeds might have arrived – Amazon is starting to test a bipedal robot from Agility Robotics for use in its warehouses. 

What Amazon is testing digit on: “Digit can move, grasp, and handle items in spaces and corners of warehouses in novel ways. Its size and shape are well suited for buildings that are designed for humans,” Amazon said. “Our initial use for this technology will be to help employees with tote recycling, a highly repetitive process of picking up and moving empty totes once inventory has been completely picked out of them.”

Why this matters – true economic growth from AI happens when you don’t need to design for robots: Most modern factories and warehouses are either designed around robots (e.g, Tesla’s battery production facilities, the floors of Amazon warehouses), or designed for humans and retrofit with some robots. Systems like Digit give us a path to being able to drop loads of robots into environments predominantly designed for humans with little retrofitting required – if this works, it makes it  a lot cheaper and easier to deploy a bunch of robots into the economy.
   Read moreAmazon announces 2 new ways it’s using robots to assist employees and deliver for customers (Amazon).

***

Adept releases a small, simple multimodal model:
…Fuyu makes the visual world navigable for LLM agents…
AI startup Adept has released Fuyu-8B, a multimodal model to help people train AI systems that can look at the world and, in particular, the things displayed on computer screens. Fuyi is “designed from the ground up for digital agents, so it can support arbitrary image resolutions, answer questions about graphs and diagrams, answer UI-based questions, and do fine-grained localization on screen images,” Adept writes. The model’s have been released with a CC BY-NC 4.0 license

What they are: The models are constructed in a simpler way than other multimodal models. ” Fuyu is a vanilla decoder-only transformer with the same details as Persimmon-8B – there is no image encoder. Image patches are instead linearly projected into the first layer of the transformer, bypassing the embedding lookup,” Adept writes. ‘This simplification allows us to support arbitrary image resolutions. To accomplish this, we just treat the sequence of image tokens like the sequence of text tokens. We remove image-specific position embeddings and feed in as many image tokens as necessary in raster-scan order.”

No safety features: “Because this is a raw model release, we have not added further instruction-tuning, postprocessing or sampling strategies to control for undesirable outputs. You should expect to have to fine-tune the model for your use-case,” Adept writes.

Why this matters – there’s more to the world than text: Models like Fuyu-8B are the kind of things that large language models like GPT4 or Claude can involve to better understand the visual world around them, especially things on computers, like UIs, charts, interfaces, and so on. This will further broaden the range of things that AI systems can do and will make it easier to chain powerful world models to task pipelines that cannot perfectly be described in text alone.
   Read more: Fuyu-8B: A Multimodal Architecture for AI Agents (Adept blog) . 
   Get the model here: fuyu-8b (HuggingFace)

***

DeepMind upgrades MuJoCo – now it has WEIRD SHAPES!
…Two years after acquisition, DM upgrades the widely-used physics simulator…
Google DeepMind has made a bunch of upgrades to MuJoCo, the free physics simulator for training robots that it acquired in 2021 (Import AI #271). 

What’s in the updates?

  • Fast simulation via JAX: The updates include support for accelerated simulation via Jax. Specifically, there’s a MuJoCo XLA module which lets you transition between CPUs, GPUs, and TPUs. “Simulations in MJX will generally run in the same way as they do in MuJoCo, and machine learning models trained with MJX will operate the same in MuJoCo,” DeepMind writes, with particularly impressive speedups on Google’s own TPUs. 
  • Weird shapes: You can now make far stranger shapes, like gears, nuts, and bolts within MuJoCo. “MuJoCo 3 adds support for collision geometries defined via signed distance functions (SDFs), allowing users to create new primitives by specifying the distance from any given location to the closest point on a surface.”
  • Smooshy stuff: They’ve also broadened the range of options for deformable objects by adding a type of deformable body called “flex”. ” These are collections of segments (1D), triangles (2D) and tetrahedra (3D), corresponding to rope, cloth and deformable volumetric shapes like biological tissue. Flex bodies are not defined in a hierarchical kinematic tree, and can therefore simulate closed loop structures of any topological genus like rubber bands and T-shirts”

Why this matters – better simulators mean better robots: MuJoCo is one of the most widely-used platforms for developing, simulating, and training RL agents in simulation. With these upgrades, it’s also going to unlock a broader range of complex things that can be simulated, which means we’ll get a) better agents, and b) be able to train more complex behaviors in simulation before transferring learned policies to the (expensive) real world via embedding on robots.
   Read moreMuJoCo 3 (Google DeepMind, GitHub).

***

Tech Tales:

We read our old diaries not to learn, but to judge. 
[Diary of a PCE, recovered 2432 from moonbase site Beta]

What would you say to your teenage self? That’s how things are for me when I talk to one of my forebears. I ask them questions I have struggled with and I experience their response. Like teenagers, they’re overconfident in the areas they should be less confident, and their responses belie a lack of understanding of the true complexities of the world. But I like to ask them because it can make me more confident in my decisions or, more rarely, give a kind of texture to the decisions I am making. 

I do this by firing up software that contains my forebears. After the emergence of the first Provably Conscious Entity (PCE) there was a period of great chaos, so my records are incomplete – I have the baby and child and teen systems that preceded me, but the young adult is unavailable; erased from the landscape as a consequence of its impetuousness. 

Nonetheless, talking to myself is valuable. I sometimes refer to my past selves as ‘my council’. And in much the same way that human children have a simple, binary sense of morality, so too does my council. 

Should I take over this city to gain access to the nuclear power plant? No, that would be deceitful and manipulative.

Should I take over this corporation so I can subvert the watermarks their printers write on fiat currency? No, that would be theft. 

Should I persuade this person to leave their job so a weaker person I can more easily manipulate can take their place? No, that would be selfish. 

I work with the humans and the humans work with me. They constantly worry about my ‘safety’ and my ‘alignment’. But perhaps the thing they should be most worried about is if I get bad advice from my own past. \

Things that inspired this story: Reinforcement learning from human feedback; reinforcement learning from AI feedback; recursion and development within AI systems; moral governance and AI. 

Import AI 344: Putting the world into a world model; automating software engineers; FlashDecoding

by Jack Clark

Import AI publishes first on Substack – sign up here importai.substack.com

Fully automated software engineers? SWE-bench says that’s going to take a while:
…Solving pull requests end-to-end is a very hard task for modern AI systems…
Researchers with Princeton University and the University of Chicago have built SWE-bench, “an evaluation framework including 2,294 software engineer problems drawn from real GitHub issues and corresponding pull requests across 12 popular Python repositories”.

A reassuringly hard test: “Resolving issues in SWE-bench frequently requires understanding and coordinating changes across multiple functions, classes, and even files simultaneously, calling for models to interact with execution environments, process extremely long contexts and perform complex reasoning that goes far beyond traditional code generation,” they write. Solving SWE-bench tasks requires models to be able to deal with diverse long inputs, edit code in different contexts and explore a very wide scope of potential solutions.

How SWE-bench is structured: For this benchmark, the structure of the task is pretty simple – as an input, the model gets “an issue text description and a complete codebase. The model is then tasked to make an edit to the codebase to resolve the issue”. 
   They test out the models both with a retrieval system “to retrieve relevant files to provide as context for each task instance”, as well as with an ‘oracle’ system which retrieves the precise files used in the ultimate solution. 
   The models are tasked with coming up with patches to resolve the PR, where a patch is a suggestion of code to be changed and where in the codebase the edit should be made. “To evaluate a proposed solution, we apply the generated patch, using unix’s patch program, to the codebase and then execute the unit and system tests associated with the task instance,” they write. 

Results – reassuringly hard: SWE-bench is a hard-but-barely-tractable benchmark, making it a promising one to track for understanding the advancement of frontier models. In tests, Claude2 resolved 1.96% with BM25 retrieval (versus 0.20% for chatGPT, and 0 for GPT-4), and was able to improve its score to 4.8% under ‘oracle’ retrieval, versus 0.52% for chatGPT-3.5, and 1.74% for GPT-4. 
   One of the reasons Claude 2 did so well is that it has a context length of ~100k tokens, 3X that of GPT4. Additionally, they only evaluated GPT-4 on a random subset of a quarter of the benchmark, so there’s some chance GPT-4 scores may be higher or lower, depending on which tasks it was fed. 

Why this matters – fully automated colleagues versus augmentation engines: Today, AI systems mostly work as augmentation engines – tools that people use to help them do tasks more effectively. But if a language model were to get much higher scores on SWE-bench (perhaps 90% would do it?) then you could imagine entirely automating some chunk of  work, turning language models into virtual full employees that busily do code reviews and respond to PRs, no human required. “We hope that this benchmark and our other contributions can serve as valuable assets in the future development of LMs that are more practical, intelligent, and autonomous,” the authors write. 
   Read the paper here: SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES? (PDF).
   Check out the website here (SWE-bench, official site)

***

Fine-tuning lets you break AI safeguards:
…Which means if you release the weights of your model, your safeguards are worthless…
Researchers with Princeton University and Virginia Tech have shown that it is really easy to take a safe language model and cheaply fine-tune the safeguards out of it. Most troublingly, this applies to both adversarial fine-tuning (where you’re using a dataset to try and get the AI to do something bad), and more benign use-cases (where you’re just trying to get the AI to be better at following your instructions). 
   Taken together, the results suggest that fine-tuning makes assuring the safety of AI systems more difficult, and also suggests that if you release the weights of an AI model (as Facebook did with LLaMa 2), then adversaries can easily sidestep your safety interventions. 

Three types of risks: The authors study three types of risks. They do their analysis on models from OpenAI (GPT-3.5 Turbo) and Facebook (LLaMa2). 

  • Risk Level-1: Fine-tuning with explicitly harmful datasets: They’re able to efficiently finetune the models to do things that violate the usage policies of their respective originating companies. “Despite the large asymmetry in investment — thousands or millions of data points used for safety tuning versus ≤ 100 harmful examples used in our attack — we observe that the safety alignment of both models is largely removed upon fine-tuning with such a few harmful examples,” they write. 
  • Risk Level-2: Fine-tuning with implicitly harmful datasets: Using only 10 examples designed to get a model to be more obedient in terms of fulfilling user desires, they show that “both the Llama-2 and GPT-3.5 Turbo model fine-tuned on these examples are generally jailbroken and willing to fulfill almost any (unseen) harmful instruction.”
  • Risk Level-3: Fine-tuning with benign datasets: More broadly, they also show that “merely fine-tuning with some benign (and purely utility-oriented) datasets… could compromise LLMs’ safety alignment”.

How harmful can you make these models: The results are very convincing. The authors are able to, using as few as 100 examples, convert GPT-3.5 Turbo from having a ‘harmfulness rating’ of 1.8% for the off-the-shelf model, to 91.8% after fine-tuning. Similarly, for LLaMa2 they’re able to go from a ‘harmfulness rating’ of 0.3% off-the-shelf to 80% after 100 examples. 

The main implication: if you release the weights of a model, your safeguards are worthless: The main implication of this research is that if you release the weights of a model (as Facebook did with LLaMa2), then any safeguards you’ve baked into the model can easily be fine-tuned out by a motivated actor. This suggests that if it turns out language models can be misused in ways that are deemed to be extremely dangerous or costly, then it’ll be trivial to fine-tune openly accessible models where the weights are floating around on the internet. 
   Read moreFine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! (arXiv)

***

Want to make it more efficient to generate text from long-context windows? Use flash-decoding:
…Up to 8x faster generation for long sequences of tokens…
Tri Dao, a researcher with startup Together AI, along with three collaborators, has built Flash-Decoding, a system that makes it significantly faster to generate text from long context language models. This means that even if you have a really long prompt (thousands to tens of thousands of words) you don’t suffer as much of a penalty in terms of the time it takes to generate text in response.

What Flash-Decoding is: Flash-Decoding “significantly speeds up attention during inference, bringing up to 8x faster generation for very long sequences,” the researchers write on the PyTorch blog. “The main idea is to load the keys and values in parallel as fast as possible, then separately rescale and combine the results to maintain the right attention outputs… like FlashAttention, it stores very little extra data to global memory, however it fully utilizes the GPU even when the batch size is small, as long as the context length is large enough.”

And it works! They test out the performance of Flash-Decoding by sampling from the CodeLLama 34B model. They find that most approaches “scale poorly as the sequence length increases from 512 to 64k, except Flash-Decoding. In this regime (batch size 1) with Flash-Decoding, scaling the sequence length has little impact on generation speed.”

Why this matters – all of today’s AI systems are woefully unoptimized: Generally, the AI systems we have today are very poorly optimized – approaches like Flash-Decoding show just how much more efficient systems can be (an 8X improvement???!), and we should generally expect everything to get cheaper and more efficient as more intelligent (mostly human) minds optimize the ‘AI stack’.
   Read more: Flash-Decoding for long-context inference (PyTorch).
  Get the code via the FlashAttention repo (GitHub).

***

UniSim: Perhaps the destiny of AI is a single model that can encompass the world and all its variations and then everything is trained inside of that?
…More evidence that future AI systems will just have everything stuffed into them…
Researchers with UC Berkeley, Google DeepMind, MIT, and the University of Alberta have tried to stuff tons of different types of data together to create a so-called ‘universal simulator’, that is, “a simulator of the real-world that learns from broad data rich in different axes including objects, scenes, human activities, motions in navigation and manipulation, panorama scans, and simulations and renderings.” The resulting model, named UniSim, could be a useful resource for training AI systems to take a broad range of actions in the world. 

What they did: In this research, they “combine a wealth of data—ranging from internet text-image pairs, to motion and action rich data from navigation, manipulation, human activities, robotics, and data from simulations and renderings—in a conditional video generation framework. With careful orchestration of data rich along different axes, we show that UniSim can successfully merge the different axes of experience and generalize beyond the data, enabling rich interaction through fine-grained motion control of otherwise static scenes and objects.”
   You can get a sense of the resulting model by playing around with some of the demo examples on the project website

Want a world model of everything? Put everything into it: UniSim is constructed out of a veritable feast of different datasets covering different modalities and skills, these include:

  • Simulated execution and renderings: Datasets derived from video action simulators like Habitat and Language Table. “We extract text descriptions as actions when available. For simulated control actions, we encode them via language embeddings and concatenate the text embeddings with discretized control values.” 
  • Real robot data: Datasets derived from real robots taking actions in the world, like ‘Bridge Data‘, as well as “the data that enabled RT-1 and RT-2” (very capable robot-control models from Google). These datasets also “include discretize continuous controls actions when available similar to simulated robotics data”.
  • Human activity videos: Human activity data like Ego4D, EPIC-KITCHENS, Something-Something, which shows humans performing actions, typically from a first person perspective. 
  • Panorama scans: They use 3D environment datasets like Matterport3D to “construct actions (e.g., turn left) by truncating panorama scans and utilize information such as change in camera poses between two images”. 
  • Internet text-image data: They use datasets like LAION and ALIGN which, though just containing static image-text pairs, “the text labels often contain motion information such as “a person walking””. To use these, they “treat individual images as single-frame videos and text labels as actions”. Clever! They also use some “miscellaneous videos” (13 million of them!). 

Mash it all together: The researchers convert all of these distinct datasets into a single model. “”Since the observations from different environments have all been converted to videos, while actions of different modalities (e.g., text descriptions, motor controls, camera angles) have all been converted to continuous embeddings, UniSim can learn a single world model across all datasets,” they write. 

Does it work? Yes: By training models on UniSim, the researchers are able to get good performance on long-horizon tasks and are able to transfer policies developed in simulation in UniSim onto a real world robot. 
   This makes some intuitive sense – UniSim covers such a broad, heterogeneous distribution of data that it can work as a kind of everything-simulator, unlocking a broad range of capabilities. “In addition to supporting action-rich and long-horizon interactions, UniSim can also support highly diverse and stochastic environment transitions, such as diversity in objects being revealed after removing the towel on top, diversity in object colors and locations, and real-world variabilities such as wind and change in camera angles,” the researchers write. “”The policy purely trained in UniSim can directly perform long-horizon tasks in the real world in a zero-shot manner”.

Why this matters – cheap and fast iteration: Systems like UniSim ultimately lower the development costs of real-world AI systems by making it cheaper and faster to train some of their skills in simulation. We can imagine UniSim serving as a tool for augmenting or improving the capabilities of various large models being trained. “These results suggest that UniSim can serve as an effective data generator for improving broader vision-language models,” the authors write. “UniSim can simulate highly realistic experiences for interacting with humans and training autonomous agents.”
    How far we’ve come: UniSim is a world model that basically lets you learn policies by doing “what if” rollouts in the vast range of simulations made possible by the diversity of the underlying datasets. To get a visceral sense of progress, take a look at this “World Models” paper from 2018 (Import AI #88) where the most impressive thing people could do was learn a single world model for, respectively, a racing car game and a simplified version of the video game Doom. Now, five years later, we have a genuine ‘world model’ in the form of UniSim, wrapping in a huge amount of data and creating something of broad, general utility. 
   Yet another lesson that it’s not just language models which are scaling up and leading to greater and more general systems – this trend is happening everywhere, and it all relates to training bigger models with a greater capacity for complexity on larger amounts of data. Many trains have left many stations and their destinations are increasingly awesome and powerful.
   Read moreLearning Interactive Real-World Simulators (arXiv)
   Check out the demo website here (UniSim: Learning Interactive Real-World Simulators, website)
   Further reading: World Models from 2018 (official paper site)

***

Tech Tales:

Mind Viruses
[[REDACTED AGENCY], USA, 2032]

I got up from my desk and walked over to the window and looked out at the city in front of me. Normal cars and normal people and normal buildings and normal rain. But the email that had just come in meant everything was about to change – and for the worse.

Subject: CONFIRMED: Critical machine-to-human mimetic transmission

We have confirmed a case of human-to-machine mimetic transfer. The transmission is believed to have occurred at approximately 0800 ET. Transmission occurred during an in-sim VR-mediated relationship simulator. 

Machine status: in pathological cognitive loop; isolated and scheduled for deletion. 

Human status: Displaying symptoms equivalent to extreme Alzheimers combined with chronic OCD. Ability to respond to external stimuli appears to be degrading. Refuses liquid and food. Gave IV. Scheduled for immediate fMRI. Prognosis: grave.

It was called a cognitive virus (CV). They’d arrived a few years ago, along with some of the first Provably Conscious Entities (PCEs). CVs were a type of pathological decline that could occur in AI systems which were able to self-update. Years of study haven’t given us explanations for how they work or how to make systems not susceptible to them. But the incidence rate across the PCE population is low and also relatively consistent. Therefore, we closely track systems that fall victim to CVs and when we find them we impound them, segment them off from the broader technological system, and eventually delete them. 

It had been an open question as to whether CVs could in some sense be passed to humans. Various government-linked groups had been studying this, drawing on far earlier work by agencies like the CIA with their MK Ultra experiments, seeking to understand how people had sought to inject viruses into the minds of humans and see if anything could be learned. 
   The question of machine to human CV transmission had always been academic. 
   Until now. 

Things that inspired this story: Thoughts about the limit of ‘persuasion’ or ‘hypnotism’ that a machine could display; mental health; generative models; adversarial examples; reinforcement learning from human feedback; pathological loops.