Import AI 184: IBM injects AI into the command line; Facebook releases 4.5 BILLION parallel sentences to aid translation research; plus, VR prison
by Jack Clark
You’ve heard of expensive AI training. What about expensive AI inference?
…On the challenges of deploying GPT-2, and other large models…
In the past year, organizations have started training ever-larger AI models. The size of these models has now grown enough that they’ve started creating challenges for people who want to deploy them into production. A new post on Towards Data Science discusses some of these issues in relation to GPT-2:
– Size: Models like GPT-2 are large (think gigabytes not megabytes), so embedding them in applications is difficult.
– Compute utilization: Sampling an inference from the model can be CPU/GPU-intensive, which means it costs quite a bit to set up the infrastructure to run these models (just ask AI Dungeon).
– Memory requirements: In the same way they’re compute-hungry, new models are memory-hungry as well.
Why this matters: Today, training AI systems is very expensive, and sampling from trained models is cheap. With some new large-scale models, it could become increasingly expensive to sample from the models as well. How might this change the types of applications these models get used for, and the economics associated with whether it makes sense to use them?
Read more: Too big to deploy: How GPT-2 is breaking production (Towards Data Science).
####################################################
AI versus AI: Detecting model-poisoning in federated learning
..Want to find the crook? Travel to the lower dimensions!…
If we train AI models by farming out computationally-expensive training processes to people’s phones and devices, then can people attack the AI being trained by manipulating the results of the computations occurring on their devices? New research from Hong Kong University of Science and Technology and WeBank tries to establish a system for defending against attacks like this.
Defending against the (distributed) dark arts: To defend their AI models against these attacks, the researchers propose something called spectral anomaly detection. This involves using a variational autoencoder to embed the results of computations from different devices into the same low-dimensional latent space. By doing this, it becomes relatively easy to identify anomalous results that should be treated with suspicion.
“Even though each set of model updates from one benign client may be biased towards its local training data, we find that this shift is small compared to the difference between the malicious model updates and the unbiased odel updates from centralized training,” they write. “Through encoding and decoding, each client’s update will incur a reconstruction error. Note that malicious updates result in much larger reconstruction errors than the benign ones.” In tests, their approach gets accuracies of between 80% and 90% at detecting three types of attacks – sign-flipping, noise addition, and targeted model poisoning.
Why this matters: This kind of research highlights the odd overlaps between AI systems and political systems – both operate over large, loosely coordinated sets of entities (people and devices). Both systems need to be able to effectively synthesize a load of heterogeneous views and use these to make the “correct” decision, where correct usually correlates to the preferences they extract from the big mass of signals. And, just as politicians try to identify extremist groups who can distort the sorts of messages politicians hear (and therefore the actions they take), AI systems need to do the same. I wonder if in a few years techniques developed to defend against distributed model poisoning, might be ported over into political systems to defend against election attacks?
Read more: Learning to Detect Malicious Clients for Robust Federated Learning (arXiv).
####################################################
Facebook wants AI to go 3D:
…PyTorch3D makes it more efficient to run ML against 3D mesh objects, includes differentiable rendering framework…
Facebook wants to make it easier for people to do research in what it terms 3D deep learning – this essentially means it wants to make tools that let AI developers train ML systems against 3D data representations. This is a surprisingly difficult task – most of today’s AI systems are built to process data presented in a 2D form (e.g., ImageNet delivers pictures of real-world 3D scenes via data composed as 2D data structures to represent static images).
3D specials: PyTorch3D ships with a few features to make 3D deep learning research easier – these include data structures for representing 3D object meshes efficiently, data operators that help make comparisons between 3D data, and a differentiable renderer – that is, a kind of scene simulator that lets you train an AI system that can learn while operating a moving camera. “With the unique differentiable rendering capabilities, we’re excited about the potential for building systems that make high-quality 3D predictions without relying on time-intensive, manual 3D annotations,” Facebook writes.
Why this matters: Tools like PyTorch3D make it easier and cheaper for more people to experiment with training AI systems against different and more complex forms of data than those typically used today. As tools like this mature we can expect them to cause further activity in this research area, which will eventually yield various exciting sensory-inference systems that will allow us to do more intelligent things with 3D data. Personally, I’m excited to see how tools like this make it easier for game developers to experiment with AI systems built to leverage natively-3D worlds, like game engines. Watch this space.
Read more: Introducing PyTorch3D: An open-source library for 3D deep learning (Facebook AI Blog).
Get the code for PyTorch3D here (Facebook Research, GitHub).
####################################################
A-I-inspiration: Using GANs to make… chairs?
…The future of prototyping is an internet-scale model, some pencils, and a rack of GPUs…
A team of researchers with Peking University and Tsinghua University have used image synthesis systems to generate a load of imaginary chairs, then make one of the chairs in the real world. Specifically, they implement a GAN to try and generate images of chairs, along with a superresolution module which takes these outputs and scales them up.
Furniture makers of the future, sit down! “After the generation of 320,000 chair candidates, we spend few ours [sic] on final chair prototype selection,” they write. “Compared with traditional time-consuming chair design process”.
What’s different about this? Today, tons of generative design tools already exist in the world – e.g., software company Autodesk has staked out some of its future on the use of a variety of algorithmic tools to help it perform on-the-fly “generative design” which optimizes things like the weight and strength of a given object. AI tools are unlikely to replace tools like this in the short term, but they will open up another form of computer-generated designs for exploration by people – though I imagine GAN-based ones are going to be more impressionistic and fanciful, whereas ones made by industrial design tools will have more useful properties that tie to economic incentives.
Prototyping of the future: In the future, I expect people will train large generative systems to help them cheaply prototype ideas, for instance, by generating various candidate images of various products to inspire a design team, or clothing to inspire a fashion designer (e.g., the recent Acne Studios X Robbie Barrat collab), or scraps of text to aid writers. Papers like this sketch out some of the outlines for what this world could look like.
Read more: A Generative Adversarial Network for AI-Aided Chair Design (arXiv).
####################################################
Google Docs for AI Training: Colab goes commercial:
…Want better hardware and longer runtimes? Get ready to pay up…
Google has created a commercial version of its popular, free “Google Colab” service. Google Colab is kind of like GDocs for code – you can write code in a browser window, then execute it on hardware in Google’s data centers. One thing that makes Colab special is that it ships with inbuilt access to GPUs and TPUs, so you can use Colab pages to train AI systems as well as execute them.
Colab Pro: Google’s commercial version, Colab Pro, costs $9.99 a month. What you get for this is more RAM, priority access to better GPUs and TPUs, and code notebooks that’ll stay connected to hardware for up to 24 hours (versus 12 hours for the free version).
More details about Colab Pro here at Google’s website.
Spotted via Max Woolf (@minimaxir, Twitter).
####################################################
What’s cooler than a few million parallel sentences? A few BILLION ones:
…CCMatrix gives translation researchers a boost…
Facebook has released CCMatrix, a dataset of more than 4.5 billion parallel sentences in 576 language pairs.
Automatic for the (translation) people: CCMatrix is so huge that Facebook needed to use algorithmic techniques to create it. Specifically, Facebook learned a multilingual sentence embedding to help it represent sentences from different languages in the same featurespace, and then using the distance between sentences in featurespace to help the system figure out if they’re parallel sentences from two different languages.
Why this matters: Datasets like this will help people build translation systems that work for a broader set of people, and in particular should help with transfer to languages for which there is less digitized material.
Read more: CCMatrix: A billion-scale bitext data set for training translation models (Facebook AI Research, blog).
Read more: CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB (ArXiv).
Get the code from the CCMatrix GitHub Page (Facebook Research, GitHub).
####################################################
AI in the command line – IBM releases its Command Line AI (CLAI):
…Get ready for the future of interfacing with AI systems…
IBM Researchers have built Project CLAI (Command Line AI), open source software for interfacing with a variety of AI capabilities via the command line. In a research paper describing the CLAI, they lay out some of the potential usages of an AI system integrated into the command line – e.g., in-line search and spellchecking, code suggestion features, and so on – as well as some of the challenges inherent to building one.
How you build a CLAI? The CLAI – pronounced like clay – is essentially a little daemon that runs in the command line and periodically comes alive to do something useful. “Every command that the user types is piped through the backend and broadcast to all the actively registered skills associated with that user’s session on the terminal. In this manner, a skill can autonomously decide to respond to any event on the terminal based on its capabilities and its confidence in its returned answer,” the researchers write.
So, what can a CLAI do? CLA’s capabilities include: a natural language module that tries to convert plain text commands into tar or grep commands; a system that tries to find and summarize information from system manuals; a ‘help’ function which activates “whenever there is an error” and searches Unix Stack Exchange for a relevant post to present to the user in response; a bot for querying Unix Stack Exchange in plain text and a Kubernetes automation service (name: Kube Bot).
And what can CLAI do tomorrow? In the future, the team hope to implement an auto-complete feature into the command line, so CLAI can suggest commands users might want to run.
Do people care? In a survey of 235 developers, a little over 50% reported they’d be either “likely” or “very likely” to use a command line interface with an integrated CLAI (or similar) service. In another part of the survey, they reported intolerance for laggy systems with response times greater than a few seconds, highlighting the need for these systems to be well performing.
Why this matters: At some point, AI is going to be integrated into command lines in the same way things like ‘git’ or ‘pip install’ or ‘ping’ are today – and so it’s worth thinking about this hypothetical future today before it becomes our actual future.
Read more: CLAI: A Platform for AI Skills on the Command Line (arXiv).
Get the code for CLAI from IBM’s GitHub page.
Watch a video about CLAI here (YouTube)
####################################################
Tech Tales:
The Virtual Prisoner
You are awake. You are in prison. You cannot see the prison, but you know you’re in it.
What you see: Rolling green fields, with birds flying in the sky. You look down and don’t see a body – only grass, speckled with flowers.
Your body feels restricted. You can move, but only so far. When you move, you travel through the green fields. But you know you are not in the green fields. You are somewhere else, and though you perceive movement, you are not moving through real space. You are, however, moving through virtual space. You get fed through tubes attached to you.
Perhaps it would not be so terrible if the virtual prison was better made. But there are glitches. Occasional flaws in the simulation where all the flowers turn a different color, or the sky disappears. For a second you are even more aware of the falsehood of this world. Then it gets fixed and you go back to caring less.
One day something breaks and you stop being able to see anything, but can still hear the artificial sound of wind causing tree branches to move. You close your eyes. You panic when you open them and things are still black. Eventually, it gets fixed, and you feel relieve as the field reappears in front of you.
It takes energy to remember that while you walk through the field you are also in a room somewhere else. It gets easier to believe more and more that you are in the field. It’s not that you’re unaware of your predicament, but you don’t focus on it so much. You cease modeling the duality of your world.
You have lived here for many years, you think one day. You know the trees of the field. Know the different birds in the sky. And when you wake up your first thought is not: where am I? It is “where shall I go today?”
Things that inspired this story: Room-scale VR; JG Ballard; simulacra; the fact states are more obsessed with control rather than deletion; spaceless panopticons.
Very troubling.