Import AI Newsletter 39: Putting neural networks on a diet, AI for simulating fluid dynamics, and what distributed computation means for intelligence
by Jack Clark
China tests out automated jaywalking cop: Chinese authorities in Shenzhen have installed smart cameras at a pedestrian crossing in the megacity. The system uses AI and facial recognition technology to spot pedestrians walking against the light, photographs them, and then displays their picture publicly, according to People’s Daily.
A ‘pub talk’ Turing test: there’s a new AI task to test how well computers can feign realism. The Conversational AI Challenge presents a person and an AI with a random news and/or wikipedia article, then asks the participants to talk about it cogently for as long as they like. If the computer is able to convince the other person that it is also a person, it wins. (This test closely mirrors how English adolescents learn to socialize with one another when in pubs.)
…Next step (I’m making this up): present a computer and a person with a random meme and ask them to comment about it, thus closely mirror contemporary ice-breaking conversations.
Will the last company to fall off the hype cliff please leave a parachute behind it? The Harvard Business Review says the first generation of AI companies are doomed to fail, in the same way the first set of Internet companies failed in the Dot Com boom. A somewhat thin argument that also struggles with chronology – when do you count a company as ‘first’? Arguably, we’ve already had our first wave of AI company failures, given the demise of AI-as-software-service companies such as Ersatz, and early, strategic acquihires for others (eg, Salesforce acquiring MetaMind, Uber acquiring Geometric Intelligence.) The essence of the article does feel right: there will be problems with early AI adoption and it will lead to some amount of blow-back.
Spare a thought for small languages in the age of AI: Icelandic people are fretting about the demise of their language, as the country of 400,000 people sees its youth increasingly use English, primarily because of tourism, but also to use the voice-enabled features of modern AI software on smartphones and clever home systems, reports the AP. Data poor environments make a poor breeding ground for AI.
Putting neural networks on a diet: New Baidu research, ‘Exploring Sparsity in Recurrent Neural Networks’, shows how to reduce the number of effective neurons in a network during the training process, creating a smaller but equally capable trained network at the end.
….The approach works kind of like this: you set a threshold number at the beginning, then at every step in training you look at all the neurons, multiply the value of each neuron by its binary mask (default setting: 1), then observe the ones that fall below your pre-defined threshold, then set the weights that are lower than your threshold to zero. You continue to do this at each step, with some fancy math to control the rate and propagation of this across the network, and what you wind up with is a slimmed-down, specialized network, that has the topological advantages of a full fat one.
… This approach lets them reduce the model size of the ultimate network by around 90% and gain an inference-time speedup of between 2X and 7X.
…people have been trying to prune and slim-down neural networks for decades. What sets Baidu’s approach apart, claim the researchers, is that the heuristic to use to decide which neurons to freeze is relatively simple, and you can slim the network successively during training. Other approaches have required subsequent retraining, which adds computational and time expenses.
…From the very small to the very big: Baidu plans open source release of ‘Apollo’ self-driving operating system: This summer Baidu plans to release a free version of the underlying operating system it uses to run the cars, called Apollo, executives tell the MIT Technology Review.
…Baidu will retain control over certain key self-driving technologies, such as machine learning and mapping systems, and will make them accessible via API. This is presumably a way to generate business for cloud services operated by the company.
…Google had earlier contemplated a tactic similar to this but seemed to pivot after it detected minimal enthusiasm among US automakers for the idea of ceding control of smart software over to Google. No one wants to just bend metal anymore.
This weeks ‘accidentally racist’ AI fail: A couple of years ago Google got into trouble when its new ‘Photos’ app categorized black people as ‘gorillas’, likely due to a poorly curated training set. Now a new organization can take the crown of ‘most unintentionally racist usage of AI’ with Faceapp, whose default ‘hot’ setting appears to automatically whiten the skin of the faces it manipulates. It’s 2017, people.
…it’s important to remember that in AI Data is made OF PEOPLE: Alphabet subsidiary Verily, has revealed the Verily Study Watch. This device is designed to pick up a range of data from participants in one of Verily’s long-running human health studies, including heart rate, respiration, and sleep patterns. As machine learning and deep learning approaches move from working on typical data, such as digital audio and visual information, and into the real-world, expect more companies to design their own data capturing devices.
Deep Learning in Africa: artificial intelligence talent can come from anywhere and everywhere. Companies, universities and non-profits are competing with eachother to attract the best minds in the planet to come and work on particular AI problems. So it makes sense that Google, DeepMind, and the University of Witwatersrand in Johannesburg are sponsoring Deep Learning Indaba, an AI gathering to run in South Africa in September 2017.
A neural memory for your computer for free: DeepMind has made the code for its Nature paper ‘Differentiable Neural Computers’ available as open source. The software is written in TensorFlow and TF-library Sonnet.
…DNC is an enhanced implementation of the ‘Neural Turing Machine’ paper that was published in 2015. It lets you add a memory to a neural network, letting the perceptual machinery of your neural net write data into a big blob of neural stuff (basically a souped-up LSTM) which it can then refer back to.
…DNC has let DeepMind train systems to perform quite neat learning tasks, like analyzing a London Underground map and figuring out the best route between multiple locations – exactly the sort of task typical computers find challenging without heavy supervision.
… however, much like generative adversarial networks, NTMs are (and were) notorious for being both extremely interesting and extremely difficult to train and develop.
Another framework escapes the dust: AI framework Caffe has been updated to Caffe2 and infused with resources by Facebook, which is backing the framework along with PyTorch.
…The open source project has also worked with Microsoft, Amazon, NVIDIA, Qualcomm, and Intel to ensure that the library runs in both cloud and mobile environments.
…It’s noteable that Google isn’t mentioned. Though the AI community tends towards being collegial, there are some areas where they’re competitive: AI frameworks is one place. Google and its related Alphabet companies are all currently working on libraries such as TensorFlow, Sonnet, and Keras.
…This is a long game. In AI frameworks, where we are today feels equivalent to the early years of Linux where many distributions competed with eachother, going through a Cambrian explosion of variants, before being winnowed down by market and nerd-adoption forces. The same will be true here.
The future of AI is… distributed computation: it’s beginning to dawn on people that AI development requires:
…i) vast computational resources.
…ii) large amounts of money.
…iii) large amounts of expertise.
…By default, this situation seems to benefit large-scale cloud providers like Amazon and Microsoft and Google. All of these companies have an incentive to offer value-added services on top of basic GPU farms. This makes it likely that each cloud will specialize around a particular framework(s) to add value as well as services that play to each provider’s strengths. (Eg, Google: TensorFlow & cutting-edge ML services; Amazon: MXNet & great integration with AWS suite; Microsoft: CNTK & powerful business-process automation/integration/LinkedIn data).
…wouldn’t it be nice if AI researchers could control the proverbial means of production for AI? Researchers have an incentive to collaborate with one another to create a basic, undifferentiated computer layer. Providers don’t.
…French researchers have outlined ‘Morpheo’. A distributed data platform that specializes in machine learning and transfer learning, and uses the blockchain for securing transactions and creating a local compute economy. The system, outlined in this research paper, would let researchers access large amounts of distributed computers, using cryptocurrencies to buy and sell access to compute and data. “Morpheo is under heavy development and currently unstable,” the researchers write. “The first stable release with a blockchain backend is expected in Spring 2018.” Morpheo is funded by the French government, as well as French neurotechnology startup Rhythm.
…There’s also ‘Golem’, a global, open source, decentralized computer system. This will let people donate their own compute cycles into a global network, and will rely on Etherium for transactions. Every compute node within Golem sees its ‘reputation’ – a proxy for how well other nodes trust it and are likely to give work to it – rise and fall according to how well it completes jobs associated with it. This, theoretically, creates a local, trusted economy.
…check back in a few months when Golem releases its first iteration Brass Golem, a CGI rendering system.
The x86-GPU hegemony is dead. Long live the x86-GPU hegemony: AI demands new types of computers to be effective. That’s why Google invested so much time and resources into creating the Tensor Processing Unit (TPU) – a custom, application specific integrated circuit, that lets it run inference tasks more efficiently than if using traditional processors. How much more efficient? Google has finally published a paper giving some details on the chip…
…When Google compared the chip to some 2015-year video cards it displayed a performance advantage of 15X to 30X. However, that same chip only displays an advantage of between 1X and 10X when compared against the latest NVIDIA chips. That highlights the messy, expensive reality of developing hardware.(We don’t know whether Google has subsequently iterated TPUs further, so TPU 2.0s – if they exist – may have far better performance than that discussed here.) NVIDIA has politely disagree with some of Google’s performance claims, and outlined its view in this blog post…
… from an AI research standpoint, faster inference is useful for providing services and doing user-facing testing, but doesn’t make a huge difference to the training of the neural network models themselves. The jury is still out on which chip architectures are going to come along that will yield unprecedented speedups in training.
…meanwhile, NVIDIA continues to iterate. Its latest chip is the NVIDIA TITAN Xp, a more powerful version of its eponymous predecessor, based on NVIDIA’s Pascal architecture, with more CUDA cores than its predecessor, the TITAN X, at the same wallet-weeping price of $1,200. (And whither AMD? The community clearly wants more competition here but a lack of a fleshed out software ecosystem makes it hard for the companies cards to play here at all. Have any ImportAI readers explored using AMD GPUS for AI development? Things may change later this year when the company releases GPUs on its new, highly secretive ‘VEGA’ architecture. Good name.)
…and this is before we get to the wave of other chip companies coming out of steath.. These include: Wave Systems, Thinicil, Graphcore, Isocline, Cerebras, DeepScale, and Tenstorrent, among others according to Tractica.
Reinforcement learning to mine the internet: the internet is a vast store of structured and unstructured information and therefore a huge temptation to AI researchers. If you can train an agent to successfully interact with the internet, the theory goes, then you’ve built something that can simply and scalably learn a huge amount about the world.
…but getting to this is difficult. A new paper from New York University, ‘Task-Oriented Query Reformulation with Reinforcement Learning’ uses reinforcement learning to train an agent to improve the types of search queries it feeds into a search engine. The goal is to automatically iterate on a query until it generates more relevant information than before, as measured by an automatic inference method called Pseudo Relevance Feedback.
…the scientists test their approach on two search engines: Lucene and Google.
…datasets tested on include TREC, Jeopardy, Microsoft Academic,
…the approach does well, mostly beating other approaches (though falling short of supervised learning methods). However it still lags far behind a close-to-optimal supervised learning ‘Oracle’ method, suggesting more research can and should be done here.
Dropbox’s noir-esque machine learning quest: Dropbox’s devilish tale of how it build its own deep learning based optical character recognition (OCR) system features a mysterious quest for a font to use to better train its system on real world failures. The company eventually found “a font vendor in China who could provide us with representative ancient thermal printer fonts.”. No mention made of whether they enlisted a Private Font Detective to do this, but I sure hope they did!
Modeling the world with neural networks: the real world is chaotic and, typically, very expensive to simulate at high fidelity. The expense comes from the need to model a bunch of very small, discrete interactions to be able to generate plausible dynamics to lead to the formation of, say, droplets or smoke tendrils, and so on. Many of the world’s top supercomputers spend their time trying to simulate these complex systems, which are out of scope of the capabilities of traditional computers.
…but what if you could instead use a neural network to learn to approximate the functions present within a high accuracy simulation, then run the trained model using far fewer computational resources? That’s the idea German researchers have adopted with a new technique to train neural networks to be able to model fluid dynamic simulations.
The approach, outlined in Liquid Splash Modeling with Neural Networks, works by training neural networks on lots of physically accurate, ground truth data, thus teaching them to approximate the complex function. Once they’ve learned this representation they can be used as a computationally cheap stand-in to generate accurate looking water and so on.
…the results show that the neural network-based method has a greater level of real-world fidelity in a smaller computational envelope than other approaches, and works for both simulations of a dam breaking, and of a wave sloshing back and forth.
…Smoke modeling: Many researchers are taking similar approaches. In this research between Google Brain and NYU, researchers are able to rapidly simulate stuff like smoke particles flowing over objects via a similar technique. You can read more in: Accelerating Eulerian Fluid Simulation With Convolutional Networks.
[2025: A bedroom in the San Francisco Bay Area.]
“No,” you say, rolling over, eyes still shut.
“I’ve got to tell you what happened last night!”
“Where are you?”
“Tokyo as if it matters. Come on! Come speak to me!”
“Fine”, you say, sitting up in bed, eyes open, looking at the robot on your nightstand. You thumb your phone and give the robot the ability to see.
“There you are!” it says. “Bit of a heavy one last night?”
“It got heavy after the first 8 pints, sure.”
“Well, tell me about it.”
“Let me see Tokyo, then I’ll tell you.”
“One second,” the robot says. Then a little light turns off on its head. A few seconds pass and the light dings back on. “Ready!” it says.
You go and grab the virtual reality headset from above your desk, then grab the controllers. Once you put it all on you have to speak your password three times. A few more seconds for the download to happen then, bam, you’re in a hotel room in Tokyo. You stretch your hands out in front of you, holding the controllers, and in Tokyo the robot you’re controlling stretches out its own hands. You tilt your head and it tilts its head. Then you turn to try and find your friend in the room and see her. Except it’s not really her, it’s a beamed in version of the robot on your nightstand, the one which she is manipulating from afar.
“Okay if I see you?”
“Sure,” she says. “That’s part of what happened last night.”
One second passes and the robot shimmers out of view, replaced by your friend wearing sneakers, shorts, a tank top,, and the VR headset and linked controllers. One of her arms has a long, snaking tattoo on it – a puppet master’s hand, holding the strings attached to a scaled-down drawing of the robot on your nightstand.
“They sponsored me!” she says, and begins to explain.
As she talks and gestures at you, you flip between the real version of her with the controllers and headset, and the robot in your room that she’s manipulating, whose state is being beamed back into your headset, then superimposed over the hotel room view.
At one point, as she’s midway through telling the story of how she got the robot sponsored tattoo, you drink a cup of coffee, still wearing the headset, holding the controller loosely between two of your fingers as the rest of them wrap around the cup. In the hotel room, your robot avatar lifts an imaginary cup, and you wonder if she sees steam being rendered off of it, or if she sees the real you with real steam. It all blurs into one eventually. As part of her sponsorship, sometimes she’s going to dress up in a full-scale costume of the robot on your nightstand, and engage strangers on the street in conversation. “Your own Personal Avatar!” she will say. “Only as lonely as your imagination!”