Import AI: Issue 27: “Outrageously large” neural nets, AI for math, and the names of three oil rig robots
by Jack Clark
The future of AI: a big dollop of ‘learn-able computation’, paired with a sprinkling of hand-crafted algorithms: One reason why AlphaGo excelled at Go was because it paired a neural network-based learning system with a hand-tuned near-optimal Monte Carlo Tree Search algorithm. It’s likely that pairing the general-purpose function approximation properties of neural nets, with tried-and-tested algorithms will continue to yield results. (Akin to how people can enhance their mental performance by pairing intuitions with a few well-memorized rule-systems, like memory palaces, propositional calculus, and so on)
… further validation of this approach comes via AI being used for automated math: A Google paper, Deep Network Guided Proof Search, uses Deep Learning techniques to support proof search in a theorem prover. Automated theorem provers (ATP) simplify the lengthy process of verifying logical statements…
… The Google researchers train their AI systems to help guide their ATP along a few exploratory paths, then perform a second (faster) combinatorial search phase using hand-crafted strategies. “We get significant improvements in first-order logic prover performance, especially for theorems that are harder and require deeper search,” they write. ”Besides improving theorem proving, our approach has the exciting potential to generate higher quality training data for systems that study the behavior of formulas under a set of logical transformations,” they write. “This could enable learning representation of formulas in ways that consider the semantics not just the syntactic properties of mathematical content and can make decisions based on their behavior during proof search.”…
… in a further demonstration of the flexibility of basic AI components, the researchers test their system with three different learning substrates: a standard convolutional neural network, a tree-LSTM, and a WaveNet…
…this research builds on earlier work called DeepMath – Deep Sequence Models for Premise Selection, which demonstrated the viability of neural networks for automated logical reasoning.
Driverless buses: driverless vehicles will spend their first years of service in small, controlled environments, like corporate campuses, amusement parks, and diminutive states, such as Singapore. Latest example: Tata Motors, which spent the last 12 months testing self-driving buses on its corporate campus. (Workers might be better off bicycling, given that the buses are rate-limited to less than 10 kilometres per hour.)
Comma again? Breaking AI systems: feed Google’s new neural translation system the wrong string of characters and it might bark ‘Knife, Knife, Knife’’ at you in German. Fun bug, probably to be blamed on trailing commas, found by Iain Murray.
NIPS & Immigration: NIPS 2017 is set to be in America, and that has caused some anxiety among AI researchers troubled by President Trump’s executive order on immigration. Change.org petition to alter the location of NIPS here.
Care for a wafer thin AI processor on top of your pi(e)? Google is asking the Raspberry Pi community for tips about what types of ‘smart tools’ it can produce for makers. Fingers crossed it gets a big enough response to start creating ultra-efficient AI software to be deployed on minicomputers like the Raspberry Pi, complementing the existing DIY open source implementations from the hacker community. Perhaps we can pair this with the cardboard drones mentioned last week? Disposable, almost-sentient paper aeroplanes.
Next-gen AI = Talking Pictures: In 2014 and 2015 we saw researchers jointly train word and image models, so computers could generate captions for images.
… Later in 2015 researchers started to experiment with the inverse of this idea, seeing if words could be used to generate imagery. They were successful, and in a little under a year and a half moved from generating low-res, fuzzy images of toilets in fields, to crisp ‘I can’t believe it’s not butter’-grade synthetic images (eg, StackGAN)…
…Now, researchers are jointly training AI systems on audio waveforms and imagery. A new paper from MIT teaches a computer to learn the correspondence between sound and vision…
…The network is trained in two stages: first, researchers teach computers to associate audio segments with particular images, then in a second stage the computer identifies various entities in the images and seeks to link those entities to particular slices of audio. The result is a trained network that can identify specific visual entities from spoken clues…
…this has quite subtle implications. For one thing, if you were able to generate a good enough network from English, then were able to train the image-sound correspondence on another language, such as German, you could do so without access to the base german language, instead translating through the shared visual layer…
…“This paves the way for creating a speech-to-speech translation model not only with absolutely zero need for any sort of text transcriptions, but also with zero need for directly parallel linguistic data or manual human translations,” the researchers write…
…new techniques for extracting emotions from speech, like “Emotion Recognition From Speech With Recurrent Neural Network” suggest this could be extended further, blending the emotions into the speech and imagery. (Next step: add smell.)…
… imagine a future where anthropologists seek out people whose language has little to no written record, and translate it into a universal data representation by having people narrate the contents of particular images or movies, pouring their speech into a shared visual dictionary whose entities are redolent of feeling. Brings a whole new meaning to the term ‘emotional palette’.
And so their structures shall be as intricate and befuddling as the architecture of Gormenghast: The AI community’s love of neural nets troubles technology cartographer Bruce Sterling: “They have a baroque, visionary, suggestive, occultist quality when at this historical moment that’s the very last thing we need,” he says.
Everything’s bigger in America – Google research points way to neural nets 1,000X the size of current ones: new Google research, ‘outrageously large neural networks: the sparsely-gated mixture-of-experts layer’, shows how to scale-up neural networks without having to boil the ocean. The new system – Google’s latest approach to applying ‘conditional computation’ to its systems – allows for networks 1,000 times larger than contemporary ones, with only slight losses in computational efficiency…
… The trick to this is the addition of what Google calls a ‘mixture of experts’ layer, which basically gives the network the ability to choose to call on an ever larger pool of ‘expert’ mini neural nets to help classify input. The MOEs are behind a gating network(s) which autonomously chooses how many MOEs to sample data from, letting the network scale in size without becoming totally unwieldy…
… Google tested its approach on a language understanding task and a translation task, attaining good results in both. Perhaps the most convincing evidence for the utility of the new approach lies in its apparent efficiency, with the new approach attaining state-of-the-art results on a language translation task, while using fewer resources…
…Google Neural Machine Translation: 6 days of training across 96 Nvidia K80 GPUs
…Mixture-of-Experts model: 6 days of training across 64 Nvidia K80 GPUs
…(Less GPUs and more performance? Quick, someone send a bouquet of flowers with a note saying ‘Condolences’ to Jen-Hsun Hwang).
…now let’s wait for a follow-up paper where the researchers follow through on their goal of training a trillion parameter model on a one trillion word corpus.
Roughneck robots for grubby deeds: The In Situ Fabricator1 brings us closer to an era where we can deploy robots with some general spectrum of capabilities into chaotic environments like construction sites.The robot is capable of millimeter-level precision, and is tested on two tasks: one, building an “undulating brick wall” (page 7, PDF) out of 1,600 bricks, stacked in a doubled lattice. The second task involves welding wires to create a ‘Mesh Mould’. The researchers are already working on a second version of the robot, and plan to increase its strength by moving from electric motors to hydraulic systems, while reducing its weight from 1.5 tons (too heavy for many buildings) to a more respectable 500 kilograms. The robot’s movement policies are derived from Optimal Control approaches, rather than in-vogue, but still quite young, neural network techniques..
… but not all Robots != Robots: This Bloomberg story about robots taking over oil rigs highlights how oil companies have been shedding employees due to a crash in oil prices and, in some cases, replacing them with robots. Read on for the description of National Oilwell Varco Inc.’s ‘Iron Roughneck’ robot, replacing a few jobs. But is automation really to blame for the current job losses? Yes, but it’s hardly original…
…Wind the clock back to 1983 and we find a news story talking about roughly the same hardware from roughly the same company doing roughly the same job in oil fields. “Roughnecks speak of their particular “Leroy” or “Igor” or “Billy Bob” as though “he” is a co-worker, which, in fact, is true. Some hands paint the machine with a face, big eyes or tennis shoes,” the news report says.
Recursive job alert: We’re looking to hire the brilliant person that helps us hire the brilliant persons. Recruitment Coordinator. (And, as ever, we continue to look for machine learning and engineering candidates).
OpenAI Universe: visual guide. Visual illustration from Tom Brown about the diversity of Universe.
Modding OpenAI Gym: blog post, with code, about modifying the reward system of a particular OpenAI Gym environment.
[Bushwick, 2025: words projected on the outside of a datacenter.]
Frank McDonald annoys the hell out of you but you need some of the cards in his hand, so have to tolerate his burping and farting and ceaseless shifting in his chair. The fellow next to him, Earl Sewer, smells worse but doesn’t talk so much, so you find him a little easier to deal with.
Shirley Ribs sits right next to you, and you and her have been trading cards all day. “Thanks Mr Grid,” she says, as you slide over a couple of units.
“Pleasure’s all mine,” you say, as she flips a couple of cards over, and sends one spinning over to you and another to McDonald.
The tension’s been running high for an hour or so as the crowd in the room has grown. People in the audience are shouting for everyone to make moves faster, calculate the odds better. The crowd hisses at shoddy play, having grown less forgiving for visibly bad bets. They say the Chinese have a better game going on next door, so for a while cards are tight as people sling their money into the game next door instead.
That makes McDonald get restless, and so he starts trying to flip the game by buying up cards from you, then not trading any with Sarah Market, instead just switching back and forth between you and Sewer and Butcher. Market gets angry and starts trying to do a side-deal with the dealer to trade some units for surplus cards from the Chinese game, but the dealer says he doesn’t have the capacity.
You get a handle on it eventually, winning back a few rounds from McDonald while calming him down with the odd bluff. If the game flipped it would been the first time in seventeen years of continuous play. Note to self: almost got into trouble there, so deal differently next time.
[Note: these kinds of ‘state art’ performances proliferated for a while, as people sought to dramatize the inner workings of AI systems. In this performance artist T K Wenzler trawled the market feeds for interactions between AI representatives from a number of retail, infrastructure, and electricity players, then thanks to the MacArthur grant, bought up some of the more obscure feeds emanating from the trader AIs. He trained the data into representations of each participant, then applied domain confusion techniques to adapt this representation into a gigantic movie corpus, culled from security camera footage of Prison card games.
The installation ran 2024-2031. Discontinued to data feeds becoming un-parseable, after the supreme court ruled for a relaxation of interpretability standards.]