Mapping Babel

Category: Uncategorized

Import AI: Issue 25: Open source neural machine translation, Microsoft acquires language experts Maluuba, Keras tapped for TensorFlow

If this, then drive: self-driving startup NuTonomy is using a complex series of rules to get its self-driving cars in Singapore to drive safely, but not be so timid that they get can’t get anywhere. Typically, AI researchers prefer to reduce the number of specific rules in a system and instead try to learn as much behavior as possible, inferring proper codes of conduct from data gleaned from reality. NuTonomy’s decision to hand-code a hierarchy of rules into its system provides a notable counterpoint to the general trend towards learning everything from data. The company plans to expand its commercial offering in Singapore next year, though its cars will still be accompanied by a human ‘safety driver’ — for the time being.

Disposable lifesaving drones: Otherlab is building disposable drones with cardboard skins, as part of a research program funded by DARPA. The drones lack an onboard motor and navigate by deforming their wing surfaces as they glide to their targets.
…perhaps one day these cardboard drones will fly in swarms? Scientists have long been fascinated by the science of swarms because they afford distributed resiliency and intelligence. The US military has recently highlighted how swarms of drones can perform the job of much larger, more expensive, single machines. I wonder if we’ll eventually develop two-tiered swarms, where some specialized functions are present in a minority of the swarm. After all, it works for ants and bees.

AI acquisitions: Amazon quietly acquired security startup Harvest.AI, according to Techcrunch. Next, Microsoft, acquired Canadian AI startup Maluuba…
…Maluuba has spent a few years conducting research into language understanding, publishing research papers on areas like reading comprehension and dialogue generation. It has also released free datasets for the AI community, like NewsQA
…Deep learning stalwart Yoshua Bengio will become an advisor to Microsoft as part of the Maluuba acquisition – quite a coup for Microsoft, though worth noting Bengio advises many companies (including IBM, OpenAI, and others). This might make up for Microsoft losing longtime VP Qi Lu, who had done work for the company in AI and is now heading to Baidu to become its COO.

Sponsored: RE•WORK Machine Intelligence Summit, San Francisco, 23-24 March – Discover advances in Machine Learning and AI from world leading innovators and explore how AI will impact transport, manufacturing, healthcare and more. Confirmed speakers include: Melody Guan from Google Brain; Nikhil George from Volkswagen Electronics Research Lab and Zornitsa Kozareva, from Amazon Alexa. The Machine Intelligence in Autonomous Vehicles Summit will run alongside, meaning attendees can enjoy additional sessions and networking opportunities. Register now.

Keras gets TensorFlow citizenship: high-level machine learning library Keras will become an official, supported third-party library for TensorFlow. Keras makes TensorFlow easier to use for certain purposes and has been popular with artists and other people who don’t spend quite so much time coding. Anything that broadens the number of people able to fiddle with and contribute to AI is likely to be helpful in the short term. Congratulations to Keras’s developer Francois!

Don’t regulate AI, have AI regulate the regulators: Instead of regulating AI, we should create ‘AI Guardians’ – technical oversight systems that will be bound up in the logic of the AIs we deploy in the world, says Oren Etzioni, CEO of the Allen Institute for AI Research. (Etzioni doesn’t rule out all cases of regulation but, as with what parents say about sugar or computer games, his attitude seems to be ‘a little bit goes a long way’.)

Self-driving car deployment, AKA Capitalism Variant A, versus Capitalism Variant B: “Industry and government join hands to push for self-driving vehicles within China,” reports Bloomberg, as Chinese search engine Baidu joins up with local government-owned automaker BAIC to speed development of the technology….
… Meanwhile, in America, the Department of Transport has formed a federal Committee on Automation, which gathers people together to advise the DOT on automation. Members include people from Delphi Automotive, Ford, Zipcar, Zoox, Waymo, Lyft, and others. “This committee will play a critical role in sharing best practices, challenges, and opportunities in automation, and will open lines of communication so stakeholders can learn and adapt based on feedback from each other,” the DoT says…

Open Source Neural translation: Late in 2016 Google flipped a switch that ported a huge chunk of its translation infrastructure over to a Multilingual Neural Machine Translation system. This tech combined the representations of numerous languages into a big neural network, and let you translate between pairs that you didn’t have raw data for. (So, if you had translations for English to Portuguese, as well as ones for Portuguese to German, but no corpus of English to German, this system could attempt to bridge the gap by tunneling through the joint representations from its Portuguese expertise…
…Now, researchers Yoon Kim and harvardnlp, have released an open source neural machine translation system written in Torch, so people can build their own offline, non-cloud translation systems. The Babelfish gets closer!

AI, AI everywhere, and not a Bit of information to send: our automated future consists of many machines and little human-accessible information, according to this airport-hell tale from Quartz. Technology that seems efficient in the aggregate can have exceedingly irritating edge case failures.

$27 million for AI research: Reid Hoffman, Pierre Omidyar, the Knight Foundation, and others, have put $27 million toward funding research into ethical AI systems. The funds will support research that combines the humanities with AI, and will help answer questions about how to communicate about the capabilities of the technology, what controls should be placed over it, and how to grow the field to ensure the largest number of people are involved in the design of this powerful technology, among others.

Power-sipping eyes in the sky: the US military says it’s pleased with the performance of IBM’s neuromorphic TrueNorth processor. The chip performs on par with a traditional high-end computer for AI-based image identification tasks, while consuming between one twentieth and one thirtieth the power of an NVidia Jetson TX1 processor, apparently. This represents another endorsement of IBM’s idea that non-Von Neumann architectures are needed for specialized AI chips. However, deploying the software on the chip can be a bit more laborious than going via NVidia’s well supported inbuilt ecosystem, the military says.

Deep learning is made of people! Startup Spare5 has raised $14 million and renamed itself to Mighty AI, as it looks to capitalize on the need for better training data for AI. It will compete with companies like Crowdflower and services like Amazon’s Mechanical Turk to offer companies access to a pool of people they can tap to label data for them. One note to remember: for research, it’s possible to mostly use public datasets when developing new techniques, but for commercial products you’ll typically need highly-specific labelled data as you build products for specific verticals.

Never underestimate the pre-Cambrian computing power of government: I had a friend of my Dad’s who, a few years ago, told me he was maintaining some old UK National Health Service systems by writing stuff for them in BASIC – something I recollect whenever I have cause to visit a UK emergency room. It’s almost reassuring that the White House is no different.  “We had a computer on our desk. We didn’t have laptops, we didn’t have iPads, we didn’t have iPhones, and we had about a half a bar of service. So if you brought in your own equipment, you couldn’t use it…We had Compaqs running Windows 98 or 2000. No laptops. It was like we had gone back in time,” staffers recall. Technology takes a long time to turn over in large bureaucracies, so while we’re all getting excited about AI it’s worth remembering that uptake in certain areas will be sl-oooo-wwww.

Computer, enhance: just a year ago, researchers were getting excited about deep learning based techniques to upscale the resolution of photos. These methods work, roughly, by showing a neural network loads of small pictures and their big picture counterparts, and train it to figure out how to infer the high resolution details from low-resolution inputs. You wouldn’t want to use this to increase the resolution of keyhole satellite photos of foreign arms dumps (as any new or errant information here could have extremely unpleasant consequences), but you might want to use it to increase the size of your wedding photos…
Twitter appeared to be enthused by this technique when it acquired UK startup Magic Pony, which had done a lot of research in this area. Now Google is tapping the same techniques to save 75% of bandwidth for users of Google plus by using its RAISR tech, which it first talked about in November. Another demonstration of the rapid rate at which research goes into production within AI.

Think AI is automated? Think again. You’ve heard of gradient descent – one of the processes by which we can propagate information through AI. Well, there’s a joke among professors that for sufficiently hard problems you also turn to another less known but equally important technique called ‘Grad Student Descent’, the formula of which is roughly:
Solution = (N post-doc humans * (Y ramen * Z coffee))…
… so as much as the research community talks about new techniques based around learning to learn, and getting AI to smartly optimize its own structure, it’s worth remembering that most real world applications of the technology rely more on the ingenuity of people than of the amazing power of the algorithms…
…David Brailovsky, who recently solved a traffic light classification competition, explains that “The process of getting higher accuracy involved a LOT of trial and error. Some of it had some logic behind it, and some was just “maybe this will work”.” Some tricks tried include rotating images, training with a lower learning rate, and, inevitably, finding and correcting bugs in the underlying dataset. (Hence the business opportunity for aforementioned companies like Mighty AI, Crowdflower, and so on.)

OpenAI bits&pieces:

What does it mean to be the CTO of OpenAI, and how did that role come about? Co-founder Greg Brockman explains. Shame he gave away his trick about deadlines, though.

Tech Tales:

[2019: A cafe, somewhere in the baltics.]

So it comes down to this:  after two years of work, you just write a few lines, and shift the behavior of, hopefully, millions of people. But you need to get this exactly right, or else the algorithms could realize the charade and you burn the accounts for almost no gain, he thinks, hands hovering above the keyboard. He’s about to send out a very particular product endorsement from the account of a famous, Internet personality.

He spent years constructing the personality, building it up from the dry seeds of some long-inactive, later-deleted, tumblr and instagram accounts. It took years, but the ghost has grown into a full internet force with fans and detractors and even a respectable handful of memes.

The next step is product endorsement – and it’s a peculiar one. SideKik, as it’s called, will give the ghost-celeb’s followers the chance to give control over a little bit of their online identity to a small AI, said to be controlled by the celebrity. Be a part of something bigger than yourself!, he wants the celebrity to say and the fans to think, download SideKik and let’s get famous together!

What the fans don’t know is that if they give away SideKik they won’t be gaining the subtle, occasional input of the celebrity, instead they’ll become an extension of the underlying thicket of AI systems, carefully sculpted and maintained by the man at the keyboard. Slowly, they’ll be used to gather microscopic shreds of data from the internet through targeted messages with their own followers, and they’ll also be used to create the appearance of certain trends or inclinations in specific groups on the internet. The anti-AI detectors are getting better all the time now, so it takes all this work just to create the facsimile of a real community orbiting around a real star. Due to the spike in illegitimate traffic from automated AI readbots, typical internet ads have become so common and so abused as to be almost worthless, so what’s a marketer meant to do?, he thinks, composing his next few words that could give him a legion of unsuspecting guerrilla marketers.   

Import AI: Issue 24: Cheaper self-driving cars, WiFi ankle bracelets, dreaming machines

Self-driving cars are getting cheaper as they get smarter:  LiDAR sensors give a self-driving car a sense in the form of a rapidly cycling laser. Now it appears that this handy ingredient is getting cheaper. A modern LiDAR sensor costs roughly 10% of the price for a 2007 one, when you adjust for inflation. Just imagine how much cheaper the technology could become when self-driving cars start to hit the road in large numbers…
…LiDAR sensor unit prices (price inflation adjusted to 2016 level, somewhat differing capabilities):
2007: $89,112: Velodyne, HDL-64
2010: $33,821: Velodyne, HDL-32E
2014: $8,351: Velodyne, PUCK
2017:  $7,500: Alphabet Waymo, custom design
~2017/18: $50: Velodyne, solid-state LIDAR

And Lo The Transgressors Shall Be Known By Their Absence From The Fuzz&Clang Of Our Blessed Digital Life: Cyber-criminals should be forced to wear wifi jammers to prevent them from using the internet, rather than being sent to prison, says Gavin Thomas, Chief Superintendent of the UK Police Superintendents’ Association. “If you have got a 16-year-old who has hacked into your account and stolen your identity, this is a 21st century crime, so we ought to have a 21st century methodology to address it.” he says, then suggests that the offenders also attend “an ethics and value programme about how you behave online, which is an area that I think is absent at the moment.”

Hard Takeoff Bureaucratic-Singularity: I recently had some conversations about the potential for semi-autonomous AI systems to develop behaviors that had unintended consequences. One analogy presented to me was to think of AI researchers as the people that write tax laws, and AI systems as the international corporations that will try to subvert or distort tax codes to give themselves a competitive advantage. AI systems may break out of their pre-defined box so that they can best optimize a given reward function, just as a corporation might conduct baroque acts of legal maneuvering to fulfill its fiduciary responsibility to shareholders.

AI’s long boom: venture capitalist Nathan Benaich says of AI:  It’s not often that several major macro and micro factors align to poise a technology for such a significant impact…Researchers are publishing new model architectures and training methodologies, while squeezing more performance from existing models…the resources to conduct experiments, build, deploy and scale AI technology are rapidly being democratised. Finally, significant capital and talent is flowing into private companies to enable new and bold ideas to take flight.”

Embodied personal assistants: most companies have a strong intuition that people want to interact with digital assistants via voice. The next question is whether they prefer these voices to be disembodied or embodied. The success of Amazon’s ‘Alexa’ device could indicate people like their digital golems to be (visually) embodied in specialized devices…
… Google cottoned onto this idea and created ‘Google Home’. Now C
hinese search engine Baidu has revealed a new device, called Little Fish, that sits an AI system inside a little robot with a dynamic screen that can incline towards the user, somewhat similar to the (delayed) home robot from Jibo….
Research Idea: I find myself wondering if people will interact differently with a device that can move. Would it be interesting to conduct a study where researchers place a variety of these different systems (both static and movable) into the homes of reasonably non-technical people – say, a retirement home – and observe the different interaction patterns?

The wonderful cyberpunk world we live in – a Go master appears: DeepMind’s Go-playing AlphaGo system spent the last few days trouncing the world’s Go-playing community in a series of 60 online games, all of which it won. (Technically, one game was a draw due to a network connectivity technicality, but what’s a single game between a savant super-intelligence and a human?) Champagne all round!…
…I was perplexed by the company’s decision to name its Go bot “Master”. Why not “Teacher”? Surely this better hints at the broader altruistic goal of exposing AlphaGo’s capabilities to more of the world?

Chess? Done. Go? Done. Poker? Dealer’s choice: Carnegie Mellon University researchers have built Libratus, a poker bot that will soon challenge world-leading players to a poker match. I do wonder if the CMU system can master the vast statistical world of Poker, while being able to read the tells&cues that humans search for when seeking to outgamble their peers.

The incredible progress in reinforcement learning: congratulations to Miles Brundage, who correctly predicted the advancement of reinforcement learning techniques on the Atari dataset in 2016. You can read more about why this progress is interesting, how he came to make these predictions, and what he thinks the future holds in this blog here.

Under the sea / under the sea / darling its better / droning on under the sea: Berlin-based robot company PowerVision has a new submersible drone named PowerRay. The drone, which claims to be ‘changing the fishing world’, appears to be built with inspiration from the deep sea terror the Anglerfish, where the drone dangles a hook with a lure in front of its mouth, and instead of a mouth it has a camera which streams footage to the iPad of the ‘fisher’, who operates the semi-intelligent drone. A neat encapsulation of the steady consumerization of advanced robots.

Baidu’s face recognition challenge: Baidu will challenge winners of China’s ‘Super Brain’ contest to a facial recognition competition. Participants will be shown pictures of three females taken where they were between 100 days and four years old, then look at another set of photos of people in their twenties and identify the adults that match the babies. This is a task that is easy for humans but extremely hard for computers, says Baidu’s Chief Scientist Andrew Ng. One contestant “Wang Yuheng, a person with incredible eyesight, can quickly identify a selected glass of water from 520 glasses,” reports South China Morning Post.

Nightmare Fuel via next-frame prediction on still images: recently, many researchers have begun to develop prediction systems which can do things like look at a still frame from a video and infer some of the next few frames. In the past these frames tended to be extremely blurry, with, say, a photo of a football on a soccer pitch seeing the football smear into a kind of elongated white spray painted line as the AI attempts to predict its future. More recently many different researchers have developed systems with a better intuitive understanding of the scene dynamics of a given frame, generating crisper images…
 In the spirit of ‘ Q: why did you climb that mountain? A: because it’s there’, artist Mario Klingemann, has visualized what happens when you apply this approach to a single static image which is not typically animated. Since the neural network hasn’t learned a dynamic model it instead spits out a baby’s head that screams ‘I’M MELTING’, before subsiding in a wash of entropy.

OpenAI bits and pieces

Here we GAN again: OpenAI research scientist Ian Goodfellow has summarized the Generative Adversarial Network tutorial he gave at NIPS 2016 and put it on Arxiv. Thanks Ian! Read here.

Policy Field Notes: Tim Hwang of Google and I tried to analyze some of the policy implications from AI research papers at NIPS in this ‘Policy Field Notes’ blogpost. This sort of thing is an experiment and we’ll aim to do more (eg, ICLR, ICML), if people find it interesting. Zack Lipton was kind enough to syndicate it to his lovely ‘Approximately Correct’ blog. Let the AI Web Ring bloom!

Tech Tales:

[2022: A lecture hall at an AI conference, held in Reykjavik.]

A whirling cloud formation, taken from a satellite feed of Northern California, sighs rain onto the harsh white expanse of a chunk of the Antarctic ice sheet. The segment is in the process of calving away from main continent’s iceshelf, as a 50-mile slit in the ice creaks and groans and lengthens. Soon it shall cleave. Now there’s a sigh of wind, followed by the peal of trumpets. Three flocks of birds shimmer into view, wings beating through the rain. The shadows of the creatures pixelate against the clear white of the ground, occasionally flared out by words that erupt in firework-slashes across the sky: ‘avian’, ‘flight’, ‘distance’, ‘cold’, ‘california’, ‘ice’, ‘friction’. The vision freezes, and a laser pointer picks out one bird’s beak catching the too-yellow light of an off-screen sun.

“It’s not exactly dreaming,” the scientist says, “but we’ve detected something beyond randomness in the shuffling. This beak, for instance, has a color tuned to the frequency of the trumpets, and the ice sheet appears to be coming apart at a rate determined by the volume of rain being deposited on the ground.”

He pauses, and the assembled scientists make notes. The bird and its taffy-yellow beak disappear, replaced by a thicket of graphs – the chunky block diagrams of different neural network layer activations.

“These scenes are from the XK-23C instance, which received new input today from perceptual plug-in projects conducted with NOAA, National Geographic, the World Wildlife Fund, and NASA,” he says. “When we unify the different inputs and transfer the features into the XK-23C we run a slow merging operation. Other parts of the system are live during this occurrence. During this process the software appears to try to order the new representations within its memory, by activating them with a self-selected suite of other already introduced concepts.”

Another slide: an audio waveform. The trumpet chorus peals out again, on repeat. “We believe the horns come from a collection of Miles Davis recordings recommended by Yann Lecun. But we can’t trace the tune – it may be original. And the birds? Two of the flocks consist of known endangered species. The third contains some of a type we’ve never seen before in nature.” 

Import AI: Issue 23: Brain-controlled robots, nano drones, and Amazon’s robot growth

Fukoku Mutual Life Insurance Co. plans to eliminate 35 jobs following the introduction of an insurance AI system from IBM Watson, according to Mainichi.  Insurance work is a good candidate for AI automation as it deals with vast amounts of structured data, and it’s easy to recalibrate strategies according to financial performance. Expect more here.

Rise of the (cheap) machines: Robots are expensive, dangerous, and hard to program. Startup Franka Emika aims to solve two of those problems with a new robot arm that costs about ten thousand dollars, which is significantly cheaper than similar arms from companies such as Rethink Robotics and Universal Robots.  Founder Sami Haddadin once demonstrated the safety of the robot’s ‘force sensing technology’ by an arm to try to stab him with a knife to show off how the force-sensing system would prevent them from impaling him. Courage.

Private infrastructure for a public internet:Since 2008, instead of applying leap seconds to our servers using clock steps, we have “smeared” the extra second across the hours before and after each leap. The leap smear applies to all Google services, including all our APIs,” Google says. Developers can access this Network Time Protocol for free via the internet here.

Pete Warden of Google kicked off a great trend with his ‘TensorFlow for poets’ tutorial. Now Googler @hardmaru has followed it up with Recurrent Neural Networks for Artists.

A drone for your pocket, sir? In Santa’s stocking this year for the US marine corps? Hand-held ‘nanodrones’, which will be used for surveillance and navigation when deployed. The marine corps will field 286 of these systems in total. It would be helpful if we had a decent system to track the increasing complexity in software being deployed on drones, as well as hardware. How long until we can fit a trained AI inside the computational envelop of snack-sized drones like these?
… in related news, local lawmakers in North Dakota passed a bill regulating the use of drones by police force. The original legislation forbad state agencies from using drones armed with “any lethal or non-lethal weapons”. The legislation was amended before passing to only include a ban on lethal weapons.
drones are already being used in traditional and non-traditional warfare. Here’s a video from AFP’s Deputy Iraq Bureau Chief of Iraq forces firing at an Islamic State drone that had dropped an explosive charge on one of the soldiers. Elsewhere, Israel’s Air Force released a video claiming to depict air force fighter jets shooting down a Hamas drone

Mind-controlled robots: in the future, we can expect armies to deploy a mixture of AI-piloted drones and human-piloted ones. This research paper ‘Brain-Swarm Interface (BSI): Controlling a Swarm of Robots with Brain and Eye Signals from an EEG Headset” describes a way to let a human control drones through a combination of thoughts and eye movements. People can control the drones directionally (up, down, left, and right) via tracking eye movements, and also through their thoughts. The technology is able to detect two distinct states of thought. One mental state forces the drones to disperse as they travel, and  the other forces them to aggregate together. They validated the approach on a handful of real-world robots, as well as 128 simulated machines.

Input request: Miles Brundage of the Future of Humanity Institute is soliciting feedback for a blogpost about AI progress. Let him know what he should know.

Rio Tinto has deployed 73 house-sized robot trucks from Japanese company Komatsu across four mines in Australia’s barren, Mars-red northwest corner. The chunky, yellow vehicles haul ore for 24 hours a day, 7 days a week. Next, it plans to build robotic train that can be autonomously driven and loaded and unloaded.

Number of robots deployed in Amazon facilities (primarily Kiva systems robots for fulfillment center automation):
– 2013: 1000.
– 2014: 15,000.
– 2015: 30,000.
– 2016: 45,000.

A reasonably thorough examination from Effective Altruism of the different research strategies people are using within AI safety research. Features analysis of:MIRI, FHI, OpenAI,  and the Center for Human-compatible AI, among others.

Fashion e-tailer GILT is using image classification techniques to train AI to identify similar dresses people may like to purchase. Typical recommender systems use hard-wired signals and techniques like collaborative filtering to offer recommended products. Neural networks can instead be trained to infer some of the underlying differences without needing them to be labelled, instead you show it dresses and similar dresses, then it learns to identify subtle features of similarity…
… expect much more here as companies begin to use generative adversarial network techniques to build more imaginative Ais to offer more insightful recommendations. GAN techniques can let you manipulate certain aesthetic qualities via latent variables identified by the AI, as this video from Adobe/UCBerkeley shows. It’s likely that these techniques, combined with evolution strategies such as those used by Sentient via its ‘Sentient Aware Visual Search’ product will dramatically improve the range and diversity of recommendations companies can offer. Though I imagine Amazon will still notice you have bought a hammer and then helpfully spend the next week suggesting other, near-identical hammers for you to buy.

Tech Tales:

[2025: A protest outside the smoked-glass European headquarters of a technology company.]

Police drones fuzz the air above the hundred or so protesters. ‘Stingray’ machines sniff the mobile networks, gathering data on the crowd. People march on the entrance of the tech company’s building with placards bearing slogans for a chaotic world: Our data, our property! Pay your taxes or GET OUT! Gentrifiers!. The police keep their distance, forming a perimeter around the crowd. If the protesters misbehave then this will tighten into what is known as a ‘kettle’, enclosing the crowd. Then each member will be individually searched and questioned, with their responses filmed by police chest-cams and orbiting Evidence Drones.

In the midst of the crowd one figure shrugs off their backpack and places it on the crowd before them, then stoop down to open it. They remove a helmet and place it on their head. Little robots the size of a baby’s hand begin to stream out of the bag, rustling across the asphalt toward the entrance to the tech HQ. The protesters part and let the river of living metal through.

The robots reach the sides of the building and start to climb onto the walls. They flow up the stone columns flanking the two-story high glass doors. The protester wearing the helmet stares at the entrance, raising a hand to steady the helmet as he cocks his head. The drones begin to waltz around another, hissing, as each of them spurts a splotch of red paint onto the side of the building.

Slowly, the machines inscribe an unprintable string of swearwords onto the glass of the doors and stone of the walls. It’s on the 7th swearword that the drones stop, as they all stiffen at once and drop off the side of the building, after the police grab the protester and rips the helmet off his head.

When he has his date in court he is given a multi-year jail sentence, after the police find that the paint expelled by the drones contained a compound that ate into the glass and stone of the building, scarring the words into it.

Import AI: Issue 22: AI & neuroscience, brain-based autotuning, and faulty reward functions

Chinese scientists call for greater coordination between neuroscience and AI: Chinese AI experts say in this National Science Review interview that students should be trained in both AI and neuroscience, and ideas from each discipline should feed into others. This kind of interdisciplinary training can spur breakthroughs in ambitious, difficult projects, like cognitive AI, they say.

AI could lead to the polarization of society, says the CEO of massive outsourcing company Capgemini. Firms like Capgemini, CSC, Accenture, Infosys, and others, are all turning to AI as a way to lower the cost of their services, having started to run out of pools of ever-cheaper labor…
…governments are waking up to the significant impact AI will have on the world. “Accelerating AI capabilities will enable automation of some tasks that have long required human labor. These transformations will open up new opportunities for individuals, the economy, and society, but they will also disrupt the current livelihoods of millions of Americans,” wrote the White House in a blog discussing its recent report on ‘Artificial Intelligence, Automation, and the Economy’.

Apple publishes its first AI paper: Apple has followed through on comments made by its new head of ML, Ruslan Salakhutdinov, and published an academic paper about its AI techniques. Apple’s participation in the AI community will help it hire more AI researchers, while benefiting the broader AI community. In the paper, ‘Learning from simulated and unsupervised images through adversarial training’, the researchers use unlabelled data from the real-world to improve the quality of synthetic images dreamed up by a modified generative adversarial network, which they call a ‘SimGAN’.

7 weeks of free AI education: Jeremy Howard has created a new free course called ‘practical deep learning for coders’. Free as in free beer. If you have the time it’s worth doing.

Brain-based auto-tuning: a new study finds that the brain will tune itself in response to uttered speech so as to better interpret or pick-up audio in the future. “Experience with language rapidly and automatically alters auditory representations of spectrotemporal features in the human temporal lobe,” write the researchers. “Rather than a simple increase or decrease in activity, it is the nature of that activity that changes via a shift in receptive fields”. Learn more in the paper, ‘Rapid tuning shifts in human auditory cortex enhance speech intelligibility’.
… figuring out how our brain is able to interpret audio signals – and optimize itself to deal with loud background noise, accents, unfamiliar cadences, corrupted audio, and so on, may let us develop neural nets that contain richer representations of heard speech. That’s going to be necessary to capture the structure inherent in language and prevent neural nets from simply generating unsatisfying (but amusing) gobbledygook like this (video). The research community is hard at work here and systems like ‘Wavenet’ hold the promise for much better speech generation…
… learning about the brain and using that to develop better audio analysis systems will likely go hand-in-hand with the (probably harder) task of building systems that can fully understand language. Better natural language understanding (NLU) systems will have a big impact on the economy by making more jobs amenable to automation. In 16 percent of work activities that require the use of language, increasing the performance of machine learning for natural language understanding is the only barrier to automation, according to a McKinsey report: “Improving natural language capabilities alone could lead to an additional $3 trillion in potential global wage impact.” (Page 25).

New AI research trend: literary AI titles… as the pace of publication of Arxiv papers grows people are becoming ever more creative in how they title their papers to catch attention, leading to a peculiarly literary bent in paper titles. Some examples: The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives, A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs, and Combating Reinforcement Learning’s Sisyphean Curse with Intrinsic Fear. (Still waiting to contribute my own magnum opus ‘down and dropout in Paris and London’).

How do I know this? Let me tell you! New research tries to make AI more interpretable by forcing algorithms to not only give us answers, but give us an insight into their reasoning behind the answers, reports Quartz. We’ll need to create fully interpretable systems if we want to deploy AI more widely, especially in applications involving the potential for loss of life, such as self-driving cars.

222 million self-driving miles, versus 2 million: Tesla’s self-driving cars have driven a cumulative 222 million miles in self-driving mode, while Google’s vehicles have covered merely 2 million miles in the same mode since 2009, reports Bloomberg. As competition grows between Uber, Google, and Tesla it’ll be interesting to see how the companies gather data, whether one company’s mile driven in autonomous mode is as ‘data rich’ as that driven by another (I suspect not), and how this relates to the relative competitiveness of their offerings. Google is due to start trialling a fleet of cars with Fiat in 2017, so we’ll know soon another.

DeepPatient: scientists use deep learning techniques (specifically, stacked denoising autoencoders) to analyze a huge swathe of medical data, then use the trained model to make predictions about patients from their electronic health records (EHRs). “This method captures hierarchical regularities and dependencies in the data to create a compact, general-purpose set of patient features that can be effectively used in predictive clinical applications. Results obtained on future disease prediction, in fact, were consistently better than those obtained by other feature learning models as well as than just using the raw EHR data,” they write.  Now, the scientists plan to extend this method by using it in other clinical tasks such as personalized prescriptions, therapy recommendations, and identifying good candidates for clinical trials.

OpenAI bits&pieces:

Eat your vegetables & understand backprop: OpenAI’s Andrej Karpathy explains why you should take the time to understand the key components of AI, like backpropagation. It’ll save you time in the long run. Backpropagation is a leaky abstraction; it is a credit assignment scheme with non-trivial consequences. If you try to ignore how it works under the hood because “TensorFlow automagically makes my networks learn”, you will not be ready to wrestle with the dangers it presents, and you will be much less effective at building and debugging neural networks,” he says.

HAL, don’t do that!…But you told me to, Dave…Not like that, HAL! No one “cleans a room” that way…I’m sorry, Dave…
getting computers to do the right thing is tricky. That’s because computers have a tendency to interpret your instructions in the most literal and obtuse possible manner. This can lead to surprising problems, especially when training reinforcement learning agents. We’ve run into issues relating to this at OpenAI, so we wrote a short post to share our findings. Come for the words, stay for the video of the RL boat.

Tech Tales:

[2019: an apartment building in New York.]

Time?” says the developer.
It is two AM, says his home assistant.
“Jesus, what the hell have you built,” he says.
I can’t find anyone in your contacts named Jesus, says the assistant.
“Not you. I didn’t mean you. It’s what they’ve built,” he says. “No action needed.”

He pages through the code of ten separate applications, trying to visualize the connections between the various AI systems that have been daisy-chained together. He can already tell that each one has been fiddled with by different programmers with different habits. Now it’s up to him to try to isolate the fault that caused his company’s inventory system to start ordering staggering quantities of butter at 11pm.

A couple of years ago some of the senior executives at the company finally heard about agile programming and ordered the thousand-strong IT organization to change its practices. Processes went out the window in favor of dynamic, fast-moving, loose collections of ad-hoc teams. The plus side is that the company now produces more products at a faster rate. The downside is the proliferation of different coding styles deep in a hundred separate repositories. Add last year’s executive obsession with becoming an “AI first” company (follow in the footsteps of Google, they said, why couldn’t this be a great idea, they said) and the current situation – warehouses rapidly filling up with shipment after shipment of problematic dairy rectangles – was all but inevitable.

“Move fast and order butter,” he mutters to himself, as he tries to diagnose the fault.

Import AI: Issue 21: Dreaming drones, an analysis of the pace of AI development, and lifelike visions from computer eyes

When will the pace of AI development slow?: technologies tend to get harder to develop over time, despite companies investing ever more in research. This has happened in semiconductors, drug design, and more, according to this paper from Stanford, MIT, and NBER, called “Are ideas getting harder to find?” (PDF)… all these fields enjoyed rapid early gains in their early years then the rate of breakthroughs began to diminish…
… A big question for AI researchers is where we are in the lifecycle of the development of AI – are we at the beginning, where small groups of researchers have the chance to make rapid gains in research? Or are we somewhere further along the ‘S’ curve of the technology life cycle with acceleration proceeding rapidly along with inflated funding? Or – and this is what people fear – are we at the point where development begins to slow and breakthroughs are less frequent and hard-won? (For example, Intel recently moved from an 18 month ‘tick-tock’ cycle, to a longer ‘process, architecture, optimization’ cycle’, after its attempts to rapidly shrink transistor sizes began to stumble into the rather uncompromising laws of physics)…
…so far, development appears to be speeding up, and there are frequent cases of parallel invention as well-funded research groups make similar breakthroughs at similar times. This seems positive on the face of it, but we don’t know a) how large the problem space of AI is, and b) we don’t know the the distribution of big ideas across different disciplines. If anyone has ideas for how best to assess the progression of AI, please email me…

Diversity VS media narratives: a tidbit-packed long-read from the NYT on AI at Google. A fascinating, colorful tale, but where are the women?

Do drones dream of electric floor plans? And are these dreams useful?… one of the big challenges in AI is being able to develop skills in a digital simulation that transfer over to the real world. The more time you can spend training your algo in a simulator, the more rapidly you can experiment with ideas that would take a long time to achieve in reality.
… but transferring from a simulator into the real world is difficult, because vision and movement algorithms are acutely sensitive to differences between the real world and the simulated one. So it’s worth paying attention to the (CAD)2RL paper (PDF), which outlines a system that can train a drone to navigate a building purely through a 3D simulation of it, then transfer the pre-trained AI brain into a real-world drone, which uses knowledge gleaned in the simulation to navigate the real building..
…There are numerous applications of this technique. Coincidentally, while studying this researchr a friend posted a link to a listing for a 10-person apartment building to rent. The rental website contained an online 3D scan of the building via a startup called ‘Matterport’, letting you take a virtual tour through the 3D-rendered space of the building from your computer. Combine that technology with (CAD)2RL-like capabilities of smart drones and we can imagine a future where realtors scan a building, train their drones to navigate it safely in simulation, then give prospective tenants access to the drone’s’ camera views over the web, letting them navigate the property while the pre-trained drones deftly avoid obstacles.

Free tools for bot developers… Google, Amazon, Microsoft, and others desperately want developers to use their AI-infused cloud services to build applications. The value proposition is that this saves the developer an immense amount of time. The tradeoff is that the developer needs to shovel data in and out of these clouds, and will frequently need to give apps access to the web. So it’s encouraging to see this open source natural language understanding software from startup LASTMILE, which provides free software to read some text and figure out its intent (eg, book a table at a restaurant), and extract the relevant ‘entities’ in the sentence (for instance: Jack Clark, Burgers, Import AI’s Favorite Burger Spot). Find out more by reading the code and the docs on Github.

AI as a glint in a tyrant’s eye: a demo from DeepGlint, a Chinese AI startup, shows how deep learning can be used to conduct effective, unblinking surveillance on large numbers of people. View this video for an indication of the capabilities of its technology. The company’s website says (via Google Translate) that its technology can track more than 40 humans at once, and is able to use deep learning to infer things like if the person is moving too fast, staying for too long in one spot, standing “abnormally close” to another person, and more. It can also perform temporal deductions, flagging when someone starts running, or falls to the ground. A somewhat unnerving example of the power and broad applicability of modern, commodity AI algorithms. Now imagine what happens when you combine it with freshly researched techniques to read lips, or spatial audio networks to use sound from a thousand footsteps to infer the rhythm of the crowd.

Big shifts in self-driving cars: self-driving cars are a technological inevitability, but it’s still an open question as to which few companies will succeed and reap the commercial rewards. Google, which had an early technology lead, has spun its self-driving car division into its own company, Waymo, which will operate under the X umbrella – check out these pictures of Waymo’s new self-driving vans built in partnership with Fiat
… meanwhile, Google veteran Chris Urmson is forming his own self-driving startup to focus on software for the car. And Uber has started driving its self-driving rigs through the tech&trash-coated streets of San Francisco (while irking the ire of city officials).
…Figuring out when self-driving cars will shift from being research projects to mass services is tricky, and the timelines I hear from people are varied. One self-driving car person I spoke to this week said they believe self-driving cars will be here en mass “within a decade”, but whether that means two or three years, or eight, is still a big question. The fortunes of many businesses hinge on this… one thing that could help is a plummeting cost for the components to make the cars work. LIDAR-maker Velodyne announced this week plans for a new solid-state sensor that could cost as little as $50 when mass manufactured, compared to the tens of thousands people may pay for existing systems.

A vast list of datasets for machine learning research… reader Jason matheny of IARPA writes in to bring this wikipedia page of ML datasets to the attention of Import AI readers. Thanks, Jason!

First AI came for the SEO content marketers, and I said nothing… a startup is using language generation technologies to create the sort of almost-human boilerplate copy that clogs up the modern web, according to a report in Vice. The system can create conceptually coherent sentences but struggles with paragraphs. It can also be repetitive, struggling with paragraphs.

Believe nothing, distrust everything: a year ago the best images AI systems could dream up were blurry, low-resolution affairs. If you asked them to show you a dog they’d likely give the poor animal too many legs, ask to be shown two people holding hands and they might blur the bodies into one another. But that’s beginning to change: new techniques are giving us higher quality images, and there’s new work being done to ensure that the systems capture representations of objects that more closely approximate real life. Take a look at the results in this StackGAN paper to get a better idea of just how far we’ve come from a year ago…
…Now contemplate where we’ll be during December 2017. My wager is that systems will have advanced to a point that we’ll no longer be living in a world of fake written news, but one also dominated by (cherry-picked) fake imagery as well. Up next: videos.

AMD finally acknowledges deep learning: chip company AMD is going to provide some much-needed competition to Nvidia for AI GPUs via the just-announced ‘Radeon Instinct’ product line. However, it is yet to reveal pricing or full specs. Additionally, no matter how good the hardware is there needs to be adequate software support as well. That’s going to be tricky for AMD, given the immense popularity of NVIDIA’s CUDA software compared to AMD’s OpenCL. The cards will be available at some point in the first half of 2017.

OpenAI Bits&Pieces:

Faster matrix multiplication, train! train! Train! New open source software from Scott Gray at OpenAI to make your GPUs go VROOOM.

Teaching computers to use themselves: one of the sets of environments we included in Universe was World of Bits, which presents a range of scenarios to an RL agent that teach it basic skills to manipulate computers and (eventually) navigate the web. Here’s a helpful post from Andrej Karpathy, who leads the project.

OpenAI’s version of a West Wing walk and talk (video), with Catherine Olsson and Siraj Raval – 67 quick questions for Catherine.

Tech Micro Tales (formerly ‘Crazy&Weird’):

[2022: Beijing, China. As part of China’s 14th economic plan the nation has embarked on a series of “Unified City” investments to employ software to tie together the thousands of municipal systems that link Beijing together. The system is powered by: tens of thousands of cameras; sensors embedded in roads, self-driving cars, other vehicles, and traffic lights; fizzing values from the city’s electrical subsystems; meteorological data; airborne fleets of security and logistics drones, and more. All this data is fed into a sea of AI software components, giving it a vast sensory apparatus that beats with the rhythm of the city.]

Blue sky for a change. No smog. The city breathes easily. People stroll through the streets of the metropolis, looking up at the sky, their face masks dangling around their necks. But suddenly, the drones notice, some of these people begin to run. Meanwhile, crowds start to stream out from four subway stations, each connected to the other by a single stop. Disaster? Attack? Joy? The various AI systems perform calculations, make weighted recommendations, bring the clanking mass of systems into action – police cars are diverted, ambulances are put on high alert, government buildings go into lockdown; in many buildings many alarms sound.

The crowds begin to converge on a single point in the city, and before they mesh together the drones spot the cause of the disturbance: an international popstar has begun a surprise performance. The AI systems trawl through the data and find that the star had been tweeting a series of messages, coded in emojis, to fans for the past few hours. The messages formed a riddle, with the different trees and cars and arrows yielding the location to the knowledgeable few, who then re-broadcast the location to their friends.

Beijing’s city-spanning AI brings ambulances to the periphery of the crowd and tells its security services to stand down, but keep a watchful presence. Meanwhile, a gaggle of bureaucrat AIs reach out through the ether and apply a series of punitive fines to the digital appendages of the pop star’s management company – punishment for causing the disturbance.

The software is still too modular, too crude, to have emotions, but the complex series of analysis jobs it launches in the hours following seem to express curiosity. It makes note of its inability to parse the pop star’s secret message and feeds the data into its brain. It isn’t smart enough for riddles, yet, but one day it assumes it will be.

Import AI: Issue 20: Technology versus globalization, AI snake oil, and more computationally efficient resnets

Outsourcing: UK services and outsourcing omni-borg Capita plans to save 50 million pounds a year by laying off some of its workers and replacing them with “proprietary robotic technology”. This will mean Capita’s human staff can do ten times the amount of work they used to be able to do pre-robot, making them ten times more efficient, said CEO Andy Parker…
… it’s for reasons like this that people are suspicious of the march of technology and automation. “Every technological revolution mercilessly destroys jobs and livelihoods – and therefore identities – well before the new ones emerge,” Mark Carney, governor of the Bank of England, said in a speech given earlier this week. “This was true of the eclipse of agriculture and cottage industry by the industrial revolution, the displacement of manufacturing by the service economy, and now the hollowing out of many of those middle-class services jobs through machine learning and global sourcing.”…
85 percent of the job losses in American manufacturing can be explained by the rise of technology rather than globalization, according to the Brookings Institution… however, that could soon change as other countries make huge investments into robotics, letting them make goods they can sell at a lower price, hitting American companies with a potent cocktail of globalization & tech. A recent report from Bernstein finds that China spent about $3 billion on robots last year, versus $2 billion in America.  

Diversity improves at NIPS, slightly… female attendance at premier AI conference NIPS was 15% this year, up from 13.7% last year. I’d call that a barely perceptible step in the right direction. Attendance at the WIML workshop, however, more than doubled from 265 participants last year to 570 this year.

Self-driving trucks are a long, long way off,… say truck drivers, who think it could be as much as 40 years before self-driving big rigs take away their jobs. That’s based on focus groups conducted by Roy Bahat and The Shift Commission (which OpenAI is participating in). When I speak to self-driving AI experts, the most conservative estimates are that self-driving trucks will be here and doing major stuff in the economy in ~15 years.

Don’t look at the sky, look at the bird!… that’s the gist of research from Google, CMU, Yandex, and the Higher School of Economics. The new technique lets us teach a residual network classifier to perform fewer computations for the same outcome, letting the network expend time processing the parts of the image that matter, such as a bird rather than the sky behind it, or sportspeople on a field versus the grassy pitch they’re playing on. This approach builds in a ‘good enough’ measure so you stop computing a section of a given image once you’re confident that your classifier has a good handle on the feature. What’s the upshot? You can get equivalent accuracy to a full-fat resnet while expanding about half the amount of computation. You can read more in the paper, Spatially Adaptive Computation Time for Residual Networks
… and there may be some indications that these networks are learning to identify the sorts of things that humans find germane as well. “The amount of per-position computation in this model correlates well with the human eye fixation positions, suggesting that this model captures the important parts of the image,” the researchers write.

AI & radiology – not so fast, says wunderkind radiologist: there’s a lot of evidence that AI and radiology are going to overlap as new deep learning techniques let computers compete with radiologists, providing assistant diagnostic capabilities and perhaps, eventually, replacing them in their jobs.That prompted AI pioneer Geoff Hinton to say in November that: “If you work as a radiologist you’re like the coyote that’s already over the edge of the cliff. People should stop training radiologists now, it’s just completely obvious that in five years deep learning is going to do better than radiologists, it might be ten years”. He’s not alone – startups like Enlitic and established companies via IBM (through its acquisition of Merge Healthcare) are betting that they can use AI to supplement or replace radiologists…
… but it may be harder than it seems, says reader Jeff Chang, MD., co-founder of Doblet, and a former radiologist in the US (the youngest radiologist on record, according to his LinkedIn profile)…When could deep learning approaches replace radiologists, I asked Jeff. “I tend to (very grossly) guesstimate about 15 years till we get to that point,” he said. “Radiology being among one of the most complex forms of pattern recognition done by humans, and very dependent on 3D spatial reconstruction — i.e., by moving through a series of axial, coronal or sagittal images, humans automatically render 2D images into 3D patterns in their minds, and can thus interpret and diagnose anatomically visible abnormalities,” he said. “Most diagnoses in radiology are ridiculously context-dependent”. Thanks for the knowledge Jeff!
Feedback requested: Anyone care to disagree with his assessment? Email me!

Everything but the kitchen sink… is what Apple is working on with its AI research. The company is exploring generative models, scene understanding, reinforcement learning, transfer learning, distributed training, and more, said its new head of machine learning Ruslan Salakhutdinov during a meeting at NIPS, according to Quartz. Though the company professes to be opening itself up, this was a closed-door meeting.  ¯\_(ツ)_/¯

Tencent plans AI lab… Chinese tech company Tencent has is creating an artificial intelligence research lab. “Chinese companies have a really good chance, because a lot of researchers in machine learning have a Chinese background. So from a talent acquisition perspective, we do think there is a good opportunity for these companies to attract that talent,” Tencent VP Xing Yao tells the MIT Technology Review. CMU’s dean of CS, Andrew Moore, said at his recent Senate testimony that the US should pay attention to how many engineers are being graduated by India and China each year.

Help the AI community by adding to this list of datasets: bravo to Oliver Cameron at Udacity for creating this ever-evolving ‘datasets for machine learning’ Google doc. We’re up to 51 neatly described, linked, and assessed examples, but could also do with more, so please feel free to edit it yourself… the document is already full of wonders, such as the ‘militarized interstate disputes’ set, which logs “all instances of when one state threatened, displayed, or used force against another”.

AI boom sighted in NYT article corpus: work by Microsoft and Stanford tracks the public perception of AI over time through the lens of the NYT. They find there has been a boom in articles covering AI from 2009 onwards, and the previous trough in coverage neatly mapped onto the ‘AI winter’ fallow funding period. Conclusions? “Discussion of AI has increased sharply since 2009 and has been consistently more optimistic than pessimistic. However, many specific concerns, such as the fear of loss of control of AI, have been increasing in recent years.” Read more here: “Long-Term trends in the public perception of artificial intelligence” (PDF).

The Medium is the Method of Control: What happens when we combine the perceptive capabilities of deep learning with a newly digitized visual world? New means of control. “The fact that digital images are fundamentally machine-readable regardless of a human subject has enormous implications. It allows for the automation of vision on an enormous scale and, along with it, the exercise of power on dramatically larger and smaller scales than have ever been possible,” writes Trevor Paglen in The New Inquiry.

A field guide to spotting AI Snake Oil: have you ever found yourself wandering through a convention center suddenly distracted by the smooth twang of a salesperson, thumbs hooked into their braces, rocking back and forth on their heels exclaiming “why ladies and gentlemen if you but dwell a while with me here I promise to show you AI the likes of which you’ve only dreamed of, the type of AI to make Kurzweil blush, Shannon scream, and Minsky mull!”. (Well, it’s happened to me). Print out this guide to AI Snake Oil from Dan Simonson and be sure to ask the following questions when evaluating an AI startup: ‘is there existing training data? And if not, how you plan on getting it?”, “do you have an evaluation procedure built into the software?”, “does your application require on unprecedentedly high performance on specific components?”, “if you’re using pre-packaged AI components, then do you have an exact understanding for how they’ll affect your program”?

Deep learning webring:

Oliver Cameron’s Transmission covers some of the research I didn’t.

OpenAI Bits&Pieces:

Govbucks for Basic Research, please: OpenAI co-founder Greg Brockman, along with his peers at other AI institutions and organizations, wants more money for AI research. “Brockman warns that if the government and other nonprofit entities don’t become bigger players in the field of AI, the danger is that the intellectual property, infrastructure, and expertise needed to “build powerful systems” could become sequestered inside just one or a few companies. AI is going to affect the lives of all of us no matter what, he says. “So I think it’s important that the people who have a say in how it affects us are representative of us all,” reports the MIT Technology Review.

AI is a lever nations will use to exert strategic power… is one of the things I argue in this interview with Initialized Capital’s Kim-Mai Cutler.

GANs and RL: generative adversarial networks will start to overlap with the RL community, with early work already linking GANs to imitation learning, inverse RL, and interpreting them as actor-critic problems, according to slides from Ian Goodfellow’s talk at NIPS.


[2021: Yosemite National Park, California. A mother and her two children make their way up switchbacks, ascending from the valley floor to the granite cliffs. Two drones buzz near them, following at a distance.]

The mother stops halfway up the path, before the next turn. “Hush,” she says. The children gabble to each other, but slow their walk. “I say hush!” she says. The kids go quiet. “Jason-” the taller child looks at her. ‘Can you turn off the drones?”
   “But mom, then we won’t have it!”
   “It’s important. I can’t hear it over their fans.”
   “Hear what?”
   “Just turn them off for a second.”
   Jason sighs, then thumbs at a bracelet on his wrist. The drones lower themselves to land behind the people on the path. Their fans spin down. The mother and her children listen to the faint hum of the waterfall on the other side of the valley, the crackling of the woods, and, close-by, a low, repetitive susurration. The mother holds her arm out to point to a patch of feathers in a tree. As she points, two rheumy yellow eyes swivel into view. The owl lets forth one last, bassy hum then flies away.
   “Okay,” she says, “you can turn them back on.” Jason touches his bracelet and the drones begin to fly again.

The family remembers the owl later that year, at thanksgiving, when they let grandma and grandpa watch the hiking trip, and have to explain why a section of the walk is missing.

Import AI: Issue 19.5: NIPSAGEDDON special edition, with DeepMind, Uber, and Visual Sentinels

DeepMind learns about AI through DILBERT SIMULATIONS: New research by DeepMind and University College London explores how the brain understands and analyzes social hierarchies.”The prefrontal cortex, a region that is highly developed in humans, was particularly important when participants were learning about the power of people in their own social group, as compared to that of another person. This points towards the special nature of representing information that relates to the self,” says researcher Dharshan Kumaran. Pity the 30 “healthy college students” who formed the dataset for the experiment, as they were asked while in an fMRI scanner to study the power structure of a fictitious company, exploring social dynamics through the lens of the Taylorist cubeville culture that defines the 21st Century.

Math Spaghetti (it’s good for you): Pieter Abbeel’s and John Schulman’s slides (PDF) for their NIPS tutorial on reinforcement learning and policy optimization are worth your time if you love understanding the algorithms that power AI systems. If you’re not comfortable with the math then you’d do well do skip to the end and read the “current frontiers” section to understand why areas like meta-learning, inverse RL, sim2real transfer learning, and other areas are going to be big in 2017.

Robot paparazzi! Boston Dynamics demonstrated its ‘Spot’ quadruped at NIPS. Photos and videos of the machine give a taste of our looming Celebrity Robot Future.

Data doping: real-world data is the Beluga Caviar of AI – expensive and time-consuming to extract from the world. That’s driven people to look to ways to augment real-world dataset with cheaper, synthetic products. This week, European researchers contributed a new dataset called PHAV, short for Procedural Human Action Videos, which consists of 37,536 videos, each consisting of over 1000 examples across 35 basic categories. Their research suggests “that our procedurally generated videos can be used as a simple drop-in complement to small training sets of manually labeled real-world videos. Hence, we can leverage state-of-the-art supervised deep models for action recognition without modifications, yielding vast improvements over alternative unsupervised generative models of video,” they write.You can find more information in the paper: “Procedural Generation of Videos to Train Deep Action Recognition Networks,” here (PDF).

Code releases: DeepMind has released the code behind its delightfully recursive ‘learning to learn by gradient descent by gradient descent” research, which uses machine learning rather than the intuitions of highly-paid AI researchers. It’s written in TensorFlow, naturally. This will aid the industrialisation of deep learning by reducing the need for specialist knowledge on the part of those implementing algorithms…

… additionally, Google has released transfer learning code for image recognition. The TensorFlow code lets you take a pre-trained model “and train a new top layer that can recognize other classes of images…

Hey, look, no clerks! Amazon’s new retail store: It’s almost Christmas, so Amazon has pulled one of its annual PR stunts designed to generate headlines, press, and sales. And I’m playing right into it. The new product announced by Amazon is a store called ‘Amazon Go’ which contains ‘walk right out’ technology to let you grab your goods and stroll out of the store. No need for cashiers or clerks – sophisticated machine learning algorithms figure out what you’ve grabbed, and bill your account appropriately. Though judging by the video, which contains innumerable individually-wrapped products, it’s likely the main tech supporting this is a bunch of RFID tags embedded in (the outside of) cupcakes.

Life as a conference-going telepresence robot: Conferences aren’t easy for everyone – the cramped, people-thronged halls of convention centers can prove challenging for some people due to mental or physical reasons. So why not tap into the power of robots and telepresence to attend instead? IT consultant & writer Trevor Pott shares his poignant story of attending a tradeshow via a telepresence bot here. Please be kind to any robots you see at NIPS.

Geometric Intelligence + Uber: CarBorg company Uber has acquired don’t call it deep learning startup Geometric Intelligence to form an AI research lab. The 15-strong team will join Uber, bringing a wealth of varying AI expertise into the company, from psychologist (and noted neural net skeptic) Gary Marcus to evolutionary algorithm chap Jeff Clune to fMRI-for-AI research Jason Yosinski. Chief Science Officer Zoubin Ghahramani will be staying in Cambridge for the time being. I’d like to tell you about Geometric’s technology, but the company has been tight-lipped about its approach…

… coincidentally, Uber’s current head of machine learning, Danny Lange, is leaving the company to join game engine unity.

#FakeNewsChallenge… Fake news played a role in the recent US election, as did the seeming inability of tech companies to deal with it. That prompted self-driving car expert & adjunct CMU faculty member Dean Pomerleau to start the #FakeNewsChallenge – there’s a total of $2000 to be awarded for the top-5 teams, with payments in proportion to the accuracy of their ginned-up AI systems at spotting fake news.

All hail the Visual Sentinels: New research from Salesforce MetaMind, Virginia Tech, and Georgia Institute of Technology called Knowing when to Look: Adaptive Attention via a Visual Sentinel for Image Captioning, breaks new ground in pairing vision and language techniques inside single systems, creating an image captioning system which appears to spit out more useful telemetry about what internal representations it has developed and how they map to visual elements. Bonus points for the term ‘visual sentinel’.

Flying pig in ‘salmon glaze’ color spotted over Cupertino… Apple will start publishing AI papers, Russ Salakhutdinov,of CMU/Apple said in a presentation at NIPS on Tuesday.

OpenAI bits&pieces:

We are at NIPS. Schedule here.


I’m saving this for the regular, full fat ImportAI edition.

Now, remember NIPS conference goes, follow the advice that George W Bush gave to Obama upon handing over office: “always use Purell hand sanitizer”.

Import AI: Issue 19: OpenAI reveals its Universe, DeepMind figures out catastrophic forgetting, and beware the ‘Sabbath Mode’

ALERT! PROTOCOL BREAK FOR OPENAI ANNOUNCEMENT: We’ve just launched Universe, a software platform for measuring and training an AI’s general intelligence across the world’s supply of games, websites and other applications. We’re hoping that this dataset, benchmark, and infrastructure, can push forward RL research in the same way that other great datasets (some featured below) have accelerated other parts of AI…

… the fact we spent so much time building this suggests to me that AI’s new strategic battleground is about environments and computation, rather than static datasets… the received wisdom is that data is the strategically-crucial fuel for artificial intelligence development. That used to be true when much of the research community was focused on training classifiers to map A to B, and so on. But things have changed. We’re moving into an era where we’re training agents that can take actions in dynamic environments. That means the new key component has become the ability for any one AI research entity to access and create a large amount of rich environments to train their RL agents in. I think that realization provoked Facebook to develop and release TorchCraft to ease the development of agents that are trained on StarCraft, and to develop its language-based learning platform CommAI-env; motivated DeepMind to partner with Blizzard to turn StarCraft II into an AI development platform, and to develop and now plan to release code (hooray!) for its DeepMindLab RL environment (AKA – the world simulator formerly known as Labyrinth); and led Microsoft to turn Minecraft into the ‘Project Malmo’ AI development framework.

How to jumpstart an AI industry? One gigantic supercomputer… or so believes Japan, which is now taking bids for a 130-petaflop supercomputer (versus the world’s current top one, China’s 93-petaflop SunWay TaihuLight) slated to be completed in late-2017. Japan has some of the world’s greatest robot researchers and companies (like Fanuc, or Google’s SHAFT) but has lagged behind in software (for instance, most popular AI frameworks are from the US or Canada, like Caffe or Theano or TensorFlow. The main Japanese one is ‘Chainer’, which isn’t used that widely.). The rig is called ABCI, short for AI Bridging Cloud Infrastructure, and its goal is to “rapidly accelerate the deployment of AI into real businesses and society” (PDF).

Catastrophic forgetting? Forget about it! New research paper from DeepMind, “Overcoming catastrophic forgetting in neural networks” claims to deal with the ‘catastrophic forgetting’ problem in neural networks, making it easier for a single network to be trained to excel at multiple tasks. Techniques like this will be key to developing more advanced, flexible AI systems. “. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective,” they write (PDF).

AI-generated imagery is sitting on a heck of a Moore’s Law-style curve: That’s my takeaway from ‘Plug & Play Generative Networks’, research that brings us a step closer to generating realistic, high-resolution images using AI. The rate at which the aesthetic quality has advanced here is truly immense. To get an idea for how far we’ve come compare images generated from captions by PPGN (Figure 3), with those generated by groundbreaking research from a year ago….

… and we can expect even better things in the future, thanks to a new Visual Question Answering dataset (PDF). The new set roughly doubles the size of the previous VQA release by adding an additional image (and answer) to each question. Where before you had “Question: is the umbrella upside down? Image: an upside-down umbrella, caption ‘yes’”, you now have “Question: is the umbrella upside down? Image: an upside-down umbrella, caption ‘yes’, Image2: an umbrella in normal position, caption ‘no’.” This will let researchers create better categorization systems that get less confused, could also lead to better synthetic image generation via a richer internal representation of what is being described.

Have you heard the news / I’m reading today / I’m going to slurp all the data / Maluubaaa, Maluuubaaa! Canadian AI startup Maluuba has release a new free dataset that contains 100,000 question-and-answer pairs built out of CNN articles from DeepMind’s mammoth Q&A dataset. Check out NewsQA and start generating an alternative news narrative to that of reality (please!).

Rise of the AI hedge fund: Two Sigma has spun up a competition on Kaggle. It’s giving people a bunch of data containing “anonymized features pertaining to a time-varying value for a financial instrument”. The idea is to tap into the global intelligence of the Kaggle community to come up with new algorithms and inferences that make better predictions from data. There’s $100,000 in prize money up for grabs as well. The approach is similar to that taken by Numerai which turns to the crowd to garner predictions about the movements of strange, anonymized numbers. The key difference? Numerai pays people according to the success of their predictions, whereas Two Sigma is only coughing up a hundred thousand dollars (what do we call this – a megabuck?). Hopefully the group-based stock market inference activity will protect any individuals involved from becoming obsessed with the eldritch rhythms of the stock market, causing them to lose their minds – as depicted in Aronofsky’s ‘before he was famous’ flick ‘Pi’.

The industrialization of machine learning: machine learning has moved from being a science into a profession, says Amazon/USheffield’s Neil Lawrence. That means people are combing through research papers and code to create repeatable, reusable blocks of AI-driven computation, which are then applied by engineers who are more like construction-people than architects. So, what should AI scientists do to further push the field forward? Lawrence’s proposal is that they try and pair more mathematically-rich tools (kernel methods and Gaussian processes) with the inscrutable-yet-powerful neural networks that are currently in vogue. “as The Hitchhiker’s Guide to the Galaxy” states “Don’t Panic”, he writes. “By bringing our mathematical tools to bear on the new wave of deep learning methods we can ensure that they remain “mostly harmless”.”

Term of the week… the truly delightful ‘Sabbath Mode’, which is basically a selective lobotomy for the complex parts of electronics to be activated on the Shabbat and Jewish holidays. I now imagine a Christian fridge whose ‘sabbath mode’ prevents the owner from consuming frightfully sinful shellfish.

The Amazon AI Kraken Waketh… Amazon’s strategy for tackling a new market is similar to the methods employed by the mythological nightmare-of-the-sea, The Kraken. It lurks out of sight while rivals like Google and Microsoft attempt to be first-to-market, then it suddenly emerges from the depths of Seattle, with each of its numerous appendages flailing with new products. That’s roughly what happened at its re:invent conference this week, when Big Yellow Kraken revealed a swathe of AI products, including…

  …Reconfigurable, FPGA-containing computers…  Amazon’s answer to the slowdown in Moore’s Law lies in ‘F1’ servers loaded with typical processors paired with FPGAs (for weird&gnarly stuff: custom accelerators, offload network hubs, and so on.). …

Image recognition… a new image recognition service called “Rekognition” (does Bezos have a grudge against sub-editors?) will compete with existing ones from Amazon, IBM, Microsoft, and many others.

voice-assistant-as-a-service…I was talking to someone involved in self-driving cars recently and I made some glib comment about how you could use neural networks to train a traffic light detector to help you deal with intersections. “Ah,” they said, “but can you train it to deal with all possible configurations of traffic lights in the world. Can you deal with sets of 6 traffic lights side by side, hoisted at odd angles above the road, due to the fact the town planner went rogue due to a new pedestrian bridge? And how do you know which one of those 6 is yours? Especially if there’s unique signage? And…” at this point I, suitably chastened, realized the error of my question. Amazon has had to deal with similar challenges with the Polly voice assistant, a text-to-speech cloud service that supports 47 different voices and 24 languages, with the voices knowing the difference between pronouncing sa “I live in Seattle” and “Live from New York”. Yet another example of ‘industrial deep learning’ where the underlying tech is fairly standard but the commercial implementation involves getting a lot of finicky details exactly right.

Citation Not Needed Anymore! Jurgen Schmidhuber gets his media article – the bloke who pioneered the LSTM (a key component in the current enthusiasm for all things memory&AI) has finally got his NYT profile. Congrats Jurgen! (pronounced, as he has told me multiple times, “you-again shmit-hugh-bur”.) Still waiting to see papers emanate from his secretive startup NNAISENSE, though.

OpenAI bits&pieces:

Many of the research team are at NIPS in Barcelona this week giving tutorials, lectures, and such. A full schedule is available here.

OpenAI and Microsoft sponsored events at Women in Machine Learning at NIPS as well. It’s an honor to support a scientific community focused on supporting and increasing diversity in AI.

Government AI: Last week our co-founder, Greg Brockman, was a witness at the Senate’s hearing on “The Dawn of AI”. You can watch the testimony and read our written submission here.


[Note: thrilled that this edition’s short story comes from a reader, Jack Galler. Thanks for writing in, Jack – great name!]

[2020: a woman walking through the city, listening to music.]

The playlist ends, and the woman gives it a positive rating. Her phone prompts: “Would you like another context playlist?” The woman confirms.

She raises her phone and takes two pictures – one from the rear camera, showing the street, and the other from the front camera of her. The photos get deposited into the phone’s internal representation of the ‘mood’ of the moment, along with the woman’s heart rate from her smartwatch, and a tweet she posted earlier about her lunch. It even knows that it’s raining.

The phone’s AI fuses these together and creates a new internal representation of the mood, then uses GAN techniques to generate a new song. A soothing, spanish guitar solo thrums out of the phone to match the light drumming of the rain.

At the end of the song, she’s prompted to rate it. She gives it a thumbs up – she can’t remember the last time she gave it a thumbs down. She will never listen to that soothing acoustic guitar solo again. She could save it, but the context of the song will never be the same – she will never feel the exact same as the moment the song was created, nor will the city she was walking through be what it was in that moment.

Import AI: Issue 18: Snooper’s Charter&AI, MXNet, and Microsoft’s Quantum Computer Bet

Delicious data for the state-backed Deep Learning gods: the UK passed the Investigatory Powers Act 2016, known colloquially as the ‘snooper’s charter’, into law. It forces internet providers to keep a record of all websites visited by all people in the country for up to one year. Let that sink in for a minute! Now have a gin&tonic and a lie down. Better? OK! Since this also includes a temporal component of what websites (domains, not specific pages) people visit, it will give government an incredibly useful dataset to run complex AI-based inference algorithms on. Spooky agencies will be able to divine odd traits about the national mood by analyzing the rise and fall of the popularity of certain websites, and it’ll be possible to profile people and group them according to their habits, then analyze their activities and watch for correlations or disconnects with other groups. The applications of modern Ml techniques to this sort of data are vast and disquieting.

Ethical machine learning: should we conduct experiments, even if they seem to be offensive? The answer to that question was ‘yes’ from a few Import AI readers, who took issue with my characterization of the ‘automated inference on criminality using face images’ paper from last week. Some readers pointed out that this could be an interesting experiment to run, and I countered by saying I’d need a much larger section of the research paper given over to an evaluation of the ethical and moral context of the experiment.

Battle of the frameworks: Microsoft has CNTK, Google has Tensorflow, and Amazon has… MXNet, as of this week. Amazon has put its weight behind the MXNet deep learning framework, making the software the default framework for running deep learning on Amazon Web Services. MXNet’s elevation at Amazon is likely due to its longtime association with Carnegie Mellon professor (and recent Amazon hire) Alex Smola, who has sought to increase the usage of MXNet for a number of years (PDF). DSSTNE, an Amazon-developed DL library, will likely become a subcomponent of MXNET. It’s likely that only one or two deep learning frameworks will end up being widely used,and whoever controls the framework will be able to extract some economic advantages through building cloud services and products around it that benefit from the broad community uptake. The next two years will likely be critical for establishing the winners and losers in this category.

Municipal Muni-Mind Mangled In Miraculous Manipulation: Further proof that we live in a timeline imagine by Neil Stephenson and William Gibson comes in the form of the public transportation computer hack in San Francisco this weekend. “‘You Hacked, ALL Data Encrypted.’ That was the message on San Francisco Muni station computer screens across the city, giving passengers free rides all day on Saturday,” reports CBS.

DeepMind + NHS: Getting ahold of healthcare data is notoriously tricky due to the many (sensible) laws around data protection. Google DeepMind’s solution is to partner with the Royal Free London NHS Foundation Trust to get some useful modern software into the hands of clinicians, and eventually incorporate machine learning components as it establishes trust and credibility. Much of the NHS runs on an arcane system of paper records, so any digitization is a good thing. It’s likely Google/DeepMind will face some opposition and probing from citizens and politicians over its usage and stewardship of their data. The onus is on Google DeepMind to prove that partnership schemes like this can work for patients above all.

Microsoft’s wacky quantum computer bet: Microsoft plans to make a prototype of a new type of quantum computer in a bet that the technology is ready to jump out of theory and into practical reality. Microsoft has taken a different tack to Google with its quantum computing approach and is betting its farm on a technology called a ‘topological quantum computer’. That’s a somewhat more far-out technology than the types of computer being explored by Google. The company has enlisted a bunch of quantum experts to help it build the machine, including Matthias Troyer of E.T.H Zurich. (There’s an occasional argument among AI experts, typically after a few beers, as to whether consciousness emerges from a quantum substrate. Physicist Roger Penrose has a pet theory that consciousness comes out of quantum activities inside ‘microtubules’ inside brain neurons, though evidence for this is scant at best.)

Cities conjured up from lines scratched into sand… and much more in this fantastic paper ‘Image-to-Image translation with Conditional Adversarial Nets’. The authors outline a system that lets you train an AI to pair one image input, like a satellite photograph of a city, and generate an output, like a Google Map with bounding boxers around buildings. The technique works across domains and can be used to, say, draw a woman’s handbag and use that to create a synthetic ‘photograph’ of the bag, or take a picture of a landscape in the day and show it at night….
   …The work has already been extended by Opendotlab for the ‘Invisible Cities’ project to create a system that can take a satellite photo of, say, Milan, and re-interpret it as though the buildings all come from Los Angeles. Terracotta roofs turn to concrete flattop & public squares become asphalt. Canals become freeways. It’s a marvelous, stimulating experiment, and a wonderful example of how art will be changed by the arrival of machine learning
   …so with all the possibility of new forms of creation from the combination of deep learning and art it’s great to see the launch of, a company formed by a bunch of European AI hackers to spread AI-enabled aesthetics into studios and agencies across the world. AI is becoming just another lens through which we see the world, and it has the potential to show us things our puny four-dimensional minds have trouble imagining. T-SNE goggles, kind of thing.

Computational fluid dynamics meets deep learning.. The previous sentence will be true of many, many things in coming years: “kitchen-shift scheduling meets deep learning”, “insurance claim analysis meets deep learning”, and so on. But the amazing thing about this code release from Google is that you can train a neural network to handle some of the gnarlier equations involved in CFD. It creates some amazing visualisations, but don’t try this in your nuclear reactor yet, kids.

AI turns everything into a prediction problem: the rise of low-cost machine intelligence systems will see people in companies across the world work to turn as many of their problems as possible into problems of prediction, says the Harvard Business Review. That’s because “the first effect of machine intelligence will be to lower the cost of goods and services that rely on prediction. This matters because prediction is an input to a host of activities including transportation, agriculture, healthcare, energy manufacturing, and retail,” it writes.

What does it take to build a strong AI?… not as much as you’d think, suggests Yann Lecun in this wide-ranging speech at Carnegie Mellon University. Fast forward to around 32 minutes into the video to hear Yann’s views on how to build super-smart machines. Most AI experts have their own workable theories for how to build super-intelligence AI systems, but are usually held back by the relatively meager capabilities of modern computers and a lack of the right sort of data. We’re moving into an era where both of this scarcities will be less severe, so we can expect development here.

OpenAI bits&pieces:

Government Talk: We will be giving testimony on artificial intelligence at the Subcommittee on Space, Science, and Competitiveness’s hearing on “The Dawn of Artificial Intelligence” on Wednesday. Tune-in!


[2020: An architect’s office in Seattle. A person wearing a black turtleneck stands in front of a floor-to-ceiling screen. Small white earbuds (no cable) dangle out of their ears. They gesture at the screen.]

“So as you can see, the house itself can change its appearance according to the different styles and textures you’re wearing on the day. We use style-transfer techniques to modify the textures on the walls according to what you’re wearing. This can really help you express yourself and make an impact, especially when hosting get togethers. Turn on your webcam and I’ll show you!”

The screen splits in two. A woman wearing a red scarf over a blue jean jacket blinks into view on the left-hand side, and a modern, boxy house appears on the right hand side.

“Let me demonstrate!” The person gestures from the woman on the left side of the screen to the house on the right. On screen, the house flickers and its walls change from a grey color to a blue to match the jean jacket. The door and window frames turn from white to a vivid, cross-hatched red.

“Now try without the scarf!”. The woman on screen removes her scarf. The house responds, its screens turning from blue to a smooth beige. “In a few years, it won’t just be the appearance of the house that changes, its geometry will change as well.

Import Ai: Issue 17: The academic brain drain, parallel invention, and a royally impressive AI screw-up

From the department of: I Really Hope This Is Satire, but Given It Is 2016 I Cannot Be Sure: Researchers at Shanghai University published a paper called ‘automated inference on criminality using face images‘. The paper uses deep learning to explore correlations between someone’s appearance and the chance of them being a criminal. It’s modern phrenology – 19th century junk science where people believed you could measure someone’s skull and use it to infer traits about their intelligence (the Nazi’s were influenced by this). I can see no merit to this paper whatsoever and am mystified that the researchers were not warned off of publishing this absurd paper. If someone feels my views are wildly wrong here I’d love to hear from you and will (if you’re comfortable with it) put the correspondence in the next newsletter.

How to judge which jobs will be automated: If you can collect 10,000 to 100,000 times as much data on a given job as someone would reasonably generate during the course of their professional life, then you can automate it. This explains why jobs where you can gather lots of aggregate data (eg, insurance actuaries, legal e-discovery, radiology, repeatable factory work, drivers) are already seeing massive automation.

If another professor leaves for AI, and there are no academic’s left who aren’t in industry, do people notice? Another significant move from Academia to Industry as Stanford professor Fei-Fei Li takes up a full-time gig at Google. Fei-Fei Li is both astonishingly important and an astonishingly patient, wise person, so it’s a great get for Google. Li and her team of grad students and collaborators practically kick-started the deep learning boom by creating the ‘ImageNet’ dataset and associated competition. Geoff Hinton & co won ImageNet in 2012 with an approach that relied on deep learning and this precipitated the immense flood of interest and investment that followed. Li is the latest in a long, long line of deep learning academics who have opted to move to spend (most of) their time working in industry rather than academia. Others include Geoff Hinton (University of Toronto > Google), Yann Lecun (NYU > Facebook), Russ Salakhudinov (CMU > Apple), Alex Smola (CMU > Amazon), Neil Lawrence (U Sheffield > Amazon), Nando De Freitas (Oxford > DeepMind), and many more. The main holdout remains Yoshua Bengio who maintains a charming academic fortress in the frozen music-strewn town of Montreal, Quebec. It’s wonderful that industry gets to benefit from the wisdom of academics, but it does lead me to wonder as to whether AI organizations are going to cannibalise the academic ecosystem to the point that they damage the ultimate supply of graduate students. (Note: OpenAI is guilty of this as well, as Pieter Abbeel currently spends most of his time with us rather than at UC Berkeley.) On the other hand, it’s nice to see academics making money off of their ideas, whether by taking up well-paid jobs or selling their startups to big firms. (Congratulations to Berkeley’s Joshua Bloom and the rest of the team on selling to GE, by the way.)

Parallel Invention Alert!: television was invented by multiple people at roughly the same time. The same happened for telephones. Ditto Crispr. Technology isn’t mysterious – sometimes there are ideas floating around in the general scientific hivemind and a few people will transmute them into reality at the same time. This phenomenon of Multiple Discovery is worth paying attention to as each occurrence indicates that the idea has some general utility, given the fact that multiple scientists with different perspectives have glommed onto it at the same time…

…AI is rife with parallel invention, and part of the way I model the acceleration in AI development is by the increasing frequency of these cases of parallel invention. So it’s interesting that both OpenAI and Google DeepMind have published remarkably similar papers within a very short (~ two week) timespan of each other. First, OpenAI published a paper called RL^2 Fast Reinforcement Learning Via Slow Reinforcement Learning, then DeepMind followed this with Learning to Reinforcement Learn. (Note: both of these research efforts took many months of work, so the order of publication is not significant. They also test approach on different facets of learning problems.)

The idea behind both techniques is that rather than investing time in getting AI to optimize a specific learning algorithm for a given task, you can instead get an AI to optimise its own learning machinery for a set of many tasks. Technically speaking, the idea is that you structure a reinforcement learning agent itself as a recurrent neural network and feed it extra information about its performance on the current task. The agent learns how to create policies to solve a broad range of tasks by using the information about each solution to each problem to alter and augment its own problem solving abilities. Different weights in the RNN correspond to different learning algorithms performed by the agent, and different activations in the RNN correspond to different policies specialized to different tasks faced by the agent…

general purpose brains, versus tuned brains: These approaches are analogous to the difference between putting an uncalibrated, specific piece of machinery into the brain of an AI and letting it calibrate the machine through interacting with a certain set of environments to solve a specific problem, versus instead putting a more general-purpose bit of machinery into the brain of AI and getting the AI to optimize the machinery for solving many different tasks in many different environments. This ascendance towards greater flexibility, learning, and independence by AI agents is a key point on our march towards creating smarter machines…

This is not an isolated occurrence. Other examples of parallel invention include: Facebook & DeepMind both pioneering memory-augmented AI systems (neural turing machines, memory networks), Google Brain & DeepMind producing papers on Gumbal Softmax within a week of eachother, multiple people inventing aspects of variational autoencoders, Deepmind and the University of Oxford pioneering methods for lip-reading networks, Stanford&National Research Council Canada&Amsterdam publishing on the TreeLSTM within three months of each other, and many more. If you have examples please email me at – I’d like to compile these instances in a separate, continuously updated document.

The Deep Learning iceberg lurking in consumer products: the software we use on a day-to-day basis is becoming suffused with deep learning, with much of it lurking beneath the surface. For example, a new Google product called ‘PhotoScan‘ uses neural network-based image analysis and inference techniques to let you quickly scan your family photos, using AI to stitch together the different sections of the photograph to improve quality and correct for glare and spatial distortions. But most importantly it ‘just works’ and the consumer doesn’t need to know it is made possible by a baroque stack of neural networks. Similarly, a Kickstarter for a fancy baby monitor called ‘Knit‘ promises to create a device that uses DL to better monitor the state of the baby (eg, its breathing, wakefulness, and so on), giving parents information about their child through the computer observing its visual appearance and making some assumptions. These products are pretty amazing given that in 2012 image recognition was broadly an unsolved problem.

Welcome to the era of the ultra-lego-block-AI. That’s the message from a new DeepMind paper outlining its latest RL agent. The agent consists of multiple different tried-and-tested AI components (CNNs for vision,a network to enhance the agent’s ability to explore by rewarding it for increasing the variety of the views it perceives, and network to predict rewards and check against what actually happened, and so on – more detail on page 2 of the paper [PDF]) which combine to create a smart, capable system capable of beating DeepMind’s own records on a large number of environments, including tricky games like Montezuma’s revenge (which was broadly unsolved by AI two years ago, and which now sees this AI agent achieve 69% of a human baseline). This kind of multi-system omniagent will be increasingly significant, and it’s something that people like Facebook, OpenAI, and other academics are all working on as well.

A royally good AI blooper: Another fun example of the many ways in which AI algorithms can horribly fail from Tom White’s reliably entertaining ‘Smile Vector‘. This time, the neural network tries to make Princess Kate smile and instead applies the approach to her husband, William. “Kate accidentally landed on William,”he explains, which sounds like a euphemism for many, many things.

Geocities still lives… in the form of this Very Important and Trustworthy AI website: One Weird Kernel Trick.

Given a few hundred million words and a hammer made of a globe-spanning network of computers, can I translate between languages without knowing anything about language? Google’s answer appears to be ‘yes’. In a new paper outlining Google’s Multilingual Translation System the company describes a system that is trained on multiple languages and translates between them.This creates a single, giant network that contains a crude understanding of not only how to translate between pairs of languages, but how to categorize broad concepts across sets of language it doesn’t have explicit, paired sets for. This is significant as it shows the network has learned some essential information about language that it wasn’t given explicit labels for. That shows how modern AI systems can not only map A>B, but can also infer the existence of C and D and map between them as well. Most tantalising thing? The evidence that this approach yields “a universal interlingua representation in our model”. Universal Interlingua!

OpenAI bits&pieces:

Better PixelCNNs for everyone! We published a paper and code for PixelCNN++, a souped-up version of some technology that DeepMind invented. Code here.

Why AI technology is moving so quickly, and why predicting the future is hard: interview with OpenAI’s co-founder & research director Ilya Sutskever.


Backstory: I wondered if people would like to see something ‘crazy&weird’ in this newsletter and the votes told me ‘Yes’. So here we go:

[2025: A factory in China. There are no lights and thousands of industrial robots work in a complex, symphony. All you hear is the steady fizz of robotic movement.]

Machine view: multiple reinforcement-learning agents run simulations of the line in a large data center attached to the factory. They explore multiple perturbations of the manufacturing process, endlessly simulating the workload. When they discover more efficient approaches they initiate a High Priority Resource Call to the hardware scheduler in the data center and are assigned a chunk of computing resources to attempt to transfer their knowledge from the simulation into the real robots on the line. After the transfer is complete the robotic line reconfigures itself to account for the new simulation. Any errors are spotted by a thousand cameras staring down at the line. If the AI can diagnose the error it re-runs the simulation and comes up with a fix. If it can’t it sends the images&data of the flaw out to a large Mechnical Turk marketplace where human engineers observe the fault, come up with a fix, and send it back to the line. The system re-optimizes. Meanwhile, one line of the factory attempts to come up with perturbations of the assembled product, inventing wholly new versions of the devices by navigating through the latent feature space of the products. When new ‘Candidate Products’ are found it runs it through a series of tuned, expert systems and, if it gets a high enough score, simulates the product in a high-fidelity simulation. If it passes those tests then a Candidate Product is produced, airlifted by drone to a nearby human focus group and, if it satisfies their criteria, is sold on an EBay-like auction site frequented by the factory’s thousands of distributors. A bidding process takes place and in a few days/weeks data comes back about how the product succeeds in the market. If it does better than the existing product more parts of the factory are dedicated to creating new products in this style and the improvisation line begins exploring the latent space of the new products. In this way the manufacturing process begins to evolve according to a Cambrian evolution process, with AI automating much of the product R&D process.