Mapping Babel

Import AI: Issue 29: neural networks crack quantum problem, fingernail-sized AI chips, and a “gender” classifier screwup

It takes a global village to raise an AI… a report titled ‘Advances in artificial intelligence require progress across all of computer science” (PDF) from the computing community consortium identifies several key areas that should be developed for AI to thrive: computing systems and hardware, theoretical computer science, cybersecurity, formal methods, programming languages, and human-computer interaction…
…better support infrastructure will speed the rate at which developers embrace AI. For example, see this Ubuntu + AWS + AI announcement from Amazon: the “AWS Deep Learning AMI for Ubuntu” will give developers a pre-integrated software stack to run on its cloud, saving them some of the tedious, frustrating time they usually spend installing and configuring deep learning software.
Baidu’s AI software PaddlePaddle now supports Kubernetes, making it easier to run the software on large clusters of computers. Kubernetes is an open source project based on Google’s internal ‘Borg’ and ‘Omega’ cluster managers, and is used quite widely among the AI community – Last year, OpenAI released software to make it easier to run Kubernetes on Amazon’s cloud.

Finally, AI creates jobs for humans! Starship Technologies is hiring a “robot handler” to accompany its freight-ferrying robots as they zoom around Redwood City. Requirements: “a quick thinker with the ability to resolve non-standard situations“.

Ford & the ARGOnauts: Ford will spend $1 billion over five years on AI, via a subsidiary company called Argo. Argo is run by veterans of both Google and Uber’s self-driving programs. Details remain nebulous. Much of the innovation here appears to be in the financial machinery underpinning Argo, which will make it easier for Ford to offer hefty salaries and stock allocations to the AI people it wants to hire. Reminiscent of Cisco’s “spin-in” company Insieme.

Powerful image classification, for free: Facebook has released code for ‘ResNeXt’, an image classification system outlined in its research paper Aggregated Residual Transformations for Deep Neural Networks. Note: one of the authors of ResNeXt is Kaiming He, the whizkid from MSR Asia who helped invent the ImageNet 2015-winning Residual Networks.

Rise of the terminator accountants: Number of traders employed on the US cash equities trading desk at Goldman Sachs’s New York office:
…in 2000: 600
…in 2017: 2, supported by 200 computer engineers.
…”Some 9,000 people, about one-third of Goldman’s staff, are computer engineers,” reports MIT Technology Review.

AI: 2. Hand-tuned algorithms: 0: New research shows how we can use modern AI techniques to learn representations of complex problems, then use some of the resulting predictive models in place of hand-tuned algorithms. “Solving the quantum many-body problem with artificial neural networks” research shows how this technique can be competitive with state of the art approaches. “With further development, it may well prove a valuable piece in the quantum toolbox,.” the researchers write.
…Similarly, Lawrence Berkeley National Laboratory recently trained machine learning systems to predict metallic defects in materials, lowering the cost of conducting research into advanced alloys and other lightweight new materials. “This work is essentially a proof of concept. It shows that we can run density functional calculations for a few hundred materials, then train machine learning algorithms to accurately predict point defects for a much larger group of materials,” the researchers say. “The benefit of this work is now we have a computationally inexpensive machine learning approach that can quickly and accurately predict point defects in new intermetallic materials. We no longer have to run very costly first principle calculations to identify defect properties for every new metallic compound.”

Microscopic, power-sipping’ AI circuits: researchers with the University of Michigan and spinout CubeWorks have created a deep learning processor the size of a fraction of a fingernail. The new chip implements deep neural networks on a 7.1mm2 chip that sips a mere 288 microwatts of power (PDF). They imagine the chip could be used for basic pattern recognition tasks, like a home security camera knowing to only record in the presence of movement of a human/animal versus a shifting tree branch. The design hints at an era for AI where crude pattern recognition capabilities are distributed in processors so tiny and discreet you could end up with fragments in your shoes after walking on some futuristic beach. Slide presentation with more technical information here.

AI needs its own disaster: AI safety researcher Stuart Russell worries that AI needs to have a Chernobyl-scale disaster to get the rest of the world to wake up to the need for fundamental research on AI safety…
…“I go through the arguments that people make for not paying any attention to this issue and none of them hold water. They fail in such straightforward ways that it seems like the arguments are coming from a defensive reaction, not from taking the question seriously and thinking hard about it but not wanting to consider it at all,” he says. “Obviously, it’s a threat. We can look back at the history of nuclear physics, where very famous nuclear physicists were simply in denial about the possibility that nuclear physics could lead to nuclear weapons.“
some disagree about the dangers of AI. Andrew Ng, a former Stanford Professor and Google Brain founder who now runs AI for Chinese giant tech company Baidu, talked about the “evil AI hype circle” in a recent lecture at the Stanford Graduate School of Business (video). His view is that some people exaggerate the dangers of “evil AI” to generate interest in the problem, which brings in more funding for research, which goes on to fund “anti-evil-AI” companies. “The results of this work drives more hype”, he says. The funding for these sorts of organizations and individuals is “a massive misallocation of resources” he says. Another worry of Ng’s: the focus on evil AI can distract us from a much more severe, real problem, which he says is job displacement.
Facebook’s head of AI research, Yann Lecun, said in mid-2016 “I don’t think AI will become an existential threat to humanity… If we are smart enough to build machine with super-human intelligence, chances are we will not be stupid enough to give them infinite power to destroy humanity.”
… I worry that AI safety is such a visceral topic that people react quite emotionally to it, and get freaked out by the baleful implications to the point they don’t consider the actual research being done. Some problems people are grappling with in AI safety include: securing machines against adversarial examples, figuring out how to give machines effective intuitions through logical induction, and ensuring that cleaning robots don’t commit acts of vandalism to achieve a tidy home, among others. These all seem like reasonable avenues of research that will improve the stability and resilience of typical AI systems…
… but don’t take my word for it – read about AI safety yourself and come to your own decision: for your next desert island vacation (stranding), consider bringing along a smorgasbord of these 200 AI resources, curated by the Center for Human-Compatible AI at UC Berkeley.
and if you want to do something about AI safety, consider applying for a new technical research intern position with the Center for Human Compatible AI at UC Berkeley and the Machine Intelligence Research Institute.

Satellite eyes, served three different ways: Startup Descartes Labs has released a new set of global satellite maps in three distinct bands – RGB, Red Edge bands, and synthetic aperture radar range/azimuth measurements The imagery has been pre-processed to remove clouds and adjusted for the angle of the satellite camera as well as the angle of the sun.

Declining economies of scale: just as companies can expect to see their rate of growth flatten as they expand, deep learning systems see performance drop as they add more GPUs, as the benefits they gain start to be nibbled away by the latency and infrastructure costs introduced by running multiple GPUs in parallel…
… New work from Japanese AI startup Preferred Networks, shows that its free ‘Chainer’ software can generate a 100X performance speedup from 128 GPUs. This is extremely good, but still highlights the slightly declining returns people get as they scale-up systems.

Gender IS NOT in the eyes of the beholder: New research “Gender-From-Iris or Gender-From-Mascara?” appears to bust experimental results showing you can predict gender from a person’s iris, instead pointing out that many strong results appear to be contingent on detectors that learn to spot mascara. Machine learning’s law of unintended consequences strikes again!…
… It reminds me of an apocryphal story an AI researcher once told me: in the 1980s the US military wanted to use machine learning algorithms to automatically classify spy satellite photos for whether they contained soviet tanks or not. The system worked flawlessly in tests, but when they put it into production they discovered that its results were little better than random… After some further experimentation they discovered that in every single photo from their task data that contained a tank, there was also some kind of cloud. Therefore, their ML algorithms had developed a superhuman cloud-classifying ability, and didn’t have the foggiest idea of what a tank was!

Rise of the machines = the end of capitalism as we know it? “Modern Western society is built on a societal model whereby Capital is exchanged for Labour to provide economic growth. If Labour is no longer part of that exchange, the ramifications will be immense,” said one respondent to a Pew Internet report about the ‘pros and cons of the algorithm age’.
…“I foresee algorithms replacing almost all workers with no real options for the replaced humans,” says another respondent.

Bushels of subterfuge in DeepMind’s apple orchard: As I write this newsletter on a Sunday, I’m still recovering from my usual morning activity – chasing my friend round an apple orchard, using a laser beam to periodically paralyze them, letting me hop over their twitching body to gather up as many apples as I can…
… in a strange turn of events it appears that Google DeepMind has been spying on my somewhat unique form of part-time sport, and have replicated this in a game environment called ‘gathering’ which they have used to explore the sort of collaborative and combative strategies that AI systems evolve…there’s also another environment called WolfPack, the less said about it the better. This sort of research is potentially very useful for large multi-agent simulations, which many people in AI are betting on as an area where exploration could yield research breakthroughs.

Lines in Google’s codebase: 2 billion
Number of commits into aforementioned codebase per day: 40,000
…From: “Software Engineering at Google”.

OpenAI Bits and Pieces

Learning how to walk, with OpenAI Gym: The challenge: model the motor control unit of a pair of legs in a virtual environment. “You are given a musculoskeletal model with 16 muscles to control. At every 10ms you send signals to these muscles to activate or deactivate them. The objective is to walk as far as possible in 5 seconds.” The components: OpenSim, OpenAI Gym, keras-rl, and much more. Try the challenge, but stay for the doddering legs!

Arxiv Sanity – bigger, better, smarter! OpenAI’s Andrej Karpathy has updated Arxiv Sanity, an indispensable resource that I and many others use to keep track of AI papers. New features: better algorithms for surfacing papers people have shown interest in, and a social feature. (Also see Stephen Merity’s social tracker trendingarxiv.)

AI Control: OpenAI researcher Paul Christiano writes an informative blog on AI safety and security, called AI Control. In the latest post, “Directions and desiderata for AI control” he talks about some particularly promising research directions in AI safety.

OpenAI does open mic night: Catherine Olsson and I both gave short talks at the Silicon Valley AI Research meetup in SF last week. Catherine’s video. Jack’s video.

Asilomar conference: articles in Wired and  Slate Star Codex about the Beneficial AI conference held at Asilomar in early January.

Tech tales:

[Diplomatic embassy, Beijing, 2025:]

It was a moonless mid-winter pre-dawn, when the flock of drones came overhead and emptied their cargo of chips over the building. The embassy cameras and searchlights picked out some of the thousands of chips as they fell down, hissing like hail on glass and steel roofs. Those staffers that heard them fall shivered instinctively, and afterwards some said that, when caught in the spotlights, the chips looked like metallic snow.

Over the next day the embassy staff did what they could, going around with vacuum cleaners and tiny mops, and ordering an external cleanup crew, but the snowfalls of chips – each one a tiny sensor, its individually meager capabilities offset by the sheer number of its kin – would come again, and eventually security protocols were tightened and people just resigned themselves to it.

Now,  you had to negotiate a baroque set of security measures to get into the embassy. But still the chips got in, and cleaners would find them tracked into bathrooms, or sitting in undusted nooks and crannies. Outside, the air hummed with invisible surveillance, as the numerous little chips used their AI processors to turn on microphones in the presence of certain phrases. Outside, the data evaporated into the air, absorbed by flocks of small drones  which would fly over the embassy, as they did in every town in every major city in every developed country, hoovering up data from the, what some termed, ‘State Dust’. The chips would lie in wait, consuming almost no power, till they heard a particular encrypted call-out from the government drones.

Even the chips that found themselves indoors would eventually be outside again, as some escaped through improper waste disposal measures, and others had their plastic barbs hook fortuitously on a trouser leg or shoe sole, to then be carried outside. And so their data was extracted as well and a titanic jigsaw was assembled.

It didn’t matter how partial the data from each chip was, given how many there were, and the frequency of their harvesting. Gather enough data and at some point you can make sense of the smallest little fragments, but you can only do this for all the little whispers of data from a city or a country if you’re a machine.

Import AI: Issue 28: What one quadrillion dollars pays for, research paper archaeology, and AI modules for drones

Cost of automating the entire global economy? One quadrillion dollars.
Requirements for the resulting system to be able to perfectly replace all human labor:
…Computation: 10^26 operations per second
…Memory: 10^25 bits
…I/O: 10^19 input-output bits per second
…Knowledge ingestion: 7 bits per person per second
and many more marvelous numbers in this essay by data compression expert Matt Mahoney on ‘the cost of AI”. A virtuoso performance of extrapolation and (with apologies to Mitchell & Webb) numberwang-ery.

Google self-driving cars, report card (PDF):
…Miles driven in 2015: 424,331
…Miles driven in 2016: 635,868
…Disengagements per 1,000 miles, 2015: 0.80
…Disengagements per 1,000 miles, 2016: 0.20
… now let’s see how they do with hard training situations for which there is little good training data, like navigating a sandstorm-ridden road in the Middle East.

How much is an AI worth? In which Google’s head of M&A, Don Harrison, says Google is happy to throw large quantities of cash at AI companies. “It’s very hard to apply valuation metrics to AI. These acquisitions are driven by key talent — really smart people. It’s an area I’m focused on and our team is focused on. The valuations are part and parcel of the promise of the technology. We pay attention to it but don’t necessarily worry about it,” he says. (Emphasis mine.)

Your organization and public data: a message to Import AI readers: most organizations gather some form of data which can be safely published, and the world is richer for it. Case in point: Backblaze its latest report on hard drive reliability. These reports should factor into any HDD buyer’s decision, as they represent good, statistically significant real-world data of drive performance. If you work at an organization that may have similar data that can be externalized, please try to make this happen – I’ll be happy to help, so feel free to email me.

Measurement: besides Atari, what are other good measures for the progression of reinforcement learning techniques? As we move into an era dominated by dynamic environments supplied by tools like Universe, DeepMind Lab, Malmo, Torchcraft, and others, how do we effectively  model the progress of agents in a way that captures their full spectrum of their growing capabilities?

AI for researching AI: the Allen Institute for AI has released Citeomatic, a tool that uses deep learning to predict citations for a given paper. To test out the system I fed it OpenAI’s RL^2 paper and it gave me back over 30 papers that it recommended we consider citing. Many of these seem reasonable, eg ‘solving partially observable reinforcement learning problems with rnns’, etc…
…Most of all, this seems like a great tool to help researchers find papers they should be reading. AI has a large literature and researchers frequently find themselves stumbling on good ideas from the previous decade. Any tool that can make this form of intellectual archaeology more efficient is likely to aid in science.

From the Dept. of Recursive Education: Tutorial from Arthur Juliani outlines how to build agents that learn how to learn, with code inspired by the DeepMind paper “Learning to reinforcement learn”, and the OpenAI paper “RL^2”.

Explanations as cognitive maps: the act of explaining situations lets us deal with the chaotic novelty of the world, and create useful abstractions we can use to reason about it. More detail, with many great research references, in this blog from Shakir at DeepMind.

Executive Order strikes a chill in math, AI community: President Trump’s executive order banning people from seven predominantly muslim countries from coming to the US will have significant effects on academia, according to mathematician Terry Tao. “This is already affecting upcoming or ongoing mathematical conferences or programs in the US, with many international speakers (including those from countries not directly affected by the order) now cancelling their visit, either in protest or in concern about their ability to freely enter and leave the country,” he writes. “It is still possible for this sort of long-term damage to the mathematical community (both within the US and abroad) to be reversed or at least contained, but at present there is a real risk of the damage becoming permanent.”…
… another illustration of the law of unintended consequences when politics runs amok. Reminds me of one of the more subtle and chilling consequences of the UK’s decisions to leave the European Union, which was that it reduced collaboration between EU and UK scientists as EU researchers worried that, because their grants were contingent on EU funding, collaboration with UK scientists could violate funding causes. Scientists need to collaborate across international borders.

“Give it the latest personality module, we’re wheels up in five minutes!” – autonomous drones are going to operate in such a huge possibility space that today’s if-this, then-that rule systems will be insufficient, according to this research paper from the University of Texas at Austin and SparkCognition. Eventually, scientists may use a combination of simulators and real world data to train different drone brains for different missions, then swap bits of them in and out as needed. “We propose delinking control networks from the ensembler RNN so that individual control RNNs may be evolved and trained to execute differing mission profiles optimally, and these “personalities” may be easily uploaded into the autonomous asset with no hardware changes necessary,” they write.

Language as the link between us and the machines: CommAI: Facebook AI researchers believe language will be crucial to the development of general purpose AI, and have outlined a platform named CommAI (short for communication-based AI) that uses language to train and communicate agents..
…The idea is that the AI will operate in a world attempting to complete tasks and it’s only major point of input/output with the operator will be a language interface. “In a CommAI-mini task, the environment presents a (simplified) regular expression to the learner. It then asks it to either recognize or produce a string matching the expression. The environment listens to the learner response and it provides linguistic feedback on the learner’s performance (possibly assigning reward). All exchanges take place at the bit level,” they write.
… whether this solves the language ‘chicken and egg’ problem remains to be seen. Language is hard because it represents a high level abstraction to refer to a bunch of low-level inputs. “Horse”, is our mental shorthand for the flood of sensory data that coincides with our experience of the creature. Ideally, we want our AIs to learn similar associations between the words in their language model and their experience of the world. CommAI is structured to encourage this sort of grounding.
…“We hope the CommAI-mini challenge is at the right level of complexity to stimulate researchers to develop genuinely new models,” they write.

Reinforcement learning goes from controlling Atari games, to robots, to… freeway onramps?  “Expert level control of Ramp Metering based on Multi-Task deep reinforcement learning” shows how RL methods can be extended to the control systems for the traffic lights that filter cars onto freeways. In tests, the researchers’ system is able to learn an effective policy for controlling traffic across a 20 mile-long section of the 210 freeway in Southern California. Their technique beats traditional reinforcement learning algorithms, as well as a baseline system in which no control occurs at all…
…“By eliminating the need for calibration, our method addresses one of the critical challenges and dominant causes of controller failure making our approach particularly promising in the field of traffic management,” they write.

Soft robots for hard work: UK online supermarket Ocado has tested a new robotic hand, created as part of a European Union ‘Horizon 2020’ research initiative for soft robots. The hand can pick up objects of varying sizes and textures, and is shown deftly handling tricky items like limes and apples. It uses a dextrous gripper called ‘RBO Hand 2’ with developed by the technical university of Berlin. The approach is reminiscent of that of SF-based Otherlab, which is using soft materials and air to build more flexible robots and exoskeletons.

Sizing up deep learning frameworks: the AI community is bad at two things: reproducibility and comparability.  The research paper “Benchmarking state-of-the-art deep learning software tools” asses the varying properties of frameworks like TensorFlow, Caffe, Theano, CNTK, and MXNet, comparing their performance on a wide variety of tasks and hardware substates. Worth reading to get an idea of the different capabilities of this software.

Import AI administrative note:

The riddle of the missing research paper: Last week I profiled some new research from MIT that involved automatically tying spoken words and sections of imagery together. However, due to a clerical error I did not link to the paper. “Learning Word-Like Units from Joint Audio-Visual Analysis

OpenAI bits & pieces:

23 principles to rule them all, 23 principles to bind them: earlier this month a bunch of people involved in the development, analysis, and study of artificial intelligence gathered at Asilomar for the “Beneficial AI” conference, a sequel to a 2015 gathering in Puerto Rico. Many people from OpenAI attended, including myself. There, the attendees helped hash out a set of 23 principles for the development of AI that signatories shall attempt to abide by.

Ian Goodfellow (OpenAI) and Richard Mallah (FLI), in conversation: podcast between Ian and Richard, in which they talk about some of the big AI breakthroughs that happened in 2016, and look ahead to some of the things that may define 2017 (machine learning security! Further development of neural translation systems! Work on OpenAI Universe!, etc).

Inverse autoregressive flow 2.0: Durk Kingma et al have posted a substantial update to the paper: “Improving Variational Inference with Inverse Autoregressive Flow”.

Do fake galaxies dream of the GANs that created them? Ian Goodfellow interview for this article in Nature about how scientists are starting to use AI-generated images to create training datasets to teach computers to spot real galaxies.

Tech Tales:

[2023, a cybercafe in Ankara]

When you were young you studied ants, staring at their nests as they grew, spreading tendrils through the dirt, sometimes brushing their antenna against the perspex walls sandwiching their captured colony. But you liked them best outside – crawling from a crack in the steps by the garage and charting a path along the sidewalk, carrying blades of grass and pebbles into some other nest. Your house was full of the signs of ants; each blob of silicone gel and mortared over holes testifying some pitched battle.

Modern spambots feel a lot like ants to you. After the first AI systems went online around 2018 the bots gained the ability to learn from the conversations with people they engaged on the internet. After this, their skills improved rapidly and their manners became more convincing.

Information started to flow between people and the bots, improving the AI’s ability to gain trust and effectively launder ideas, viruses, links, and eventually outright fraud. Spend a year arguing on the internet with someone and, stranger or no, there’s a good chance you’ll click on a link they post, seeing if it’s one of their nutty websites or something else to confirm your beliefs about them. And all your talking has taught them a lot about you.

The attacks mounted by the AIs destroyed the value of numerous publicly traded social companies. People changed their internet habits, becoming more cautious, better at security, more effective at uploading the sorts of words and images and videos to persuade people that they were real humans in the real world. And the AIs learned from this to.

So now you have to hunt them out, trace their paths and links to find the nests from which they emanate. Like the ants, you don’t get much insight from imprisoning them in display cases; synthetic social networks, where the AI bots are studied as they interact with your own simulated people bots. You feed data to their control systems and try to simulate the experience of the real internet, but soon your little model world goes out of sync with reality. It fails to keep up with those of its peers roaming wild, cut off from the links on the real internet where it gets its software updates – the few bits of code still pushed by humans.

So now you hunt these controllers through the internet and in real life, switching between VPNs and ethereal internet sites, and dusty internet cafes in the baltics and, now, Ankara. But recently you’ve been having trouble finding the humans, and you wonder if some of the swarms you are tracking have stopped taking orders from people. You’ll find out soon enough – there’s an election next year.

Import AI: Issue 27: “Outrageously large” neural nets, AI for math, and the names of three oil rig robots

The future of AI: a big dollop of ‘learn-able computation’, paired with a sprinkling of hand-crafted algorithms: One reason why AlphaGo excelled at Go was because it paired a neural network-based learning system with a hand-tuned near-optimal Monte Carlo Tree Search algorithm. It’s likely that pairing the general-purpose function approximation properties of neural nets, with tried-and-tested algorithms will continue to yield results. (Akin to how people can enhance their mental performance by pairing intuitions with a few well-memorized rule-systems, like memory palaces, propositional calculus, and so on)
… further validation of this approach comes via AI being used for automated math: A Google paper, Deep Network Guided Proof Search, uses Deep Learning techniques to support proof search in a theorem prover. Automated theorem provers (ATP) simplify the lengthy process of verifying logical statements…
… The Google researchers train their AI systems to help guide their ATP along a few exploratory paths, then perform a second (faster) combinatorial search phase using hand-crafted strategies. “We get significant improvements in first-order logic prover performance, especially for theorems that are harder and require deeper search,” they write. ”Besides improving theorem proving, our approach has the exciting potential to generate higher quality training data for systems that study the behavior of formulas under a set of logical transformations,” they write. “This could enable learning representation of formulas in ways that consider the semantics not just the syntactic properties of mathematical content and can make decisions based on their behavior during proof search.”…
… in a further demonstration of the flexibility of basic AI components, the researchers test their system with three different learning substrates: a standard convolutional neural network, a tree-LSTM, and a WaveNet
…this research builds on
earlier work called DeepMath – Deep Sequence Models for Premise Selection, which demonstrated the viability of neural networks for automated logical reasoning.

Driverless buses: driverless vehicles will spend their first years of service in small, controlled environments, like corporate campuses, amusement parks, and diminutive states, such as Singapore. Latest example: Tata Motors, which spent the last 12 months testing self-driving buses on its corporate campus. (Workers might be better off bicycling, given that the buses are rate-limited to less than 10 kilometres per hour.)

Comma again? Breaking AI systems: feed Google’s new neural translation system the wrong string of characters and it might bark ‘Knife, Knife, Knife’’ at you in German. Fun bug, probably to be blamed on trailing commas, found by Iain Murray.

NIPS & Immigration: NIPS 2017 is set to be in America, and that has caused some anxiety among AI researchers troubled by President Trump’s executive order on immigration. petition to alter the location of NIPS here.

Care for a wafer thin AI processor on top of your pi(e)? Google is asking the Raspberry Pi community for tips about what types of ‘smart tools’ it can produce for makers. Fingers crossed it gets a big enough response to start creating ultra-efficient AI software to be deployed on minicomputers like the Raspberry Pi, complementing the existing DIY open source implementations from the hacker community. Perhaps we can pair this with the cardboard drones mentioned last week? Disposable, almost-sentient paper aeroplanes.

Next-gen AI = Talking Pictures: In 2014 and 2015 we saw researchers jointly train word and image models, so computers could generate captions for images.
… Later in 2015 researchers started to experiment with the inverse of this idea, seeing if words could be used to generate imagery. They were successful, and in a little under a year and a half moved from generating low-res, fuzzy images of toilets in fields, to crisp ‘I can’t believe it’s not butter’-grade synthetic images (eg, StackGAN)…
…Now, researchers are jointly training AI systems on audio waveforms and imagery. A new paper from MIT teaches a computer to learn the correspondence between sound and vision…
…The network is trained in two stages: first, researchers teach computers to associate audio segments with particular images, then in a second stage the computer identifies various entities in the images and seeks to link those entities to particular slices of audio. The result is a trained network that can identify specific visual entities from spoken clues
…this has quite subtle implications. For one thing, if you were able to generate a good enough network from English, then were able to train the image-sound correspondence on another language, such as German, you could do so without access to the base german language, instead translating through the shared visual layer…
…“This paves the way for creating a speech-to-speech translation model not only with absolutely zero need for any sort of text transcriptions, but also with zero need for directly parallel linguistic data or manual human translations,” the researchers write…
…new techniques for extracting emotions from speech, like “Emotion Recognition From Speech With Recurrent Neural Network” suggest this could be extended further, blending the emotions into the speech and imagery. (Next step: add smell.)…
… imagine a future where anthropologists seek out people whose language has little to no written record, and translate it into a universal data representation by having people narrate the contents of particular images or movies, pouring their speech into a shared visual dictionary whose entities are redolent of feeling. Brings a whole new meaning to the term ‘emotional palette’.

And so their structures shall be as intricate and befuddling as the architecture of Gormenghast: The AI community’s love of neural nets troubles technology cartographer Bruce Sterling: “They have a baroque, visionary, suggestive, occultist quality when at this historical moment that’s the very last thing we need,” he says.

Everything’s bigger in America – Google research points way to neural nets 1,000X the size of current ones: new Google research, ‘outrageously large neural networks: the sparsely-gated mixture-of-experts layer’, shows how to scale-up neural networks without having to boil the ocean. The new system – Google’s latest approach to applying ‘conditional computation’ to its systems – allows for networks 1,000 times larger than contemporary ones, with only slight losses in computational efficiency…
… The trick to this is the addition of what Google calls a ‘mixture of experts’ layer, which basically gives the network the ability to choose to call on an ever larger pool of ‘expert’ mini neural nets to help classify input. The MOEs are behind a gating network(s) which autonomously chooses how many MOEs to sample data from, letting the network scale in size without becoming totally unwieldy…
… Google tested its approach on a language understanding task and a translation task, attaining good results in both. Perhaps the most convincing evidence for the utility of the new approach lies in its apparent efficiency, with the new approach attaining state-of-the-art results on a language translation task, while using fewer resources…
…Google Neural Machine Translation: 6 days of training across 96 Nvidia K80 GPUs
…Mixture-of-Experts model: 6 days of training across 64 Nvidia K80 GPUs
…(Less GPUs and more performance? Quick, someone send a bouquet of flowers with a note saying ‘Condolences’ to Jen-Hsun Hwang).
…now let’s wait for a follow-up paper where the researchers follow through on their goal of training a trillion parameter model on a one trillion word corpus.

Roughneck robots for grubby deeds: The In Situ Fabricator1 brings us closer to an era where we can deploy robots with some general spectrum of capabilities into chaotic environments like construction sites.The robot is capable of millimeter-level precision, and is tested on two tasks: one, building an “undulating brick wall” (page 7, PDF) out of 1,600 bricks, stacked in a doubled lattice. The second task involves welding wires to create a ‘Mesh Mould’. The researchers are already working on a second version of the robot, and plan to increase its strength by moving from electric motors to hydraulic systems, while reducing its weight from 1.5 tons (too heavy for many buildings) to a more respectable 500 kilograms. The robot’s movement policies are derived from Optimal Control approaches, rather than in-vogue, but still quite young, neural network techniques..
but not all Robots != Robots: This Bloomberg story about robots taking over oil rigs highlights how oil companies have been shedding employees due to a crash in oil prices and, in some cases, replacing them with robots. Read on for the description of National Oilwell Varco Inc.’s ‘Iron Roughneck’ robot, replacing a few jobs. But is automation really to blame for the current job losses? Yes, but it’s hardly original…
Wind the clock back to 1983 and we find a news story talking about roughly the same hardware from roughly the same company doing roughly the same job in oil fields. “Roughnecks speak of their particular “Leroy” or “Igor” or “Billy Bob” as though “he” is a co-worker, which, in fact, is true. Some hands paint the machine with a face, big eyes or tennis shoes,” the news report says.

OpenAI bits&pieces:

Recursive job alert: We’re looking to hire the brilliant person that helps us hire the brilliant persons. Recruitment Coordinator. (And, as ever, we continue to look for machine learning and engineering candidates).

OpenAI Universe: visual guide. Visual illustration from Tom Brown about the diversity of Universe.

Modding OpenAI Gym: blog post, with code, about modifying the reward system of a particular OpenAI Gym environment.

Tech Tales:

[Bushwick, 2025: words projected on the outside of a datacenter.]

Frank McDonald annoys the hell out of you but you need some of the cards in his hand, so have to tolerate his burping and farting and ceaseless shifting in his chair. The fellow next to him, Earl Sewer, smells worse but doesn’t talk so much, so you find him a little easier to deal with.
   Shirley Ribs sits right next to you, and you and her have been trading cards all day. “Thanks Mr Grid,” she says, as you slide over a couple of units.
   “Pleasure’s all mine,” you say, as she flips a couple of cards over, and sends one spinning over to you and another to McDonald.
   The tension’s been running high for an hour or so as the crowd in the room has grown. People in the audience are shouting for everyone to make moves faster, calculate the odds better. The crowd hisses at shoddy play, having grown less forgiving for visibly bad bets. They say the Chinese have a better game going on next door, so for a while cards are tight as people sling their money into the game next door instead.
   That makes McDonald get restless, and so he starts trying to flip the game by buying up cards from you, then not trading any with Sarah Market, instead just switching back and forth between you and Sewer and Butcher. Market gets angry and starts trying to do a side-deal with the dealer to trade some units for surplus cards from the Chinese game, but the dealer says he doesn’t have the capacity.

You get a handle on it eventually, winning back a few rounds from McDonald while calming him down with the odd bluff. If the game flipped it would been the first time in seventeen years of continuous play. Note to self: almost got into trouble there, so deal differently next time.

[Note: these kinds of ‘state art’ performances proliferated for a while, as people sought to dramatize the inner workings of AI systems. In this performance artist T K Wenzler trawled the market feeds for interactions between AI representatives from a number of retail, infrastructure, and electricity players, then thanks to the MacArthur grant, bought up some of the more obscure feeds emanating from the trader AIs. He trained the data into representations of each participant, then applied domain confusion techniques to adapt this representation into a gigantic movie corpus, culled from security camera footage of Prison card games.

The installation ran 2024-2031. Discontinued to data feeds becoming un-parseable, after the supreme court ruled for a relaxation of interpretability standards.]

Import AI: Issue 26: Low-wages for robots, AI optometry, and RL agents that tell you what they’re thinking

Deep learning needs discipline: AI researchers need to do a better job of making their experiments comparable with one another by publishing more details about the underlying infrastructure and specific hyperparameter recipes they use, says Google Denny Britz. ”The difficulty of building upon other’s work is a major factor in determining what research is being done,” he writes. “It’s easiest, from an experimental perspective, to build upon one’s own work… It also leads to less competition”. Researchers can prevent groupthink and enhance replicability by publishing code to go along with their papers, and giving all the details needed to aid replication.

Self-driving cars save lives: Tesla cars with Autopilot installed have a 40% lower crash rate than those lacking the software, according to data the company shared with the National Highway Traffic Safety Association. Finally, a figure that proves the residents of Duckietown are safer than your average rubber duck…
but self-driving tech may also magnify our selfishness: today, downtown urban driving is frequently fouled up by people that stop their cars and hop into a cafe to grab a drink while their vehicle idles outside, and by the incorrigible optimists that endlessly circle a street waiting for a parking spot to open up. Roboticist Rodney Brooks suspects that when really smart autonomous cars arrive people will tend towards even more of these selfish occurrences, hopping out of their AV to get a latte and telling the car to hover nearby, or autonomously circle for a parking spot. I can see a kind of intermediary future where urban traffic is more unpleasant due to hordes of dutiful vehicles, unwittingly enabling their owners’ selfishness.

AI creates: endlessly replicating cultural artifacts: given enough data, neural networks can learn to generate anything. That points to a future where certain visual classes of object, ranging from comic book characters, to landscape shots, to others, will be partially generated and refined by AI. This blog about using recurrent neural networks to generate Egyptian-esque hieroglyphics is a nice example of that phenomenon in action.

Mutating AI programming languages: Facebook and others have released PyTorch. The AI programming framework implements a technique called ‘reverse-mode auto differentiation’ to make it easier to modify neural networks created using the language. “While this technique is not unique to PyTorch, it’s one of the fastest implementations of it to date. You get the best of speed and flexibility for your crazy research,” the project writes. It’s open source, naturally.

Good morning, HAL, the AI cyclops-optometrist will see you now: Jeff Dean from Google likes to say that computers have recently ‘begun to open their eyes’. That’s in reference to the powerful image recognition algorithms we’ve developed in the past half decade. But what we’re lacking for these computers is an optometrist – researchers don’t have a good understanding of the characteristics of computer vision, and much of our research is made up of trial-and-error as much as theory…
…Now, academics from the University of Toronto are trying to change this with a paper that analyzes the structure of the effective receptive field in neural networks. Their work finds interesting parallels between how receptive fields behave in convolutional neural networks versus in mammalian visual systems, and provides clues as to ways to increase the efficiency of future networks. Techniques like this, paired with ones like the spatially adaptive computation time paper, promise a future where our computers can see more efficiently, and we can work out how to tune them based on a more rigorous theoretical understanding of their unblinking ‘eyes’.

AI and automation:The technology is not the problem. The problem is a political system that doesn’t ensure the benefits accrue to everyone,” says Geoff Hinton. In potentially related news, regulator-flouting self-driving car startup comma ai wants to ‘build the largest AI data collection machine in history’.

Cost per hour for a typical industrial robot, according to Kuka: 5 Euros
Cost per hour for worker to do similar job:
…Germany: 50 Euros
…China: 10 Euros
…”“It took 50 years for the world to install the first million industrial robots. The next million will take only eight,” reports Bloomberg.

Mysterious hippocampal signals: scientists have conducted a study of the firing of hippocampal place cells in mice. (Place cells tend to fire in response to the living entity being in a specific location, hence the name). The experiment suggests that place cells may encode some other type of information, along with geographical markers. Further analysis here will lead to more clues about how the brain represents information. We already know that London taxi drivers store a mental map of the city in the hippocampus (which appears to have an enlarged volume as a consequence) — perhaps the place cells could also function as a geographically-indexed store of bawdy jokes?

22nd Century Children’s Books: the Entertainment Intelligence Lab at Georgia Tech has trained a reinforcement learning agent to shout about its thoughts and plans as it plays classic game Frogger. “Looking forward to a hopping spot to jump to catch my breath,” it says. Good luck, Froggo!
…This kind of work could help solve the interpretability issues of AI, by making the thought processes of AI agents easier for people to diagnose and analyze…
… I can also imagine building a new form of children’s entertainment with this technology, where the characters are RL agents and they shout about their goals and ideas as they proceed through dynamically generated worlds.

Megacorps as powerful as countries: “I was recently together with the Prime Minister of quite an important country who told me there are three or four powers left in the world. One is US, one is China, and the other is Alphabet,” Klaus Schwab told Alphabet co-founder Sergey Brin, during a conversation in Davos. (Because you can’t be right all the time bonus: Brin said he mostly ignored AI at Google in the early days, only later realizing its huge importance.)

AI system gets FDA approval: Arterys has gained clearance from the US Food and Drug Administration to market Cardio DL, software that uses AI to automatically segment images taken from cardiac MRIs. Another reminder that AI technology moves very rapidly from research into production.

After the apocalypse, the data centers shall continue: this fluffy, PR video from Amazon Web Services reminds me of the tremendous investments that Amazon, Google, Microsoft, Facebook, and others have made into renewable energy infrastructure; from AWS’s fleets of solar panels, to Google’s stake in the Ivanpah solar power facility, to Facebook’s air-cooled arctic circle enclave, a new baroque landscape is taking shape, in service of the neo-feudal empires of the digital world…
…And should some calamity strike, we can imagine that the computers in these football field-sized computer cathedrals will be the last to turn off. However, the inefficient, closed-circuit environments of legacy data centers will probably be the last to house human life, as depicted in this short story by Cory Doctorow called ‘When Sysadmins Ruled the Earth’. Quick, befriend a sysadmin at a non-tech company!

OpenAI bits&pieces:

OpenAI’s Tom Brown will be speaking at the AI By the Bay conference in San Francisco in March. Readers can get 20% off tickets for the conference by heading over to this link and using the promo code ‘OPENAI20’.

Tech Tales:

[2035, Moonbase Alpha, the Moon]

Two astronauts sit in front of a 6-foot wide and 3-foot tall screen. The main lights are out, and their faces are lit by the red strobing of the emergency system.

“How long has it been like this,” says one of the astronauts.
“About two hours,” says the other. “The executable came in through the comm relay. They encoded it in transmission intervals on some of the automated logistics channels. Which means-”
‘-which means that they’d already bugged the software when it was installed, so it could receive the payload.”
Both astronauts lean back and stare at the screen. One of them places their hand across their face and squints through their fingers at the images rolling across the monitor.

ERROR. DEATH INEVITABLE! Scrolls across the screen. The text blinks out, replaced by a fuzzy image of an astronaut wearing a priest’s ID patch and no helmet standing in an airlock. The screen shimmers and, next to the priest, appears a teenage girl, also lacking a helmet. Now a helmet materializes in the air, hovering between them. Green circles flash over their faces, flickering as the AI tries to pick who to save. The text appears again: ERROR. DEATH INEVITABLE!

“We’ve gotta burn it,” says one of the astronauts. “Go full analogue and rebuild the base from the ground up.”
“But that’ll take weeks!”
“We don’t have a choice. The longer we wait the worse the damage is going to be. It’s already started shunting oxygen into different airlocks. Next, it might start opening some of the doors.”

Class note: These kinds of ‘trolley problem’ viruses proliferated during the late 2020s and early 2030s, before the UN mandated AI systems be installed with their own moral heuristics, codename: ETHICS WARDENS.

Import AI: Issue 25: Open source neural machine translation, Microsoft acquires language experts Maluuba, Keras tapped for TensorFlow

If this, then drive: self-driving startup NuTonomy is using a complex series of rules to get its self-driving cars in Singapore to drive safely, but not be so timid that they get can’t get anywhere. Typically, AI researchers prefer to reduce the number of specific rules in a system and instead try to learn as much behavior as possible, inferring proper codes of conduct from data gleaned from reality. NuTonomy’s decision to hand-code a hierarchy of rules into its system provides a notable counterpoint to the general trend towards learning everything from data. The company plans to expand its commercial offering in Singapore next year, though its cars will still be accompanied by a human ‘safety driver’ — for the time being.

Disposable lifesaving drones: Otherlab is building disposable drones with cardboard skins, as part of a research program funded by DARPA. The drones lack an onboard motor and navigate by deforming their wing surfaces as they glide to their targets.
…perhaps one day these cardboard drones will fly in swarms? Scientists have long been fascinated by the science of swarms because they afford distributed resiliency and intelligence. The US military has recently highlighted how swarms of drones can perform the job of much larger, more expensive, single machines. I wonder if we’ll eventually develop two-tiered swarms, where some specialized functions are present in a minority of the swarm. After all, it works for ants and bees.

AI acquisitions: Amazon quietly acquired security startup Harvest.AI, according to Techcrunch. Next, Microsoft, acquired Canadian AI startup Maluuba…
…Maluuba has spent a few years conducting research into language understanding, publishing research papers on areas like reading comprehension and dialogue generation. It has also released free datasets for the AI community, like NewsQA
…Deep learning stalwart Yoshua Bengio will become an advisor to Microsoft as part of the Maluuba acquisition – quite a coup for Microsoft, though worth noting Bengio advises many companies (including IBM, OpenAI, and others). This might make up for Microsoft losing longtime VP Qi Lu, who had done work for the company in AI and is now heading to Baidu to become its COO.

Sponsored: RE•WORK Machine Intelligence Summit, San Francisco, 23-24 March – Discover advances in Machine Learning and AI from world leading innovators and explore how AI will impact transport, manufacturing, healthcare and more. Confirmed speakers include: Melody Guan from Google Brain; Nikhil George from Volkswagen Electronics Research Lab and Zornitsa Kozareva, from Amazon Alexa. The Machine Intelligence in Autonomous Vehicles Summit will run alongside, meaning attendees can enjoy additional sessions and networking opportunities. Register now.

Keras gets TensorFlow citizenship: high-level machine learning library Keras will become an official, supported third-party library for TensorFlow. Keras makes TensorFlow easier to use for certain purposes and has been popular with artists and other people who don’t spend quite so much time coding. Anything that broadens the number of people able to fiddle with and contribute to AI is likely to be helpful in the short term. Congratulations to Keras’s developer Francois!

Don’t regulate AI, have AI regulate the regulators: Instead of regulating AI, we should create ‘AI Guardians’ – technical oversight systems that will be bound up in the logic of the AIs we deploy in the world, says Oren Etzioni, CEO of the Allen Institute for AI Research. (Etzioni doesn’t rule out all cases of regulation but, as with what parents say about sugar or computer games, his attitude seems to be ‘a little bit goes a long way’.)

Self-driving car deployment, AKA Capitalism Variant A, versus Capitalism Variant B: “Industry and government join hands to push for self-driving vehicles within China,” reports Bloomberg, as Chinese search engine Baidu joins up with local government-owned automaker BAIC to speed development of the technology….
… Meanwhile, in America, the Department of Transport has formed a federal Committee on Automation, which gathers people together to advise the DOT on automation. Members include people from Delphi Automotive, Ford, Zipcar, Zoox, Waymo, Lyft, and others. “This committee will play a critical role in sharing best practices, challenges, and opportunities in automation, and will open lines of communication so stakeholders can learn and adapt based on feedback from each other,” the DoT says…

Open Source Neural translation: Late in 2016 Google flipped a switch that ported a huge chunk of its translation infrastructure over to a Multilingual Neural Machine Translation system. This tech combined the representations of numerous languages into a big neural network, and let you translate between pairs that you didn’t have raw data for. (So, if you had translations for English to Portuguese, as well as ones for Portuguese to German, but no corpus of English to German, this system could attempt to bridge the gap by tunneling through the joint representations from its Portuguese expertise…
…Now, researchers Yoon Kim and harvardnlp, have released an open source neural machine translation system written in Torch, so people can build their own offline, non-cloud translation systems. The Babelfish gets closer!

AI, AI everywhere, and not a Bit of information to send: our automated future consists of many machines and little human-accessible information, according to this airport-hell tale from Quartz. Technology that seems efficient in the aggregate can have exceedingly irritating edge case failures.

$27 million for AI research: Reid Hoffman, Pierre Omidyar, the Knight Foundation, and others, have put $27 million toward funding research into ethical AI systems. The funds will support research that combines the humanities with AI, and will help answer questions about how to communicate about the capabilities of the technology, what controls should be placed over it, and how to grow the field to ensure the largest number of people are involved in the design of this powerful technology, among others.

Power-sipping eyes in the sky: the US military says it’s pleased with the performance of IBM’s neuromorphic TrueNorth processor. The chip performs on par with a traditional high-end computer for AI-based image identification tasks, while consuming between one twentieth and one thirtieth the power of an NVidia Jetson TX1 processor, apparently. This represents another endorsement of IBM’s idea that non-Von Neumann architectures are needed for specialized AI chips. However, deploying the software on the chip can be a bit more laborious than going via NVidia’s well supported inbuilt ecosystem, the military says.

Deep learning is made of people! Startup Spare5 has raised $14 million and renamed itself to Mighty AI, as it looks to capitalize on the need for better training data for AI. It will compete with companies like Crowdflower and services like Amazon’s Mechanical Turk to offer companies access to a pool of people they can tap to label data for them. One note to remember: for research, it’s possible to mostly use public datasets when developing new techniques, but for commercial products you’ll typically need highly-specific labelled data as you build products for specific verticals.

Never underestimate the pre-Cambrian computing power of government: I had a friend of my Dad’s who, a few years ago, told me he was maintaining some old UK National Health Service systems by writing stuff for them in BASIC – something I recollect whenever I have cause to visit a UK emergency room. It’s almost reassuring that the White House is no different.  “We had a computer on our desk. We didn’t have laptops, we didn’t have iPads, we didn’t have iPhones, and we had about a half a bar of service. So if you brought in your own equipment, you couldn’t use it…We had Compaqs running Windows 98 or 2000. No laptops. It was like we had gone back in time,” staffers recall. Technology takes a long time to turn over in large bureaucracies, so while we’re all getting excited about AI it’s worth remembering that uptake in certain areas will be sl-oooo-wwww.

Computer, enhance: just a year ago, researchers were getting excited about deep learning based techniques to upscale the resolution of photos. These methods work, roughly, by showing a neural network loads of small pictures and their big picture counterparts, and train it to figure out how to infer the high resolution details from low-resolution inputs. You wouldn’t want to use this to increase the resolution of keyhole satellite photos of foreign arms dumps (as any new or errant information here could have extremely unpleasant consequences), but you might want to use it to increase the size of your wedding photos…
Twitter appeared to be enthused by this technique when it acquired UK startup Magic Pony, which had done a lot of research in this area. Now Google is tapping the same techniques to save 75% of bandwidth for users of Google plus by using its RAISR tech, which it first talked about in November. Another demonstration of the rapid rate at which research goes into production within AI.

Think AI is automated? Think again. You’ve heard of gradient descent – one of the processes by which we can propagate information through AI. Well, there’s a joke among professors that for sufficiently hard problems you also turn to another less known but equally important technique called ‘Grad Student Descent’, the formula of which is roughly:
Solution = (N post-doc humans * (Y ramen * Z coffee))…
… so as much as the research community talks about new techniques based around learning to learn, and getting AI to smartly optimize its own structure, it’s worth remembering that most real world applications of the technology rely more on the ingenuity of people than of the amazing power of the algorithms…
…David Brailovsky, who recently solved a traffic light classification competition, explains that “The process of getting higher accuracy involved a LOT of trial and error. Some of it had some logic behind it, and some was just “maybe this will work”.” Some tricks tried include rotating images, training with a lower learning rate, and, inevitably, finding and correcting bugs in the underlying dataset. (Hence the business opportunity for aforementioned companies like Mighty AI, Crowdflower, and so on.)

OpenAI bits&pieces:

What does it mean to be the CTO of OpenAI, and how did that role come about? Co-founder Greg Brockman explains. Shame he gave away his trick about deadlines, though.

Tech Tales:

[2019: A cafe, somewhere in the baltics.]

So it comes down to this:  after two years of work, you just write a few lines, and shift the behavior of, hopefully, millions of people. But you need to get this exactly right, or else the algorithms could realize the charade and you burn the accounts for almost no gain, he thinks, hands hovering above the keyboard. He’s about to send out a very particular product endorsement from the account of a famous, Internet personality.

He spent years constructing the personality, building it up from the dry seeds of some long-inactive, later-deleted, tumblr and instagram accounts. It took years, but the ghost has grown into a full internet force with fans and detractors and even a respectable handful of memes.

The next step is product endorsement – and it’s a peculiar one. SideKik, as it’s called, will give the ghost-celeb’s followers the chance to give control over a little bit of their online identity to a small AI, said to be controlled by the celebrity. Be a part of something bigger than yourself!, he wants the celebrity to say and the fans to think, download SideKik and let’s get famous together!

What the fans don’t know is that if they give away SideKik they won’t be gaining the subtle, occasional input of the celebrity, instead they’ll become an extension of the underlying thicket of AI systems, carefully sculpted and maintained by the man at the keyboard. Slowly, they’ll be used to gather microscopic shreds of data from the internet through targeted messages with their own followers, and they’ll also be used to create the appearance of certain trends or inclinations in specific groups on the internet. The anti-AI detectors are getting better all the time now, so it takes all this work just to create the facsimile of a real community orbiting around a real star. Due to the spike in illegitimate traffic from automated AI readbots, typical internet ads have become so common and so abused as to be almost worthless, so what’s a marketer meant to do?, he thinks, composing his next few words that could give him a legion of unsuspecting guerrilla marketers.   

Import AI: Issue 24: Cheaper self-driving cars, WiFi ankle bracelets, dreaming machines

Self-driving cars are getting cheaper as they get smarter:  LiDAR sensors give a self-driving car a sense in the form of a rapidly cycling laser. Now it appears that this handy ingredient is getting cheaper. A modern LiDAR sensor costs roughly 10% of the price for a 2007 one, when you adjust for inflation. Just imagine how much cheaper the technology could become when self-driving cars start to hit the road in large numbers…
…LiDAR sensor unit prices (price inflation adjusted to 2016 level, somewhat differing capabilities):
2007: $89,112: Velodyne, HDL-64
2010: $33,821: Velodyne, HDL-32E
2014: $8,351: Velodyne, PUCK
2017:  $7,500: Alphabet Waymo, custom design
~2017/18: $50: Velodyne, solid-state LIDAR

And Lo The Transgressors Shall Be Known By Their Absence From The Fuzz&Clang Of Our Blessed Digital Life: Cyber-criminals should be forced to wear wifi jammers to prevent them from using the internet, rather than being sent to prison, says Gavin Thomas, Chief Superintendent of the UK Police Superintendents’ Association. “If you have got a 16-year-old who has hacked into your account and stolen your identity, this is a 21st century crime, so we ought to have a 21st century methodology to address it.” he says, then suggests that the offenders also attend “an ethics and value programme about how you behave online, which is an area that I think is absent at the moment.”

Hard Takeoff Bureaucratic-Singularity: I recently had some conversations about the potential for semi-autonomous AI systems to develop behaviors that had unintended consequences. One analogy presented to me was to think of AI researchers as the people that write tax laws, and AI systems as the international corporations that will try to subvert or distort tax codes to give themselves a competitive advantage. AI systems may break out of their pre-defined box so that they can best optimize a given reward function, just as a corporation might conduct baroque acts of legal maneuvering to fulfill its fiduciary responsibility to shareholders.

AI’s long boom: venture capitalist Nathan Benaich says of AI:  It’s not often that several major macro and micro factors align to poise a technology for such a significant impact…Researchers are publishing new model architectures and training methodologies, while squeezing more performance from existing models…the resources to conduct experiments, build, deploy and scale AI technology are rapidly being democratised. Finally, significant capital and talent is flowing into private companies to enable new and bold ideas to take flight.”

Embodied personal assistants: most companies have a strong intuition that people want to interact with digital assistants via voice. The next question is whether they prefer these voices to be disembodied or embodied. The success of Amazon’s ‘Alexa’ device could indicate people like their digital golems to be (visually) embodied in specialized devices…
… Google cottoned onto this idea and created ‘Google Home’. Now C
hinese search engine Baidu has revealed a new device, called Little Fish, that sits an AI system inside a little robot with a dynamic screen that can incline towards the user, somewhat similar to the (delayed) home robot from Jibo….
Research Idea: I find myself wondering if people will interact differently with a device that can move. Would it be interesting to conduct a study where researchers place a variety of these different systems (both static and movable) into the homes of reasonably non-technical people – say, a retirement home – and observe the different interaction patterns?

The wonderful cyberpunk world we live in – a Go master appears: DeepMind’s Go-playing AlphaGo system spent the last few days trouncing the world’s Go-playing community in a series of 60 online games, all of which it won. (Technically, one game was a draw due to a network connectivity technicality, but what’s a single game between a savant super-intelligence and a human?) Champagne all round!…
…I was perplexed by the company’s decision to name its Go bot “Master”. Why not “Teacher”? Surely this better hints at the broader altruistic goal of exposing AlphaGo’s capabilities to more of the world?

Chess? Done. Go? Done. Poker? Dealer’s choice: Carnegie Mellon University researchers have built Libratus, a poker bot that will soon challenge world-leading players to a poker match. I do wonder if the CMU system can master the vast statistical world of Poker, while being able to read the tells&cues that humans search for when seeking to outgamble their peers.

The incredible progress in reinforcement learning: congratulations to Miles Brundage, who correctly predicted the advancement of reinforcement learning techniques on the Atari dataset in 2016. You can read more about why this progress is interesting, how he came to make these predictions, and what he thinks the future holds in this blog here.

Under the sea / under the sea / darling its better / droning on under the sea: Berlin-based robot company PowerVision has a new submersible drone named PowerRay. The drone, which claims to be ‘changing the fishing world’, appears to be built with inspiration from the deep sea terror the Anglerfish, where the drone dangles a hook with a lure in front of its mouth, and instead of a mouth it has a camera which streams footage to the iPad of the ‘fisher’, who operates the semi-intelligent drone. A neat encapsulation of the steady consumerization of advanced robots.

Baidu’s face recognition challenge: Baidu will challenge winners of China’s ‘Super Brain’ contest to a facial recognition competition. Participants will be shown pictures of three females taken where they were between 100 days and four years old, then look at another set of photos of people in their twenties and identify the adults that match the babies. This is a task that is easy for humans but extremely hard for computers, says Baidu’s Chief Scientist Andrew Ng. One contestant “Wang Yuheng, a person with incredible eyesight, can quickly identify a selected glass of water from 520 glasses,” reports South China Morning Post.

Nightmare Fuel via next-frame prediction on still images: recently, many researchers have begun to develop prediction systems which can do things like look at a still frame from a video and infer some of the next few frames. In the past these frames tended to be extremely blurry, with, say, a photo of a football on a soccer pitch seeing the football smear into a kind of elongated white spray painted line as the AI attempts to predict its future. More recently many different researchers have developed systems with a better intuitive understanding of the scene dynamics of a given frame, generating crisper images…
 In the spirit of ‘ Q: why did you climb that mountain? A: because it’s there’, artist Mario Klingemann, has visualized what happens when you apply this approach to a single static image which is not typically animated. Since the neural network hasn’t learned a dynamic model it instead spits out a baby’s head that screams ‘I’M MELTING’, before subsiding in a wash of entropy.

OpenAI bits and pieces

Here we GAN again: OpenAI research scientist Ian Goodfellow has summarized the Generative Adversarial Network tutorial he gave at NIPS 2016 and put it on Arxiv. Thanks Ian! Read here.

Policy Field Notes: Tim Hwang of Google and I tried to analyze some of the policy implications from AI research papers at NIPS in this ‘Policy Field Notes’ blogpost. This sort of thing is an experiment and we’ll aim to do more (eg, ICLR, ICML), if people find it interesting. Zack Lipton was kind enough to syndicate it to his lovely ‘Approximately Correct’ blog. Let the AI Web Ring bloom!

Tech Tales:

[2022: A lecture hall at an AI conference, held in Reykjavik.]

A whirling cloud formation, taken from a satellite feed of Northern California, sighs rain onto the harsh white expanse of a chunk of the Antarctic ice sheet. The segment is in the process of calving away from main continent’s iceshelf, as a 50-mile slit in the ice creaks and groans and lengthens. Soon it shall cleave. Now there’s a sigh of wind, followed by the peal of trumpets. Three flocks of birds shimmer into view, wings beating through the rain. The shadows of the creatures pixelate against the clear white of the ground, occasionally flared out by words that erupt in firework-slashes across the sky: ‘avian’, ‘flight’, ‘distance’, ‘cold’, ‘california’, ‘ice’, ‘friction’. The vision freezes, and a laser pointer picks out one bird’s beak catching the too-yellow light of an off-screen sun.

“It’s not exactly dreaming,” the scientist says, “but we’ve detected something beyond randomness in the shuffling. This beak, for instance, has a color tuned to the frequency of the trumpets, and the ice sheet appears to be coming apart at a rate determined by the volume of rain being deposited on the ground.”

He pauses, and the assembled scientists make notes. The bird and its taffy-yellow beak disappear, replaced by a thicket of graphs – the chunky block diagrams of different neural network layer activations.

“These scenes are from the XK-23C instance, which received new input today from perceptual plug-in projects conducted with NOAA, National Geographic, the World Wildlife Fund, and NASA,” he says. “When we unify the different inputs and transfer the features into the XK-23C we run a slow merging operation. Other parts of the system are live during this occurrence. During this process the software appears to try to order the new representations within its memory, by activating them with a self-selected suite of other already introduced concepts.”

Another slide: an audio waveform. The trumpet chorus peals out again, on repeat. “We believe the horns come from a collection of Miles Davis recordings recommended by Yann Lecun. But we can’t trace the tune – it may be original. And the birds? Two of the flocks consist of known endangered species. The third contains some of a type we’ve never seen before in nature.” 

Import AI: Issue 23: Brain-controlled robots, nano drones, and Amazon’s robot growth

Fukoku Mutual Life Insurance Co. plans to eliminate 35 jobs following the introduction of an insurance AI system from IBM Watson, according to Mainichi.  Insurance work is a good candidate for AI automation as it deals with vast amounts of structured data, and it’s easy to recalibrate strategies according to financial performance. Expect more here.

Rise of the (cheap) machines: Robots are expensive, dangerous, and hard to program. Startup Franka Emika aims to solve two of those problems with a new robot arm that costs about ten thousand dollars, which is significantly cheaper than similar arms from companies such as Rethink Robotics and Universal Robots.  Founder Sami Haddadin once demonstrated the safety of the robot’s ‘force sensing technology’ by an arm to try to stab him with a knife to show off how the force-sensing system would prevent them from impaling him. Courage.

Private infrastructure for a public internet:Since 2008, instead of applying leap seconds to our servers using clock steps, we have “smeared” the extra second across the hours before and after each leap. The leap smear applies to all Google services, including all our APIs,” Google says. Developers can access this Network Time Protocol for free via the internet here.

Pete Warden of Google kicked off a great trend with his ‘TensorFlow for poets’ tutorial. Now Googler @hardmaru has followed it up with Recurrent Neural Networks for Artists.

A drone for your pocket, sir? In Santa’s stocking this year for the US marine corps? Hand-held ‘nanodrones’, which will be used for surveillance and navigation when deployed. The marine corps will field 286 of these systems in total. It would be helpful if we had a decent system to track the increasing complexity in software being deployed on drones, as well as hardware. How long until we can fit a trained AI inside the computational envelop of snack-sized drones like these?
… in related news, local lawmakers in North Dakota passed a bill regulating the use of drones by police force. The original legislation forbad state agencies from using drones armed with “any lethal or non-lethal weapons”. The legislation was amended before passing to only include a ban on lethal weapons.
drones are already being used in traditional and non-traditional warfare. Here’s a video from AFP’s Deputy Iraq Bureau Chief of Iraq forces firing at an Islamic State drone that had dropped an explosive charge on one of the soldiers. Elsewhere, Israel’s Air Force released a video claiming to depict air force fighter jets shooting down a Hamas drone

Mind-controlled robots: in the future, we can expect armies to deploy a mixture of AI-piloted drones and human-piloted ones. This research paper ‘Brain-Swarm Interface (BSI): Controlling a Swarm of Robots with Brain and Eye Signals from an EEG Headset” describes a way to let a human control drones through a combination of thoughts and eye movements. People can control the drones directionally (up, down, left, and right) via tracking eye movements, and also through their thoughts. The technology is able to detect two distinct states of thought. One mental state forces the drones to disperse as they travel, and  the other forces them to aggregate together. They validated the approach on a handful of real-world robots, as well as 128 simulated machines.

Input request: Miles Brundage of the Future of Humanity Institute is soliciting feedback for a blogpost about AI progress. Let him know what he should know.

Rio Tinto has deployed 73 house-sized robot trucks from Japanese company Komatsu across four mines in Australia’s barren, Mars-red northwest corner. The chunky, yellow vehicles haul ore for 24 hours a day, 7 days a week. Next, it plans to build robotic train that can be autonomously driven and loaded and unloaded.

Number of robots deployed in Amazon facilities (primarily Kiva systems robots for fulfillment center automation):
– 2013: 1000.
– 2014: 15,000.
– 2015: 30,000.
– 2016: 45,000.

A reasonably thorough examination from Effective Altruism of the different research strategies people are using within AI safety research. Features analysis of:MIRI, FHI, OpenAI,  and the Center for Human-compatible AI, among others.

Fashion e-tailer GILT is using image classification techniques to train AI to identify similar dresses people may like to purchase. Typical recommender systems use hard-wired signals and techniques like collaborative filtering to offer recommended products. Neural networks can instead be trained to infer some of the underlying differences without needing them to be labelled, instead you show it dresses and similar dresses, then it learns to identify subtle features of similarity…
… expect much more here as companies begin to use generative adversarial network techniques to build more imaginative Ais to offer more insightful recommendations. GAN techniques can let you manipulate certain aesthetic qualities via latent variables identified by the AI, as this video from Adobe/UCBerkeley shows. It’s likely that these techniques, combined with evolution strategies such as those used by Sentient via its ‘Sentient Aware Visual Search’ product will dramatically improve the range and diversity of recommendations companies can offer. Though I imagine Amazon will still notice you have bought a hammer and then helpfully spend the next week suggesting other, near-identical hammers for you to buy.

Tech Tales:

[2025: A protest outside the smoked-glass European headquarters of a technology company.]

Police drones fuzz the air above the hundred or so protesters. ‘Stingray’ machines sniff the mobile networks, gathering data on the crowd. People march on the entrance of the tech company’s building with placards bearing slogans for a chaotic world: Our data, our property! Pay your taxes or GET OUT! Gentrifiers!. The police keep their distance, forming a perimeter around the crowd. If the protesters misbehave then this will tighten into what is known as a ‘kettle’, enclosing the crowd. Then each member will be individually searched and questioned, with their responses filmed by police chest-cams and orbiting Evidence Drones.

In the midst of the crowd one figure shrugs off their backpack and places it on the crowd before them, then stoop down to open it. They remove a helmet and place it on their head. Little robots the size of a baby’s hand begin to stream out of the bag, rustling across the asphalt toward the entrance to the tech HQ. The protesters part and let the river of living metal through.

The robots reach the sides of the building and start to climb onto the walls. They flow up the stone columns flanking the two-story high glass doors. The protester wearing the helmet stares at the entrance, raising a hand to steady the helmet as he cocks his head. The drones begin to waltz around another, hissing, as each of them spurts a splotch of red paint onto the side of the building.

Slowly, the machines inscribe an unprintable string of swearwords onto the glass of the doors and stone of the walls. It’s on the 7th swearword that the drones stop, as they all stiffen at once and drop off the side of the building, after the police grab the protester and rips the helmet off his head.

When he has his date in court he is given a multi-year jail sentence, after the police find that the paint expelled by the drones contained a compound that ate into the glass and stone of the building, scarring the words into it.

Import AI: Issue 22: AI & neuroscience, brain-based autotuning, and faulty reward functions

Chinese scientists call for greater coordination between neuroscience and AI: Chinese AI experts say in this National Science Review interview that students should be trained in both AI and neuroscience, and ideas from each discipline should feed into others. This kind of interdisciplinary training can spur breakthroughs in ambitious, difficult projects, like cognitive AI, they say.

AI could lead to the polarization of society, says the CEO of massive outsourcing company Capgemini. Firms like Capgemini, CSC, Accenture, Infosys, and others, are all turning to AI as a way to lower the cost of their services, having started to run out of pools of ever-cheaper labor…
…governments are waking up to the significant impact AI will have on the world. “Accelerating AI capabilities will enable automation of some tasks that have long required human labor. These transformations will open up new opportunities for individuals, the economy, and society, but they will also disrupt the current livelihoods of millions of Americans,” wrote the White House in a blog discussing its recent report on ‘Artificial Intelligence, Automation, and the Economy’.

Apple publishes its first AI paper: Apple has followed through on comments made by its new head of ML, Ruslan Salakhutdinov, and published an academic paper about its AI techniques. Apple’s participation in the AI community will help it hire more AI researchers, while benefiting the broader AI community. In the paper, ‘Learning from simulated and unsupervised images through adversarial training’, the researchers use unlabelled data from the real-world to improve the quality of synthetic images dreamed up by a modified generative adversarial network, which they call a ‘SimGAN’.

7 weeks of free AI education: Jeremy Howard has created a new free course called ‘practical deep learning for coders’. Free as in free beer. If you have the time it’s worth doing.

Brain-based auto-tuning: a new study finds that the brain will tune itself in response to uttered speech so as to better interpret or pick-up audio in the future. “Experience with language rapidly and automatically alters auditory representations of spectrotemporal features in the human temporal lobe,” write the researchers. “Rather than a simple increase or decrease in activity, it is the nature of that activity that changes via a shift in receptive fields”. Learn more in the paper, ‘Rapid tuning shifts in human auditory cortex enhance speech intelligibility’.
… figuring out how our brain is able to interpret audio signals – and optimize itself to deal with loud background noise, accents, unfamiliar cadences, corrupted audio, and so on, may let us develop neural nets that contain richer representations of heard speech. That’s going to be necessary to capture the structure inherent in language and prevent neural nets from simply generating unsatisfying (but amusing) gobbledygook like this (video). The research community is hard at work here and systems like ‘Wavenet’ hold the promise for much better speech generation…
… learning about the brain and using that to develop better audio analysis systems will likely go hand-in-hand with the (probably harder) task of building systems that can fully understand language. Better natural language understanding (NLU) systems will have a big impact on the economy by making more jobs amenable to automation. In 16 percent of work activities that require the use of language, increasing the performance of machine learning for natural language understanding is the only barrier to automation, according to a McKinsey report: “Improving natural language capabilities alone could lead to an additional $3 trillion in potential global wage impact.” (Page 25).

New AI research trend: literary AI titles… as the pace of publication of Arxiv papers grows people are becoming ever more creative in how they title their papers to catch attention, leading to a peculiarly literary bent in paper titles. Some examples: The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives, A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs, and Combating Reinforcement Learning’s Sisyphean Curse with Intrinsic Fear. (Still waiting to contribute my own magnum opus ‘down and dropout in Paris and London’).

How do I know this? Let me tell you! New research tries to make AI more interpretable by forcing algorithms to not only give us answers, but give us an insight into their reasoning behind the answers, reports Quartz. We’ll need to create fully interpretable systems if we want to deploy AI more widely, especially in applications involving the potential for loss of life, such as self-driving cars.

222 million self-driving miles, versus 2 million: Tesla’s self-driving cars have driven a cumulative 222 million miles in self-driving mode, while Google’s vehicles have covered merely 2 million miles in the same mode since 2009, reports Bloomberg. As competition grows between Uber, Google, and Tesla it’ll be interesting to see how the companies gather data, whether one company’s mile driven in autonomous mode is as ‘data rich’ as that driven by another (I suspect not), and how this relates to the relative competitiveness of their offerings. Google is due to start trialling a fleet of cars with Fiat in 2017, so we’ll know soon another.

DeepPatient: scientists use deep learning techniques (specifically, stacked denoising autoencoders) to analyze a huge swathe of medical data, then use the trained model to make predictions about patients from their electronic health records (EHRs). “This method captures hierarchical regularities and dependencies in the data to create a compact, general-purpose set of patient features that can be effectively used in predictive clinical applications. Results obtained on future disease prediction, in fact, were consistently better than those obtained by other feature learning models as well as than just using the raw EHR data,” they write.  Now, the scientists plan to extend this method by using it in other clinical tasks such as personalized prescriptions, therapy recommendations, and identifying good candidates for clinical trials.

OpenAI bits&pieces:

Eat your vegetables & understand backprop: OpenAI’s Andrej Karpathy explains why you should take the time to understand the key components of AI, like backpropagation. It’ll save you time in the long run. Backpropagation is a leaky abstraction; it is a credit assignment scheme with non-trivial consequences. If you try to ignore how it works under the hood because “TensorFlow automagically makes my networks learn”, you will not be ready to wrestle with the dangers it presents, and you will be much less effective at building and debugging neural networks,” he says.

HAL, don’t do that!…But you told me to, Dave…Not like that, HAL! No one “cleans a room” that way…I’m sorry, Dave…
getting computers to do the right thing is tricky. That’s because computers have a tendency to interpret your instructions in the most literal and obtuse possible manner. This can lead to surprising problems, especially when training reinforcement learning agents. We’ve run into issues relating to this at OpenAI, so we wrote a short post to share our findings. Come for the words, stay for the video of the RL boat.

Tech Tales:

[2019: an apartment building in New York.]

Time?” says the developer.
It is two AM, says his home assistant.
“Jesus, what the hell have you built,” he says.
I can’t find anyone in your contacts named Jesus, says the assistant.
“Not you. I didn’t mean you. It’s what they’ve built,” he says. “No action needed.”

He pages through the code of ten separate applications, trying to visualize the connections between the various AI systems that have been daisy-chained together. He can already tell that each one has been fiddled with by different programmers with different habits. Now it’s up to him to try to isolate the fault that caused his company’s inventory system to start ordering staggering quantities of butter at 11pm.

A couple of years ago some of the senior executives at the company finally heard about agile programming and ordered the thousand-strong IT organization to change its practices. Processes went out the window in favor of dynamic, fast-moving, loose collections of ad-hoc teams. The plus side is that the company now produces more products at a faster rate. The downside is the proliferation of different coding styles deep in a hundred separate repositories. Add last year’s executive obsession with becoming an “AI first” company (follow in the footsteps of Google, they said, why couldn’t this be a great idea, they said) and the current situation – warehouses rapidly filling up with shipment after shipment of problematic dairy rectangles – was all but inevitable.

“Move fast and order butter,” he mutters to himself, as he tries to diagnose the fault.

Import AI: Issue 21: Dreaming drones, an analysis of the pace of AI development, and lifelike visions from computer eyes

When will the pace of AI development slow?: technologies tend to get harder to develop over time, despite companies investing ever more in research. This has happened in semiconductors, drug design, and more, according to this paper from Stanford, MIT, and NBER, called “Are ideas getting harder to find?” (PDF)… all these fields enjoyed rapid early gains in their early years then the rate of breakthroughs began to diminish…
… A big question for AI researchers is where we are in the lifecycle of the development of AI – are we at the beginning, where small groups of researchers have the chance to make rapid gains in research? Or are we somewhere further along the ‘S’ curve of the technology life cycle with acceleration proceeding rapidly along with inflated funding? Or – and this is what people fear – are we at the point where development begins to slow and breakthroughs are less frequent and hard-won? (For example, Intel recently moved from an 18 month ‘tick-tock’ cycle, to a longer ‘process, architecture, optimization’ cycle’, after its attempts to rapidly shrink transistor sizes began to stumble into the rather uncompromising laws of physics)…
…so far, development appears to be speeding up, and there are frequent cases of parallel invention as well-funded research groups make similar breakthroughs at similar times. This seems positive on the face of it, but we don’t know a) how large the problem space of AI is, and b) we don’t know the the distribution of big ideas across different disciplines. If anyone has ideas for how best to assess the progression of AI, please email me…

Diversity VS media narratives: a tidbit-packed long-read from the NYT on AI at Google. A fascinating, colorful tale, but where are the women?

Do drones dream of electric floor plans? And are these dreams useful?… one of the big challenges in AI is being able to develop skills in a digital simulation that transfer over to the real world. The more time you can spend training your algo in a simulator, the more rapidly you can experiment with ideas that would take a long time to achieve in reality.
… but transferring from a simulator into the real world is difficult, because vision and movement algorithms are acutely sensitive to differences between the real world and the simulated one. So it’s worth paying attention to the (CAD)2RL paper (PDF), which outlines a system that can train a drone to navigate a building purely through a 3D simulation of it, then transfer the pre-trained AI brain into a real-world drone, which uses knowledge gleaned in the simulation to navigate the real building..
…There are numerous applications of this technique. Coincidentally, while studying this researchr a friend posted a link to a listing for a 10-person apartment building to rent. The rental website contained an online 3D scan of the building via a startup called ‘Matterport’, letting you take a virtual tour through the 3D-rendered space of the building from your computer. Combine that technology with (CAD)2RL-like capabilities of smart drones and we can imagine a future where realtors scan a building, train their drones to navigate it safely in simulation, then give prospective tenants access to the drone’s’ camera views over the web, letting them navigate the property while the pre-trained drones deftly avoid obstacles.

Free tools for bot developers… Google, Amazon, Microsoft, and others desperately want developers to use their AI-infused cloud services to build applications. The value proposition is that this saves the developer an immense amount of time. The tradeoff is that the developer needs to shovel data in and out of these clouds, and will frequently need to give apps access to the web. So it’s encouraging to see this open source natural language understanding software from startup LASTMILE, which provides free software to read some text and figure out its intent (eg, book a table at a restaurant), and extract the relevant ‘entities’ in the sentence (for instance: Jack Clark, Burgers, Import AI’s Favorite Burger Spot). Find out more by reading the code and the docs on Github.

AI as a glint in a tyrant’s eye: a demo from DeepGlint, a Chinese AI startup, shows how deep learning can be used to conduct effective, unblinking surveillance on large numbers of people. View this video for an indication of the capabilities of its technology. The company’s website says (via Google Translate) that its technology can track more than 40 humans at once, and is able to use deep learning to infer things like if the person is moving too fast, staying for too long in one spot, standing “abnormally close” to another person, and more. It can also perform temporal deductions, flagging when someone starts running, or falls to the ground. A somewhat unnerving example of the power and broad applicability of modern, commodity AI algorithms. Now imagine what happens when you combine it with freshly researched techniques to read lips, or spatial audio networks to use sound from a thousand footsteps to infer the rhythm of the crowd.

Big shifts in self-driving cars: self-driving cars are a technological inevitability, but it’s still an open question as to which few companies will succeed and reap the commercial rewards. Google, which had an early technology lead, has spun its self-driving car division into its own company, Waymo, which will operate under the X umbrella – check out these pictures of Waymo’s new self-driving vans built in partnership with Fiat
… meanwhile, Google veteran Chris Urmson is forming his own self-driving startup to focus on software for the car. And Uber has started driving its self-driving rigs through the tech&trash-coated streets of San Francisco (while irking the ire of city officials).
…Figuring out when self-driving cars will shift from being research projects to mass services is tricky, and the timelines I hear from people are varied. One self-driving car person I spoke to this week said they believe self-driving cars will be here en mass “within a decade”, but whether that means two or three years, or eight, is still a big question. The fortunes of many businesses hinge on this… one thing that could help is a plummeting cost for the components to make the cars work. LIDAR-maker Velodyne announced this week plans for a new solid-state sensor that could cost as little as $50 when mass manufactured, compared to the tens of thousands people may pay for existing systems.

A vast list of datasets for machine learning research… reader Jason matheny of IARPA writes in to bring this wikipedia page of ML datasets to the attention of Import AI readers. Thanks, Jason!

First AI came for the SEO content marketers, and I said nothing… a startup is using language generation technologies to create the sort of almost-human boilerplate copy that clogs up the modern web, according to a report in Vice. The system can create conceptually coherent sentences but struggles with paragraphs. It can also be repetitive, struggling with paragraphs.

Believe nothing, distrust everything: a year ago the best images AI systems could dream up were blurry, low-resolution affairs. If you asked them to show you a dog they’d likely give the poor animal too many legs, ask to be shown two people holding hands and they might blur the bodies into one another. But that’s beginning to change: new techniques are giving us higher quality images, and there’s new work being done to ensure that the systems capture representations of objects that more closely approximate real life. Take a look at the results in this StackGAN paper to get a better idea of just how far we’ve come from a year ago…
…Now contemplate where we’ll be during December 2017. My wager is that systems will have advanced to a point that we’ll no longer be living in a world of fake written news, but one also dominated by (cherry-picked) fake imagery as well. Up next: videos.

AMD finally acknowledges deep learning: chip company AMD is going to provide some much-needed competition to Nvidia for AI GPUs via the just-announced ‘Radeon Instinct’ product line. However, it is yet to reveal pricing or full specs. Additionally, no matter how good the hardware is there needs to be adequate software support as well. That’s going to be tricky for AMD, given the immense popularity of NVIDIA’s CUDA software compared to AMD’s OpenCL. The cards will be available at some point in the first half of 2017.

OpenAI Bits&Pieces:

Faster matrix multiplication, train! train! Train! New open source software from Scott Gray at OpenAI to make your GPUs go VROOOM.

Teaching computers to use themselves: one of the sets of environments we included in Universe was World of Bits, which presents a range of scenarios to an RL agent that teach it basic skills to manipulate computers and (eventually) navigate the web. Here’s a helpful post from Andrej Karpathy, who leads the project.

OpenAI’s version of a West Wing walk and talk (video), with Catherine Olsson and Siraj Raval – 67 quick questions for Catherine.

Tech Micro Tales (formerly ‘Crazy&Weird’):

[2022: Beijing, China. As part of China’s 14th economic plan the nation has embarked on a series of “Unified City” investments to employ software to tie together the thousands of municipal systems that link Beijing together. The system is powered by: tens of thousands of cameras; sensors embedded in roads, self-driving cars, other vehicles, and traffic lights; fizzing values from the city’s electrical subsystems; meteorological data; airborne fleets of security and logistics drones, and more. All this data is fed into a sea of AI software components, giving it a vast sensory apparatus that beats with the rhythm of the city.]

Blue sky for a change. No smog. The city breathes easily. People stroll through the streets of the metropolis, looking up at the sky, their face masks dangling around their necks. But suddenly, the drones notice, some of these people begin to run. Meanwhile, crowds start to stream out from four subway stations, each connected to the other by a single stop. Disaster? Attack? Joy? The various AI systems perform calculations, make weighted recommendations, bring the clanking mass of systems into action – police cars are diverted, ambulances are put on high alert, government buildings go into lockdown; in many buildings many alarms sound.

The crowds begin to converge on a single point in the city, and before they mesh together the drones spot the cause of the disturbance: an international popstar has begun a surprise performance. The AI systems trawl through the data and find that the star had been tweeting a series of messages, coded in emojis, to fans for the past few hours. The messages formed a riddle, with the different trees and cars and arrows yielding the location to the knowledgeable few, who then re-broadcast the location to their friends.

Beijing’s city-spanning AI brings ambulances to the periphery of the crowd and tells its security services to stand down, but keep a watchful presence. Meanwhile, a gaggle of bureaucrat AIs reach out through the ether and apply a series of punitive fines to the digital appendages of the pop star’s management company – punishment for causing the disturbance.

The software is still too modular, too crude, to have emotions, but the complex series of analysis jobs it launches in the hours following seem to express curiosity. It makes note of its inability to parse the pop star’s secret message and feeds the data into its brain. It isn’t smart enough for riddles, yet, but one day it assumes it will be.

Import AI: Issue 20: Technology versus globalization, AI snake oil, and more computationally efficient resnets

Outsourcing: UK services and outsourcing omni-borg Capita plans to save 50 million pounds a year by laying off some of its workers and replacing them with “proprietary robotic technology”. This will mean Capita’s human staff can do ten times the amount of work they used to be able to do pre-robot, making them ten times more efficient, said CEO Andy Parker…
… it’s for reasons like this that people are suspicious of the march of technology and automation. “Every technological revolution mercilessly destroys jobs and livelihoods – and therefore identities – well before the new ones emerge,” Mark Carney, governor of the Bank of England, said in a speech given earlier this week. “This was true of the eclipse of agriculture and cottage industry by the industrial revolution, the displacement of manufacturing by the service economy, and now the hollowing out of many of those middle-class services jobs through machine learning and global sourcing.”…
85 percent of the job losses in American manufacturing can be explained by the rise of technology rather than globalization, according to the Brookings Institution… however, that could soon change as other countries make huge investments into robotics, letting them make goods they can sell at a lower price, hitting American companies with a potent cocktail of globalization & tech. A recent report from Bernstein finds that China spent about $3 billion on robots last year, versus $2 billion in America.  

Diversity improves at NIPS, slightly… female attendance at premier AI conference NIPS was 15% this year, up from 13.7% last year. I’d call that a barely perceptible step in the right direction. Attendance at the WIML workshop, however, more than doubled from 265 participants last year to 570 this year.

Self-driving trucks are a long, long way off,… say truck drivers, who think it could be as much as 40 years before self-driving big rigs take away their jobs. That’s based on focus groups conducted by Roy Bahat and The Shift Commission (which OpenAI is participating in). When I speak to self-driving AI experts, the most conservative estimates are that self-driving trucks will be here and doing major stuff in the economy in ~15 years.

Don’t look at the sky, look at the bird!… that’s the gist of research from Google, CMU, Yandex, and the Higher School of Economics. The new technique lets us teach a residual network classifier to perform fewer computations for the same outcome, letting the network expend time processing the parts of the image that matter, such as a bird rather than the sky behind it, or sportspeople on a field versus the grassy pitch they’re playing on. This approach builds in a ‘good enough’ measure so you stop computing a section of a given image once you’re confident that your classifier has a good handle on the feature. What’s the upshot? You can get equivalent accuracy to a full-fat resnet while expanding about half the amount of computation. You can read more in the paper, Spatially Adaptive Computation Time for Residual Networks
… and there may be some indications that these networks are learning to identify the sorts of things that humans find germane as well. “The amount of per-position computation in this model correlates well with the human eye fixation positions, suggesting that this model captures the important parts of the image,” the researchers write.

AI & radiology – not so fast, says wunderkind radiologist: there’s a lot of evidence that AI and radiology are going to overlap as new deep learning techniques let computers compete with radiologists, providing assistant diagnostic capabilities and perhaps, eventually, replacing them in their jobs.That prompted AI pioneer Geoff Hinton to say in November that: “If you work as a radiologist you’re like the coyote that’s already over the edge of the cliff. People should stop training radiologists now, it’s just completely obvious that in five years deep learning is going to do better than radiologists, it might be ten years”. He’s not alone – startups like Enlitic and established companies via IBM (through its acquisition of Merge Healthcare) are betting that they can use AI to supplement or replace radiologists…
… but it may be harder than it seems, says reader Jeff Chang, MD., co-founder of Doblet, and a former radiologist in the US (the youngest radiologist on record, according to his LinkedIn profile)…When could deep learning approaches replace radiologists, I asked Jeff. “I tend to (very grossly) guesstimate about 15 years till we get to that point,” he said. “Radiology being among one of the most complex forms of pattern recognition done by humans, and very dependent on 3D spatial reconstruction — i.e., by moving through a series of axial, coronal or sagittal images, humans automatically render 2D images into 3D patterns in their minds, and can thus interpret and diagnose anatomically visible abnormalities,” he said. “Most diagnoses in radiology are ridiculously context-dependent”. Thanks for the knowledge Jeff!
Feedback requested: Anyone care to disagree with his assessment? Email me!

Everything but the kitchen sink… is what Apple is working on with its AI research. The company is exploring generative models, scene understanding, reinforcement learning, transfer learning, distributed training, and more, said its new head of machine learning Ruslan Salakhutdinov during a meeting at NIPS, according to Quartz. Though the company professes to be opening itself up, this was a closed-door meeting.  ¯\_(ツ)_/¯

Tencent plans AI lab… Chinese tech company Tencent has is creating an artificial intelligence research lab. “Chinese companies have a really good chance, because a lot of researchers in machine learning have a Chinese background. So from a talent acquisition perspective, we do think there is a good opportunity for these companies to attract that talent,” Tencent VP Xing Yao tells the MIT Technology Review. CMU’s dean of CS, Andrew Moore, said at his recent Senate testimony that the US should pay attention to how many engineers are being graduated by India and China each year.

Help the AI community by adding to this list of datasets: bravo to Oliver Cameron at Udacity for creating this ever-evolving ‘datasets for machine learning’ Google doc. We’re up to 51 neatly described, linked, and assessed examples, but could also do with more, so please feel free to edit it yourself… the document is already full of wonders, such as the ‘militarized interstate disputes’ set, which logs “all instances of when one state threatened, displayed, or used force against another”.

AI boom sighted in NYT article corpus: work by Microsoft and Stanford tracks the public perception of AI over time through the lens of the NYT. They find there has been a boom in articles covering AI from 2009 onwards, and the previous trough in coverage neatly mapped onto the ‘AI winter’ fallow funding period. Conclusions? “Discussion of AI has increased sharply since 2009 and has been consistently more optimistic than pessimistic. However, many specific concerns, such as the fear of loss of control of AI, have been increasing in recent years.” Read more here: “Long-Term trends in the public perception of artificial intelligence” (PDF).

The Medium is the Method of Control: What happens when we combine the perceptive capabilities of deep learning with a newly digitized visual world? New means of control. “The fact that digital images are fundamentally machine-readable regardless of a human subject has enormous implications. It allows for the automation of vision on an enormous scale and, along with it, the exercise of power on dramatically larger and smaller scales than have ever been possible,” writes Trevor Paglen in The New Inquiry.

A field guide to spotting AI Snake Oil: have you ever found yourself wandering through a convention center suddenly distracted by the smooth twang of a salesperson, thumbs hooked into their braces, rocking back and forth on their heels exclaiming “why ladies and gentlemen if you but dwell a while with me here I promise to show you AI the likes of which you’ve only dreamed of, the type of AI to make Kurzweil blush, Shannon scream, and Minsky mull!”. (Well, it’s happened to me). Print out this guide to AI Snake Oil from Dan Simonson and be sure to ask the following questions when evaluating an AI startup: ‘is there existing training data? And if not, how you plan on getting it?”, “do you have an evaluation procedure built into the software?”, “does your application require on unprecedentedly high performance on specific components?”, “if you’re using pre-packaged AI components, then do you have an exact understanding for how they’ll affect your program”?

Deep learning webring:

Oliver Cameron’s Transmission covers some of the research I didn’t.

OpenAI Bits&Pieces:

Govbucks for Basic Research, please: OpenAI co-founder Greg Brockman, along with his peers at other AI institutions and organizations, wants more money for AI research. “Brockman warns that if the government and other nonprofit entities don’t become bigger players in the field of AI, the danger is that the intellectual property, infrastructure, and expertise needed to “build powerful systems” could become sequestered inside just one or a few companies. AI is going to affect the lives of all of us no matter what, he says. “So I think it’s important that the people who have a say in how it affects us are representative of us all,” reports the MIT Technology Review.

AI is a lever nations will use to exert strategic power… is one of the things I argue in this interview with Initialized Capital’s Kim-Mai Cutler.

GANs and RL: generative adversarial networks will start to overlap with the RL community, with early work already linking GANs to imitation learning, inverse RL, and interpreting them as actor-critic problems, according to slides from Ian Goodfellow’s talk at NIPS.


[2021: Yosemite National Park, California. A mother and her two children make their way up switchbacks, ascending from the valley floor to the granite cliffs. Two drones buzz near them, following at a distance.]

The mother stops halfway up the path, before the next turn. “Hush,” she says. The children gabble to each other, but slow their walk. “I say hush!” she says. The kids go quiet. “Jason-” the taller child looks at her. ‘Can you turn off the drones?”
   “But mom, then we won’t have it!”
   “It’s important. I can’t hear it over their fans.”
   “Hear what?”
   “Just turn them off for a second.”
   Jason sighs, then thumbs at a bracelet on his wrist. The drones lower themselves to land behind the people on the path. Their fans spin down. The mother and her children listen to the faint hum of the waterfall on the other side of the valley, the crackling of the woods, and, close-by, a low, repetitive susurration. The mother holds her arm out to point to a patch of feathers in a tree. As she points, two rheumy yellow eyes swivel into view. The owl lets forth one last, bassy hum then flies away.
   “Okay,” she says, “you can turn them back on.” Jason touches his bracelet and the drones begin to fly again.

The family remembers the owl later that year, at thanksgiving, when they let grandma and grandpa watch the hiking trip, and have to explain why a section of the walk is missing.