Import AI: Issue 21: Dreaming drones, an analysis of the pace of AI development, and lifelike visions from computer eyes

by Jack Clark

When will the pace of AI development slow?: technologies tend to get harder to develop over time, despite companies investing ever more in research. This has happened in semiconductors, drug design, and more, according to this paper from Stanford, MIT, and NBER, called “Are ideas getting harder to find?” (PDF)… all these fields enjoyed rapid early gains in their early years then the rate of breakthroughs began to diminish…
… A big question for AI researchers is where we are in the lifecycle of the development of AI – are we at the beginning, where small groups of researchers have the chance to make rapid gains in research? Or are we somewhere further along the ‘S’ curve of the technology life cycle with acceleration proceeding rapidly along with inflated funding? Or – and this is what people fear – are we at the point where development begins to slow and breakthroughs are less frequent and hard-won? (For example, Intel recently moved from an 18 month ‘tick-tock’ cycle, to a longer ‘process, architecture, optimization’ cycle’, after its attempts to rapidly shrink transistor sizes began to stumble into the rather uncompromising laws of physics)…
…so far, development appears to be speeding up, and there are frequent cases of parallel invention as well-funded research groups make similar breakthroughs at similar times. This seems positive on the face of it, but we don’t know a) how large the problem space of AI is, and b) we don’t know the the distribution of big ideas across different disciplines. If anyone has ideas for how best to assess the progression of AI, please email me…

Diversity VS media narratives: a tidbit-packed long-read from the NYT on AI at Google. A fascinating, colorful tale, but where are the women?

Do drones dream of electric floor plans? And are these dreams useful?… one of the big challenges in AI is being able to develop skills in a digital simulation that transfer over to the real world. The more time you can spend training your algo in a simulator, the more rapidly you can experiment with ideas that would take a long time to achieve in reality.
… but transferring from a simulator into the real world is difficult, because vision and movement algorithms are acutely sensitive to differences between the real world and the simulated one. So it’s worth paying attention to the (CAD)2RL paper (PDF), which outlines a system that can train a drone to navigate a building purely through a 3D simulation of it, then transfer the pre-trained AI brain into a real-world drone, which uses knowledge gleaned in the simulation to navigate the real building..
…There are numerous applications of this technique. Coincidentally, while studying this researchr a friend posted a link to a listing for a 10-person apartment building to rent. The rental website contained an online 3D scan of the building via a startup called ‘Matterport’, letting you take a virtual tour through the 3D-rendered space of the building from your computer. Combine that technology with (CAD)2RL-like capabilities of smart drones and we can imagine a future where realtors scan a building, train their drones to navigate it safely in simulation, then give prospective tenants access to the drone’s’ camera views over the web, letting them navigate the property while the pre-trained drones deftly avoid obstacles.

Free tools for bot developers… Google, Amazon, Microsoft, and others desperately want developers to use their AI-infused cloud services to build applications. The value proposition is that this saves the developer an immense amount of time. The tradeoff is that the developer needs to shovel data in and out of these clouds, and will frequently need to give apps access to the web. So it’s encouraging to see this open source natural language understanding software from startup LASTMILE, which provides free software to read some text and figure out its intent (eg, book a table at a restaurant), and extract the relevant ‘entities’ in the sentence (for instance: Jack Clark, Burgers, Import AI’s Favorite Burger Spot). Find out more by reading the code and the docs on Github.

AI as a glint in a tyrant’s eye: a demo from DeepGlint, a Chinese AI startup, shows how deep learning can be used to conduct effective, unblinking surveillance on large numbers of people. View this video for an indication of the capabilities of its technology. The company’s website says (via Google Translate) that its technology can track more than 40 humans at once, and is able to use deep learning to infer things like if the person is moving too fast, staying for too long in one spot, standing “abnormally close” to another person, and more. It can also perform temporal deductions, flagging when someone starts running, or falls to the ground. A somewhat unnerving example of the power and broad applicability of modern, commodity AI algorithms. Now imagine what happens when you combine it with freshly researched techniques to read lips, or spatial audio networks to use sound from a thousand footsteps to infer the rhythm of the crowd.

Big shifts in self-driving cars: self-driving cars are a technological inevitability, but it’s still an open question as to which few companies will succeed and reap the commercial rewards. Google, which had an early technology lead, has spun its self-driving car division into its own company, Waymo, which will operate under the X umbrella – check out these pictures of Waymo’s new self-driving vans built in partnership with Fiat
… meanwhile, Google veteran Chris Urmson is forming his own self-driving startup to focus on software for the car. And Uber has started driving its self-driving rigs through the tech&trash-coated streets of San Francisco (while irking the ire of city officials).
…Figuring out when self-driving cars will shift from being research projects to mass services is tricky, and the timelines I hear from people are varied. One self-driving car person I spoke to this week said they believe self-driving cars will be here en mass “within a decade”, but whether that means two or three years, or eight, is still a big question. The fortunes of many businesses hinge on this… one thing that could help is a plummeting cost for the components to make the cars work. LIDAR-maker Velodyne announced this week plans for a new solid-state sensor that could cost as little as $50 when mass manufactured, compared to the tens of thousands people may pay for existing systems.

A vast list of datasets for machine learning research… reader Jason matheny of IARPA writes in to bring this wikipedia page of ML datasets to the attention of Import AI readers. Thanks, Jason!

First AI came for the SEO content marketers, and I said nothing… a startup is using language generation technologies to create the sort of almost-human boilerplate copy that clogs up the modern web, according to a report in Vice. The system can create conceptually coherent sentences but struggles with paragraphs. It can also be repetitive, struggling with paragraphs.

Believe nothing, distrust everything: a year ago the best images AI systems could dream up were blurry, low-resolution affairs. If you asked them to show you a dog they’d likely give the poor animal too many legs, ask to be shown two people holding hands and they might blur the bodies into one another. But that’s beginning to change: new techniques are giving us higher quality images, and there’s new work being done to ensure that the systems capture representations of objects that more closely approximate real life. Take a look at the results in this StackGAN paper to get a better idea of just how far we’ve come from a year ago…
…Now contemplate where we’ll be during December 2017. My wager is that systems will have advanced to a point that we’ll no longer be living in a world of fake written news, but one also dominated by (cherry-picked) fake imagery as well. Up next: videos.

AMD finally acknowledges deep learning: chip company AMD is going to provide some much-needed competition to Nvidia for AI GPUs via the just-announced ‘Radeon Instinct’ product line. However, it is yet to reveal pricing or full specs. Additionally, no matter how good the hardware is there needs to be adequate software support as well. That’s going to be tricky for AMD, given the immense popularity of NVIDIA’s CUDA software compared to AMD’s OpenCL. The cards will be available at some point in the first half of 2017.

OpenAI Bits&Pieces:

Faster matrix multiplication, train! train! Train! New open source software from Scott Gray at OpenAI to make your GPUs go VROOOM.

Teaching computers to use themselves: one of the sets of environments we included in Universe was World of Bits, which presents a range of scenarios to an RL agent that teach it basic skills to manipulate computers and (eventually) navigate the web. Here’s a helpful post from Andrej Karpathy, who leads the project.

OpenAI’s version of a West Wing walk and talk (video), with Catherine Olsson and Siraj Raval – 67 quick questions for Catherine.

Tech Micro Tales (formerly ‘Crazy&Weird’):

[2022: Beijing, China. As part of China’s 14th economic plan the nation has embarked on a series of “Unified City” investments to employ software to tie together the thousands of municipal systems that link Beijing together. The system is powered by: tens of thousands of cameras; sensors embedded in roads, self-driving cars, other vehicles, and traffic lights; fizzing values from the city’s electrical subsystems; meteorological data; airborne fleets of security and logistics drones, and more. All this data is fed into a sea of AI software components, giving it a vast sensory apparatus that beats with the rhythm of the city.]

Blue sky for a change. No smog. The city breathes easily. People stroll through the streets of the metropolis, looking up at the sky, their face masks dangling around their necks. But suddenly, the drones notice, some of these people begin to run. Meanwhile, crowds start to stream out from four subway stations, each connected to the other by a single stop. Disaster? Attack? Joy? The various AI systems perform calculations, make weighted recommendations, bring the clanking mass of systems into action – police cars are diverted, ambulances are put on high alert, government buildings go into lockdown; in many buildings many alarms sound.

The crowds begin to converge on a single point in the city, and before they mesh together the drones spot the cause of the disturbance: an international popstar has begun a surprise performance. The AI systems trawl through the data and find that the star had been tweeting a series of messages, coded in emojis, to fans for the past few hours. The messages formed a riddle, with the different trees and cars and arrows yielding the location to the knowledgeable few, who then re-broadcast the location to their friends.

Beijing’s city-spanning AI brings ambulances to the periphery of the crowd and tells its security services to stand down, but keep a watchful presence. Meanwhile, a gaggle of bureaucrat AIs reach out through the ether and apply a series of punitive fines to the digital appendages of the pop star’s management company – punishment for causing the disturbance.

The software is still too modular, too crude, to have emotions, but the complex series of analysis jobs it launches in the hours following seem to express curiosity. It makes note of its inability to parse the pop star’s secret message and feeds the data into its brain. It isn’t smart enough for riddles, yet, but one day it assumes it will be.