Import AI: Issue 55: Google reveals its Alphabet-wide optimizer, Chinese teams notch up another AI competition win, and Facebook hires hint at a more accessible future

by Jack Clark

Welcome to the hybrid reasoning era… MIT scientists teach machines to draw images and to show their work in the process:
…New research from MIT shows how to fuse deep learning and program synthesis to create a system that can translate handdrawn mathematical diagrams into their digital equivalents – and generate the program used to draw them in the digital software as well.
…”Our model constructs the trace one drawing command at a time. When predicting the next drawing command, the network takes as input the target image as well as the rendered output of previous drawing commands. Intuitively, the network looks at the image it wants to explain, as well as what it has already drawn. It then decides either to stop drawing or proposes another drawing command to add to the execution trace; if it decides to continue drawing, the predicted primitive is rendered to its “canvas” and the process repeats,” they say.
…Read more in: Learning to Infer Graphics Programs from Hand Drawn Images.

Baidu/Google/Stanford whiz Andrew Ng is back with… an online deep learning tuition course:
…Andrew Ng has announced the first of three secret projects: a deep learning course on the online education website Coursera.
…The course will be taught in Python and TensorFlow (perhaps raising eyebrows at Ng’s former employer Baidu, given that the company is trying to popularize its own TF-competitor ‘Paddle’ framework).
Find out more about the courses here.
…Bonus Import AI ‘redundant sentence of the week’ award goes to Ng for writing the following ‘When you earn a Deep Learning Specialization Certificate, you will be able to confidently put “Deep Learning” onto your resume.

US military seeks AI infusion with computer vision-based ‘Project Maven’:
…the US military wants to use ML and deep learning techniques for computer vision systems to help it autonomously extract, label, and triage data gathered by its signals intelligence systems to help it in its various missions.
…”We are in an AI arms race”, said one official. The project is going to run initially for 36 months during which time the government will try to build its own AI capabilities and work with industry to develop the necessary expertise. “You don’t buy AI like you buy ammunition,” they said.
…Bonus: Obscure government department name of the week:
…’the ‘Algorithmic Warfare Cross-Function Team’
…Read more in the DoD press release ‘Project Maven to Deploy Computer Algorithms to War Zone by Year’s End.’
…Meanwhile, the US secretary of defense James Mattis toured Silicon Valley last week, telling journalists he worried the government was falling behind in AI development. “It’s got to be better integrated by the Department of Defense, because I see many of the greatest advances out here on the West Coast in private industry,” he said.
…Read more in: Defense Secretary James Mattis Envies Silicon Valley’s AI Ascent.

Sponsored Job: Facebook builds breakthrough technology that opens the world to everyone, and our AI research and engineering programs are a key investment area for the company. We are looking for a technical AI Writer to partner closely with AI researchers and engineers at Facebook to chronicle new research and advances in the building and deployment of AI across the company. The position is located in Menlo Park, California.
Apply Here.

Q: Who optimizers the optimizers?
A: Google’s grand ‘Vizier’ system!
…Google has outlined ‘Vizier’, a system developed by the company to automate optimization of machine learning algorithms. Modern AI systems, while impressive, tend to require the tuning of vast numbers of hyperparameters to attain good  performance. (Some AI researchers refer to this process as ‘Grad Student Descent’.)
…So it’s worth reading this lengthy paper from Google about Vizier, a large-scale optimizer that helps people automate this process. “Our implementation scales to service the entire hyperparameter tuning workload across Alphabet, which is extensive. As one (admittedly extreme) example, Collins et al. [6] used Vizier to perform hyperparameter tuning studies that collectively contained millions of trials for a research project investigating the capacity of different recurrent neural network architectures,” the researchers write.
…The system can be used to both tune systems and to optimize others via transfer learning – for instance by tuning the learning rate and regularization of one ML system, then running a second smaller optimization job using the same priors but on a different dataset.
…Notable: for experiments which run into the 10,000+ range Vizier supports standard RANDOMSEARCH and GRIDSEARCH technologies as well as a “proprietary local search algorithm” with tantalizing performance properties judging by the graphs.
…Read more about the system in Google Vizier: A Service for Black-Box Optimization (PDF).
Reassuringly zany experiment: Skip to the end of the paper to learn how Vizier was used to run a real world optimization experiment in which it iteratively optimized (via Google’s legions of cooking staff) the recipe for the company’s chocolate chip cookies.  “The cookies improved significantly over time; later rounds were extremely well-rated and, in the authors’ opinions, delicious,” they write.

Chinese teams sweep ActivityNet movement identification challenge, beating originating dataset team from DeepMind, others:
…ActivityNet is a challenge to recognize high-level concepts and activities from short videoclips found in the wild. It incorporates three datasets: ActivityNet (VCC Kaust)ActivityNet Captions (Stanford), and Kinetics (DeepMind). Challenges like this pose some interesting research problems (how to infer fairly abstract concepts like ‘walking the dog from unlabelled and labelled videos), and are also eminently applicable by various security apparatuses – none of this research exists in a vacuum.
…This year’s ActivityNet challenge was won by a team from Tsinghua University and Baidu, whose system had a top-5 accuracy (suggest five labels, one of them is correct) of 94.8% and a top-1 accuracy of 81.4%. The second place was one by a team from the Chinese University of Hong Kong,  ETH Zurich, and the Shenzhen Institute of Advanced Technology, with top-5 93.5% and top-1 78.6%. German AI research company TwentyBN took third place and DeepMind’s team took fourth place.
…Read more about the results in this post from TwentyBN: Recognizing Human Actions in Videos.
…Progress here has been quite slow at the high-end though (because the problem is extremely challenging): last year’s winning top-1 accuracy was 93.23% from CUHK/ ETHZ / SIAT.
…This year’s results follow a wider pattern of Chinese teams beginning to rank highly in competitions relating to image and video classification; other Chinese teams swept the ImageNet and WebVision competitions this year. It’s wonderful to see the manifestation of the country’s significant investment in AI and the winners should be commended for a tendency to publish their results as well.

Salesforce sets new language modeling record:
… Welcome to the era of modular, Rude Goldberg machine AI…
…Research from Salesforce in which the team attains record-setting perplexity scores on Penn TreeBank (52.8) and WikiText (52) via the use of what they call a weight-dropped LSTM, representing a rather complicated system consisting of numerous recent inventions ranging from DropConnect to Adam to randomized-length backpropagation through time, to regularization, to temporal activation regularization. The results of this word salad of techniques is a record-setting system.
…The research highlights a trend in modern AI development of moving away from trying to design large, end-to-end general systems (though I’m sure everyone would prefer it if we could build these) and instead focusing on eking out gains and new capabilities by assembling and combining together various components, developed by the concerted effort of many hundreds of researchers in recent years.
…The best part of the resulting system? It can be dropped into existing systems without needing any underlying modification of fundamental libraries like CuDNN.
…Read more here: Regularizing and Optimizing LSTM Language Models.

Visual question answering experts join Facebook…
…Georgia Tech professors Dhruv Batra and Devi Parikh recently joined Facebook AI Research part-time, bringing more machine vision expertise to the social network’s AI research lab.
…The academics are known for their work on visual question answering – a field of study where you train machine learning models to associate large-scale language models with the contents of images, letting you provide complex details about images in other forms. This has particular relevance to people who are blind or who need screen readers to be able to interact with sites on the web. Facebook has led the charge in increasing the accessibility of its website so it’ll be exciting to see what exactly the researchers come up with as they work at the social network.

STARCRAFTAGEDDON (Facebook: SC1, DeepMind: SC2):
Facebook unfurls large-scale machine learning dataset built around RTS game StarCraft:
…Facebook has released STARDATA, a 50,000-game large-scale dataset of recordings of humans playing the RTS game StarCraft. StarCraft is an RTS game that as defined e-sports in East Asia, particularly in South Korea. Now, companies such as Facebook, DeepMind, Tencent and others are racing with one another to create AI systems that can tackle the game.
…Read more on: STARDATA: a StarCraft AI Research Dataset.
DeepMind announces own large-scale machine learning dataset based around StarCraft 2: 53k to Facebook’s 50k, with plans to scale to “half a million”:
…Additionally, DeepMind has released a number of other handy tools for researchers keen to test out AI ideas on StarCraft, including an API (SC2LE), an open source toolset for SC2 development (PySC2), and a series of simple RL environments. StarCraft is a complex, real-time strategy game with hidden information, requiring AIs to be able to control multiple units while planning over extremely long timescales. It seems like a natural testbed for new ideas in AI including hierarchical reinforcement learning, generative models, and others.
Tale of the weird baseline: Along with releasing the SC2LE API DeepMind also released a bunch of baselines of AI agents playing SC2 including full games and mini-games. But the main game baselines used agents trained by A3C techniques — I’m excited to see future baselines trained on newer systems, like proximal policy optimization, FeuDAL reinforcement learning networks, and so on.
…Read more in: DeepMind and Blizzard open Starcraft II as an AI Research Environment.

OpenAI Bits and Pieces:

OpenAI beats top Dota pros at 1v1 mid:
…OpenAI played and won multiple 1v1 mid matches against multiple pro Dota 2 players at The International last week with an agent trained predominantly via self-play.
…Read more: Dota 2.

Practical AI safety:
…NYT article on practical AI safety, featuring OpenAI, Google, DeepMind, UC Berkeley, and Stanford. A small, growing corner of the AI research field with long-ranging implications.
…Read more: Teaching A.I. Systems to Behave Themselves

Tech Tales:

[2024: A nondescript office building on the outskirts of Slough, just outside of London.]

OK, so today we’ve got SleepNight Mattresses. The story is we hate them. Why do we hate them? Noisy springs. Gina and Allison are running the prop room, Kevin and Sarah will be doing online complaints, and I’ll be running the dispersal. Let’s get to it.

The scammers rush into their activities: five people file into an adjoining room and start taking photos of a row of mattresses, adorning them with different pillows or throws or covers, and others raising or lowering backdrop props to give the appearance of different rooms. Once each photo is taken the person tosses their phone across the room to a waiting runner, who takes it and heads over to the computer desks, already thumbing in the details of the particular site they’ll leave the complaint on. Kevin and Sarah grab the phones from the runners and sort them into different categories depending on the brand of phone – careful of the identifying information encoded into each smartphone camera – and the precise adornments of the mattresses they’ve photographed. Once the phones are sorted they distribute them to a team of copywriters who start working up the complaints, each one specializing in a different regional lingo, sowing their negative review or forum post or social media heckle with idiosyncratic phrases that should pass the anti-spam classifiers, registering with high confidence as ‘authentic; not malicious’.

The phones start to come back to you and you and your team inspect them, further sorting the different reviews on the different phones into different geographies. This goes on for hours, with stacks of phones piling up until the office looks like an e-waste disposal site. Meanwhile, you and your time fire up various inter-country network links, hooking your various phones up to ghost-links that spoof them into different locations across the world. Then the messages start to go out, with the timing carefully calibrated so as not to arouse suspicion, each complaint crafted to arrive at opportune times, in keeping with local posting patterns.

Hours after that and the search engines have adjusted. Various websites start to re-rank the various mattress products. Review sentiments go down. Recommendation algorithms hold their nose and turn the world’s online consumers away from the products. Business falls. You don’t know who gave you the order or what purpose they have to scam the SleepNight Mattresses out of favor – and you don’t care. Yesterday it was fishtanks, delivered by the pallet-load on vans with registrations you tried to ignore. Tomorrow is tomorrow, and you’ll get an order late tonight over an onion network. If you do your job right a cryptocurrency payment will be made. Then it’s on to the next thing. And all the while the classifiers are getting smarter – this is a game where every successful theft makes those you are thieving from smarter. ‘One of the last sources of low-end graduate employment,’ read a recent expose. ‘A potential goldmine for humanities graduates with low-sensibilities.’

Technologies that inspired this story: Collaborative filtering, sentiment analysis, boiler-room spreadsheets, Tor.

Monthly Sponsor:
Amplify Partners is an early-stage venture firm that invests in technical entrepreneurs building the next generation of deep technology applications and infrastructure. Our core thesis is that the intersection of data, AI and modern infrastructure will fundamentally reshape global industry. We invest in founders from the idea stage up to, and including, early revenue.
…If you’d like to chat, send a note to david@amplifypartners.com