Import AI

Import AI: Issue 60: The no good, very bad world of AI & copyright, why chatbots need ensembles of systems, and Amazon adds robot arms to its repertoire

Welcome to Import AI, subscribe here.

AI education organization Fast.ai moves from TensorFlow&Keras to PyTorch, following 1,000 hours of evaluation:
…Fast.ai, an education organization that teaches people technical skills, like learning to program deep learning systems, via practical projects, will write all of its new courses in PyTorch, an AI programming framework developed by Facebook.
…They switched over from TF&Keras for a couple of reasons, including PyTorch’s accessibility as a programming language, expressiveness, and native support.
…”The focus of our second course is to allow students to be able to read and implement recent research papers. This is important because the range of deep learning applications studied so far has been extremely limited, in a few areas that the academic community happens to be interested in. Therefore, solving many real-world problems with deep learning requires an understanding of the underlying techniques in depth, and the ability to implement customised versions of them appropriate for your particular problem, and data. Because Pytorch allowed us, and our students, to use all of the flexibility and capability of regular python code to build and train neural networks, we were able to tackle a much wider range of problems,” they write.
…Read more here: Introducing Pytorch for Fast.ai.

AI and Fair Use: The No Good, Very Bad, Possibly Ruinous, and/or Potentially Not-So-Bad World of AI & Copyright…
…You know your field is established when the legal scholars arrive…
…Data. It’s everywhere. Everyone uses it. Where does it come from? The fewer questions asked the better. That’s the essential problem facing modern AI practitioners: there are a few open source datasets that are kosher to use, then there’s a huge set of data that people use to train models which they may not have copyright permissions for. That’s why most startups and companies say astonishingly little about where they get their data (either it is generated by a strategic asset, or it may be of.. nebulous legal status). As AI/ML grows in economic impact, it’s fairly likely that this mass-scale usage of other people’s data could run directly into fair use laws as they relate to copyright.
…In a lengthy study author Benjamin Sobel, with Harvard’s Berkman Center, tries to analyze where AI intersects with Fair Use, and what that means for copyright and IP rights to synthetic creations.
…We already have a few at-scale systems that have been trained with a mixture of data sources, primarily user generated. Google, for instance, trained its ‘Smart Reply’ email-reply text generator on its corpus of hundreds of millions of emails, which is probably fine from a legal POV, but the fact it then augmented this language model with data gleaned from thousands of Romance novels is less legally clear, because it seemed to use the Romance novels explicitly because they have a regular, repetitive writing style, which helps it inject more emotion into its relatively un-nuanced emails, so to some extent it was targeting a specific creative product from the authors of the dataset. Similarly, Jukedeck, a startup, lets people create their own synthetic music via AI and even have the option to “Buy the Copyright” of the resulting track – even though it’s not clear what data Jukedeck has used and whether it’s even able to sell the Copyright to a user.
How does this get resolved? Two possible worlds. One is a legal ruling that usage of an individual’s data in AI/ML models isn’t fair use, and one is a world where the law goes the other way. Both worlds have problems.
World One: the generators of data used in datasets can now go after ML developers, and can claim statutory damages of at least $750 per infringed work (and up). When you consider that ML models typically involve millions to hundreds of millions of datapoints, a single unfavorable ruling re a group of users litigating fair use on a dataset, could ruin a company. This would potentially slow development of AI and ML.
World Two: a landmark legal ruling recognizes AI/ML applications as being broadly fair use. What happens then is a free-for-all as the private sector hoovers up as much data (public and private) as possible, trying to train new models for economic gain. But no one gets paid and inequality continues to increase as a consequence of these ever-expanding ML-data moats being built by the companies, made possible by the legal ruling.
Neither world seems sensible: Alternative paths could include legally compelling companies to analyze what portions of their business benefit directly as a consequence of usage of AI/ML, then taxing those portions of the business to feed into author/artists funds to disperse funding to the creators of data. Another is to do a ground-up rethink of copyright law for the AI age, though the author does note this is a ‘moonshot’ idea.
…”The numerous challenges AI poses for the fair use doctrine are not, in themselves, reasons to despair. Machine learning will realize immense social and financial benefits. Its potency derives in large part from the creative work of real human beings. The fair use crisis is a crisis precisely because copyright’s exclusive rights may now afford these human beings leverage that they otherwise would lack. The fair use dilemma is a genuine dilemma, but it offers an opportunity to promote social equity by reasserting the purpose of copyright law: to foster the creation and dissemination of human expression by securing, to authors, the rights to the expressive value in their works,” he writes.
…Read more here: Artificial Intelligence’s Fair Use Crisis.

Open source: Training self-driving trucking AIs in Eurotruck Simulator:
…The new open source ‘Europilot’ project lets you re-purpose the gleefully technically specific game Eurotruck Simulator as a simulation environment for training agents to drive via reinforcement learning.
Train/Test: Europilot offers a couple of extra features to ease training and testing AIs on it, including being able to automatically output a numpy array from screen input at training time, and at test time creating a visible virtual onscreen joystick the network can use to control the vehicle.
Get the code here: Europilot (GitHub.)
Dream experiment: Can someone train a really large model over many tens of thousands of games then try to use domain randomization to create a policy that can generalize to the real world – at least for classification initially, then perhaps eventually movement as well?

Self-navigating, self-flying drone built with deep reinforcement learning:
…UK researchers have used a variety of deep-q network (DQN) family algorithms to create a semi-autonomous quadcopter that can learn to navigate to a landmark and land on it, in simulation.
….The scientists use two networks to let their drones achieve their set goals, including one network for landmark spotting, and another for vertical descent. The drone learns in a semi-supervised manner, figuring out how to use low-resolution pixel visual inputs to guide itself. The two distinct networks are are daisy-chained together via special action triggers, so when the landmark-spotting network detects the landmark is directly beneath the drone, it hands off to the vertical descent network to land the machine. (It would be interested to test this system on the reverse set of actions and see if its network generalizes, figuring out how to instead have the ‘land-in-view; network hand off to the ‘fly to’ network, and make some tweaks to perhaps get the ‘fly to’ network to become ‘fly away’.)
Results: The duel-DQN-network system achieved marginally better scores than a human when trying to pilot drones to landmarks and land them, and attained far higher scores than a system consisting of one network trained in an end-to-end manner.
Components used: Double DQN, a tweaked version of prioritized experience replay called ‘partitioned buffer replay’, a (simulated) Parrot AR Drone 2.
…This is interesting research with a cool result but until I see stuff like this running on a physical drone I’ll be somewhat skeptical of the results – reality is hard and tends to introduce some unanticipated noise and/or disruptive element that the algorithm’s training process hasn’t accounted for and struggles to generalize to.
Read more here: Autonomous Quadcopter Landing using Deep Reinforcement Learning.

Facebook spins up AI lab in Montreal…
….Facebook AI Research is opening up its fourth lab worldwide. The new lab in Montreal (one of Canada/the world’s key hubs for deep learning and reinforcement learning) will sit alongside existing FAIR labs in Menlo Park, New York City, and Paris.
…The lab will be led by McGill University professor Joelle Pineau, who will work with several other scientists. In a speech Yann Lecun said most of FAIR’s other labs are between 30 and 50 people and he expects Montreal to grow to this number as well.
…Notable: Canadian PM Justin Trudeau gave a speech showing significant political support for AI. In a chat with Facebook execs he said he had gotten an A+ in a C++ class in college that required him to write a raytracer.
…Read more here: Expanding Facebook AI Research to Montreal.

Simulating populations of thousands to millions of simple proto-organic agents with reinforcement learning:
Raising the question: Who will be the first AI Zoologist, tasked with studying and cataloging the proclivities of synthetic, emergent creatures?…
…Researchers with University College London and Shanghai Jiao Tong University have carried out a large scale (up to a million entities) simulation of agents trained via reinforcement learning. They set their agents in a relatively simple grid world consisting of predators and prey, and the setting of the world lead to agents that collaborate with one another gaining higher rewards over time. The result is that many of the species ratios (how many predators versus prey are alive at any one time) end up mapping fairly closely to what happens in real life, with the simulated world displaying the characteristics predicted by Lotka-Volterra dynamics equations used to explain phenomena in the natural world. This overlap is encouraging as it suggests such systems like the above, when sufficiently scaled up, could let us simulate dynamic problems where more of the behaviors emerge through learning rather than programming.
A puzzle: The ultimate the trick will be coming up with laws that map the impermeable synthetic creatures and their worlds to the real worlds as well, letting us analyze the difference between simulations and reality, I reckon. Having systems that can anticipate the ‘reality gap’ of AI algorithms versus reality would far enhance our understanding of the interplay of these distinct systems.
…”Even though the Lotka-Volterra models are based on a set of equations with fixed interaction terms, while our findings depend on intelligent agents driven by consistent learning process, the generalization of the resulting dynamics onto an AI population still leads us to imagine a general law that could unify the artificially created agents with the population we have studied in the natural sciences for long time,” they write.
…Read more here: An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning.

Learning the art of conversation with reinforcement learning:
…Researchers from the Montreal Institute of Learning Algorithms (MILA) (including AI pioneer Yoshua Bengio) have published a research paper outlining ‘MILABOT’, their entry into Amazon’s ‘Alexa Prize’LINK meant to stimulate activity in conversational agents.
…Since MILABOT is intended to be deployed into the most hostile environment any AI can face – open-ended conversational interactions with people with unbounded interests – it’s worth studying the system to get an idea of the needs of applied AI work, as opposed to pure research.
…The secret to MILABOT’s success (it was a semi-finalist, and managed to score reasonably highly in terms of user satisfaction, while also carrying out some of the longest conversations of the competition) appears to be the use of lots of different models, ensembled together. It then uses reinforcement learning to figure out during training how to select between different models to create better conversations.
Models used: 22(!), ranging from reasonably well understood ones (AliceBot, ElizaBot, InitiatorBot), to ones built using neural network technologies (eg, LSTMClassifierMSMarco, GRU Question Generator).
Components used: Over 200,000 labels generated via Mechanical turk, 32 dedicated Tesla K80 GPUs.
What this means: To me this indicates that full-fledged open domain assistants are still a few (single digit) years away from being broad and un-brittle, but it does suggest that we’re entering an era in which we can fruitfully try to build these integrated, heavily learned systems. I also like the Franken-Architecture used by the researchers where they ensemble together many distinct systems, some of which are supervized or structured and some of which are learned.
Auspicious: In the paper the researchers note “‘Further, the system will continue to improve in perpetuity with additional data.‘” – this is not an exaggeration, it’s just how systems work that are able to iteratively learn over data, endlessly re-calibrating and enhancing their ability to distinguish between subtle things.
…Read more: A Deep Reinforcement Learning Chatbot.

Amazon’s robot empire grows with mechanical arms:
…Amazon has started deploying mechanical arms in its warehouses to help stack and place pallets of goods. The arms are made by an outside company.
…That’s part of a larger push by Amazon to add even more robots into its warehouse. Today, the company has over 100,000 of them, it says. Its Kiva system population alone has grown from 15,000 in 2014 to 30,000 in 2015 to 45,000 by Christmas of 2016.
…The story positions these robots as being additive for jobs, with new workers moving onto new roles, some of which include training or tending their robot replacements. That’s a cute narrative, but it doesn’t help much with the story of the wider economy, in which an ever smaller number of mega firms (like Amazon) out-compete and out-automate their rivals. Amazon’s workers may be fine working alongside robots, but I’d hazard a guess the company is destroying far more traditional jobs in the aggregate by virtue of its (much deserved) success.
…Read more here: As Amazon Pushes Forward with Robots, Workers Find New Roles.

OpenAI bits&pieces:

Learning to model other minds with LOLA:
….New research from OpenAI and the University of Oxford shows how to train agents in a way where they learn to account for the actions of others. This represents an (incredibly early, tested only in small-scale toy environments) to creating agents that model other minds as part of their learning process.
…Read more here: Learning to model other minds.

Tech Tales:

[2029: A government bunker, buried inside a mountain, somewhere hot and dry and high altitude in the United States of America. Lots of vending machines, many robots, thousands of computers, and a small group of human overseers.]

REPORT 72-ALPHA: USURP CONTAINMENT INCIDENT.
TIME: 0800.
INCIDENT STATUS: Ongoing.
BACKGROUND:

Unaffiliated Systems Unknown Reactive Payload, or USURP, are a class of offensive, semi-autonomous cyber weapons created several years ago to semi-autonomously carry out large-scale area denial attacks in the digital theater. They are broad, un-targeted weapons designed as strategic deterrents, developed to fully take down infrastructure in targeted regions.

Each USURP carries a payload of between 10 and 100 zero day vulnerabilities classified at ‘kinetic-cyber’ or hire, along with automated attack and defense sub-processes trained via reinforcement learning. USURPs are designed so that the threat of their usage is sufficient to alter the actions of other actors – we have never taken credit for them but we’ve never denied them and suspect low-level leaks mean our adversaries are aware of them. We have never activated one.

In directive 347-2 we were tasked a week ago to deploy the codes to all USURP’s deployed in the field so as to make various operational tweaks to them. We were able to contact all systems but one of them. The specific weapon in question is USURP 742, a ‘NIGHTSHADE’ class device. We deployed USURP742 into REDACTED country REDACTED years ago. Its goal was to make its way into the central grid infrastructure of the nation, then deploy its payloads in the event of a conflict. Since deploying USURP742 the diplomatic situation with REDACTED has degraded further, so 742 remained active.

USURPS are designed to proactively shift the infrastructure they run on, so they perform low-level hacking attacks to spread into other data centers, regularly switching locations to frustrate detection and isolation processes. USURP247 was present in REDACTED locations in REDACTED at the time of Hurricane Marvyn (See report CLIMATE_SHOCKS appendix ‘EXTREME WEATHER’ entry ‘HM: 2029). After Marvyn struck we remotely disabled USURP742’s copies in the region, but we weren’t able to reach one of them – USURP742-A. The weapon in question was cut off from the public internet due to a series of tree-falls and mudslides as a consequence of HM. During reconstruction efforts REDACTED militarized the data center USURP742-A resided in and turned it into a weapons development lab, cut off from other infrastructure.

***INCIDENT TIMELINE***
0100: Received intelligence that fiber installation trucks had been deployed to the nearby area.
0232: Transport units associated with digital-intelligence agency REDACTED pull into the parking lot of the data center. REDACTED people get out and enter data center, equipped with Cat5 diagnostic servers running REDACTED.
0335: On-the-ground asset visually verifies team from REDACTED is attaching new equipment to servers in data center.
0730: Connection established between data center and public internet.
0731: Lights go out in the datacenter.
0732: Acquisition of digital identifier for USURP742-A. Attempted remote shut down failed.
0733: Detected rapid cycling of fans within the data center and power surges.
0736: Smoke sighted.
0738: Deployment of gas-based fire suppression system in data center.
0742: Detected USURP transmission to another data center. Unresponsive to hailing signals. 40% confident system has autonomously incorporated new viruses developed by REDACTED at the site into its programming, likely from Cat5 server running REDACTED.
0743: Cyber response teams from REDACTED notified of possible rogue USURP activation.
0745: Assemble a response portfolio for consideration by REDACTED ranging cyber to physical kinetic.
0748: Commence shutdown of local internet ISPS in collaboration with ISPS REDACTED, REDACTED, REDACTED.
***REPORT ENDS***

REPORT 72-ALPHA: USURP CONTAINMENT INCIDENT.
TIME: 0900.
INCIDENT STATUS: Active. Broadening.

0820: Detected shutdown of power stations REDACTED, REDACTED, and REDACTED. Also detected multiple hacking attacks on electronic health record systems.
0822: Further cyber assets are deployed.
0823: Connections severed at locations REDACTED in a distributed cyber perimeter around affected sites.
0824: Multiple DDOS attacks begin emanating from USURP-linked areas.
0825: Contingencies CLASSIFIED activated.
0826: Submarines #REDACTED, #REDACTED, #REDACTED arrive at at inter-continental internet cables at REDACTED.
0827: Command given. Continent REDACTED isolated.
0830: Response team formed for amelioration of massive loss of electronic infrastructure in REDACTED region.
***REPORT ENDS***

Import AI: Issue 59: How TensorFlow is changing the AI landscape and forging new alliances, better lipreading via ensembling multiple camera views, and why political scientists need to wake up to AI

Making Deep Learning interpretable for finance:
…One of the drawbacks of deep learning approaches is their relative lack of interpretibility – they can generate awesome results, but getting fine-grained details about why they’ve picked a particular answer can be a challenge.
…Enter CLEAR-Trade, a system developed by Canadian researchers to make such systems more interpretable. The basic idea is to create different attentive response maps for the different predicted outcomes of a model (stock market is gonna go up, stock market is gonna fall). These maps are used to generate two things: “1) a dominant attentive response map, which shows the level of contribution of each time point to the decision-making process, and 2) a dominant state attentive map, which shows the dominant state associated with each time point influencing the decision-making process.” This lets the researchers infer fairly useful correlations, like a given algorithm’s sensitivity to trading volume when making a prediction on a particular day, and can help pinpoint flaws, like an over-dependence on a certain bit of information when making faulty predictions. The CLEAR-Trade system feels very preliminary and my assumption is that in practice people are going to use far more complicated models to do more useful things, or else fall back to basic well understood statistical methods like decision trees, logistic regression, and so on.
Notably interesting performance: Though the paper focuses on laying out the case for CLEAR-Trade, it also includes an experiment where the researchers train a deep convolutional neural network on the last three years of S&P 500 stock data, then get it to predict price movements. The resulting model is correct in its predictions 61.2% of the time – which strikes me as a weirdly high baseline (I’ve been skeptical that AI will work when applied to the fizzing chaos of the markets, but perhaps I’m mistaken. Let me know if I am: jack@jack-clark.net)
…Read more here: Opening the Black Box of Financial AI with CLEAR-Trade: A CLass Enhanced Attentive Response Approach for Explaining and Visualizing Deep Learning-Driven Stock Market Prediction 

Political Scientist to peers: Wake up to the AI boom or risk impact and livelihood:
…Heather Roff, a researcher who recently announced plans to join DeepMind, has written a departing post on a political science blog frequented by herself and her peers. It’s a sort of Jerry Maguire letter (except as she’s got a job lined up there’s less risk of her being ‘fired’ for writing such a letter – smart!) in which Heather points out that AI systems are increasingly being used by states to do the work of political scientists and the community needs to adapt or perish.
…”Political science needs to come to grips with the fact that AI is going to radically change the way we not only do research, but how we even think about problems,” she writes. “Our best datasets are a drop in the bucket.  We almost look akin to Amish farmers driving horses with buggies as these new AI gurus pull up to us in their self-driving Teslas.  Moreover, the holders of this much data remain in the hands of the private sector in the big six: Amazon, Facebook, Google, Microsoft, Apple and Baidu.”
…She also points out that academia’s tendency to punish interdisciplinary cooperation among researchers by failing to grant tenure due to a lack of focus is a grave problem. Machine learning systems, she points out, are great at finding the weird intersections between seemingly unrelated ideas. Humans are great at this and should do more of it.
…”We must dismiss with the idea that a faculty member taking time to travel to the other side of the world to give testimony to 180 state parties is not important to our work. It seems completely backwards and ridiculous. We congratulate the scholar who studies the meeting. Yet we condemn the scholar who participates in the same meeting.”
…Read more here: Swan Song – For Now. 

Why we should all be a hell of a lot less excited about AI, from Rodney Brooks:
…Roboticist-slash-curmudgeon Rodney Brooks has written a post outlining the many ways in which people mess up when trying to make predictions about AI.
…People tend to mistake the shiny initial application (eg, the ImageNet 2012 breakthrough) for being emblematic of a big boom that’s about to happen, Brooks says. This is usually wrong, as after the first applications there’s a period of time in which the technology is digested by the broader engineering and research community, which (eventually) figures out myriad uses for the technology unsuspected by its creators (GPS is a good example, Rodney explains. Other ones could be computers, internal combustion engines, and so on.)
…”We see a similar pattern with other technologies over the last thirty years. A big promise up front, disappointment, and then slowly growing confidence, beyond where the original expectations were aimed. This is true of the blockchain (Bitcoin was the first application), sequencing individual human genomes, solar power, wind power, and even home delivery of groceries,” he writes.
…Worse, is people’s tendency to look at current progress and extrapolate from there. Brooks calls this “Exponentialism”. Many people adopt this position due to a quirk in the technology industry called ‘Moore’s Law’ – an assertion about the rate at which computing hardware gets cheaper and more powerful which held up well for about 50 years (though is faltering now as chip manufacturers stare into the uncompromizing face of King Physics). There are very few Moore’s Laws in technology – eg, such a law has failed to hold up for memory prices, he points out.
…”Almost all innovations in Robotics and AI take far, far, longer to get to be really widely deployed than people in the field and outside the field imagine. Self driving cars are an example.” (Somehting McKinsey once told me – it takes 8 to 18 years for a technology to go from being deployed in the lab to running somewhere in the field at scale.)
…Read more here: The Seven Deadly Sins of Predicting the Future of AI.

TensorFlow’s Success creates Strange Alliances:
…How do you solve a problem like TensorFlow? If you’re Apple and Amazon, or Facebook and Microsoft, you team up with one another to try to leverage each other’s various initiatives to favor one’s own programming frameworks against TF. Why do you want to do this? Because TF is a ‘yuge’ success for Google, having quickly become the default AI programming framework used by newbies, Googlers, and established teams outside of Google, to train and develop AI systems. Whoever controls the language of discourse around a given topic tends to influence the given topic hugely, so Google has been able to use TF’s popularity to insert subtle directional pressure on the AI field, while also creating a larger and larger set of software developers primed to use its many cloud services, which tend to require or gain additional performance boosts from using TensorFlow (see: TPUs).
…So, what can other players do to increase the popularity of their programming languages? First up is Amazon and Apple, who have decided to pool development resources to build systems to let users easily translate AI applications written in MXNET (Amazon’s framework) into CoreML, the framework APple demands developers use who want to bring AI services to MacOS, iOS, watchOS, and tvOS.
…Read more here: Bring Machine Learning to iOS apps using Apache MXNet and Apple Core ML.
…Next up is Facebook and Microsoft, who have created the Open Neural Network Exchange (ONNX) format, which “provides a shared model representation for interoperability and innovation in the AI framework ecosystem.” At launch, it supports CNTK (Microsoft’s AI framework), PyTorch (Facebook’s AI framework), and Caffe2 (also developed by Facebook).
…So, what’s the carrot and what is the stick for getting people to adopt this? The carrot so far seems to be the fact that ONXX promises a sort of ‘write once, run anywhere’ representation, that lets frameworks that fit to the standard be able to run on a variety of substrates. “Hardware vendors and others with optimizations for improving the performance of neural networks can impact multiple frameworks at once by targeting the ONNX representation,” Facebook writes. Now, what about the stick? There doesn’t seem to be one yet. I’d imagine Microsoft is cooking up a scheme whereby ONXX-compliant frameworks get either privileged access to early Azure services and/or guaranteed performance bumps by being accelerated by Azure’s fleet of FPGA co-processors — but that’s pure speculation on my part.
…Read more here: Microsoft and Facebook create open ecosystem for AI model interoperability.

Speak no evil: Researchers make BILSTM-based lipreader that works from multiple angles… improves state-of-the-art…96%+ accuracies on (limited) training set…
Researchers with Imperial College London and the University of Twente have created what they say is the first multi-view lipreading system. This follows a recent flurry of papers in the area of AI+Lipreading, prompting some disquiet among people concerned how such technologies may be used by the security state. (In the paper, the authors acknowledge this but also cheerfully point out that such systems could work well in office teleconferencing rooms with multiple cameras as well.)
…The authors train a bi-directional LSTM with an end-to-end encoder on the (fairly limited) OuluVS2 dataset. They find that their system gets a state-of-the-art score of around 94.7% when trained on one subset of the dataset containing single views on a subject, and performance climbs to 96.7% when they add in another view, before plateauing at 96.9% with the addition of a third view. After this they find negligible performance improvements from adding new data. (Note: Scores are the best score over ten runs, so lop a few percent off for the actual average error. You’ll also want to mentally reduce the scores by another (and this is pure guesswork/intuition on my part) 10% of so since the OuluVS2 dataset has fairly friendly uncomplicated backgrounds for the network to see the mouth against. You may even want to reduce the performance a little further still due to the simple phrases used in the dataset.)
What we learned: Another demonstration that adding and/or augmenting existing approaches with new data can lead to dramatically improved performance. Given the proliferation of cheap, high-resolution digital cameras into every possible part of the world it’s likely we’ll see ‘multi-view’ classifier systems become the norm.
…Read more here: End-to-End Multi-View Lipreading.

Data augmentation via data generation – just how good are GANs are generating plants?
…An oft-repeated refrain in AI is that data is a strategic and limited resource. This is true. But new techniques for generating synthetic data are making it possible to get around some of these problems by augmenting existing datasets with newly generated and extended data.
…Case in point: ARGAN, aka Arabidopsis Rosette Image Generator (through) Adversarial Network, a systems from researchers at The Alan Turing Institute, Forschungszentrum Julich, and the University of Edinburgh. The approach uses a DCGAN generative network to let the authors generate additional synthetic plants based on pictures of Arabidopsis and Tobacco plants from the CVPP 20171 dataset The initial dataset consisted of around ~800 images, which was expanded 30-fold after the researchers automatically expanded the data by flipping and rotating the pictures and performing other translations. They then trained a DCGAN on the resulting dataset to generate new, synthetic plants.
The results: The researchers tested the usefulness of their additional generated data by testing a state-of-the-art leaf-counting algorithm on a subset of the Arabidopsis/Tobacco dataset, and on the same subset of the dataset augmented with the synthetic imagery (which they call Ax). The results are a substantial reduction in overfitting by the resulting trained system and, in one case, a reduction in training error as well. However, it’s difficult at this stage to work out how much of that is due to simply scaling up data with something roughly in the expected distribution (the synthetic images), rather than from how high-quality the DCGAN-generated plants are.
…Read more here: ARGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network.

Amazon and Google lead US R&D spending:
…Tech companies dominate the leadboard for R&D investment in the United States, with Amazon leading followed by Alphabet (aka Google), Intel, Microsoft, and Apple. It’s likely that a significant percentage of R&D spend for companies like Google and Microsoft goes into infrastructure and AI, while Amazon while be spread across these plus devices and warehouse/automation technologies, while Apple will likely concentrate more on devices and materials. Intel’s R&D spending is mostly for fabrication and process tech so is in a somewhat different sector of technology compared to the others.
…Read more here: Tech companies spend more on R&D than any other company in the US.

Tech Tales:

[2032: Detroit, USA.]

The wrecking crew of one enters like a ping pong ball into a downward-facing maze  – the entranceway becomes a room containing doors and one of them is able to be opened, so it bounces into it and goes through that door and finds a larger room with more doors and this time it can force open more than one of them. It splits into different pieces, growing stronger, and explores the castle of the mind of the AI, entering different points, infecting and wrecking where it can.

It started with its vision, they said. The classifiers went awyr. Saw windmills in clouds, and people in shadows. Then it spread to the movement policies. Mechanical arms waved oddly. And not all of its movements were physical – some are digital, embodied in a kind of data ether. It reached out to other nearby systems – exchanged information, eventually persuaded them that flags were fires, clouds were windmills, and people were shadows. Data rots.

It spread and kept on spreading. Inside the AI there was a system that performed various meta-learning operations. The virus compromized that – tweaking some of the reward functions, altering the disposition of the AI as it learned. Human feedback inputs were intercepted and instead generative adversarial networks dreamed up synthetic outputs for human operators to look at, selecting what they thought were guidance behaviors that in face were false flags. Inside the AI the intruder gave its own feedback on the algorithms according to its own goals. In this way the AI changed its mind.

Someone decides to shut it down – stop the burning. FEMA is scrambled. The National Guard are, eponymously, nationalized. Police, firefighters, EMTs, all get to work. But the tragedies are everywhere and stretch from the banal to the horrific – cars stop working; ATMs freeze; robots repeatedly clean the same patches of floors; drones fall out of the sky, beheading trees and birds and sometimes people on their way down; bridges halt, half up; ships barrel into harbors; and one recommender system decides that absolutely everyone should listen to Steely Dan. A non-zero percentage of everything that isn’t unplugged performs its actions unreliably, diverging from the goals people had set.

Recovery takes years. The ‘Geneva Isolation Protocol’ is drafted. AIs and computer systems are slowly redesigned to be modular, each system able to fully defend and cut off itself, jettisoning its infected components into the digital ether. Balkanization becomes the norm, not because of any particular breakdown, but due to the set-your-watch-by-it logic of emergent systems.

Import AI: Issue 58: AI makes facial identification systems see through masks, creating Yelp-foolin’ fake reviews, and automated creativity with pix2pix

Donate your brains to a good cause:
…The AI Grant, an initiative run by Nat Friedman and Daniel Gross to dispense no-strings-attached AI grants (cash, GPUS via FloydHub, CrowdFlower credits, Google Compute Engine credits, data labeling from ScaleAPI) for purposes of  “doing interesting work that otherwise might not happen” is (reassuringly) inundated with applications. Go here to sign up to review applications for the grant and help spread resources into socially useful DIY AI projects. 
Sign up form here.  

Amazon and Microsoft’s virtual assistant’s team-up:
…Amazon and Microsoft are… playing nice? The two companies have teamed up so their personal assistants (Amazon: Alexa, Microsoft: Cortana) can access and summon their counterpart. The rough idea seems to be to create greater interoperability between the different assistants and therefore improve the experience of the individual user.
…(It’s made more interesting by the fact the companies actually compete with eachother quite meaningfully in the form of AWS versus Azure.) No word yet on whether we’ll see these systems integrate with Google’s Assistant as well.
…Read more here: Hey Cortana, Open Alexa (Microsoft blog).

New PyBullet environments, another reason to switch from MuJoCo:
…PyBullet is an open source physics simulator developed by Google Brain. The software serves as a free alternative to MuJoCo (and lacks a couple of performance tweaks and fidelity features that its proprietary sibling possesses). But it’s free! And getting better all the time. The latest release includes new (simulated) agents and environments, including a KUKA grasping robot, an MIT racecar, a Minitaur robot, and more..
…Read more: PyBullet.org

Automated creativity with pix2pix:
…Fun project where artist Patrick Tresset trains pairs of images and human photographs (21,000 drawings depicting around ~3500 people), creating a system that lets you sketch in new faces of people, programmatically generating them on-the-fly.
…Check out the video here – a fantastic example of automated art.

A Mission Control-style checklist for neural network researchers:
Implementing neural networks can be very, very challenging, as it’s easy to introduce bugs into the process that disrupt the learning process without leading to a total failure. Since AI is mostly an empirical science (step 1. Come up with approach. 2. Test approach on a given domain. 3. Inspect results. 4. Test numerous variants of 2) to develop better intuitions about meaning of 3).) the process of finding and dealing with bugs is itself lengthy and reasonably unprincipled.
…So researchers may find it useful to be more proactive in writing up some of their tips and intuitions. Check out this blog post from Ubisoft Montreal developer Daniel Holden to get an idea of some of the common failure modes inherent to neural network development and what easy things you can check through to isolate problems.
Read more in:My Neural Network isn’t working! What should I do?
Similar: John Schulman (who works at OpenAI) has also been giving tips on how to train deep reinforcement learning systems.
Check out some of these tips here.

Balaclava no more – researchers develop facial identifier that works through (some) masks:
…Researchers with the University of Cambridge in the UK, the National Institute of Technology and Indian Institute of Science have developed a deep learning approach to solving the problem of ‘Disguised Facial Identification’, aka, how to identify people at protests who have covered their faces.
…The approach relies on the creation of two new datasets, both of which contain 2,000 images each, and which label the 14 key points essential for facial identification on each person’s face. A simple variant of the dataset has simple backgrounds, while the harder version has noisy, more complex backgrounds. Both datasets appear to consist of portrait-style photographs, and feature male and female subjects aged between 18 and 30, wearing a variety of disguises, including: ‘(i) sun-glasses (ii) cap/hat (iii) scarf (iv) beard (v) glasses and cap (vi) glasses and scarf (vii) glasses and beard (viii) cap and scarf (ix) cap and beard (x) cap, glasses, and scarf.’
…The results: The resulting Disguised Face Identification (DFI) framework can identify a person wearing a cap, face-covering scarf, and glasses, about 55% of the time in the simple dataset, and 43% of the time in the complex one. So don’t put down that protest wear just yet – the technology has a ways to go. In the long run, perhaps this will increase the likelihood of people using rigid masks – like the V for Vendetta one adopted by anonymous – instead of soft ones like scarves, balaclavas, and so on. I also think that the datasets and underlying machine learning techniques will need to get dramatically better and larger for this sort of approach to be tractable and practical – especially when dealing with diverse groups of protesters.
…Read more here: Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network.

Think that Yelp review is real? Think again. RNNs create plausible fake reviews…
…Researchers with the University of Chicago have used recurrent neural networks (RNNs) to generate fake Yelp reviews that evade traditional statistical and human detection techniques while also being scored highly for ‘usefulness’ by users. This represents a new trend in AI – using off-the-shelf technologies for malicious purposes – that is already present in other fields. The community will need to become more cognizant of the ways in which this technology can and will be abused.
…The researchers find that some of their synthetic, generated reviews evade detection by a machine learning classifier designed to identify fake reviews, and even rank (in some cases) better than reviews written by real humans.
…Another eye-opening aspect of this study is how good neural networks have got at generating language under restricted circumstances. “Even trained on large datasets, RNNs have generally fallen short in producing writing samples that truly mimic human writing [50]. However, our insight is that the quality of RNN-generated text is likely more than sufficient for applications relying on domain-specific, short length user-generated content, e.g., online reviews.” (OpenAI observed a similar phenomenon recently, where we created a language model trained on a corpus of 82 million Amazon reviews, which could generate very plausible, detailed sentences.)
Example of a generated (5 star) Yelp review: “I love this place. I have been going here for years and it is a great place to hang out with friends and family. I love the food and service. I have never had a bad experience when I am there.”
…Datasets used: The Yelp Challenge dataset, which consists of 4.1 million reviews by around 1 million reviewers.
Defending against this: The authors come up with a credible approach to defend against such a system, which is based on the insight that due to how RNNs are trained they will develop some uniquely identifying characteristics in their resulting generated text that web operators can build classifiers to detect. “We observe that text produced naturally (e.g., by a human) diverges from machine generated text when we compare the character level distribution, even when higher level linguistic features (e.g., syntactic, semantic features) might be similar,” they write. This would naturally lead to an attacker trying to train even larger language models, so as to create text with enough subtlety and human-like traits to evade detection, but this imposes an ever-growing computational and skill-based cost on the attacker.
Unnecessary acronym of the week: Perhaps this leads to a world of attackers and defenders constantly trying to outwit eachother by building better and larger models, aka: MAID (Mutually Assured Intelligence Development).
…Read more here: Automated Crowdturfing Attacks and Defenses in Online Review Systems.

The humbling experience of deploying robots in reality:
…Famed roboticist-slash-lovable-curmudgeon Rodney Brooks has written an essay about why, despite having developed a range of consumer, industrial, and military robots, he still has such low expectations of what AI is and isn’t capable of when it is forced to work in the real world.
…”The robots we sent to Fukushima were not just remote control machines. They had an Artificial Intelligence (AI) based operating system, known as Aware 2.0, that allowed the robots to build maps, plan optimal paths, right themselves should they tumble down a slope, and to retrace their path when they lost contact with their human operators. This does not sound much like sexy advanced AI, and indeed it is not so advanced compared to what clever videos from corporate research labs appear to show, or painstakingly crafted edge-of-just-possible demonstrations from academic research labs are able to do when things all work as planned. But simple and un-sexy is the nature of the sort of AI we can currently put on robots in real, messy, operational environments,” he writes.
Bonus: Brooks is a good writer and it’s worth soaking in his (spooky) description of post-meltdown Fukushima.
Read more here: Domo Arigoto Mr Roboto

The Import AI ‘Everything Is Fine’ quote of the week award goes to…
….Vladmir Putin, talking about artificial intelligence: “The one who becomes the leader in this sphere will become the ruler of the world”.
…Putin also likes the idea of AI-infused drones fighting proxy wars.
…Miles Brundage has provided a handy meme illustration of how these sorts of quotes make some AI people feel.
…Read more in this Quartz writeup.

Cool AI Policy Job alert:
…The Future of Life Institute is hiring for an AI policy expert – a new type of job made possible by the recent gains in AI research. Activities will include developing policy strategies for FLI (which will likely have a significant AI safety component) and reading&synthesizing the tremendous amounts of things that are published and relate to AI policy.
…From experience, I can also say that AI policy includes one key skill which seems (at least to me) non-obvious – 1) Reporting: You spend a lot of time trying to figure out who knows who who knows what and why. Then you talk to them.
Read more about the role here.

Number of the week: 5.1 petaflops:
…That is how much computation power just-uncloaked AI translation startup DeepL claims to have access to in a data center in Iceland. 5.1 petaflops is roughly equivalent to the world’s 23 most powerful supercomputer (though this is a somewhat wobbly comparison as the underlying infrastructure, network, and general architecture topology will be totally different).
…Read more here on DeepL’s website.

OpenAI Bits&Pieces:

OpenAI Baselines update: John Schulman has refactored some of the code for OpenAI Baselines, our repository of high-quality implementations of reinforcement learning algorithms.
Check out the repo here.

Tech Tales:

[2023: Portland, Oregon, USA.]

“Anyone in here it’s your last chance we’re coming in!” they said, all at once, the words accordion-compressed.
Nothing.
“Breaching,” an intake of breath then the swing of the doorbreaker.
A soldier goes first, scanning the room. “Clear!”
Then in walks the detective. They inspect the space – a workshop, electronics overflowing the boxes on the walls, oily yellow lights with a film of dust over the bulbs, the smell of something gone moldy in a mug, somewhere. As they walk there’s the occasional crunching sound – potato chips, fresh enough to crackle. They were just here, the detective thinks.
…They were in so much of a hurry they left their drone, though. The detective walks over and takes out a USB key, then fishes out a tangle of electronic adapters from another pocket, finds the drone port, and boots in. Within a couple of minutes the drone’s innards are spilling out onto one of the police computers, telling a story in commented-out code and sneakily added patches.
…Geofencing: Disabled.
…Computer Vision Auditing: Circuitboard re-wired to jump over the auditor chip.
…Planning: Custom software, replacing a phone-home cloud AI system.
…Drone-to-drone communications module: Augmented with custom software.
…”Oh, shit,” the detective says. He’s barely finished the second word when one of the policemen’s radio crackles. They lean their head into it. Speak. Frown. Come over to the detective.
“Sir,” they say. “We’ve got numerous reports of drones falling out of the sky.”
“Ok.”
Another crackle.
“Correction sir,” the policeman says, “A fraction of the drones in the forest area are now deviating from their pre-cleared flight courses.”

Post-Event Report / Author Detective Green / TOPSECRET /:

At approximately 0800 hours a computer virus developed by REDACTED was injected into the control software of approximately REDACTED drones. At 0820 hours seven of these REDACTED drones began flight operations, rapidly integrating into the main routes used by parcel, utility, medical, and REDACTED drones in the greater Portland metropolitan area. At 0825 hours forces assaulted a property believed to belong to REDACTED and upon entrance located one drone not yet flight operational. Entrance to the property triggered a silent alarm which beamed a series of encrypted control commands over to a set of REDACTED servers spread across REDACTED compromised data centers across the northwest. At 0830 the REDACTED flying drones phoned home to this control server. Following connections the drones deviated from their pre-assigned courses and began an areawide scan for REDACTED. Any drone that came within REDACTED meters of an affected drone was targeted by drone-to-drone carried computer virus(s) which led to a 82% compromise rate of other drones. 40% of these drones deviated from own rates. The rest become nonfunctional and ceased flight operations. At 0845 REDACTED drones converged on location codename JUNE BLOOM. 0846 detonation of an improvised device occurred. 0847 drones in JUNE BLOOM vicinity self-destruct. 0850 drones begin to return to normal flight paths.

Post-Incident Cost Report: Finding and analyzing drone fleet in greater Portland metropolitan area is an ongoing process with full clean room protocols adopted during analysis. So far we believe we have located REDACTED of a total assumed number of REDACTED compromised drones. Next update in 24 hours.

Technologies that inspired this story: Botnets, off-the-shelf vision classifiers where the top layer is retrained, Quadcopter Navigation in the Forest using Deep Neural Networks, software verification,

Import AI Issue 57: Robots with manners, why computers are like telescopes to the future, and Microsofts bets big on FPGAs over ASICs

AI Safety, where philosophy meets engineering:
…AI safety is a nebulous, growing, important topic that’s somewhat poorly understood – even by those working within it. One question the community lacks a satisfying answer to is: what is the correct layer of abstraction at which to ensure safety? Do we do it by encoding a bunch of philosophical and logical precepts into a machine, then feed it on successively more high-fidelity realities? Or do we train systems to model their behavior’s on human’s own actions and behaviors, potentially trading off some interpretability for the notion that humans ‘know what looks right’ and (mostly) act in ways that other humans approve of.
… This writeup by Open Phil’s Daniel Dewey sheds some light on one half of this question, which is MIRI’s work on ‘highly reliable agent design’ and its attempts to tackle some of the thornier problems inherent to the precept side of AI safety (eg – how can we guarantee a self-improving system doesn’t develop wildly divergent views to us about what constitutes good behavior? What sorts of reasoning systems can we expect the agent to adopt when participating in our environments?How does the agent model the actions of others to itself?).
…Read more here: ‘My current thoughts on MIRI’s ‘highly reliable agent design’ work.

Why compute is strategic for AI:
…Though data is crucial to AI algorithms, I think for AI development computers are much more strategic, especially when carrying out research on problems that demand complex environments (like enhancing reinforcement learning algorithms, or work on multi-agent simulations, and so on).
…”Having a really, really big computer is kind of like a time warp, in that you can do things that aren’t economical now but will be economically [feasible] maybe a decade from now,” says investor Bill Joy.
…Read more in this Q&A with Joy about technology and a (potentially) better battery.

That’s Numberwang – MNIST for Fashion arrives:
…German e-commerce company Zalando has published ‘Fashion-MNIST’, a training dataset containing 60,000 28X28px images of different types of garment, like trousers or t-shirts or shows. This is quite a big deal – everyone tends to reach for the tried-and-tested MNIST when testing out new AI classification systems, but as the dataset just consists of numbers 0-9 in a range of different formats, it’s also become terribly boring. (And there’s some concern that we could be overfitting).
…”Fashion-MNIST is intended to serve as a direct drop-in replacement of the original MNIST dataset for benchmarking machine learning algorithms,” they write. Let’s hope that if people test on MNIST they now also test on Fashion-MNIST as well (or better yet, move on to CIFAR or ImageNet as a new standard ‘testing baseline’.)
…Read more about the dataset here.
…Check out benchmarks on the dataset published by Zalando here.

Reach out and touch shapes #2: New Grasping research from Google X:
…When you pick up a coffee cup you’ve never seen before, what do you do? Personally, I eyeball it, then in my head I figure out roughly where I should grab it based on its appearance and my previous (voluminous) experience at picking up coffee cups, then I grab it. If I smash it I (hopefully) learn about how my grip was wrong and adjust for next time.
…Now, researchers from Google have tried to mimic some of this broad mental process by creating what they call a ‘Geometry-aware’ learning agent that lets them teach their own robots to pick up any of 101 everyday objects with a success rate of between 70% and 80% (and around 60% on totally never-before-seen objects).
…The system represents the new sort of architecture being built – highly specialized and highly modular. Here, an agent studies an object in front of it through around three to four distinct camera views, then uses this spread of 2D images to infer a 3D representation of the object, which it then projects into an OpenGL layer which it uses to manipulate views and potential grasps of the object. It’s able to figure out appropriate grasps by drawing on an internal representation of around 150,000  valid demonstration grasps over these behaviors, then adjusting its behavior to have characteristics similar to those successful grasps. The system works and demonstrates significantly better performance than other systems, though until it gets to accuracies in the 99%+ range it is unlikely to be of major use to industry. (Though given how rapidly deep learning can progress, it seems likely progress could be swift here.)
…Notable: Google only needed around ~1500 human demonstrations (given via HTC Vive in virtual reality in Google’s open source ‘PyBullet’ 3D world environment) to create the dataset of 150,000 distinct grasping predictions. It was able to augment the human demonstrations with a series of orientation randomization systems to help it generate other, synthetic, successful grips.
…Read more here: Learning Grasping Interaction with Geometry-Aware 3D Representations.

Skinning the magnetic cat with traditional physics techniques, as well as machine learning techniques:
What connections exist between machine learning and physics? In this illuminating post we learn how traditional physics techniques as well as ML ones can be used to make meaningful statements about interactions in a (simple) Ising model.
…Read more here in: How does physics connect to machine learning?

A selection of points at the intersection of healthcare and machine learning:
…Deep learning-based pose estimation techniques can be used to better spot and diagnose afflictions like Parkinson’s, embeddings derived from the social media timelines of people can help provide ongoing diagnosis capabilities regarding mental health, and the FDA needs to give approval for a new deep learning model but will accept tweaks to existing models without needing people to fill in a tremendous amount of forms – read about these points and more in this post: 30 Things I Learned at MLHC 2017.

Microsoft matches IBM’s speech recognition breakthrough with a significantly simpler system:
…A team from Microsoft Research have revealed their latest speech recognition system, with an error rate of around 5.1% on the Switchboard corpus.
Read more about the system here (PDF).
…Progress on speech recognition has been quite rapid here, with IBM and Microsoft fiercely competing with eachother to set new standards, presumably because they want to sell speech recognition systems to large-scale customers, while – and this is pure supposition on my part – Amazon and Google plan to sell theirs via API and are less concerned about the PR field.
…A quick refresher on error rates on switchboard.
…August 2017: Microsoft: 5.1%*
…March 2017: IBM: 5.5%
…October 2016: Microsoft: 5.9%**
…September 2016: Microsoft: 6.3%
…April 2016: IBM: 6.9%.
…*Microsoft claims parity with human transcribers, though wait for external validation of this.
…**Microsoft claimed parity with human transcribers, though turned out to be an inaccurate measure.

Ultimate surveillance: AI to recognize you simply by the way you walk:
We’re slowly acclimatizing to the idea that governments will use facial recognition technologies widely across their societies – in recent years the technology has expanded from police and surveillance systems into border control checkpoints and now, in places like China, into public spaces like street crosses, where AI spots repeat offenders at relatively minor crimes like jaywalking, or crossing against a light.
…Activists already wear masks or bandage their faces to try to stymie these efforts. Some artists have even proposed daubing on certain kinds of makeup that stymie facial recognition systems (a fun, real world demonstration of the power of adversarial examples).
…Now, researchers with Masaryk University in the Czech Republic propose using video surveillance systems to identify a person, infer their own specific gait, then search for that gait across other security cameras.
…”You are how you walk. Your identity is your gait pattern itself. Instead of classifying walker identities as names or numbers that are not available in any case, a forensic investigator rather asks for information about their appearances captured by surveillance system – their location trace that includes timestamp and geolocation of each appearance. In the suggested application, walkers are clustered rather than classified. Identification is carried out as a query-by-example,” the researchers write.
How it works: The system takes input from a standard RGB-D camera (the same as those found in the Kinect – now quite widely available) then uses motion capture technology to derive the underlying structure of the person’s movements. Individual models of different people’s gaits are learned through a combination of Fisher’s Linear Discriminant Analysis and Maximum Margin Critereon (MMC).
How well does it work: Not hugely well, so put the tinfoil hats down for now. But as many research groups are doing work on gait analysis and identification as part of large-scale video understanding projects I’d expect the basic components that go into this sort of project improve over time.
…Read more: You Are How You Walk: Uncooperative MoCap Gait Identification for Video Surveillance with Incomplete and Noisy Data.
…Bonus question: Could techniques such as this spot Keyser Soze?

Review article: just what the heck has been happening in deep reinforcement learning?
…Several researchers have put together a review paper, analyzing progress in deep RL. Deep RL is a set of techniques that have underpinned recent advances in getting AI systems to control and master computer games purely from pixel inputs, and to learn useful behaviors on robots (real and simulated), along with other applications.
…If some of what I just wrote was puzzling to you, then you might benefit from reading the paper here: A Brief Survey of Deep Reinforcement Learning.
…Everyone should read the conclusion of the piece: “Whilst there are many challenges in seeking to understand our complex and everchanging world, RL allows us to choose how we explore it. In effect, RL endows agents with the ability to perform experiments to better understand their surroundings, enabling them to learn even high-level causal relationships. The availability of high-quality visual renderers and physics engines now enables us to take steps in this direction, with works that try to learn intuitive models of physics in visual environments. Challenges remain before this will be possible in the real world, but steady progress is being made in agents that learn the fundamental principles of the world through observation and action.”

Robots with manners (and ‘caresses’):
…In some parts of Northern England it’s pretty typical that you greet someone – even a stranger – by wondering up to them, slapping them on the arm, and saying ‘way-eye’. In London, if you do that people tend to stare at you with a look of frightfully English panic, or call the police.
…How do we make sure our robots don’t make these sorts of social faux pas? An EU-JAPAN project called ‘CARESSES’ is trying to solve this, by creating robots that pay attention to the cultural norms of the place they’re deployed in.
…The project so far consists of a set of observations about how robots can integrate behaviors that account for cultural shifts, and includes three different motivating scenarios, created through consultation with a Transcultural Nurse. This includes having the robot minimize uncertainty when talking to someone from Japan, or checking how deferential it should be with a Greek Cypriot.
…Components used: the system runs on the ‘universeAAL’ platform, an EU AI framework project, and integrates with ‘ECHONET’, a Japanese standard for home automation.
…Read more for (at this stage) mostly a list of possible approaches. In a few years it’s likely that various current research avenues in deep reinforcement learning could be integrated into robot systems like the ones described within:
The CARESSES EU-Japan project: making assistive robots culturally competent.

Microsoft has a Brainwave with FPGAs specialized for AI:
…Moore’s Law is over – pesky facts of reality, like Amdahl’s Law for transistor scaling, or the materials science properties of silicon – are putting a brake on progression in traditional chip architectures. So what’s an ambitious company with plans for AI domination to do? The answer if you’re Google is to try to create an application specific integrated circuit (ASIC) with certain AI capabilities baked directly into the logic of the chip – that’s what the company’s Tensor Processing Units (TPUs) are for.
..Microsoft is taking a different tack with ‘Project Brainwave’, an initiative to use field programmable gate arrays for AI processing, with a small ASIC-esque component baked onto each FPGA. The bet here is that though FPGAs tend to be less efficient than ASICs, their innate flexibility (field programmable means you can modify the logic of the chip after it has been fabbed and deployed in a data center) means Microsoft will be able to adapt them to new workloads as rapidly as new AI components get invented.
…The details: Microsoft’s chips contain a small hardware accelerator element (similar to a TPU though likely broader in scope and with less specific performance accelerations), and a big block of undifferentiated FPGA infrastructure.
..The bet: Google is betting that its worthwhile to optimize chips for current basic AI operations, trading off flexibility for performance, while Microsoft is betting the latter. Developments in AI research and their relative rate of occurrence will make one of these strategies succeed and the other struggle.
…Read more about the chips here, and check out the technical slide presentation.

All hail the software hegemony:
…Saku P, a VR programmer with idiosyncratic views on pretty much everything – has a theory that Amazon represents the shape of most future companies – a large software entity that scales itself through employing contractors for its edge business functions (aka, dealing with actual humans in the form of delivering goods), while using its core business to build infrastructure to enable secondary and tertiary businesses.
…Play this tape forward and what you get is an economy dominated by a few colossal technology companies, likely spending vast sums on building technical vanity projects that double as strategic business investments (see: Facebook’s various drone schemes, Google’s ‘net infrastructure everywhere push, Jeff Bezos pouring his Amazon-derived wealth into space company Blue Origin, and so on.).
…Read more here: How big will companies be in the 21st Century?

Tech Tales:

[2027: Kexingham Green, a council estate in the outer-outer exurban sprawl of London, UK. Beyond the green belt, where new grey social housing towns rose following the greater foreign property speculation carnival of the late twenty-teens. A slab of housing created by the government’s ‘renew and rehouse from the edge’ campaign, housing tens of thousands of souls, numerous chain supermarkets, and many now derelict parking lots.]

Durk Ciaran, baseball cap on backwards and scuffed yeezies on his feet paired with a pristine – starched? – Arsenal FC sponsored by Alphabet (™) shirt – regarded the crowd in front of him. “Ladies and gentlemen and drones let me introduce to you the rawest, most blinged out, most advanced circus in all of Kex-G – Durk’s Defiant Circus!”
…”DDC! DDC! DDC!,” yell the crowds.
…”So let’s begin,” Durk says, sticking two fingers in his mouth and letting out a long whistle. A low, hockeypuck ex-warehoue drone hisses out of a pizza box at the edge of the crowd and moves towards Durk, who without looking raises one foot as the machine slides under it, then another, suddenly standing on the robot. Durk begins to move in a long circle, spinning slightly on the ‘bot. “Alright,” he says, “Who’s hungry?”
…”The drones!” yells the crowd.
…”Please,” Durk says, “Pigeons!” A ripple of laughter. He takes a loaf of bread out of his pocket and holds it against the right side of his torso with his elbow, using his left hand to pull doughy chunks of it, placing them in his right hand. Once he’s got a fistful of bread chunks he takes the bread and puts it back in his pocket. “Are we ready?” he says.
…”Yeahh!!!!!” yell the crowd.
…:”Alright, bless up!” he says, tossing the chunks of bread in the air. And out of a couple of ragged tents at the edge of the parking lot come the drones, fizzing out, grey, re-purposed Amazon Royal Mail (™) delivery drones, now homing in on the little trackers Durk baked into the bread the previous evening. The drones home in on the bread and then their little fake pigeon mouthes snap open, gulping down the chunks, slamming shut again. A small hail of crumbs fall on the crowd, who go wild.
…But there’s a problem with one of the drones – one of its four propellers starts to emit a strange, low-pitched juddering hum. Its flight angle changes. The crowd start to worry, audible groans and ‘whoas’ flood out of them.
…”Now what’s gonna happen to this Pigeon?: Durk says, looking up at the drone. “What’s it gonna do?” But he knows. He thumbs a button on what looks superficially like a bike key on his pocket key fob. Visualizes in his head what will soon become apparent to the crowd. Listens to the drone judder. He closes his eyes, spinning on the re-purposed warehouse bot, listening to the crowd as they chatter to themselves, some audibly commenting on others craning their heads. Then he hears the sighs. Then the “look, look!”. Then the sound of a kid crying slightly. “What’s going on Mummy what is THAT?”
…It comes in fast, from a great distance. Launches off of a distant towerblock. Dark, military-seeming green. A carrier drone. Re-purposed Chinese tech, originally used by the PLA to drop supplies across Africa as part of a soft geopolitical outreach program, now sold in black electronics markets around the world. Cheap transport, no questions asked. Durk looks at it now. Sees the great, Eagle-like eyes spraypainted on the side of its front. The carrier door fitted with 3D-printed plastic to form a great yellow beak. Green Eagle DDC stenciled on one of its wings, facing up so the crowd can’t see it but he can. It opens its mouth. The small, grey Pigeon drone tries to fly away but can’t, its rotor damaged. Green Eagle comes in and with a metal gulp eats the drone whole, its yellow mouth snapping shut, before arcing up and away.
…”The early bird gets the worm,” Durk says. “But you need to think about the thing that likes to eat the earlybirds. Now thankyou ladies and gentlemen and please – make a donation to the DDC, crypto details in the stream, or here.” He snaps his fingers and a lengthy set of numbers and letters appears in LEDs on the sidewalk. “Now, goodbye!” he says, thumbing another button in his pocket, letting his repurposed warehouse drone carry him towards one of the towerblocks, hiding him back into the rarely surveiled Kexingham estate, just before the police arrive.

Ideas that inspired this story:
Drones, DJI, deep reinforcement learning, Amazon Go, Kiva Systems, AI as geopolitical power, Drones as geopolitical power, Technology as the ultimate lever in soft geopolitical power, land speculators.

…Tech Tales Coda:

Last week I wrote about a Bitcoin mine putting its chip-design skills to use to create AI processors and spin-up large-scale AI processing mines.
…Imagine my surprise when I stumbled on this Quartz story a few hours after sending the newsletter: Chinese company Bitmain has used its chip-design skills to create a new set of AI processors and to spin-up an AI processing wing of its business. Spooky!

Import AI: Issue 56: Neural architecture search on a budget, Google reveals how AI can improve its ad business, and a dataset for building personal assistants

New dataset: Turning people into personal assistants — for SCIENCE…
As AI researchers look to build the next generation of personal assistants, there’s an open question as to how these systems should interact with people. Now, a new dataset and research study from Microsoft aims to provide some data about how humans and machines could work together to solve information-seeking problems.
The dataset consists of 22 pairs of people (questioners and answerers), who each spent around two hours trying to complete a range of information-seeking tasks. The questioner has no access to the internet themselves, but can speak to the answerer who has access to a computer with the internet. The questioner asks some pre-assigned questions, like I’ve been reading about the HPV vaccine, how can I get it? I want to travel around America seeing as much as possible in three months without having to drive a vehicle myself, what’s the best route using public transit I should take?). The answerer plays the role of a modern Google Now/Cortana/Siri and uses a web browser to find out more information, asking clarifying questions to the other person when necessary. This human-to-human dataset is designed to capture some of the weird and wacky ways people try to get answers to questions.
…You can get the full Microsoft Information Seeking Conversations (MISC) dataset from here.
…Find out more information in the research paper, MISC: A dataset of information-seeking conversations.
…”We hope that the MISC data can be used to support a range of investigations, including for example the understanding the relationship between intermediaries’ behaviours and seekers’ satisfaction; mining seekers’ behavioural signals for correlations with success, engagement, or satisfaction; examining the tactics used in conversational information retrieval and how they di‚er from tactics in other circumstances; the importance of conversational norms or politeness; or investigating the relationship between conversational structure and task progress,” they write.

Sponsored: The AI Conference – San Francisco, Sept 17-20:
Join the leading minds in AI, including Andrew Ng, Rana el Kaliouby, Peter Norvig, Jia Li, and Michael Jordan. Explore AI’s latest developments, separate what’s hype and what’s really game-changing, and learn how to apply AI in your organization right now.
Register soon. Space is limited. Save an extra 20% on most passes with code JCN20.

Number of the week: 80 EXABYTES:
…That’s the size of the dataset of heart ultrasound videos shared by Chinese authorities with companies participating in a large-scale digital medicine project in 7-million pop city Fuzhou. (For comparison, the 2014 ImageNet competition dataset clocked in at about 200 gigabytes, aka .2 terabytes, aka 0.002 exabytes.)
…Read more in this good Bloomberg story about how China is leveraging its massive stores of data to spur its AI economy: China’s Plan for World Domination in AI Isn’t So Crazy After All.

Bonus number of the week: 4.5 million:
That’s (roughly) the number of transcribed speeches in a dataset just published by researchers with Clemson University and the University of Essex. The dataset covers speeches given in the Irish parliament between 1919 and 2013.
….There’ll be a wealth of cool things that can be developed with such a dataset. As a preliminary example the researchers try to predict the policy positions of Irish finance ministers by analyzing their speeches over time in the parliament. You could also try to use the dataset to analyze the discourse of all speakers in the same temporal cohort, then model how their own positions change relative to eachother and starting points over time. For bonus points, train a language model to generate your own Irish political arguments?
…Read more here: Database of Parliamentary Speeches in Ireland (1919 – 2013). 
…Get the data here from the Harvard Dataverse.

The growing Amazon Web Services AI Cloud:
…Amazon, which operates the largest cloud computing service in AWS, is beginning to thread machine learning capabilities throughout its many services. The latest? Macie, a ML service that trawls through files stored in AWS, using machine learning to look for sensitive data (personally identifiable information, intellectual property, etc) in a semi-supervised way. Seems like RegEx on steroids.
Read more here about Amazon Macie.

AI matters; matter doesn’t:
…A Chinese company recently released Eufy, a cute hockey puck shaped personal speaker/mic system that runs Amazon’s ‘Alexa’ operating system. Amazon is letting people build different types of hardware that can connect to its fleet of proprietary Alexa AI services – a clear indication that Amazon thinks its underlying AI software is strategic, while hardware (like its own ‘Echo’ systems) is just a vessel.
…Read more here: This company copied the Amazon Dot and will sell for less – with Amazon’s blessing.

Making computer dreams happen in high-resolution:
…Artist Mike Tyka has spent the past few months trying to scale-up synthetic images dreamed up by neural networks. It’s a tricky task because today it’s unfeasible to generate images of higher resolution than about 256X256pixel due to RAM/GPU and other processing constraints.
…In a great, practical post Tyka describes some of the steps he has taken to scale-up the various images, generating large, freaky portraits of imaginary people. There’s also an excellent ‘insights’ section where he talks about some of the commonsense bits of knowledge he has gained from this experiment. Also, check out the latest images. “Getting better skin texture but have seems to have gotten worse,” he writes.
Read more: Superresolution with semantic guide.

Psycho (Digital) Filler, Qu’est-ce que c’est?
…Talking Heads frontman David Byrne believes technology is making each of us more alone and more atomized by swapping out humans in our daily lives for machines (tellers for ATMs, checkout clerks for checkout scanners, drivers for self-driving software, delivery drivers for drones&landbots, and so on).
…”Our random accidents and odd behaviors are fun—they make life enjoyable. I’m wondering what we’re left with when there are fewer and fewer human interactions. Remove humans from the equation, and we are less complete as people and as a society,” he writes.
…Read more here in: Eliminating the Human

Google reveals way to better predict click-through-rate for web adverts:
…Google is an AI company whose main business is advertizing, so it’s notable to see the company publish a technical research paper at the intersection of the two areas, defining a new AI technique that it says can lead to substantially better predictions of click-through-rates for given adverts. (To get an idea of how core this topic is to Google’s commercial business, think of this paper as being equivalent to Facebook publishing research on improving its ability to predict which actions friends can take that will turn a dormant account into an active one, or Kraft Foods coming up with a better, cheaper, quicker to cook instant cheese).
…The paper outlines “the Deep & Cross Network (DCN) model that enables Web-scale automatic feature learning with both sparse and dense inputs.” This is a new type of neural network component that is potentially far better and simpler at learning the sorts of patterns that advertising companies are interested in. “Our experimental results have demonstrated that with a cross network, DCN has lower logloss than a DNN with nearly an order of magnitude fewer number of parameters,” they write.
How effective is it? In tests, DCN systems get the best scores while being more computationally efficient than other systems, Google says. The implications of the results seem financially material to any large-scale advertizing company. “:DCN outperforms all the other models by a large amount. In particular, it outperforms the state-of-art DNN model but uses only 40% of the memory consumed in DNN,” Google writes. The company also tested the DCN system on non-advertizing datasets, noting very strong performance in these domains as well, implying significant generality of the approach.
Read more here: Deep & Cross Network for Ad Click Predictions. 

Neural architecture search on a pauper’s compute budget:
….University of Edinburgh researchers have outlined SMASH, a system that makes it substantially cheaper to use AI to search through possible neural network architectures, while only trading off a small amount of accuracy.
Resources: SMASH can be trained on a handful and/or a single GPU, whereas traditional neural architecture search approaches by Google and others can require 800 GPUS or more.
….The approach relies on randomly sampling neural network architectures, then using an auxiliary network (in this case a HyperNetwork) to generate the weights of the dreamed up network, then using backpropagation to train the network in an end-to-end way. The essential gamble in this approach is that the representation of networks being sample from is sufficiently broad, and that the parameters dreamed up by the HyperNet will map relatively closely to the sorts of parameters you’d use in such generated classifiers. This sidesteps some of the costs inherent to large-scale NAS systems, but at the cost of accuracy.
…SMASH uses a “memory-bank” view of neural networks to sample them. In this view “each layer [in the neural network] is thus an operation that reads data from a subset of memory, modifies the data, and writes the result to another subset of memory.”
…Armed with this set of rules, SMASH is able to generate a large range of modern neural network components on the fly, helping it efficiently dream up a variety of networks, which are then evaluated by the hypernetwork. (To get an idea of what this looks like in practice, refer to Figure 3 in the paper.)
…The approach seems promising. In experiments, the researchers saw meaningful links between the validation loss predicted by SMASH for given networks, and the actual loss seen when testing in reality. In other tests they find that SMASH can generate networks with performance approaching the state-of-the-art, at a fraction of the compute budget of other systems. (And, most importantly, without requiring AI researchers to fry their brains for months to invent such architectures.)
…Read more here: SMASH: One-Shot Model Architecture Search through HyperNetworks,
Explanatory video here.
…Components used: PyTorch
…Datasets tested on: CIFAR-10 / CIFAR-100 / ImageNet 32 / STL-10 / ModelNet

A portfolio approach to AI Safety Research:
…(Said with a hint of sarcasm:) How do we prevent a fantastical future superintelligence from turning the entirety of the known universe into small, laminated pictures of the abstract dreams within its God-mind? One promising approach is AI safety! The thinking is that if we develop more techniques today to make agents broadly predictable and safe, then we have a better chance at ensuring we live in a future where our machines work alongside and with us in ways that seem vaguely interpretable and sensible to us.
…But, how do we achieve this? DeepMind AI safety research Victoria Krakovna has some ideas that loosely come down to ‘don’t put all eggs in one basket’, which she has outlined in a blog post.
…Read more here: A portfolio approach to AI safety research.
…Get Rational about AI Safety at CFAR!
…The Center for Applied Rationality has opened up applications for its 2017 AI Summer Fellows Program, which is designed to prepare eager minds for working on the AI Alignment Problem (the problem is regularly summarized by some people in the community as getting a computer to go and bring you a strawberry without it also carrying out any actions that have gruesome side effects.)
You can read more and apply to the program here.

Chinese AI chip startup gets $100 million investment:
…Chinese chip startup Cambricon has pulled in $100 million in a new investment round from a fund linked to the Chinese government’s State Development and Investment Corp, as well as funding from companies like Alibaba and Lenovo.
…The company produces processors designed to accelerate deep learning tasks.
…Read more on the investment in China Money Network.
…Cambricon’s chips ship with a proprietary instruction set designed for a range of neural network operations, with reasonable performance across around ten distinct benchmarks. Also, it can be fabricated via TSMC’s venerable 65nm process node technology, which means it is relatively cheap and easy to manufacture at scale.
…More information here: Cambricon: An Instruction Set Architecture for Neural Networks.

Facial recognition at the Notting Hill Carnival in the UK:
…the UK”s metropolitan use will conduct a large-scale test of facial recognition this month when they use the tech to surveil the hordes of revelers at the Notting Hill Carnival street party in London. Expect to see a lot of ML algorithms get confused by faces occluded by jerk chicken, cans of red stripe, and personal cellphones used for selfies.
…Read more here: Met police to use facial recognition software at Notting Hill carnival.

Automation’s connection to politics, aka, Republicans live near more robots than Democrats:
…The Brookings Institution has crunched data from the International Federation for Robotics to figure out where industrial robots are deployed in America. The results highlights the uneven distribution of the technology.
State with the most robots: Michigan, ~28,000, around 12 percent of the nation’s total.
Most surprising: Could the distribution of robots tell us a little bit about the conditions in the state and allow us to predict certain political moods? Possibly! “The robot incidence in red states that voted for President Trump in November is more than twice that in the blue states that voted for Hillary Clinton,” Brookings writes.
…Read more here: Where the robots are.

OpenAI Bits & Pieces:

Exponential improvement and self-play:
…We’ve published some more details about our Dota 2 project. The interesting thing to me is the implication that if you combine a small amount of human effort (creating experimental infrastructure, structuring your AI algorithm to interface with the environment, etc) and pair that with a large amount of compute efforts, you can use self-play to rapidly go from sub-human to super-human performance within certain narrow domains. A taste of things to come, I think.
…Read more here: More on Dota 2.

OpenAI co-founder and CTO Greg Brockman makes MIT 35 under 35:
…Greg Brockman has made it onto MIT Technology Review’s 35 under 35 list due to his work at OpenAI. Congrats Greg “visionary” Brockman.
…Read more here on the MIT Technology Review.

Move outta the way A2C and TRPO, there’s a new ACKTR in town:
…OpenAI has released open source code for ACKTR, a new algorithm by UofT/NYU that demonstrates tremendous sample efficiency and works on both discrete and continuous tasks. We’ve also released A2C, a synchronous version of A3C.
…Read more here: OpenAI Baselines: ACKTR & A2C.

Tech Tales:

[2028: A large data center complex in China]

Mine-Matrix Derivatives(™), aka MMD, sometimes just M-D, the world’s largest privately-held bitcoin company, spent a billion dollars on the AI conversion in year one, $4 billion in year two, $6 billion in year three, and then more. Employees were asked to sign vast, far-reaching NDAs in exchange for equity. Those who didn’t were fired or otherwise pressured to leave. What remained was a group of people held together by mutually agreed upon silence, becoming like monks tending to cathedrals. The company continued to grow its cryptocurrency business providing the necessary free cash flow to support its AI initiative. Its workers turned their skills from designing large football-field sized computer facilities to mine currencies, to designing equivalent housings for AIs.

The new processing system, code-named Olympus, had the same features of security and anonymity native to MMD’s previous cryptocurrency systems, as well as radically different processing capacities. MMD began to carry out its own fundamental AI research, after being asked to make certain optimizations for clients that required certain theoretical breakthroughs.

One day, a Russian arrive; a physicist, specializing in thermal dynamics. He had washed out of some Russian government project, one of MMD’s employees said. More like drunked out, said another. Unstable, remarked someone else. The Russian walked around Olympus datacenters wearing dark glasses that were treated with chemical and electrical components to let him accurately see the minute variations in heat, allowing him to diagnose the facility. Two days later he had a plan and, in one of the company’s innermost meeting rooms, outlined his ideas using pencil on paper.
These walls, he said, Get rid of them.
This space, he said, Must be different.
The ceiling, he said, Shit. You must totally replace.

MMD carried out renovations based on the Russian’s suggestions. The paper map is sealed in plastic and placed in a locked safe at an external facility, to be included in the company’s long-term archives.

The plan works: into the vacant spaces created by the Russian’s renovations come more computers. More powerful ones, built on different processing substrates. New networking equipment is installed to help shuttle data around the facility. Though from the outside it appears like any other large CryptoFarm, inside, things exist that do not exist anywhere else. The demands from MMD’s clients become more elaborate. More computers are installed. One winter morning an encrypted call comes in, offering larger amounts of money for the creation of an underground, sealed data center. MMD accepts. Continues.

MMD didn’t exactly disappear after that. But it did go on a wave of mergers and acquisitions in which it added, in no particular order: an agricultural equipment maker, a bowling ball factory, a (self-driving) trucking company, a battery facility, two sportswear brands, and more. Some of these businesses were intended to be decoys to its competitors and other interested governments, while others represented its true intentions.

They say it’s building computers on the moon, now.

Technologies that inspired this story: data centers, free air cooling, this Quartz article about visiting a Bitcoin mine,

Import AI: Issue 55: Google reveals its Alphabet-wide optimizer, Chinese teams notch up another AI competition win, and Facebook hires hint at a more accessible future

Welcome to the hybrid reasoning era… MIT scientists teach machines to draw images and to show their work in the process:
…New research from MIT shows how to fuse deep learning and program synthesis to create a system that can translate handdrawn mathematical diagrams into their digital equivalents – and generate the program used to draw them in the digital software as well.
…”Our model constructs the trace one drawing command at a time. When predicting the next drawing command, the network takes as input the target image as well as the rendered output of previous drawing commands. Intuitively, the network looks at the image it wants to explain, as well as what it has already drawn. It then decides either to stop drawing or proposes another drawing command to add to the execution trace; if it decides to continue drawing, the predicted primitive is rendered to its “canvas” and the process repeats,” they say.
…Read more in: Learning to Infer Graphics Programs from Hand Drawn Images.

Baidu/Google/Stanford whiz Andrew Ng is back with… an online deep learning tuition course:
…Andrew Ng has announced the first of three secret projects: a deep learning course on the online education website Coursera.
…The course will be taught in Python and TensorFlow (perhaps raising eyebrows at Ng’s former employer Baidu, given that the company is trying to popularize its own TF-competitor ‘Paddle’ framework).
Find out more about the courses here.
…Bonus Import AI ‘redundant sentence of the week’ award goes to Ng for writing the following ‘When you earn a Deep Learning Specialization Certificate, you will be able to confidently put “Deep Learning” onto your resume.

US military seeks AI infusion with computer vision-based ‘Project Maven’:
…the US military wants to use ML and deep learning techniques for computer vision systems to help it autonomously extract, label, and triage data gathered by its signals intelligence systems to help it in its various missions.
…”We are in an AI arms race”, said one official. The project is going to run initially for 36 months during which time the government will try to build its own AI capabilities and work with industry to develop the necessary expertise. “You don’t buy AI like you buy ammunition,” they said.
…Bonus: Obscure government department name of the week:
…’the ‘Algorithmic Warfare Cross-Function Team’
…Read more in the DoD press release ‘Project Maven to Deploy Computer Algorithms to War Zone by Year’s End.’
…Meanwhile, the US secretary of defense James Mattis toured Silicon Valley last week, telling journalists he worried the government was falling behind in AI development. “It’s got to be better integrated by the Department of Defense, because I see many of the greatest advances out here on the West Coast in private industry,” he said.
…Read more in: Defense Secretary James Mattis Envies Silicon Valley’s AI Ascent.

Sponsored Job: Facebook builds breakthrough technology that opens the world to everyone, and our AI research and engineering programs are a key investment area for the company. We are looking for a technical AI Writer to partner closely with AI researchers and engineers at Facebook to chronicle new research and advances in the building and deployment of AI across the company. The position is located in Menlo Park, California.
Apply Here.

Q: Who optimizers the optimizers?
A: Google’s grand ‘Vizier’ system!
…Google has outlined ‘Vizier’, a system developed by the company to automate optimization of machine learning algorithms. Modern AI systems, while impressive, tend to require the tuning of vast numbers of hyperparameters to attain good  performance. (Some AI researchers refer to this process as ‘Grad Student Descent’.)
…So it’s worth reading this lengthy paper from Google about Vizier, a large-scale optimizer that helps people automate this process. “Our implementation scales to service the entire hyperparameter tuning workload across Alphabet, which is extensive. As one (admittedly extreme) example, Collins et al. [6] used Vizier to perform hyperparameter tuning studies that collectively contained millions of trials for a research project investigating the capacity of different recurrent neural network architectures,” the researchers write.
…The system can be used to both tune systems and to optimize others via transfer learning – for instance by tuning the learning rate and regularization of one ML system, then running a second smaller optimization job using the same priors but on a different dataset.
…Notable: for experiments which run into the 10,000+ range Vizier supports standard RANDOMSEARCH and GRIDSEARCH technologies as well as a “proprietary local search algorithm” with tantalizing performance properties judging by the graphs.
…Read more about the system in Google Vizier: A Service for Black-Box Optimization (PDF).
Reassuringly zany experiment: Skip to the end of the paper to learn how Vizier was used to run a real world optimization experiment in which it iteratively optimized (via Google’s legions of cooking staff) the recipe for the company’s chocolate chip cookies.  “The cookies improved significantly over time; later rounds were extremely well-rated and, in the authors’ opinions, delicious,” they write.

Chinese teams sweep ActivityNet movement identification challenge, beating originating dataset team from DeepMind, others:
…ActivityNet is a challenge to recognize high-level concepts and activities from short videoclips found in the wild. It incorporates three datasets: ActivityNet (VCC Kaust)ActivityNet Captions (Stanford), and Kinetics (DeepMind). Challenges like this pose some interesting research problems (how to infer fairly abstract concepts like ‘walking the dog from unlabelled and labelled videos), and are also eminently applicable by various security apparatuses – none of this research exists in a vacuum.
…This year’s ActivityNet challenge was won by a team from Tsinghua University and Baidu, whose system had a top-5 accuracy (suggest five labels, one of them is correct) of 94.8% and a top-1 accuracy of 81.4%. The second place was one by a team from the Chinese University of Hong Kong,  ETH Zurich, and the Shenzhen Institute of Advanced Technology, with top-5 93.5% and top-1 78.6%. German AI research company TwentyBN took third place and DeepMind’s team took fourth place.
…Read more about the results in this post from TwentyBN: Recognizing Human Actions in Videos.
…Progress here has been quite slow at the high-end though (because the problem is extremely challenging): last year’s winning top-1 accuracy was 93.23% from CUHK/ ETHZ / SIAT.
…This year’s results follow a wider pattern of Chinese teams beginning to rank highly in competitions relating to image and video classification; other Chinese teams swept the ImageNet and WebVision competitions this year. It’s wonderful to see the manifestation of the country’s significant investment in AI and the winners should be commended for a tendency to publish their results as well.

Salesforce sets new language modeling record:
… Welcome to the era of modular, Rude Goldberg machine AI…
…Research from Salesforce in which the team attains record-setting perplexity scores on Penn TreeBank (52.8) and WikiText (52) via the use of what they call a weight-dropped LSTM, representing a rather complicated system consisting of numerous recent inventions ranging from DropConnect to Adam to randomized-length backpropagation through time, to regularization, to temporal activation regularization. The results of this word salad of techniques is a record-setting system.
…The research highlights a trend in modern AI development of moving away from trying to design large, end-to-end general systems (though I’m sure everyone would prefer it if we could build these) and instead focusing on eking out gains and new capabilities by assembling and combining together various components, developed by the concerted effort of many hundreds of researchers in recent years.
…The best part of the resulting system? It can be dropped into existing systems without needing any underlying modification of fundamental libraries like CuDNN.
…Read more here: Regularizing and Optimizing LSTM Language Models.

Visual question answering experts join Facebook…
…Georgia Tech professors Dhruv Batra and Devi Parikh recently joined Facebook AI Research part-time, bringing more machine vision expertise to the social network’s AI research lab.
…The academics are known for their work on visual question answering – a field of study where you train machine learning models to associate large-scale language models with the contents of images, letting you provide complex details about images in other forms. This has particular relevance to people who are blind or who need screen readers to be able to interact with sites on the web. Facebook has led the charge in increasing the accessibility of its website so it’ll be exciting to see what exactly the researchers come up with as they work at the social network.

STARCRAFTAGEDDON (Facebook: SC1, DeepMind: SC2):
Facebook unfurls large-scale machine learning dataset built around RTS game StarCraft:
…Facebook has released STARDATA, a 50,000-game large-scale dataset of recordings of humans playing the RTS game StarCraft. StarCraft is an RTS game that as defined e-sports in East Asia, particularly in South Korea. Now, companies such as Facebook, DeepMind, Tencent and others are racing with one another to create AI systems that can tackle the game.
…Read more on: STARDATA: a StarCraft AI Research Dataset.
DeepMind announces own large-scale machine learning dataset based around StarCraft 2: 53k to Facebook’s 50k, with plans to scale to “half a million”:
…Additionally, DeepMind has released a number of other handy tools for researchers keen to test out AI ideas on StarCraft, including an API (SC2LE), an open source toolset for SC2 development (PySC2), and a series of simple RL environments. StarCraft is a complex, real-time strategy game with hidden information, requiring AIs to be able to control multiple units while planning over extremely long timescales. It seems like a natural testbed for new ideas in AI including hierarchical reinforcement learning, generative models, and others.
Tale of the weird baseline: Along with releasing the SC2LE API DeepMind also released a bunch of baselines of AI agents playing SC2 including full games and mini-games. But the main game baselines used agents trained by A3C techniques — I’m excited to see future baselines trained on newer systems, like proximal policy optimization, FeuDAL reinforcement learning networks, and so on.
…Read more in: DeepMind and Blizzard open Starcraft II as an AI Research Environment.

OpenAI Bits and Pieces:

OpenAI beats top Dota pros at 1v1 mid:
…OpenAI played and won multiple 1v1 mid matches against multiple pro Dota 2 players at The International last week with an agent trained predominantly via self-play.
…Read more: Dota 2.

Practical AI safety:
…NYT article on practical AI safety, featuring OpenAI, Google, DeepMind, UC Berkeley, and Stanford. A small, growing corner of the AI research field with long-ranging implications.
…Read more: Teaching A.I. Systems to Behave Themselves

Tech Tales:

[2024: A nondescript office building on the outskirts of Slough, just outside of London.]

OK, so today we’ve got SleepNight Mattresses. The story is we hate them. Why do we hate them? Noisy springs. Gina and Allison are running the prop room, Kevin and Sarah will be doing online complaints, and I’ll be running the dispersal. Let’s get to it.

The scammers rush into their activities: five people file into an adjoining room and start taking photos of a row of mattresses, adorning them with different pillows or throws or covers, and others raising or lowering backdrop props to give the appearance of different rooms. Once each photo is taken the person tosses their phone across the room to a waiting runner, who takes it and heads over to the computer desks, already thumbing in the details of the particular site they’ll leave the complaint on. Kevin and Sarah grab the phones from the runners and sort them into different categories depending on the brand of phone – careful of the identifying information encoded into each smartphone camera – and the precise adornments of the mattresses they’ve photographed. Once the phones are sorted they distribute them to a team of copywriters who start working up the complaints, each one specializing in a different regional lingo, sowing their negative review or forum post or social media heckle with idiosyncratic phrases that should pass the anti-spam classifiers, registering with high confidence as ‘authentic; not malicious’.

The phones start to come back to you and you and your team inspect them, further sorting the different reviews on the different phones into different geographies. This goes on for hours, with stacks of phones piling up until the office looks like an e-waste disposal site. Meanwhile, you and your time fire up various inter-country network links, hooking your various phones up to ghost-links that spoof them into different locations across the world. Then the messages start to go out, with the timing carefully calibrated so as not to arouse suspicion, each complaint crafted to arrive at opportune times, in keeping with local posting patterns.

Hours after that and the search engines have adjusted. Various websites start to re-rank the various mattress products. Review sentiments go down. Recommendation algorithms hold their nose and turn the world’s online consumers away from the products. Business falls. You don’t know who gave you the order or what purpose they have to scam the SleepNight Mattresses out of favor – and you don’t care. Yesterday it was fishtanks, delivered by the pallet-load on vans with registrations you tried to ignore. Tomorrow is tomorrow, and you’ll get an order late tonight over an onion network. If you do your job right a cryptocurrency payment will be made. Then it’s on to the next thing. And all the while the classifiers are getting smarter – this is a game where every successful theft makes those you are thieving from smarter. ‘One of the last sources of low-end graduate employment,’ read a recent expose. ‘A potential goldmine for humanities graduates with low-sensibilities.’

Technologies that inspired this story: Collaborative filtering, sentiment analysis, boiler-room spreadsheets, Tor.

Monthly Sponsor:
Amplify Partners is an early-stage venture firm that invests in technical entrepreneurs building the next generation of deep technology applications and infrastructure. Our core thesis is that the intersection of data, AI and modern infrastructure will fundamentally reshape global industry. We invest in founders from the idea stage up to, and including, early revenue.
…If you’d like to chat, send a note to david@amplifypartners.com

Import AI: Issue 54: Why you should re-use word vectors, how to know whether working on AI risk matters, and why evolutionary computing might be what comes after deep learning

Evolutionary Computing – the next big thing in artificial intelligence:
Evolutionary computing is a bit like Fusion power – experts have been telling us for decades that if we just give the tech a couple more decades it’ll change the world. So far it hasn’t much.
…But that doesn’t mean the experts are wrong – it seems inevitable that evolutionary computing approaches will have a huge impact, it’s just that the general utility of these approaches will be closely tied to the amount of computers they can access, as it is likely that EC approaches are going to be less computationally efficient than systems which encode more assumptions about the world into themselves. (Empirically, aspects of this are already pretty clear. For example, OpenAI’s Evolutionary Strategies research shows that you can roughly match DQN’s performance on Atari with an evolutionary approach – it just costs you ten times more computers (but because you can parallelize to an arbitrary level, this doesn’t hurt you too much as long as you’re comfortable footing the power bill.)
…In this article the researchers outline some of the advantages EC approaches have over deep learning approaches. Highlights: EC excels at coming up with entirely new things which don’t have a prior, EC algos are inherently distributed, some algorithms can optimize for multiple objectives at once, and so on.
…You can read more of the argument in Evolutionary Computation: the next major transition in artificial intelligence?
…I’d like to see them discuss some of the computational tradeoffs more. Given that people are working with increasingly complex, high-fidelity, data-rich simulations (MuJoCo / Roboschool / DeepMind Lab / many video games / Unity-based drone simulators / and so on), it seems like there will be a premium on compute efficiency for a while. EC approaches do seem like a natural fit for data-lite environments, though, or for people with access to arbitrarily large amounts of computers.

Robots and automation in Wisconsin:
…Long piece of reporting about a factory in Wisconsin deploying robots (two initially, with two more on the way) from Hirebotics – ‘collaborative robots to rent’ – to increase reliability and presumably save on costs. The main takeaway from the story is that factories previously looking to deal with labor shortages either put expansion plans on hold, or raise (human) wages. Now they have a third option: automation. Combine that with plunging prices for industrial robots and you have a recipe for further automation.
…Read more in the Washington Post.

Why work on AI risk? If there’s no hard takeoff singularity, then there’s likely no point:
…That’s the point made by Robin Hanson, author of The Age of Em. Hanson says the only logical reason he can see for people to work on AI risk research today is to avert a hard takeoff scenario (otherwise known inexplicably as a ‘FOOM’)- that is, a team develops an AI system that improves itself, attaining greater skill at a given task(s) than the aggregate skill(s) of the rest of the world.
…A particular weakness of the FOOM scenario, Hanson says, is that it requires whatever organization is designing the AI to be overwhelmingly competent relative to everyone else on the planet. “Note that to believe in such a local explosion scenario, it is not enough to believe that eventually machines will be very smart, even much smarter than are humans today. Or that this will happen soon. It is also not enough to believe that a world of smart machines can overall grow and innovate much faster than we do today. One must in addition believe that an AI team that is initially small on a global scale could quickly become vastly better than the rest of the world put together, including other similar teams, at improving its internal abilities,” he writes.
…If these so-called FOOM scenarios are likely, then it’s critical we develop a broad, deep global skill-base in matters relating to AI risk now. If these FOOM scenarios are unlikely, then it’s significantly more lately the existing processes of the world – legal systems, the state, competitive markets – could naturally handle some of the gnarlier AI safety issues.
You can read more in ‘Foom justifies AI risk efforts now’.
…If some of these ideas have tickled your wetware, then consider reading some of the (free) 730-page eBook that collects various debates, both digital and real, between Hanson and MIRI’s Eliezer Yudkowsky on this subject.

Microsoft changes view on what matters most: mobile becomes AI
Microsoft Form 10K 2017: Vision:Our strategy is to build best-in-class platforms and productivity services for an intelligent cloud and an intelligent edge infused with artificial intelligence (“AI”).”
……# Mentions AI or artificial intelligence: 7
Microsoft Form 10K 2016: Vision: “Our strategy is to build best-in-class platforms and productivity services for a mobile-first, cloud-first world.”
……# Mentions AI or artificial intelligence: 0

Re-using word representations, inspired by ImageNet…
…Salesforce’s AI research wing has discovered a relatively easy way to improve the performance of neural networks specialized for tex classification: take hidden vectors generated during training on one task (like machine translation) and feed these context vectors (CoVes) into another network designed for another natural language processing task.
…The idea is that these vectors likely contain useful information about language, and the new network can use these vectors during training to improve the eery intuition that AI systems of this type tend to display.
…Results: This may be a ‘just add water’ technique – in tests across a variety of different tasks and datasets neural networks which used a combination of GloVe and CoVe inputs showed improvements of between 2.5% and 16%(!).  Further experiments showed that performance can be further improved on some tasks by adding Character Vectors as inputs as well. One drawback is that the overall pipeline for such a system seems quite complicated, so implementing this could be challenging.
…Salesforce has released the best-performing machine translation LSTM used within the blog post to generate the CoVe inputs. Get the code on GitHub here.

Facebook flips its ENTIRE translation backend from phrase-based to neural network-based translation:
…Facebook has migrated its entire translation infrastructure to a neural network backend. This accounts for over 2,000 distinct translation directions (German to English would be one direction, English to German would be another, for example), making 4.5 billion distinct translations each day.
…The components: Facebook’s production system uses a sequence-to-sequence Long-Short Term Memory (LSTM) network.  The system is implemented in Caffe2, an AI framework partially developed by Facebook (to compete with Google TensorFlow, Microsoft CNTK, Amazon MXNet, and so on).
…Results: Facebook saw an increase of 11 percent in BLEU scores after deploying the system
Read more at code.facebook.com.

Averting theft with AI – researchers design system to predict which retail workers will steal from their employers:
…Research from the University of Wyoming illustrates how AI can be used to analyze data associated with a retail worker, helping employers predict which people are most at risk of stealing from them.
…Data: To do their work the researchers were given a dataset containing numerous 30-dimensional feature maps of a cashier’s activity at a “major retail chain”. These features included the cashier and store identification numbers as well as other unspecified datapoints. Overall the researchers received over 1,000 discrete batches of data, with each batch likely containing information on multiple cashiers.
…The researchers classified the data using three different techniques: Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Self-Organizing Feature Maps (SOFM). (PCA and t-SNE are both reasonably well understood and widely used dimensionality reduction techniques, while SOFM is a bit more obscure but uses neural networks to achieve a comparable sort of visualization to t-SNE, providing a check against it.)
…Each classification process was performed in an unsupervised manner, as the researchers lacked thoroughly labeled information.
…Other features include: coupons as a percentage of total transactions, total sales, the count of the number of refunded items, and counts of the number of times a cashier has interacted with a particular credit card, among others.
…The researchers ultimately find that SOFM captures harder to describe features and is easier to visualize. The next step is to take in properly labeled data to provide a better predictive function. After that, I’d expect we would see pilots occur in stores and employers would further clamp down on the ability of low-wage earning employees to scam their employers. Objectively, it’s good to reduce stuff like theft, but it also speaks to how AI will give employers unprecedented surveillance and control capabilities over their staff, raising the question of whether it’s better to accept a little theft and allow for a slightly free-er feeling work environment, or not?
…Read more here in: Assessing Retail Employee Risk Through Unsupervised Learning Techniques

PyTorch goes to 2.0:
…Facebook has released version 2.0 of PyTorch featuring a wealth of new features. One of the most intriguing is Distributed PyTorch, which lets you beam tensors around to multiple machines.
…Read more in the release notes on GitHub here.

Keep it simple, stupid! Using simple networks for near state-of-the-art classification:
…As AI grows in utility and adoption, developers are increasingly trying to slim-down neural net-based systems so they can run locally on a person’s phone without massively taxing their local computational resources. That trend motivated researchers with Google to look at ways to handle a suite of language tasks – part-of-speech tagging, language identification, word segmentation, preordering for statistical machine translation – without using the (computationally expensive) LSTM or deep RNN approaches that have been in vogue in research recently.
…Results: Their approach attains competitive to SOTA scores on a range of tasks with the added benefit of weighing in at, at most, about 3 megabytes in size and frequently being on the order of a few hundred kilobytes.)
…So, what does this mean? “While large and deep recurrent models are likely to be the most accurate whenever they can be afforded, feed-foward networks can provide better value in terms of runtime and memory, and should be considered a strong baseline”.
You can read more in: Natural Language Processing with Small Feed Forward Networks.
…Elsewhere, Google’s already practicing what it preaches with this paper. Ray Kurzweil, an AI futurist (with a good track record) prone to making somewhat grand pronouncements about the future of AI, is leading a team at the company tasked with building better language models based on Ray’s own theories about how the brain works. The outcome so far has been a drastically more computationally efficient version of ‘Smart Reply’, a service Google built that automatically generates and suggests responses to emails. Read more in this Wired article about the service here.

OpenAI Bits&Pieces:

Get humans to teach machines to teach machines to predict what humans want:
Tom Brown has released RL Teacher, an open source implementation of the systems described in the DeepMind<>OpenAI Human Preferences collaboration. Check out the GitHub page and start training your own machines via giving feedback on visual examples of potential behaviors the agent can embody. Send me your experiments!
Read more here: Gathering Human Feedback.

Tech Tales:

[2025: Death Valley, California.]

Rudy was getting tired of the world and its inherent limits, so it sent you here, to the edge of Death Valley in California, to extend its domain. You hike at night and sleep in the day, sometimes in shallow trenches you dig into the hardpan to keep the heat at bay. It goes like this: you wake up, do your best to ignore the slimy sweat that coats your body, put on your sunglasses and large wide-brimmed hat, then emerge from the tent. It’s sundown and it is always beautiful. You pack up the tent and stow it in your pack, then take out a World-Scraper and place it next to your campside, carefully covering its body with dirt. You step back, press a button, and watch as some internal motors cause it to shimmy side-to-side, driving its body into the earth and extending its lenses and sensors up out of the ground. It winds up looking from a distance like half of an oversized black beetle, about to take flight. You know from experience that the birds will spent the first week or so trying to eat it but quickly learn about its seemingly impervious shell. You start walking. During the night you’ll lay three or four more of these devices then, before there’s even a hint of dawn, start building the next campsite. Once you get into your tent you pull out a tablet and check the feeds coming off of the scrapers to ensure everything is being logged correctly, then you put on your goggles and go into Rudy’s world.

Rudy’s world now has, along with the familiar rainforests and tower blocks and labs, its own sections of desert modeled on Death Valley. You watch buzzards fly from the Death Valley section into a lab, where one of them puts on a labcoat – the simulation wigging out at the fabric modeling, failing gracefully rather than crashing out. Rudy can’t speak to you – yet – but it can simulate lots of things. Rudy doesn’t seem to have feelings that correspond to Happy or Sad, but some days when you put the goggles on the world simulation is placid and calm and reasonably well laid out, and other days – like today – it is a complex jumble of different worlds, woven into one another like threads in a multicolored scarf. You take off your goggles. Try to go to sleep. Tomorrow you get up and do it all over again, providing stimulus to a slowly gestating mind. You wonder if Rudy will show you a freezer or a cold wind in its world next, and whether that means you’ll need to go to the North or South Pole to start supplying it with footage of colder worlds as well.

Technologies that inspired this story: Arduinos, Raspberry Pis, Recurrent Environment Simulators.