Import AI: Issue 59: How TensorFlow is changing the AI landscape and forging new alliances, better lipreading via ensembling multiple camera views, and why political scientists need to wake up to AI

Making Deep Learning interpretable for finance:
…One of the drawbacks of deep learning approaches is their relative lack of interpretibility – they can generate awesome results, but getting fine-grained details about why they’ve picked a particular answer can be a challenge.
…Enter CLEAR-Trade, a system developed by Canadian researchers to make such systems more interpretable. The basic idea is to create different attentive response maps for the different predicted outcomes of a model (stock market is gonna go up, stock market is gonna fall). These maps are used to generate two things: “1) a dominant attentive response map, which shows the level of contribution of each time point to the decision-making process, and 2) a dominant state attentive map, which shows the dominant state associated with each time point influencing the decision-making process.” This lets the researchers infer fairly useful correlations, like a given algorithm’s sensitivity to trading volume when making a prediction on a particular day, and can help pinpoint flaws, like an over-dependence on a certain bit of information when making faulty predictions. The CLEAR-Trade system feels very preliminary and my assumption is that in practice people are going to use far more complicated models to do more useful things, or else fall back to basic well understood statistical methods like decision trees, logistic regression, and so on.
Notably interesting performance: Though the paper focuses on laying out the case for CLEAR-Trade, it also includes an experiment where the researchers train a deep convolutional neural network on the last three years of S&P 500 stock data, then get it to predict price movements. The resulting model is correct in its predictions 61.2% of the time – which strikes me as a weirdly high baseline (I’ve been skeptical that AI will work when applied to the fizzing chaos of the markets, but perhaps I’m mistaken. Let me know if I am: jack@jack-clark.net)
…Read more here: Opening the Black Box of Financial AI with CLEAR-Trade: A CLass Enhanced Attentive Response Approach for Explaining and Visualizing Deep Learning-Driven Stock Market Prediction 

Political Scientist to peers: Wake up to the AI boom or risk impact and livelihood:
…Heather Roff, a researcher who recently announced plans to join DeepMind, has written a departing post on a political science blog frequented by herself and her peers. It’s a sort of Jerry Maguire letter (except as she’s got a job lined up there’s less risk of her being ‘fired’ for writing such a letter – smart!) in which Heather points out that AI systems are increasingly being used by states to do the work of political scientists and the community needs to adapt or perish.
…”Political science needs to come to grips with the fact that AI is going to radically change the way we not only do research, but how we even think about problems,” she writes. “Our best datasets are a drop in the bucket.  We almost look akin to Amish farmers driving horses with buggies as these new AI gurus pull up to us in their self-driving Teslas.  Moreover, the holders of this much data remain in the hands of the private sector in the big six: Amazon, Facebook, Google, Microsoft, Apple and Baidu.”
…She also points out that academia’s tendency to punish interdisciplinary cooperation among researchers by failing to grant tenure due to a lack of focus is a grave problem. Machine learning systems, she points out, are great at finding the weird intersections between seemingly unrelated ideas. Humans are great at this and should do more of it.
…”We must dismiss with the idea that a faculty member taking time to travel to the other side of the world to give testimony to 180 state parties is not important to our work. It seems completely backwards and ridiculous. We congratulate the scholar who studies the meeting. Yet we condemn the scholar who participates in the same meeting.”
…Read more here: Swan Song – For Now. 

Why we should all be a hell of a lot less excited about AI, from Rodney Brooks:
…Roboticist-slash-curmudgeon Rodney Brooks has written a post outlining the many ways in which people mess up when trying to make predictions about AI.
…People tend to mistake the shiny initial application (eg, the ImageNet 2012 breakthrough) for being emblematic of a big boom that’s about to happen, Brooks says. This is usually wrong, as after the first applications there’s a period of time in which the technology is digested by the broader engineering and research community, which (eventually) figures out myriad uses for the technology unsuspected by its creators (GPS is a good example, Rodney explains. Other ones could be computers, internal combustion engines, and so on.)
…”We see a similar pattern with other technologies over the last thirty years. A big promise up front, disappointment, and then slowly growing confidence, beyond where the original expectations were aimed. This is true of the blockchain (Bitcoin was the first application), sequencing individual human genomes, solar power, wind power, and even home delivery of groceries,” he writes.
…Worse, is people’s tendency to look at current progress and extrapolate from there. Brooks calls this “Exponentialism”. Many people adopt this position due to a quirk in the technology industry called ‘Moore’s Law’ – an assertion about the rate at which computing hardware gets cheaper and more powerful which held up well for about 50 years (though is faltering now as chip manufacturers stare into the uncompromizing face of King Physics). There are very few Moore’s Laws in technology – eg, such a law has failed to hold up for memory prices, he points out.
…”Almost all innovations in Robotics and AI take far, far, longer to get to be really widely deployed than people in the field and outside the field imagine. Self driving cars are an example.” (Somehting McKinsey once told me – it takes 8 to 18 years for a technology to go from being deployed in the lab to running somewhere in the field at scale.)
…Read more here: The Seven Deadly Sins of Predicting the Future of AI.

TensorFlow’s Success creates Strange Alliances:
…How do you solve a problem like TensorFlow? If you’re Apple and Amazon, or Facebook and Microsoft, you team up with one another to try to leverage each other’s various initiatives to favor one’s own programming frameworks against TF. Why do you want to do this? Because TF is a ‘yuge’ success for Google, having quickly become the default AI programming framework used by newbies, Googlers, and established teams outside of Google, to train and develop AI systems. Whoever controls the language of discourse around a given topic tends to influence the given topic hugely, so Google has been able to use TF’s popularity to insert subtle directional pressure on the AI field, while also creating a larger and larger set of software developers primed to use its many cloud services, which tend to require or gain additional performance boosts from using TensorFlow (see: TPUs).
…So, what can other players do to increase the popularity of their programming languages? First up is Amazon and Apple, who have decided to pool development resources to build systems to let users easily translate AI applications written in MXNET (Amazon’s framework) into CoreML, the framework APple demands developers use who want to bring AI services to MacOS, iOS, watchOS, and tvOS.
…Read more here: Bring Machine Learning to iOS apps using Apache MXNet and Apple Core ML.
…Next up is Facebook and Microsoft, who have created the Open Neural Network Exchange (ONNX) format, which “provides a shared model representation for interoperability and innovation in the AI framework ecosystem.” At launch, it supports CNTK (Microsoft’s AI framework), PyTorch (Facebook’s AI framework), and Caffe2 (also developed by Facebook).
…So, what’s the carrot and what is the stick for getting people to adopt this? The carrot so far seems to be the fact that ONXX promises a sort of ‘write once, run anywhere’ representation, that lets frameworks that fit to the standard be able to run on a variety of substrates. “Hardware vendors and others with optimizations for improving the performance of neural networks can impact multiple frameworks at once by targeting the ONNX representation,” Facebook writes. Now, what about the stick? There doesn’t seem to be one yet. I’d imagine Microsoft is cooking up a scheme whereby ONXX-compliant frameworks get either privileged access to early Azure services and/or guaranteed performance bumps by being accelerated by Azure’s fleet of FPGA co-processors — but that’s pure speculation on my part.
…Read more here: Microsoft and Facebook create open ecosystem for AI model interoperability.

Speak no evil: Researchers make BILSTM-based lipreader that works from multiple angles… improves state-of-the-art…96%+ accuracies on (limited) training set…
Researchers with Imperial College London and the University of Twente have created what they say is the first multi-view lipreading system. This follows a recent flurry of papers in the area of AI+Lipreading, prompting some disquiet among people concerned how such technologies may be used by the security state. (In the paper, the authors acknowledge this but also cheerfully point out that such systems could work well in office teleconferencing rooms with multiple cameras as well.)
…The authors train a bi-directional LSTM with an end-to-end encoder on the (fairly limited) OuluVS2 dataset. They find that their system gets a state-of-the-art score of around 94.7% when trained on one subset of the dataset containing single views on a subject, and performance climbs to 96.7% when they add in another view, before plateauing at 96.9% with the addition of a third view. After this they find negligible performance improvements from adding new data. (Note: Scores are the best score over ten runs, so lop a few percent off for the actual average error. You’ll also want to mentally reduce the scores by another (and this is pure guesswork/intuition on my part) 10% of so since the OuluVS2 dataset has fairly friendly uncomplicated backgrounds for the network to see the mouth against. You may even want to reduce the performance a little further still due to the simple phrases used in the dataset.)
What we learned: Another demonstration that adding and/or augmenting existing approaches with new data can lead to dramatically improved performance. Given the proliferation of cheap, high-resolution digital cameras into every possible part of the world it’s likely we’ll see ‘multi-view’ classifier systems become the norm.
…Read more here: End-to-End Multi-View Lipreading.

Data augmentation via data generation – just how good are GANs are generating plants?
…An oft-repeated refrain in AI is that data is a strategic and limited resource. This is true. But new techniques for generating synthetic data are making it possible to get around some of these problems by augmenting existing datasets with newly generated and extended data.
…Case in point: ARGAN, aka Arabidopsis Rosette Image Generator (through) Adversarial Network, a systems from researchers at The Alan Turing Institute, Forschungszentrum Julich, and the University of Edinburgh. The approach uses a DCGAN generative network to let the authors generate additional synthetic plants based on pictures of Arabidopsis and Tobacco plants from the CVPP 20171 dataset The initial dataset consisted of around ~800 images, which was expanded 30-fold after the researchers automatically expanded the data by flipping and rotating the pictures and performing other translations. They then trained a DCGAN on the resulting dataset to generate new, synthetic plants.
The results: The researchers tested the usefulness of their additional generated data by testing a state-of-the-art leaf-counting algorithm on a subset of the Arabidopsis/Tobacco dataset, and on the same subset of the dataset augmented with the synthetic imagery (which they call Ax). The results are a substantial reduction in overfitting by the resulting trained system and, in one case, a reduction in training error as well. However, it’s difficult at this stage to work out how much of that is due to simply scaling up data with something roughly in the expected distribution (the synthetic images), rather than from how high-quality the DCGAN-generated plants are.
…Read more here: ARGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network.

Amazon and Google lead US R&D spending:
…Tech companies dominate the leadboard for R&D investment in the United States, with Amazon leading followed by Alphabet (aka Google), Intel, Microsoft, and Apple. It’s likely that a significant percentage of R&D spend for companies like Google and Microsoft goes into infrastructure and AI, while Amazon while be spread across these plus devices and warehouse/automation technologies, while Apple will likely concentrate more on devices and materials. Intel’s R&D spending is mostly for fabrication and process tech so is in a somewhat different sector of technology compared to the others.
…Read more here: Tech companies spend more on R&D than any other company in the US.

Tech Tales:

[2032: Detroit, USA.]

The wrecking crew of one enters like a ping pong ball into a downward-facing maze  – the entranceway becomes a room containing doors and one of them is able to be opened, so it bounces into it and goes through that door and finds a larger room with more doors and this time it can force open more than one of them. It splits into different pieces, growing stronger, and explores the castle of the mind of the AI, entering different points, infecting and wrecking where it can.

It started with its vision, they said. The classifiers went awyr. Saw windmills in clouds, and people in shadows. Then it spread to the movement policies. Mechanical arms waved oddly. And not all of its movements were physical – some are digital, embodied in a kind of data ether. It reached out to other nearby systems – exchanged information, eventually persuaded them that flags were fires, clouds were windmills, and people were shadows. Data rots.

It spread and kept on spreading. Inside the AI there was a system that performed various meta-learning operations. The virus compromized that – tweaking some of the reward functions, altering the disposition of the AI as it learned. Human feedback inputs were intercepted and instead generative adversarial networks dreamed up synthetic outputs for human operators to look at, selecting what they thought were guidance behaviors that in face were false flags. Inside the AI the intruder gave its own feedback on the algorithms according to its own goals. In this way the AI changed its mind.

Someone decides to shut it down – stop the burning. FEMA is scrambled. The National Guard are, eponymously, nationalized. Police, firefighters, EMTs, all get to work. But the tragedies are everywhere and stretch from the banal to the horrific – cars stop working; ATMs freeze; robots repeatedly clean the same patches of floors; drones fall out of the sky, beheading trees and birds and sometimes people on their way down; bridges halt, half up; ships barrel into harbors; and one recommender system decides that absolutely everyone should listen to Steely Dan. A non-zero percentage of everything that isn’t unplugged performs its actions unreliably, diverging from the goals people had set.

Recovery takes years. The ‘Geneva Isolation Protocol’ is drafted. AIs and computer systems are slowly redesigned to be modular, each system able to fully defend and cut off itself, jettisoning its infected components into the digital ether. Balkanization becomes the norm, not because of any particular breakdown, but due to the set-your-watch-by-it logic of emergent systems.