Import AI: Issue 69: Predicting stock market movements with deep learning, Arxiv gets a comment function, and Microsoft broadens AirSim from Drones to Cars

Welcome to Import AI, subscribe here.

Arxiv gets its comment layer – will science benefit?
…Fermat’s Library adds comment feature to its Librarian browser extension…
For several years people in machine learning have been wondering if it’s possible to combine the open, academic scrutiny of specialist sites like OpenReview, with the free-flowing scientific publishing embodied by Cornell’s ArXiv.
  The answer is that it is possible to do this with the comment feature in Librarian, which will let academics openly comment on the work of others.
  “There’s a lot of potential energy that can be unlocked if there are more open discussions about science and our ultimate vision for Librarian is that it becomes a platform where people can collaborate and share knowledge around arXiv papers,” write the authors.
  Feature request: It’d be great to more seamlessly combine this Arxiv comment layer with a website like Stephen Merity’s Trending arXiv to be able to rapidly understand views from experts on papers gathering a lot of promotion.
Read more: Comments on arXiv papers.

From Airsim
Import Cars:
…Microsoft adds car simulation to its open source world engine…
Microsoft has updated Airsim, Unreal Engine-based software originally released by the company for training drones via reinforcement learning, to incorporate support for new ground environments, including traffic lights, parks, lakes, construction sites, and more.
Read more at the Microsoft blog.

Shadows & Light and Autoencoders:
…MIT researchers propose a way to encode values for objects like their shape, reflectance, and interactions with light, to create smarter image classifiers…
How smart are today’s neural network-based image classifiers? Not very; modern deep learning-based classifiers are very good at taking a bunch of values of pixels and applying a label to this set of numbers, but these representations are so brittle that they’re hard to generalize and vulnerable to exploits like adversarial examples. Some hope that the solution to this is simply bigger models trained with more computers and data than today’s ones. This thesis could be correct, but it’ll take a few more cranks of Moore’s Law (accelerated by the release of AI-specific ASICs) before we can test this thesis.
  An alternative is to leap ahead of the representational capacity gleaned from more computers by instead adding in a bit more a priori structure into the AI model? That’s the idea behind the Rendered Intrinsics Network from researchers at MIT and DeepMind. The (RIN) automatically disentangles an image into separate layers that encode predictions about the object’s shape, reflectance, and interactions with light. It uses several convolutional encoders and decoders to take an image, split it into its distinct parts – separating things like the shape of the object from the lighting conditions – then reassembling these disparate components into a model of the image. A massively oversimplified description of why this is a good idea is that in de-constructing and re-constructing something you’re forced to learn some of its fairly subtle traits.
  “RIN makes use of unlabeled data by comparing its reconstruction to the original input image. Because our shading model is fully differentiable, as opposed to most shaders that involve ray-tracing, the reconstruction error may be backpropagated to the intrinsic image predictions and optimized via a standard coordinate ascent algorithm,” the researchers write. “RIN has one shared encoder for the intrinsic images but three separate decoders, so the appropriate decoder can be updated while the others are held fixed.”
  Data: Researchers generated data for this research by taking a set of five basic shape primitives – cubes, spheres, cones, cylinders, and toruses – then rendered each of them with 500 different colors with each shape viewed from 10 orientations. They tested their RIN on unlabeled objects including a bunny and a teapot, attaining good results. Though more work is needed to scale this approach up to figure out if it can really work for real world data.
Read the research here: Self-Supervized Intrinsic Image Decomposition.

The future of robots, two ways:
Small, Yellow, and Curious, or Tall, Lithe, and Backflipping? Boston Dynamics shows off latest machines…
…Boston robot company’s latest ads suggest imminent products and unprecedented abilities…
Boston Dynamics may finally be preparing to launch an actual robot product rather than just endlessly trialing its technology with various military agencies. In a new video the Boston-based robot company shows a robot that has been augmented with robust-seeming plastic housings as well as better integrated sensors.
  Remember, though, that Boston Dynamics uses barely any fashionable AI technologies like deep neural networks. Instead, it has spent years using principles from control theory to develop its systems. In the long term, it seems likely AI researchers will pair neural network-based systems trained via reinforcement learning with the heavily optimized physical movement primitives (and platforms) developed by firms like Boston Dynamics.
Watch more here: The New SpotMini (YouTube).
There’s another potential product on the way as the well, in the form of the latest design of the company’s ‘Atlas’ robot. Like Minispot, this version of Atlas feels far more carefully shaped and ‘consumerized’ parts, but it’s decidedly more rough and lab-bench-like in appearance than its quadruped brethren.
  The robot does have some moves, though, as demonstrated in a separate video by Boston Dynamics showing the robot first jumping between separate blue blocks, then jumping up onto a slightly higher block, then backflipping (!) onto a (somewhat flexible) floor.
To see the backflip, watch Boston Dynamics’ ‘What’s new, Atlas?’ video here (YouTube).

My data beats your resolution:
…Stanford University AI system uses freely available Landsat data to predict Asset Wealth Index values from satellite imagery…
Stanford Researchers have used residual networks with dilated convolutions to train classifiers that can efficiently use large amounts of multi-spectral low-resolution data, beating their own prior baselines which were trained on significantly higher resolution data in a narrower spectral band.
  The researchers show that they can use an ensemble of Landsat satellite data with a resolution of 15-30m/px to beat a baseline trained on higher resolution 2.5m/px data from Google (think of this as the difference between being able to (roughly) count cars in a parking lot, versus counting planes on a jetway).
  The researchers use dilated convolutions to vary the receptive field of the network (18- and 34-layer ResNets and VVG-F) to incorporate data from multiple resolutions into the classifier, versus the fixed resolution of Google’s high resolution images.
Read more here: Poverty Prediction With Public Landsat 7 Satellite Imagery and Machine Learning.
   (Many other companies are experimenting with training convolutional neural network-based classifiers on modern satellite imagery, including Facebook predicting land inhabitants, and Orbital Insight has been able to predict retail trends by monitoring parking lots full of cars; the world is learning to see itself.)

Training ImageNet in 15 minutes (with over 1,000 NVIDIA GPUS):
Being able to access and effectively use large amounts of computers will be to AI research as access to large amounts of well labelled data is to AI product development…
Japanese AI startup Preferred Networks has successfully trained an ImageNet model to accuracies competitive with the state of the art in 15 minutes.
  For those not following the ‘how fast can you train ImageNet’ contest, a refresher:
July, 2017: Facebook trains an ImageNet model in ~1 hour using 256 GPUs.
November, 2017: Preferred Networks trains ImageNet in ~15 minutes using 1024 NVIDIA P100 GPUs.

Using deep learning to predict stock price movements!
…Backtesting shows promising results for stock prediction approach…
Researchers have shown it’s possible to (theoretically) generate good returns with stock market data using deep learning techniques.
  Two notable things about this:
  1) It provides further evidence that today’s basic AI tools, when scaled up and fed with decent data, are capable of performing credibly difficult tasks, like making accurate predictions in the stock market.
  2) Since this exists, it confirms most people’s intuitions that large quant shops like Renaissance / MAN Group / 2Sigma, have been exploring techniques like this in private for commercial gain.
     Now researchers with Euclidean, a financial technology firm, and Amazon AI / CMU, have outlined a system trained on data from 11,815 stocks that were publicly traded on the NYSE, NASDAQ or AMEX exchanges for at least 12 consecutive months between between January, 1970 and September, 2017. (Excluded stocks: non-US-based companies, financial sector companies, and any company with an inflation-adjusted market capitalization value below 100 million dollars.) Data from the Compustat North America and Compustat Snapshot databases
  The system uses multi-task learning to predict future stock performance by normalizing all the stocks into the same data format then analyzing 16 future fundamental details about each stock, including trailing twelve month revenue, cost of goods sold, EBIT, as well as quarterly measures like property plant and equipment, debt in current liabilities, accounts payable and taxes payable, and so on.
  The results: “Our results demonstrate a clear advantage for the lookahead factor model. In nearly all months, however turbulent the market, neural networks outperform the naive predictor (that fundamentals remains unchanged). Simulated portfolios lookahead factor strategies with MLP and RNN perform similarly, both beating traditional factor models”, they write.
Read more: Improving Factor-Based Quantitative Investing By Forecasting Company Fundamentals.

Less precision for future compute savings:
Intel-Nervana detail ‘Flexpoint’ data format for variable precision training of deep neural nets, letting you train 16-bit (sort of) precision networks with performance roughly equivalent to 32-bit ones…
Intel-Nervana has proposed combining fixed point and floating point arithmetic to implement a new data format, Flexpoint, that lets you train networks with reduced precision without a huge performance tradeoff.
  “Flexpoint is based on tensors with an N-bit mantissa storing an integer value in two’s complement form, and an M-bit exponent e, shared across all elements of a tensor. This format is denoted as flexN+M. Fig. 1 shows an illustration of a Flexpoint tensor with a 16-bit mantissa and 5-bit exponent, i.e. flex16+5 compared to 32-bit and 16-bit floating point tensors. In contrast to floating point, the exponent is shared across tensor elements, and different from fixed point, the exponent is updated automatically every time a tensor is written,” the authors write.
  The flex16+5 format appears to work as expected, with Intel-Nervana training neural nets with equivalent performance to 32-bit variants (whereas stock 16-bit tends to lead to a slight relative fall in accuracy).
   In the next few years we’re likely going to see various companies launching more specialized hardware for AI processing, some of which will implement 16-bit precision (or less) natively, so software techniques like this will likely become more prevalent.
Read more here: Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks.

Tech Tales:

[A flat in Deptford, London, United Kingdom. 2026.]

So you’re walking round your house aimlessly doing dishes and listening to the radio when you start to compose The Rant. It’s a rant about society and certain problems that you perceive both with yourself and with other people. It’s also a rant about how technology narrows the distance between your own brain and the brain of everyone else in the world to the point you feel your emotions are now contingent on the ‘mood of the internet’. This doesn’t please you.

So after spending close to an hour verbally composing this essay and having synthesized voices speak it back to you and synthesized dream-AI actors carry out dramatized versions of it, you prepare to post it to the internet.

But when you submit it to your main social network platform the post is is blocked; you stare at an error message displayed in cheerful pink with an emoji of a policeman-like looking person holding a ‘Stop’ sign. Posi Vibes Only! the warning says. Try putting in some more cheerful words or phrases. Maybe tag a friend? it suggests.

You frown. Try to outsmart it. You first embed bits of your rant as text overlaying images, but when you go to submit these to the network it only lets a percentage of them through, blocking some. Hiding your message and changing it to one of hope, talk of ‘rising up’ and ‘growing comfortable with the world’ – a spliced-up, distorted version of your position. You record a basic audio file and upload it and the same thing happens, with your virtual personality praising (instead of critiquing) the super-structure. Of course you tell your real friends about your views, but what’s the point of that? They end up caught in the same digital traps, able to talk to other people in the real world, but unable to transmit their message of sadness and rebellion to the larger mass. POSI VIBES ONLY~!