Import AI: Issue 31: Memories as maps & maps as memories, bot automation, and crypto-fintech-AI
by Jack Clark
ICML special administrative notice: Hello! Arxiv paper volume will increase this week due to a flood of ICML submissions. I’d like to try and analyse as many of them as possible and need some help – drop me a line if you want to work on a collaborative, AI paper project: firstname.lastname@example.org.
Can you hear me now? Computers learn to upscale audio: A group of Stanford researchers have taught computers to enhance the quality of audio. The system observes high-quality audio samples and corrupted samples, then trained using a residual network to identify the signals and infer the relationship between corrupt and clean audio. If you feed it some previously unheard corrupted audio it can make a good stab at upscaling it. The results are gently encouraging, with the system achieving good performance on speech and slightly less good performance on music. More an interesting proof-of-concept than a fall-out-of-your-chair result. “Our models are still training and the numbers in Table 1 are subject to change,” the authors note.
Image generation gets 100X faster thanks to algorithmic improvements: Last week we heard about a general purpose algorithmic improvement that could halve the cost of training deep neural networks. This week, a specific one comes along in the form of FastPixel CNN++, which is able to achieve as much as a 183X speedup on the image generation component of PixelCNN++.
Brain-interface company Kernel grabs MIT talent to explore your cranium: Kernel, a “human intelligence” company started by entrepreneur Bryan Johnson,has acquired MIT spinout Kendall Research Systems. This acquisition, combined with the hiring of MIT brain specialists Ed Boyden and Adam Marblestone, gives Kernel more expertise in the field of brain interfaces. Kernel was founded on the intuition that everything outside of us is getting smarter and faster, so we should invest some time into trying to make our own brains smarter and faster as well.
UK government to invest £17m ($21 million) into artificial intelligence research: the UK government will invest an additional few million pounds into AI research. The amount is minor and seems mostly to be what the treasury was able to find down the back brexit-shrunk sofa. Nonetheless, every little helps.
DeepCoder: promise & hype: Stephen Merity has tried to debunk some of the hype around DeepCoder, a research paper (PDF) that oultines a system that gets computers to learn programming. He’s even written a bonus article to try and show what he thinks level-headed journalism would be like – come for the insight, stay for the keyboard monkeys.
When your memory is a map, beautiful things can happen: a new research technique lets us give machines the ability to autonomously map their environment without needing to check the resulting maps against any kind of ground-truth data. This brings us closer to an age when we can deploy robots into completely novel environments and simply feed them goals, then have them map the buildings on the way to getting there…
… The specific approach, “Cognitive Mapping and Planning for Visual Navigation”, out-performs approaches based on LSTMs and reactive agents. The system works by coupling two distinct systems together – a planner, and a mapper. At each step the mapper updates the robots beliefs about the world, and then feeds this to the planner, which figures out an action to take…
… The Mapper gives the robot access to a memory system that lets it reprensent its world in terms of an overhead two-dimensional map. It feeds its map to The Planner, whichuses that data to plan the actions it takes to bring it closer to its goal. Once the planner has taken an action, the map is re-updated. The map is egocentric, which means that it naturally differentiates the agent from the rest of its environment. (In other words, action cements the agent’s perception of itself as being distinct from the rest of the world – how’s that for motivation!) This egocentric representation, combined with actions that are represented as egomotion, makes it easier for the system to recalibrate itself to learn more about its environment, without a human needing to be in the loop…
… The system still fails occasionally, usually due to its first person view leading it to miss a good route to its target, and ending up with it dithering about the space.
…It’s worth noting that this project, like all scientific endeavors, builds on numerous research contributions that have occurred in recent years: the planning component depends on a residual network (developed by Microsoft researchers and used to win the ImageNet competition in Dec 2015), a hierachical variant of value iteration networks (UC Berkely, released February 2016), and the whole combined system is trained using DAGGER (Carnegie Mellon, 2011). This highlights the inherent modularity of the modern approach to AI, and reminds us that any research contribution is there only due to standing on the shoulders of the contributions of innumerable others. (If you want to join me in a little AI archeology project, send me an email to email@example.com)
… “A central limiation in our work is the assumption of perfect odometry, robots operating in the real world do not have perfect odometry and a model that factors in uncertainty in movement is essential,” the researchers write.
Is that a gun in your hand or a corn cob spraypainted black? No, no that’s definitely a gun. Alright, come with me! Research from the University of Grenada in Spain shows how to do two useful things: 1), build and augment a dataset of handguns in films using deep learning and 2) use methods like an R-CNN to then successfully detect handguns in videos. Admins of video sites that have to deal with all the usual video nasties – weapons, drugs, sex – will likely be interested in such a technique. It could also reduce the number of people tech companies hire to manually look at disreptuable content – a low-paying, sometimes traumatising job that I think we would gladly cede to the machines.
The first rule of deep learning is you don’t talk about the black magic… Nikolas Markou has sadly been kicked out of AI club for talking about one of its uncomfortable truths – that because we lack a well developed set of theories for why AI works the way it does, many experts in the field use various tips and tricks gained through trial-and-error and intuition, rather than a deep understanding of theory. Read on for details of some of those tricks.
Smashing! Researchers use deep reinforcement learning to beat pros at Super Smash Brothers Melee: research from Tenenbaum’s lab at MIT have used reinforcement learning to train Smash Bros character Captain Falcon to a point of competency where he is able to play competitively with top-ranked human players. This approach works with both policy gradients and q-learning. This is a pretty good example of how RL has moved on from relatively simple two-dimensional environments like Atari to complex, changing, 3D environments. Read more here: Beating the World’s Best at Super Smash Bros Melee with Deep Reinforcement Learning…
… the algorithms found some near approaches that a typical human would not likely stumble on: “Q-learners would consistently find the unintuitive strategy of tricking the in-game AI into killing itself. This multi-step tactic is fairly impressive; it involves moving to the edge of the stage and allowing the enemy to attempt a 2-attack string, the first of which hits (resulting in a small negative reward) while the second misses and causes the enemy to suicide (resulting in a large positive reward),” the researchers write.
…the result indicates that RL has a chance of helping to solve tasks like mastering StarCraft 2. That’s because both games share some traits that traditional Atari games lack – partial observability, and multiple players. Therefore, it’s possible that SSBM could become a kind of intermediary metric as the AI community (Zerg) rushes to solve StarCraft, which will require many other algorithmic inventions to crack.
…meanwhile, Super Smash Bros, on the cheap!... Stanford researchers show you can train an AI to master Smash Bros using imitation learning, with no RL required. Imitation learning approaches are easier for less experienced researchers to tune and are cheaper, computationally, to train. Additionally, the approach outlined here is purely vision-based – meaning it has no access to the real state of the game, nor any particular hooks into it. That can be challenging for RL algorithms. AIs trained via this method were able to defeat a Level 3 difficulty CPU player, roughly match a Level 6, respectably hold their own or lose against a Level 9 character. Read more: The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI.
…Imitation learning is not particularly fashionable.The authors note that their approach “does not currently enjoy much status within the machine learning community.” But they think the value in their work is that it demonstrates how absurdly powerful CNN approaches are.
…(Minor details: the authors gathered their data via Nintendo 64 emulation and screen capture tools, using software called Project 64 v2.1. AIs were trained on around 600,000 frames of games (around 5 hours of playing).
Three humans and a hundred bots: interesting article about Philip Kaplan’s experience of building Distrokid, a music distribution service. Main thing of note to Import AI readers is Kaplan’s explanation of how Distrokid is able to turn over millions in revenue while running on only three fulltime staff: “DistroKid has dozens of automated bots that run 24/7. These bots do things that humans do at other distributors. For example, verifying that artwork & song files are correct, changing song titles to comply with each stores’ unique style guide, checking for infringement, delivering files & artwork to stores, updating sales & streaming stats for artists, processing payments, and more,” he says.
Cryptocurrency for the ceaseless machinations of those that tend the AI hedge fund: Numerai, a startup that appears to have emerged from the pyshic loam of a proto William Gibson novel, has launched a new cryptocurrency, Numeraire, to strengthen its AI-based hedgefund. The strangest part? All of those buzzwords are being used legitimately!…
…Numerai uses homomorphic encryption to fuzz a load of financial data and make it available to a global crew of data scientists, who then poke and prod at it with algorithms trying to make predictions about how the numbers change. They then upload these models to Numerai which creates an ensemble from them and uses that to trade mysterious financial instruments. Successful authors get paid out in accordance (in Bitcoin, naturally) with the success of their algorithm in the market. This week, Numerai distributed 1,000,000 Numeraire currency units across its 12,000 algorithm author members. Those people can now use Numeraire to place bets on the success of their own models, and if they win the value of Numeraire goes up. This mains that the data scientists now have a financial incentive to participate in the platform (sweet, sweet bitcoin), and a secondary financial one (participate in the internal Numerai economy by wagering lots of Numeraire, and use that to enhance earning power in accordance with the growing effectiveness of the predictions made by Numerai). The incentives seem to stop people from gaming the system…
… I’ve spent so long waffling on about this because I think Numerai is probably what an AI-first business looks like. Replace the 12,000 data scientists with smart, financial AI prediction systems, and you’re there. And in the same way AIs will exploit their environment for rewards that may not benefit the creator (eg, reward hacking, goal divergence, etc), humans will try to take as much money out of the market with the minimal amount of effort. If Numerai’s incentive system is successful then it can chalk out a path for AI companies to take in the future.
OpenAI bits & pieces:
OpenAI’s Tom Brown will be giving a talk at AI By the Bay on Wednesday, March 8, talking about OpenAI Gym and Universe.
[2020, a converted Church in Barcelona, full of computers behind austere glass]
They call the AI system ‘the math submarine’, but if you had them draw it for you no one could give a true depiction of its form. That’s because it’s a bundle of high-dimensional representations, drifting through complex, ethereal fields of numbers. You send the AI out there, out to the brain-warping weird edges of mathematics, and it tries to explore the border between what is proved and what is unproved, and it comes back with answers that are verifiably true, but difficult for a human to understand.
Still, you anthropomorphize it. Does it get lonely, out there, drifting through high-dimensional clouds of conjectures, each representing some indication of proof, or truth, or clarity. Does it feel itself distinct from these things? Does number have a texture to it? Are there currents?
When you were young you once looked up between two tall buildings and saw a plane pass overhead. You could never see the whole plane at once as your view was occluded by the walls of the buildings. But your brain filled in the rest, using its sense of ‘plane-ness’ to extend the slice of the object to the whole. Does the math submarine see numbers in this way, you wonder? Does it see a group of conjectures and have an intuition about what they mean? You know you can’t know, but your other computers can, and you watch the interfaces between this AI system and the others, and tend to the servers and ensure the network is running, so the machine can go and explore something you cannot see or truly know.