Squint Compression with generative models: recently people have been trying to use neural networks to develop lossy compression systems. The theory behind the approach is you can train a computer to understand a given class of data enough that when you feed it a bandwidth-constricted representation its able to use its own impression of the object to try and rebuild it from the ground up, extrapolating a representation that is approximately correct…
…The paper, Generative Compression, shows how to combine techniques inspired by generative adversarial networks and variational autoencoders to create a system that can creatively upscale images.
…The results are quite remarkable, and are reminiscent of how many of us remember certain familiar objects, like favorite trees or bikes. When we remember things it’s common that our brain will put in little odd details in which aren’t present in base reality, or leave things out. That might be because we’re doing a kind of decompression, where our memory is a composite of various different internal representations, and we generate new representations based on our memories. This means we don’t need to remember everything about the object to remember it, and our imagination can fill in enough of the holes to let us still do something useful with it.
…Neural compression algorithms still have a ways to go, judging by how they break – go to the later pages of the paper to see how at 97X compression the model will suddenly forget about the heels on high heeled shoes, or arbitrarily change the color of the fabric on a sneaker, creating jarring transitions. Our own brains seem to be better at interpolating between what we definitely remember and what we’re creating, whereas this system is a bit more brittle.
Free tools: Denny Britz has released a free encoder-decoder AI software package for TensorFlow. A helpful framework for building anything from image captioning, to summarization, to conversational modelling, to program generation. As it’s OSS, there’s a list of tasks people can do to help improve the software.
Speech Recognition takes another big step: IBM researchers have set a new record for speech recognition on the widely used (and flawed) ‘Switchboard’ corpus. The new system has a word error rate of 5.5 percent, compared to 5.9 percent from the previous leading system created by Microsoft. IBM’s system is built on a LSTM combined with a Wavenet. IBM says human parity would be at about 5.1% (Microsoft previously said human parity was approximately 5.9%).
HSBC on track to double its data in four years: HSBC has been gathering more and more diverse types of data on its customers, leading to swelling repositories of information. Next step: use machine learning to analyze it.
Data under management at HSBC in…
2014: 56 PB
2016: 77 PB
2017: 93 PB
… data shared by HSBC at Google’s cloud conference, Google Cloud Next, in SF last week.
DeepWarp: AI – it will alter the social pact, change the economy, and might give us a way to re-mediate some of the horrendous damage our specifies has caused to the climate. But for now AI let’s us do something much more meaningful – take any photo of a person’s face and automatically make them roll their eyes. The Mr Bean example is particularly good. Check out more examples at the DeepWarp page here.
The era of quantum supremacy is nigh: Google researchers are betting that within a few years there will be a demonstration of quantum supremacy – that is, a real quantum computing algorithm will perform a task out of scope for the world’s most powerful supercomputer. And after that? New material design technologies, smarter route planning algorithms and – you knew this was coming – much more effective machine learning systems.
… in related news scientists at St Mary’s College of California have used standard machine learning approaches to train a D-Wave quantum computer (well, quantum annealer) to spot trees. In research, they show their approach is competitive to results achieved by classical computers.
Finally, AI gets an honest acronym – Facebook’s new AI server, codename Big Basin, is a JBOG, short for Just a Bunch Of GPUs. Honest acronyms are awesome! (HAAA!)
Self-driving, no human required: the California DMV has tweaked its regulations around the testing of autonomous vehicles in the state, and has said manufacturers can now test vehicles out on public roads without a human needing to physically be in the car. That’s a big step for adoption of self-driving technology.
Chinese government makes AI development a national, strategic priority: ““We will implement a comprehensive plan to boost strategic emerging industries,” said Premier Li Keqiang in his delivery at the annual parliamentary session in Beijing over the weekend, according to the South China Morning Post. “We will accelerate research & development (R&D) on, and the commercialisation of new materials, artificial intelligence (AI), integrated circuits, bio-pharmacy, 5G mobile communications, and other technologies.”
Keep AI Boring: Sick of the AI hype generated by media, talking heads, and newsletters? Help me in my (recursive) quest to remove some of the hype by coming up with dull terms for AI concepts. My example: Deep Learning becomes Stacked Function Approximators’. Other suggestions: WaveNet: Autoregressive Time Series Modeling using Convolutional Networks, Style Transfer: input optimization for matching high level statistics, Learning: iterative parameter adjustment”.
Fancy being 15X more energy efficient at deep neural network calculations than traditional chips? Just wait for RESPARC. New research from Purdue University outlines a new compute substrate built on Memristive Crossbar Arrays for the simulation of deep Spiking Neural Networks. What does that mean? They want to create a low-power, very fast chip that is able to better implement the kinds of massively parallel operations needed by modern AI systems.
… in the research the scientists show that, theoretically, RESPARC systems can achieve a 15X improvement in energy efficiency along with a 60X performance boost for deep neural networks, and a larger 500X energy efficiency and 300X performance boost for multi-layer perceptrons.
…the design depends on the use of memristive crossbars, which let you bring computer and storage together in the same basic circuit element. These crossbars will be used to store the weights in the network, letting computation happen without the latency overhead of checking weights. (.Now we just need to create those memristive crossbars – no sure thing. Memristors have been on the menu for several years from several different manufacturers and are distinguished as a technology mainly by their consistent delays in coming to market. )
… in tests the researchers showed that the platform can be used to compute common AI tasks, like digit recognition, house number recognition, and object classification.
… this type of new, non-Von Neumann architecture hardware looks likely to grow in coming years, as traditional CPUs and GPUs run into scaling limitations brought about by the difficulty the semiconductor industry is having in bringing in new finer detail process nodes, and by limitations in the chip-fabbing lithographic techniques, which will make it hard to scale-up die size for ‘big gulp’ performance…
…”The intrinsic compatibility of post-CMOS technologies with biological primitives provides new opportunities to develop efficient neuromorphic systems“, the researchers write.
Data fuel for your hungry machines: Google has published AudioSet, a collection of 5,800-hundred hours of audio spread across 2,084,320 human-labelled ten second long audio clips. This combined with new techniques for joint image, text, and audio analysis, will create models with a richer understanding of the world. Personally, I’m glad Google has woken up to the importance of the sound of people gargling and has created a dataset to track that…
…Haberdashers, seamstresses, and other tidy people might like the ‘DeepFashion’ dataset — a collection of 800,000 labelled fashion images.
Ongoing education to short-circuit inequality from automation: Governments should invest in ongoing education and retraining programs to help people adapt their skills to jobs changed by the rise of AI and machine learning, writes The Financial Times.
Buzzword VS Buzzword in IBM-Salesforce deal: Salesforce’s “Einstein” system (basically white labelled MetaMind, plus some fancy email from the RelateIQ acquisition, as well as software infrastructure from PredictionIO) will link up with IBM’s “Watson” system (software trained to play Jeopardy, then used to sell lengthy IBM service contracts). What the deal means is that Salesforce will start using many Watson services within its own AI stack, and IBM will move to buying more Salesforce software. Given how valuable data is, this seems like it may strengthen Watson.
How do you make 650 jobs turn into 60 jobs? Robots! A factory in Dongguan, China, has gone from employing 650 full-time staff members to 60 through the adoption of extensive automation technologies, including 60 robot arms at ten production lines. Eventually, the factory owner would like to drop the number of employees further to just 20 people. This is part of a citywide “robot replace human” program, according to state-backed publication People’s Daily Online.
Reinforcement learning, thinking fast and slow: new approaches to hierarchical RL may create systems capable of learning to act over multiple timescales, pursuing larger user-specified goals, while figuring out some of the intermediary shorter goals needed to be solved to crack the larger problems. New research from DeepMind, FeUdal Networks for Hierarchical Reinforcement Learning, demonstrates a system that gets record-setting scores on Montezuma’s Revenge, one of the acknowledged hardest Atari games for traditional RL algorithms to learn…
….Fall of the house of Montezuma: about 9 months ago i had coffee with someone who told me they thought infamously difficult Atari game Montezuma’s Revenge would be solved by AI within a year. In the FuN paper DeepMind claims a Montezuma score of about 2600 – that’s a vast improvement over previous approaches. (I recently had trhe chance to play the game myself and found that I got scores of between about 600 and 3200 depending on how good my reactions were.)
… there are multiple ways to create AI that can reason over long timescales. Another approach is based around a technique called option discovery from the University of Alberta and DeepMind.
… Bonus acronym alert: two pints for whoever at DeepMind decided to call these FeUdal NetworkS ‘FuNs’.
Not AI, but worth your (leisure) time: Fascinating article on Rock Paper Shotgun about the procedural generation techniques used by casual roguelike game ‘Unexplored”. Unexplored consists of a series of levels, each one about the size of a big box supermarket, that you must navigate and fight within. Each level is procedurally generated, providing the Skinner Box just-one-more-game feeling that most modern entertainment exploits…
… One of the frequent problems of procedurally generated game can be a feel of sameness – see levels in early procedural titles like Diablo, and so on. Underworld gets around this via a system called ‘cyclic traversal’, which lets it structure levels in a more diverse, flowing, non-repetitive, branching way that makes them feel like they’ve been designed by hand.
Conferences versus readers: Andrej Karpathy has mined the data vaults of Arxiv Sanity, generating a list comparing papers accepted and rejected from ICLR with those favorited by users of Arxiv Sanity. OpenAI’s RL2 paper makes the cut on Arxiv Sanity (along with many other papers not placed in traditional conferences).
[2022: A Funeral Home in the greater Boston area of Massachusetts]
“Her last will and testament was lost in the, um, incident,” says the Funeral Home director.
“Can’t you just say fire?” you say.
“Of course sir. They were destroyed in the fire. But we do have a slightly older video testimony and will. Would you like us to put it on?”
The projector turns on, and the whole wall lights up first with the test-pattern blue of the projector, then the white of the operating system, then the flood of color from the video itself. You close your eyes and when you open them you’re looking at someone who is not quite your mother, but if you squint could be.
“Who the hell is this?” you say.
“It is your relative, sir. The footage had been, ah, corrupted, due to being saved in the incorrect format -”
“Whose fault is that?”
“We’d prefer not to say sir. Anyway, we’ve used some upscaling techniques to generate this video. We find clients prefer having someone to look at and I’m told the likenessness can really be quite uncanny.”
“Turn it off.”
“The upscaling. Turn it off.”
They nod and you squeeze your eyes shut. You hear them tapping delicately at their keyboard. Headache. Don’t cry don’t cry it’s fine. When you open them you’re lookng at a wall of fuzzy pixels, your mothers voice crackling over them, like someone calling from underwater. Grief Mondrian. They use these generative compression tools everywhere now, turning old photos and songs into half-known remembrances, making the internet into a brain in terms of its dereliction as well as capability.