Import AI: #73: Generative steganography, automated data fuzzing with imgaug, and what happens when neural networks absorb database software

by Jack Clark

Welcome to Import AI, subscribe here.

Accidental steganography with CycleGAN:
…Synthetic image generators create their own optical illusions…
Researchers with Google have identified some surprising information storage techniques used by CycleGAN, a tool that can be used to learn correspondences between different sets of images and generate synthetic images. Specifically, the researchers find that during CycleGAN training the network encodes additional information into the images it is generating to help it reconstruct original images from synthetic sources. “This suggests that the majority of information about the source photograph is stored in a high-frequency, low-amplitude signal within the generated map,” the researchers write.
This also means it’s possible to use CycleGANs to create adversarial synthetic images, where a pattern of noise in the source image will cause the network to reconstruct a completely different image.”We claim that CycleGAN is learning an encoding scheme in which it “hides” information about the aerial photograph x within the generated map F x,” they write.
Read more: CycleGAN, a Master of Steganography.

Generating synthetic training data with imgaug:
…Will we be applying the CoarseDropout today, sir? Perhaps with some salt and pepper? And how about some affine scaling as well?…
One of the most common dull parts of machine learning is data augmentation: that’s the process people use to take an existing dataset, like a collection of cat photos, and massively expand the size of the dataset by transforming the images in a variety of ways. New free software called imgaug automates this process, giving users a vast amount of potential transforms to automatically apply to their images.
“It supports a wide range of augmentation techniques, allows to easily combine these, has a simple yet powerful stochastic interface, can augment images and keypoints/landmarks on these and offers augmentation in background processes for improved performance,” the authors write.
– Read the imgaug docs here.
– View imgaug on GitHub here.

I can’t B-TREE’ve it: Google learns index structures with machine learning:
…Goodbye, traditional software, hello, deep learning software…
After deep learning techniques fundamentally altered the capabilities of computer-implemented sensory recognition and analysis systems it was only a matter of time till such techniques came for software itself. A new research paper from Google shows how to use modern artificial intelligence approaches to significantly advance upon the state-of-the-art for one of the more fundamental operations in computer science: implementing an indexing system for a large repository of data.
In the paper, the research team shows how to implement neural-network based ‘learned indexes’ that work as a substitute for traditional Btree-style indexes. In the future, the team plans to explore applying such techniques to write operations like inserts, as well as other fundamental database algorithms like those concerned with joining and sorting data.
  The Google team test their approach in four large-scale data domains: real-world integer datasets from Google’s own systems – Maps and weblogs – as well as a web-document dataset that contains ’10m non-continuous document-ids of a large web index used as part of a real product at a large internet company’, as well as a syntehtic dataset called Lognormal.
  Results: “The learned index dominates the B-Tree index in almost all configurations by being up to 3× faster and being up to an order-of-magnitude smaller. Of course, B-Trees can be further compressed at the cost of CPU-time for decompressing. However, most of these optimizations are not only orthogonal but for neural nets even more compression potential exist. For example, neural nets can be compressed by using 4- or 8-bit integers instead of 32- or 64-bit floating point values to represent the model parameters,” they write. Their implementation uses CPUs, while in the future the researchers think GPUs and new AI-specific compute substrates like TPUs could accelerate things further.
  Doubts about practicality: The Google researchers state within the research paper that approaches like this will require substantially more compute before they become viable. But since we know we have new powerful substrates via TPUS, Cerebras, Graphcore, etc, then that seems like a reasonable thing to bet on. Some others have slightly more quibbles regarding the paper. “It assumes a static data set being used in read-only fashion, so it’s unsuitable for a directory or database that serves ongoing modifications. It also assumes an entire data set fits in RAM, which is generally not true for database applications. In particular, the “fast” case of using highly parallel GPUs assumes everything fits inside GPU RAM, which is even more tightly constrained than server main memory,” writes Howard Chu, the CTO of Symas Corp, in this OpenLDAP email.
– Read more: The Case for Learned Index Structures (Arxiv).

Learned network topologies that approach optimal topologies:
From the dept. of ‘everything with an input-output pair gets automated’…
New research from Duke University / UYESTC China / Brown University / NEC Labs, shows how to use deep learning approaches to train an AI policy to predict close-to-optimal networking topologies for datacenters via software called DeepConf. The research is mostly interesting because it’s another demonstration of the recent trend for reframing problems that require you to match inputs with outputs (say, packets flooding into a data center with a particular optimal topology, or image pixels leading to a label, or audio waveforms leading to transcribed speech, and so on). Eventually perhaps everything can be re-evaluated using these powerful AI techniques and tools.
Read more here: DeepConfig: Automating Data Center Network Topologies Management with Machine Learning.

First AI analyzed the visual world. Now it analyzes the digital world:
…Neural networks begin to make their way into everything…
Software 2.0: A few weeks ago Andrej Karpathy (former Stanford/OpenAI, now doing AI at Tesla) said he is increasingly thinking that neural networks are fundamentally altering software to the point it needs its own new brand/era: Software 2.0.
“It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data than to explicitly write the program.A large portion of programmers of tomorrow do not maintain complex software repositories, write intricate programs, or analyze their running times. They collect, clean, manipulate, label, analyze and visualize data that feeds neural networks,” Karpathy writes.
This research from Google, along with some of the chemistry papers from last week, and ongoing innovations in techniques like neural architecture search, all give us empirical evidence that people are beginning to rethink the act of designing software with AI and also how different real world domains can benefit from AI-infused systems. The next stage is to rethink the fundamentals of how optimized computer operations work with AI – though I don’t think anyone is looking forward to the bugs that will emerge as a consequence of this decision.
– Read more here: Software 2.0 (Medium).

Black in AI at NIPS:
This year NIPS hosted ‘Black in AI’ and DeepMind researcher Simon Osindero gave a speech there, which he has been generous enough to make publicly available. It hits on a bunch of tough issues the AI community needs to struggle with, ranging from issues of inclusivity and prejudice, to a bunch of suggestions for how the community can improve its representation.
“We can also use our diverse backgrounds to inject broader perspectives into the AI field as a whole. Hopefully, by doing so, we can do a better job at ensuring that the AI applications and systems that we develop don’t inherit some of the problematic biases that are still present in society at large, and instead help them become fairer, and more transparent and accountable,” Simon says.
Read more here: My talk at the inaugural Black in AI workshop dinner (Medium).
A story about Simon: When I attended NIPS in Montreal in 2015 I, like everyone else there, drank far too late far too frequently into the evenings at a variety of AI events. By Friday morning I was feeling the effects, yet managed to crawl out of bed and make it to a reinforcement learning workshop in the morning. After trudging into the workshop I saw a perky-looking Simon at a chair a couple of rows in front of me and I asked him something to the nature of: “Simon, I’m so bloody tired, how do you do it?” Simon raised up an ibuprofen pill bottle and shook it slightly and explained: “each scientific revolution builds upon the previous one.”

Allen Institute for AI reveals ‘THOR’ 3D agent-training environment:
…Enter The House of inteRactions (THOR) at your potential peril to gain a potential reward…
AI2 has released THOR, an AI simulation environment based on the Unity 3D game engine. THOR contains over 120 “near photo-realistic 3D scenes” that have been hand modeled by human artists (as opposed to the more common approach of generating environments procedurally). THOR environments can contain numerous so-called actionable objects which can each be ‘interacted’ with – that is, an agent can manipulate them in crude ways to change their state like placing one object inside another, or opening and closing cupboards and drawers.
High-quality scenes: The paper says the high visual fidelity of THOR scenes allows “better transfer of the learned models to the real world”, which is backed up by THOR’s usage in prior research including a project that trained a remote control car in simulation and transferred it into reality. without seeing experimental validation. There are numerous sim2real techniques, like ‘domain randomization’, that make it easy to take low-fidelity simulations and transfer models into reality through data augmentation.
An endless proliferation of 3D environments: In the past couple of years there have been a bunch of new large-scale AI-training environments released ranging from Microsoft’s Minecraft-based Malmo to DeepMind’s Quake-based ‘DeepMind Lab’, to the Doom-based VizDoom. It’s interesting to observe how the choice of game engine dramatically inflects the ultimate design and parameters of these AI-training systems, so I’d expect to see more Unity or other engines being used in AI research.
Read more: AI2-THOR: An Interactive 3D Environment for Visual AI (Arxiv).

Tech Tales:

Clown Hunt.

So I guess when people hear what I do they think of the Turing Test and the Voight-Kampf interview and whatever, but trust me – those tests wouldn’t work. Weve tried dialogue. We’ve tried emobided VR interviews – with all the requisite probes. But nothing matches the playground. Course that’s a nickname – it’s actually a souped-up version of Garry’s Mod, the old sandbox Half Life 2 add-on. Now the thing with the software is it lets you just… play. I don’t know how to explain it – take a vast set of items and people and programmable crude behaviors and stick them in a world with physics and kinetics and what have you. People had fun with it. Hey, let’s make a cannon that fires cars! Let’s make an upside down swimming pool using an anti-gravity gun! Let’s make a rollercoaster where all the passengers are made of rubber! You know – weird stuff.

So that’s how we test the AI’s now. They blew past most of our dialogue techniques a long time ago. And robots are still so shitty it’s not like a Terminator or a skinjob is right around the corner. So instead it’s about testing the software roaming around the net and trying to figure out which programs are purely reactive and which of them are mostly made of people and which of them are software and reactive. Reactivity is a problem. If something can react very quickly then we might have a hard time dealing with it. Fighting it, so to speak. I don’t know. Maybe these things are weapons or something. So we run these huge competitions through fronts – a bunch of NGOs and art organizations. Free expression for digital artists, or whatever. Big prize money. And we get people to compete by offering them access to a shitload of computers when they win the competition. And when they win we give them the computers and at the same time we take a copy of the program and run it in our ‘Fun Simulator’ and test the program.

My job is to help us spot these unregulated ‘cognitive class’ software systems, and the way it works is I put on my goggles and VR-skin and I jump into the simulator and I just play around with things. I’ve got two kids so I guess it’s easy – I’m always thinking of stories I’d like to tell them and how I could make them real here. We figure fun is still hard for computers to get. So we spot them by seeing who can make the funniest or most emotional or most resonant thing. We know what it feels like, we figure. I’d write children’s books in another era, my wife says. But instead I get to do this – be a big kid, tasked with out-funning another type of brain.

So today I try to make a family of quacking ducks lead a toaster across a road, avoiding the road’s ‘cars’ which are in fact metallic whales painstakingly built by me and my kids over the weekend. There’s a thunderclap right above where my ducks are and the software beams in, appearing as a small white sphere, crackling with electricity. Nice cosmetic effects, I think. Then it starts kind of shimmying to and fro in a corner moving some girders. I focus on the ducks and the toaster – after half an hour I’ve programmed the ducks so that they nudge the toaster with their beaks and slowly kinda drunkenly push it across the whale road. I’m pleased. Might show my kids.

So I look up at whatever the software has been doing and… it’s strange. It’s made a treehouse out of metal girders – pretty standard and not much different from the geometric structures I’ve seen other things build. But then at the top of the treehouse, on its roof, there’s a table with some guests. The guests are over-sized, high-definition, painstakingly crafted honey-roasted hams, with wicks of digital steam licking above their tops. One of the hams has a fake-mustache stuck onto its top-third section, with a monocle place above and to the right of it, right where a human would figure the eye would be. Like something I’d make, or dream about. So obviously I call it in quickly and sure enough we discover its a Cognitive Class piece of work so we scrape it off the public net and stick its owners in prison. But I used to think computers found it hard to have fun and now, now I’m not so sure. Maybe they learned it from me?

Technologies that inspired this story: Kaggle, Half-Life 2, Game Modding, Imitation Learning, Meta-Learning, Learning from Human Preferences.

5 Comments to “Import AI: #73: Generative steganography, automated data fuzzing with imgaug, and what happens when neural networks absorb database software”

casajarm says:

December 28, 2017 at 8:37 pm

Great story!

Import AI: #85: Keeping it simple with temporal convolutional networks instead of RNNs, learning to prefetch with neural nets, and India’s automation challenge. | Import AI says:

March 12, 2018 at 9:19 pm

[…] to work from Google last autumn in using neural networks to learn database index structures (covered in #73), which also found that you could learn indexes that had competitive to superior performance to […]

AI前线一周新闻盘点：微软三位AI大牛出走；拉里佩奇的飞行出租车在新西兰试飞；DeepChem2.0发布 2018-03-13 – Androidev says:

March 13, 2018 at 7:44 pm

[…] Import AI: #73: Generative steganography, automated data fuzzing with imgaug, and what happens when … […]

Import AI 187: Real world robot tests at CVPR; all hail the Molecule Transformer; the four traits needed for smarter AI systems. | Import AI says:

March 2, 2020 at 4:15 pm

[…] that can navigate to objects specified by name (e.g., go to the table in the kitchen), using the ‘Thor’ simulator (Import AI: 73). “Participants will train their models in simulation and these models will be evaluated by […]

Import AI 242: ThreeDWorld, 209 delivery drone flights, Spotify transcripts versus all the words in New York City | Import AI says:

March 29, 2021 at 5:52 pm

[…] software provides an API to AI agents. ThreeDWorld differs to other embodied robot challenges (like AI2’s ‘Thor’ #73, and VirtualHome by modelling physics to a higher degree of fidelity, which makes the learning […]

Import AI