Import AI: Issue 48: Learning language in the third dimension; how AI may lead to war, inequality, or stagnation; AI and Art researchers team-up to create CANs

by Jack Clark

Extremely freaky and incredibly cool AI art:
This eery AI experiment mashes up Mike Tyka’s recent work on generating fully synthetic faces with AI, with a technology called Deep Warp to let the eyes of the synthetic person follow your cursor. The effect is perturbing and cool! More of these AI mashups, please.
You can see the experiment here.

NIPS 2017, by the numbers:
…..3,297: # of NIPS 2017 research papers submitted
…~2,500: # of NIPS 2016 research papers submitted
…..3,240: #r of research papers cleared for review (some violated policies and others were withdrawn by submitters.)
183: # of area chairs charged with overseeing these papers.
…New: NIPS, in keeping with the current boom in deep learning, has “added one more layer” to its reviewing structure. Senior area chairs (human) will help to further calibrate the decisions made by individual area chairs (much like a layer in a neural network, though with more coffee and swearing.)
More information in this Google Doc from the courageous NIPS program co-chairs.

Why some industries may adopt AI slowly…
Biology eats all the code around me
Despite software leading to rapid gains in our ability to simulate and run experiments on complicated processes, there are some things we struggle with. Real life is one of them. Reality is built on a kind of fizzing underlay of chaos and fusing our computer systems with them tends to be difficult.
…”Instead of “software eats biotech”, the reality of drug discovery today is that biology consumes everything,” writes Life Sci VC in a great post to remind us of the difficulty of some fundamental domains.
…”The primary failure mode for new drug candidates stems from a simple fact: human biology is massively complicated. Drug candidates interfere with the wrong targets or systems leading to bad outcomes (“off-target” toxicity),” they write.
Read the whole post here.

Language learning goes into the third dimension:
Today, many groups are trying to teach agents to develop language in a way that is uniquely tied to the environment they exist in. This is because of a growing intuition among researchers that simply getting an agent to learn about text by studying large corpuses of it is insufficient to develop AIs with a rounded commonsense understanding of the world – instead, groups are teaching agents to tie words to their environment, letting them develop an intuitive understanding of what, say, “big” or “heavy” or “far away” might mean. Some of these projects have yielded agents with a language which must be translated into English. Other groups are trying to teach their agents English from the ground-up, expanding the agents’ capabilities over time via curriculum learning.
…Now, separate research projects from Facebook and DeepMind show a way to push this project into the third dimension, with new papers that teach agents complex language in rich, 3D environments.
Components: DeepMind Lab (DeepMind, a customized/proprietary version of an earlier open source release based on Quake), ViZDoom (CMU, an open source 3D simulator based on Doom).
…Paper: Gated-Attention Architectures for Task-Oriented Language Grounding (CMU). (Notable, the last author is Ruslan Salakhutdinov, who splits his time between CMU and Apple)
…Approach: The approach taken by CMU researchers is to construct a modular neural network to let the agents complete tasks that require both an understanding of text and vision. To do this, they use a standard convolutional neural network block to interpret vision and a Gated Recurrent Unit to process the text. They then take these representations and combine them via what they call a Gated Attention multi-modal learning layer, which cleverly merges the different representations into a unified set of features. What you wind up with is an agent that can naturally learn to combine the text you feed it with its images of the world, then acts in the world using this single representation.
DeepMind uses a similar technique (with some bells and whistles based around ideas present in their UNREAL paper of last year) to create agents that learn curriculums of entangled links of words and object and generalize instantly (zero-shot adaptation) to previously unseen combinations of words or objects). The additional of auxiliary goal identification and acquisition helps learning via letting the agent create autoregressive objectives which help it model its surroundings.
Paper: Grounded Language Learning in a Simulated 3D World (DeepMind).

Matlab gets a free visualization upgrade:
…MIT researchers have created mNeuron, a free plug-in for popular math software Matlab. The plug-in visualizes neurons in neural networks and has support for Caffe and matconvnet.
…Come for the potentially useful tool for interpretability, stay for the ‘tessellation art’ technique that lets you take the visualizations of a single neuron and extend it into a large, repeating tapestry.
Keras gets a viz plugin as well:
Easy-to-use AI framework Keras also has its own visualization ecosystem. One handy tool looks to be Keras-vis, a toolkit for visualizing saliency maps, activation maximization, and class activation maps in models.

Amazon reveals its (many) AI priorities with Amazon Research Awards:
…Amazon has published a call for proposals for its Amazon Research Awards and it is willing to fund proposals to the tune of, at most, $80,000 in cash and $20,000 in Amazon Web Services promotional cloud credits.
The research: What’s most of note is the broad set of research areas Amazon is seeking proposals for – and some of them are particularly germane and specific to its work.
Notable research focus areas: …Apparel similarity…Personalization using personal knowledge base…advances in methods for estimating machine translation quality at run time…synonym and hypernym generation for eCommerce search…simulation of sensing and grasping for object manipulation, and so on.
Read more on the Amazon Research Awards page here.

Interdisciplinary Research: Automated Artists via Creative Adversarial Networks:
Researchers have tweaked a generative adversarial network so that it can be used to create synthetic artwork that feels more coherent and human than stuff we could previously generate.
…The approach, Creative Adversarial Networks (CANs), was outlined in a wonderfully interdisciplinary paper from researchers at Rutgers, Facebook, and the Department of Art History at the College of Charleston, South Carolina.
…CANs work somewhat like a generative adversarial network, except the discriminator now gives two signals back to the generator instead of one. First, it feeds back whether something qualifies as art (a discrimination based on it being pre-fed a large corpus of art) . Second, it gives a signal about how well it can classify the generator’s sample into an exact style.
…”If the generator generates images that the discriminator thinks are art and also can easily classify into one of the established styles, then the generator would have fooled the discriminator into believing it generated actual art that fits within established styles,” explain the authors.
…So, how good are the samples? The researchers carried out a quantitative evaluation where they showed human subjects (via mechanical turk) sets of paintings generated by, respectively, CANs, DCGAN, and via humans (across two sets: Abstract Expressionist and Art Basel 2016.)
Results: Human evaluators thought CAN images were generated by a human 53% of the time, versus 35% for DCGAN (and 85% for the human-generated abstract expressionist set).
…You can read more in the paper: “CAN: Creative Adversarial Networks, Generating “Art” by Learning About Styles and Deviating from Style Norms“. I rather liked some of them, reminiscent of Kandinsky via Pollack via Mondrian.

Google & DHS
sitting in a tree
…Google and the Department of Homeland Security have teamed up (via recent Google acquisition Kaggle) to create a competition to get data scientists to create algorithms to identify concealed items in images gathered by checkpoint body scanners.
…Total prize money: $1.5 million
…Sad trombone: Only US citizens or permanent residents can actually win money in this competition (though everyone can participate), somewhat going against the free-wheeling egalitarian nature of Kaggle.
More information on the competition on its Kaggle page here.

AI == War?:
…Alibaba chairman Jack Ma worries that artificial intelligence could lead to a third world war. …”The first technology revolution caused World War 1,” he told CNBC’s David Faber. “The second technology revolution caused World War II. This is the third technology revolution.”
AI == Inequality?:
…Chairman of VC firm Sinovation Ventures (and former head of Google) China Kai-Fu Lee, writing in the New York Times opinion pages, saysthe A.I. products that now exist are improving faster than most people realize and promise to radically transform our world, not always for the better. They are only tools, not a competing form of intelligence. But they will reshape what work means and how wealth is created, leading to unprecedented economic inequalities and even altering the global balance of power,” he writes.
…”Unlike the Industrial Revolution and the computer revolution, the A.I. revolution is not taking certain jobs (artisans, personal assistants who use paper and typewriters) and replacing them with other jobs (assembly-line workers, personal assistants conversant with computers). Instead, it is poised to bring about a wide-scale decimation of jobs — mostly lower-paying jobs, but some higher-paying ones, too.”
AI == An Amazing World, (if we make some changes)?:
…Michael Bloomberg says automation poses many risks to society but some of these can be re-mediated with policy changes. Health care should not be tied to employment, he says (a step taken by many Northern European and other countries already); governments should contemplate creating direct employment programs (as the US did with the New Deal back in a more optimistic time); benefits should be altered to subsidize low-income earners potentially via the Earned Income Tax Credit, and other ideas.
…”To spread the benefits of the age of automation far and wide, we’ll need more cooperation among government, business, education, and philanthropic leaders,” he writes in a column in, naturally, Bloomberg BusinessWeek..

What happens if only a few industries automate themselves too rapidly?
AI is going to bring about more opportunities for automation. The multi-trillion dollar question is how rapidly different industries will automate and what the aggregate effect will be. That relates to some of the issues the above people have been grappling with.
…I worry that there’s a way that uptake of AI can lead to pretty adverse effects. In the 20th century America went through a couple of revolutions, with both agriculture and manufacturing undergoing mass automation, leading to a significant reduction in their share of the overall economy.
…This was broadly good for the industries themselves, letting them feed and produce more far more efficiently. It wasn’t so bad for the displaced workers, either, because at the same time new technologies were unlocking new jobs, like automobiles creating entirely new occupation categories, or because the rest of the economy was growing rapidly enough to enlarge other industries, like the service sector.
…If AI is adopted unevenly, then it’s possible that those industries that turn to it will become a proportionally smaller part of the overall economy in terms of employment through a more efficient workforce, leading to a small well-remunerated class of specialized workers in automated industries, and poorer workers in the rest of the economy. The question is whether other industries will keep on growing – and that part is a real wildcard. If they don’t then they’ll become a stagnant drag on the economy, especially if they’re unable to access AI technologies used in other industries, and the gap between different levels of compensation could continue to widen. We’re already seeing some indicators of this kind of effect in the tech industry which pays its employees very highly but doesn’t in the aggregate boost national employment much at all.
…for an example of this worrying trend in action, check out this New York Times article about how post-industrial towns are now struggling with a stagnant physical retail market (likely partially due to online shopping displacing in-store shopping.) As a resident points out, all the good jobs with companies like Amazon that are leading to the physical retail decline are located near large metropolitan areas, hundreds of miles away. Where do the locals get to work?

Amazon files patent for drone delivery towers:
[Year 2035: Megacity 700, a flock of drones, like so many metal starlings, billow out of a gigantic tower, ferrying bright yellow packages to innumerable residents across the city.]
…Amazon’s patent for the “Multi-Level Fulfillment Center for Unmanned Aerial Vehicles” here.

MIT’s Senior House clampdown: Intervention or Culture-Washing?
MIT officials are seeking to shut down Senior House, a student community in MIT that houses “a disproportionately high number of people of color, LGBT students, the socioeconomically disadvantaged” and other oddball students, according to Save Senior House, a student-led initiative to lobby for preserving the accommodation. “In terms of diversity it is one of the most representational distribution of these factors that existed on campus, and maybe one of the best in all of higher education.”
…MIT says that the house had particularly low graduation rates, higher drug use, and faced more mental health issues, so it wants to step in and change the set-up. Save Senior House says many of these factors stem more from the diversity of the house rather than than what the students choose to do within it.
…MIT has evicted all residents, and will replace them with a new cohort starting in Autumn 2017 called ‘Pilot 2021‘.(A parody site of which is available here.)
… Sarah Schwettmann, a graduate student mentor who lived in Senior House, says: “In the Senior House community many residents find – some for the first time – what feels like home. Last Monday, I was given 48 hours notice of my eviction from Senior House, along with the other graduate mentors who normally remain in the House over the summer to integrate new and returning students in the fall. Now, police and security personnel guard an empty building, whose past residents valued openness and diversity. We’re experiencing action from the MIT administration that is both heavy-handed and disproportionate. This effort, undertaken while students are away from campus for the summer, eradicates a unique part of campus culture and restructures a new community from the top down.
…As someone who was hired to support this community, I see this as an administrative failure to support some of the most vulnerable and stigmatized members of MIT. These students present the institute with a challenge, and one not unique to MIT: how do we build a platform for the historically marginalized to define their own success in a rigorous academic environment, craft their own system of values, and learn to support themselves and each other? Such issues will accompany these students to wherever they reside on campus, so long as the institute continues to admit them. Senior House provided a community-driven solution, a work in progress engineered from the bottom up. In my eyes, MIT is dismantling that solution, and a century of history: cleaning house by sweeping the challenge itself under the rug.”
Expanded statement available here.

OpenAI Bits&Pieces:

OpenAI’s Ilya Sutskever spoke at the ACM Turing conference in San Francisco this week. You can find out more about the conference here and find video recordings of the panel and others here on the ACM’s Facebook page.

Tech Tales:
[ 20??: A park in a city. Winter. Frost on the ground. Some deer ferrying lost fawns across the park to be reunited with their mothers. ]

What year is it? Who are you? Where did you grow up? Why are you here? See how many of these and other questions you can answer before the timer runs out! Says the text on your tablet. In the bottom right-hand corner is a little red timer, counting down to zero. Five hours left. See how many points you can get before the time runs out! You don’t know much, yet.

Temporary Brain Wiping is what the neuroshrinks call it. Mental Fresh Air is what its fans call it. Lobotomy Cult is what the media calls it. You don’t know what you call it, because you’ve forgotten.

You know you must have agreed to initiate the wipe. You know some basic things, like how physics works, how to speak, how to read. But most of your memory is… not present. You know that you have memories but you can’t access them right now. It’s like they’re trapped in smoky glass – you can discern faint outlines, but there’s no resolution, nothing to put a hand on.

You see an older woman walking her dog. “Excuse me, what year is it?” you ask.
“Oh dear you’re going to have to try harder than that. We get a lot of your type around here now.”
“Can you give me a clue?”
“Well, when I was a young girl there was a band called the Spice Girls. They were the first CD I bought.”
“Thanks,” you say. Watch her as she walks away. Spice Girls, you think, dredging through your partially occluded memory. You don’t remember anything specific, but it feels old. The woman was old enough to have faded into a kind of graying twilight – anywhere between 50 and 80, depending on lifestyle and genetic lottery and, sadly probably, wealth. Where am I? You think. There are trees, very few houses, some elaborate old-looking buildings. People. The woman had a British accent. If you get to high ground you can see if there are any landmarks that the wipe didn’t get.
You study other people in the park, unsure whether they’re like you – temporarily marooned, mentally cut off from things – or if they’re a part of this world, emneshed in it through memory.

A few minutes later and you’re at a play-park, quizzing kids about what year it is. They all think your question is silly.
“What’s your name?” they ask.
“I don’t know.”
“Did you wipe yourself? My Dad does that when he gets sad some times. Were you sad?”
“I’m not sure. I hope not. I think I’m playing a game. Do you know what city this is?”
“London. I don’t understand this game-”
“-I LIKE TO REMEMBER EVERYTHING!,” blurts out another kid, before running up to the top of a slide and going down again. They hop off the bottom and run up to you. “The metal was cold but it was slippery and I went down really fast and because I was so fast there was wind and it meant there was air in my eyes. The metal at the bottom is very cold. I’m going to remember this forever,” they say, then they close their eyes and frown to themselves and you imagine them muttering to themselves internally remember remember remember. A memory rears up at you; you’re wearing pajamas sat on top of your bunk bed, staring at a shoe-box full of junk electronics, trying to assemble intelligence out of logo. The door opens and — the memory fades back into glass. Your parent? Who?

As the red timer ticks down you wonder about what you’re going to find when it releases. What happens when the memories come back? And where will you be? You walk away from the park, head for higher ground, hope that when your life comes back to you you’ll be gazing over a city that you know in a park that is familiar with friends in the distance. You hope for these things because they seem likely, but you have no way to be sure. You wonder what happens if, instead of letting the timer run out, you press the “extend” button, playing out the amnesia a little longer. You press it. Denied, the screen says. Too Many Extends In One Session. Please seek NeuroAttention for Evaluation Following Closure of Sequence. How many times can you loop out of your own memory? You don’t remember. Close your eyes. Hold the tablet in your hand. Feel the wind on your face. Wait for yourself to become yourself again.

Technologies that inspired this story: whatever the current memory substrate within neural nets ends up being, brain-computer interfaces, recursion.