Import AI 146: Making art with CycleGANs; Google and ARM team-up on low-power ML; and deliberately designing AI for scary uses
by Jack Clark
Chinese researchers use AI to build an artist-cloning mind map system:
…Towards the endless, infinite AI artist…
AI researchers from JD and the Central Academy of Fine Arts in Beijing have built Mappa Mundi, software to let people construct aesthetically pleasing ‘mind maps’ in the style of artist ‘Qui Zhijie’. The software was built to accompany an exhibition of Zhijie’s work.
How it works: The system has three main elements: a speech recognition module which pulls key words from speech; a topic expansion system which takes these words and pulls in other concepts from a rule-based knowledge graph; and software for image projection which uses any of 3,000 distinct painting elements to depict key words. One clever twist: the system automatically creates visual ‘routes’ between different words by analyzing their difference in the knowledge graph and using that to generate visualizations.
A reflexive, reactive system: Mappa Mundi works in-tandem with human users, growing and changing its map according to their inputs. “The generated information, after being presented in our system, becomes the inspiration for artist’s next vocal input,” they write. “This artwork reflects both the development of artist’s thinking and the AI-enabled imagination”.
Why this matters: I’m forever fascinated by the ways in which AI can help us better interact with the world around us, and I think systems like ‘Mappa Mundi’ give us a way to interact with the idiosyncratic aesthetic space defined by another human.
Read more: Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination (Arxiv).
Read more about Qiu Zhijie (Center for Contemporary Art).
#####################################################
Using AI to simulate and see climate change:
…CycleGAN to the rescue…
In the future, climate change is likely to lead to catastrophic flooding around the world, drowning cities and farmland. How can we make this likely future feel tangible to people today? Researchers with the Montreal Institute for Learning Algorithms, ConscienceAI Labs, and Microsoft Research, have created a system that can take in a Google Street View image of a house, then render an image showing how that house will look like under a predicted climate change future.
The resulting CycleGAN-based system does a decent job at rendering pictures of different houses under various flooding conditions, giving the viewer a more visceral sense of how climate change may influence where they live in the future.
Why this matters: I’m excited to see how we use the utility-class artistic capabilities of modern AI tools to simulate different versions of the world for people, and I being able to easily visualize the effects of climate change may help us make more people aware of how delicate the planet is.
Read more: Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks (Arxiv).
#####################################################
Google and ARM plan to merge low-power ML software projects:
…uTensor, meet TensorFlow Lite…
Google and chip designer ARM are planning to merge two open source frameworks for running machine learning systems on low-power ‘Arm’ chips. Specifically, uTensor is merging with Google’s ‘TensorFlow Lite’ software. The two organizations expect to work together to further increase the efficiency of running machine learning code on ARM chips.
Why this matters: As more and more people try to deploy AI to the ‘edge’ (phones, tablets, drones, etc), we need new low-power chips on which to run machine learning systems. We’ve got those chips in the form of processors from ARM and others, but we currently lack many of the programming tools needed to extract the greatest amount of performance as possible from this hardware. Software co-development agreements, like the one announced by ARM and Google, help standardize this type of software, which will likely lead to more adoption.
Read more: uTensor and Tensor Flow Announcement (Arm Mbed blog).
#####################################################
Microsoft wants your devices to record (and transcribe) your meetings:
…In the future, our phones and tablets will transcribe our meetings…
In the future, Microsoft thinks people attending the same meeting will take out their phones and tablets, and the electronic devices will smartly coordinate to transcribe the discussions taking place. That’s the gist of a new Microsoft research paper, which outlines a ‘Virtual Microphone Array’ made of “spatially distributed asynchronous recording devices such as laptops and mobile phones”.
Microsoft’s system can integrate audio from a bunch of devices spread throughout a room and use it to transcribe what is being said. The resulting system (trained on approximately 33,000 hours of in-house data) is more effective than single microphones at transcribing natural multi-speaker speech during the meeting; “there is a clear correlation between the number of microphones and the amount of improvement over the single channel system”, they write. The system struggles with overlapping speech, as you might expect.
Why this matters: AI gives us the ability to approximate things, and research like this shows how the smart use of AI techniques can let us approximate the capabilities of dedicated microphones, piecing one virtual microphone together out of a disparate set of devices.
Read more: Meeting Transcription Using Virtual Microphone Arrays (Arxiv).
#####################################################
One language model, trained in three different ways:
…Microsoft’s Unified pre-trained Language Model (UNILM) is a 3-objectives-in-1 transformer…
Researchers with Microsoft have trained a single, big language model with three different objectives during training, yielding a system capable of a broad range of language modeling and generation tasks. They call their system the Unified pre-trained Language Model (UNILM) and say this approach has two advantages relative to single-objective training:
- Training against multiple objectives means UNILM is more like a 3-in-1 system, with different capabilities that can manifest for different tasks.
- Parameter sharing during joint training means the resulting language model is more robust as a consequence of being exposed to a variety of different tasks under different constraints
The model can be used for natural language understanding and generation tasks and, like BERT and GPT, is based on a ‘Transformer’ component. During training, UNILM is given three distinct language modelling objectives: bidirectional (predicting words based on those on the left and right; useful for general language modeling tasks, used in BERT); unidirectional (predicting words based on those to the left; useful for language modeling and generation, used in GPT2); and sequence-to-sequence learning (mapping sequences of tokens to one another, subsequently used in ‘Google Smart Reply’).
Results: The trained UNILM system obtains state-of-the-art scores on summarization and question answering tasks, and also sets state-of-the-art on text generation tasks (including the delightful recursive tasks of learning to generate appropriate questions that map to certain answers). The model also obtains a state-of-the-art score on the multi-task ‘GLUE’ benchmark (though note GLUE has subsequently been replaced by ‘SuperGLUE’ due to its creators thinking it is a little too easy.
Why this matters: Language modelling is undergoing a revolution as people adopt large, simple, scalable techniques to model and generate language. Papers like UNILM gesture towards a future where large models are trained with multiple distinct objectives over diverse datasets, creating utility-class systems that have a broad set of capabilities.
Read more: Unified Language Model Pre-training for Natural Language Understanding and Generation (Arxiv).
#####################################################
AI… for Bad!
…CHI workshop is an intriguing direction for AI research..
This week, some researchers gathered together to prototype the ways in which their research could be used for evil. This workshop ‘CHI4Evil, Creative Speculation on the Negative Effects of HCI Research’, was held at the ACM CHI Conference on Human Factors in Computing Systems, and was designed to investigate various ideas in HCI through the lens of designing deliberately bad or undesirable systems.
Why this matters: Prototyping the potential harms of technology can be pretty useful for calibrating thinking about threats and opportunities (see: GPT-2), and thinking about such harms through the lens of human-computer interaction (HCI, or CHI) feels likely to yield new insights. I’m excited for future “AI for Bad” conferences (and would be interested to co-organize one with others, if there’s interest).
Read more: CHI4EVIL website.
#####################################################
Facial recognition is a hot area for venture capitalists:
…Chinese start-up Megvii raise mucho-moolah…
Megvii, a Chinese computer vision startup known by some as ‘Face++’ has raised $750 million in a funding round. Backers include the Bank of China Group Investment Ltd; a subsidiary of the Abu Dhabi Investment Authority, and Alibaba Group. The company plans to IPO soon.
Why this matters: Chinese is home to numerous large-scale applications of AI for usage in surveillance, and is also exporting surveillance technologies via its ‘One Belt, One Road’ initiative (which frequently pairs infrastructure investment with surveillance).
This is an area fraught with both risks and opportunities – the risks are that we sleepwalk into building surveillance societies using AI, and the opportunities are that (judiciously applied) surveillance technologies can sometimes increase public safety, given the right oversight. I think we’ll see numerous Chinese startups push the boundaries of what is thought to be possible/deployable here, so watching companies like Megvii feels like a leading indicator for what happens when you combine surveillance+society+capitalism.
Read more: Chinese AI start-up Megvii raises $750 million ahead of planned HK IPO (Reuters).
Chatbot company builds large-scale AI system, doesn’t fully release it:
…Startup Hugging Face restricts release of larger versions of some models following ethical concerns…
NLP company Hugging Face has released a demo, tutorial, and open-source code for creating a conversational AI based on OpenAI’s Transformer-based ‘GPT2‘ system.
Ethics in action: The company said it decided not to release the full GPT2 model for ethical reasons – it thought the technology had a high chance of being used to improve spam-bots, or to perform “mass catfishing and identity fraud”. “We are aligning ourselves with OpenAI in not releasing a bigger model until they do,” the organization wrote.
Read more: Ethical analysis of the open-sourcing of a state-of-the-art conversational AI (Medium).
Read more about Hugging Face here (official website).
#####################################################
Tech Tales
The Evolution Game
They built the game so it could run on anything, which meant they had to design it differently to other games. Most games have a floor on their performance – some basic set of requirements below which you can’t expect to play. But not this game. Neverender can run on your toaster, or fridge, or watch, and it can just as easily run on your home cinema, or custom computer, and so on. Both the graphics and gameplay change depending on what it is running on – I’ve spent hours stood fiddling with the electronic buttons on my oven, using them to move a small character across a simplified Neverender gameboard, and I’ve also spent hours in my living room navigating a richly-rendered on screen character through a lush, Salvador Dali-esque horrorworld. I’m what some people call a Neverheader, or what others call a Nevernut. If you didn’t know anything about the game, you’d probably call me a superfan.
So I guess that’s why I got the call when Neverender started to go sideways. They brought me in and asked me to play it and I said “what else?”
“Just play it,” they said.
So I did. I sat in my livingroom surrounded by a bunch of people in suits and I played the game. I navigated my character past the weeping lands and up into eldritch keep and beyond, to the deserts of dream. But when I got to the deserts they were different: the sand dunes had grown in size, and some of them now hosted cave entrants. Strange lights shot out of them. I went into one and was killed almost instantly by a beam of light that caused more damage than all the weapons in my inventory combined. After I was reborn at the spawn point I proceeded more carefully, skirting these light-spewing entrances, and trying to walk further across the sand plains to whatever lay beyond.
The game is thinking, they tell me. In the same way Neverender was built to run on anything, its developers recently rolled out a patch that let it use anything. Now, all the game clients are integrated with the game engine and backend simulation environment, sharing computational resources with eachother. Mostly, it’s leading to better games and more entertained players. But in some parts of the gameworld, things are changing that should not be changing: larger sand dunes with subterranean cities inside themselves? Wonderful! That’s the sort of thing the developers had hoped for. But having the caves be controlled by beams of light of such power that no one can go and play within them? That’s a lot less good, and something which no one had expected.
My official title now is “Explorer”, but I feel more like a Spy. I take my character and I run out into the edges of the maps of Neverender, and usually I find areas where the game is modifying itself or growing itself in some way. The code is evolving. One day we took off the local sandbox systems, letting Neverender deploy more code, deeper into my home system. As I played the game the lights began to flicker, and when I was in a cave I discovered some treasure and the game automatically fired up some servers which we later discovered it was using to to high-fidelity modelling of the treasure.
The question we all ask ourselves, now, is whether the things Neverender is building within itself are extensions of the game, or extensions of the mind behind the game. We hear increasing reports of ‘ghosts’ seen across the game universe, and of intermittent cases of ‘kitchen appliance sentience’ in the homes of advanced players. We’ve even been told by some that this is all a large marketing campaign, and any danger we think is plausible, is just a consequence of us having over-active imaginations. Nonetheless, we click and play and explore.
Things that inspired this story: Endless RPGs; loot; games that look like supercomputers such as Eve Online; distributed computation; relativistic ideas deployed on slower timescales.
Reblogged this on Hyper Emotional Society.