Import AI: 123: Facebook sees demands for deep learning services in its data centers grow by 3.5X; why advanced AI might require a global policeforce; and diagnosing natural disasters with deep learning

by Jack Clark

#GAN_Paint: Learn to paint with an AI system:
…Generating pictures out of neuron activations – a new, AI-infused photoshop filter…
MIT researchers have figured out how to extract more information from trained generative adversarial networks, letting them identify specific ‘neurons’ in the network that correlate to specific visual concepts. They’ve built a website that lets anyone learn to paint with these systems. The effect is akin to having a competent ultra-fast painter standing by your shoulder, letting you broadly spraypaint an area where you’d like, for instance, some sky, and then the software activates the relevant ‘neuron’ in the GAN model and uses that to paint an image for you.
  Why it matters: Demos like this give a broader set of people a more natural way to interact with contemporary AI research, and help us develop intuitions about how the technology behaves.
  Paint with an AI yourself here: GANpaint (MIT-IBM Watson AI Lab website).
  Read more about the research here: GAN Dissection: Visualizing and Understanding Generative Adversarial Networks (MIT CSAIL).
  Paint with a GAN here (GANPaint website).

DeepMind says the future of AI safety is all about agents that learn their own reward functions:
…History shows that human-specified reward functions are brittle and prone to creating agents with unsafe behaviors…
Researchers with DeepMind have laid out a long-term strategy for creating AI agents that do what humans want in complex domains where it is difficult for humans to construct an appropriate reward function.
  The basic idea here is that to create safe AI agents, we want agents that figure out appropriate reward functions by collecting information from the (typically human) user and use this to learn a reward function, then we can use reinforcement learning to optimize this learned reward function. The nice thing about this approach, according to DeepMind, is that it should work for agents that have the potential to become smarter than humans: “agents trained with reward modeling can assist the user in the evaluation process when training the next agent”.
  A long-term alignment strategy: DeepMind thinks that this approach potentially has three properties that give it a chance of being adopted by researchers: it is scalable, it is economical, and it is pragmatic.
  Next steps: The researchers say these ideas are “shovel-ready for empirical research today”. The company believes that “deep RL is a particularly promising technique for solving real-world problems. However, in order to unlock its potential, we need to train agents in the absence of well-specified reward functions.” This research agenda sketches out ways to do that.
  Challenges: Reward modeling has a few challenges which are as follows: amount of feedback (how much data you need to collect to have the agent successfully learn the reward function); the distribution of feedback (where the agent visits new states which lead to it generating a higher perceived reward for doing actions that are in reality sub-optimal); reward hacking, which is when the agent finds a way to exploit the task to give itself reward that leads to it learning a function that does not reflect the implicit expressed wishes of the user; unacceptable outcomes (taking actions that a human would likely never approve, such as an industrial robot breaking its own hardware to achieve a task; or a personal assistant automatically writing a very rude email; and the reward-result gap (the gap between the optimal reward model and the reward function learned by the agent ). DeepMind thinks that each of these challenges can potentially be dealt with by some specific technical approaches, and today there exist several distinct ways to tackle each of the challenges, which seems to increase the chance of one working out satisfactorily.
  Why it might matter: Human empowerment: Putting aside the general utility of having AI agents that can learn to do difficult things in hard domains without inflicting harm on humans, this research agenda also implies something else: Something which isn’t directly discussed in the paper but which is implicit to this agenda is that it offers a way to empower humans with AI. if AI systems continue to scale in capability then it seems likely that in a matter of decades we will fill society with very large AI systems which large numbers of people interact with. We can see the initial outlines of this today in the form of large-scale surveillance systems being deployed in countries like China; in self-driving car fleets being rolled out in increasing numbers in places like Phoenix, Arizona (via Google Waymo); and so on. I wonder what it might be like if we could figure out a way to maximize the number of people in society who were engaged in training AI agents via expressing preferences. After all, the central mandate of many of the world’s political systems comes from people regularly expressing their preferences via voting (and, yes, these systems are a bit rickety and unstable at the moment, but I’m a bit of an optimist here). Could we better align society with increasingly powerful AI systems by more deeply integrating a wider subset of society into the training and development of AI systems?
  Read more: Scalable agent alignment via reward modeling: a research direction (Arxiv).

Global police, global government likely necessary to ensure stability from powerful AI, says Bostrom:
…If it turns out we’re playing with a rigged slot machine, then how do we make ourselves safe?…
Nick Bostrom, researcher and author of Superintelligence (which influenced the thinking of a large number of people with regard to AI) has published new research in which he tries to figure out what problems policymakers might encounter if it turns out planet earth is a “vulnerable world”; that is a world “which there is some level of technological development at which civilization almost certainly gets devastated by default”.
  Bostrom’s analysis compares the process of technological development as like a person or group of people steadily withdrawing balls from a vase. Most balls are white (beneficial, eg medicines), while some are of various shades of gray (for instance, technologies that can equally power industry or warmaking). What Bostrom’s Vulnerable World Hypothesis papers worries about is whether we could at one point withdraw a “black ball” from the vase. This would be “a technology that invariably or by default destroys the civilization that invents it. The reason is not that we have been particularly careful or wise in our technology policy. We have just been lucky.”
  In this research, Bostrom creates a framework for thinking about the different types of risks that such balls could embody, and outlines some ideas for potential (extreme!) policy responses to allow civilization to prepare for such a black ball.
  Types of risks: To help us think about these black balls, Bostrom lays out a few different types of civilization vulnerability that could be stressed by such technologies.
  Type-1 (“easy nukes”): “There is some technology which is so destructive and so easy to use that, given the semi-anarchic default condition, the actions of actors in the apocalyptic residual make civilizational devastation extremely likely”.
  Type-2a (“safe first strike”): “There is some level of technology at which powerful actors have the ability to produce civilization-devastating harms and, in the semi-anarchic default condition, face incentives to use that ability”.
  Type-2b (“worse global warming”): “There is some level of technology at which, in the semi-anarchic default condition, a great many actors face incentives to take some slightly damaging action such that the combined effect of those actions is civilizational devastation”.
  Type-0: “There is some level of technology that carries a hidden risk such that the default outcome when it is discovered is inadvertent civilizational devastation”.
  Policy responses for a risky world: bad ideas: How could we make a world with any of these vulnerabilities safe and stable? Bostrom initially considers four options then puts aside two as being unlikely to yield sufficient stability to be worth pursuing. These discarded ideas are to: restrict technological development, and “ensure that there does not exist a large population of actors representing a wide and recognizably human distribution of motives” (aka, brainwashing).
  Policy responses for a risky world: good ideas: There are potentially two types of policy response that Bostrom says could increase the safety and stability of the world. These are to adopt “Preventive policing” (which he also gives the deliberately inflammatory nickname “High-tech Panopticon”), as well as “global governance”. Both of these policy approaches are challenging. Preventive policing would require all states being able to “monitor their citizens closely enough to allow them to intercept anybody who begins preparing an act of mass destruction”. Global governance is necessary because states will need “to extremely reliably suppress activities that are very strongly disapproved of by a very large supermajority of the population (and of power-weighted domestic stakeholders)”, Bostrom writes.
  Why it matters: Work like this grapples with one of the essential problems of AI research: are we developing a technology so powerful that it can fundamentally alter the landscape of technological risk, even more so than the discovery of nuclear fission? It seems unlikely that today’s AI systems fit this description, but it does seem plausible that future AI technologies could be. What will we do, then?  “Perhaps the reason why the world has failed to eliminate the risk of nuclear war is that the risk was insufficiently great? Had the risk been higher, one could eupeptically argue, then the necessary will to solve the global governance problem would have been found,” Bostrom writes.
  Read more: The Vulnerable World Hypothesis (Nick Bostrom’s website).

Facebook sees deep learning demand in its data centers grow by 3.5X in 3 years:
…What Facebook’s workloads look like today and what they might look like in the future…
A team of researchers from Facebook have tried to characterize the types of deep learning inference workloads running in the company’s data centers and predict how this might influence the way Facebook designs its infrastructure in the future.
  Hardware for AI data centers: So what kind of hardware might an AI-first data center need? Facebook believes servers should be built with the following concerns in mind: high memory bandwidth and capacity for embeddings; support for powerful matrix and vector engines; large on-chip memory for inference with small batches; support for half-precision floating-point computation.
  Inference, what is it good for? Facebook has the following major use cases for AI in its datacenters: providing personalized feeds, ranking, or recommendations; content understanding; and visual and natural language understanding.
  Facebook expects these workloads to evolve in the future: for recommenders, it suspects it will start to incorporate time into event-probability models, and imagines using larger embeddings in its models which will increase their memory demands; for computer vision, it expects to do more transfer learning via fine-tuning pre-trained models onto specific datasets, as well as exploring more convolution types, different batch sizes, and moving to higher resolutions of imagery to increase accuracy; for language it expects to explore larger batch sizes, evaluate new types of mode, like transformers; and move to deploying larger multi-lingual models.
  Data-center workloads: The deep learning applications in Facebook’s data centers “have diverse compute patterns where matrices do not necessarily have “nice” square shapes. There are also many “long-tail” operators other than fully connected and convolutional layers. Therefore, in addition to matrix multiplication engines, hardware designers should consider general and powerful vector engines,” the researchers write.
  Why it matters: Papers like this give us a sense of all the finicky work required to deploy deep learning applications at scale, and indicates how computer design is going to change as a consequence of these workload demands. “Co-designing DL inference hardware for current and future DL models is an important but challenging problem,” the Facebook researchers write.
  Read more: Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications (Arxiv).

In the future, drones will heal the land following forest fires:
…Startup Droneseed uses large drones + AI to create re-forestation engines….
TechCrunch has written a lengthy profile of DroneSeed, a startup that is using drones and AI to create systems that can reforest areas after wildfires.
  DroneSeed’s machines have “multispectral camera arrays, high-end lidar, six gallon tanks of herbicide and proprietary seed dispersal mechanisms,” according to TechCrunch. The drones can be used to map areas that have recently been burned up in forest fires, then can autonomously identify the areas where trees have a good chance to grow and can deliver seed-nutrient packages to those areas.
  Why it matters: I think we’re at the very beginning of exploring all the ways in which drones can be applied to nature and wildlife maintenance and enrichment, and examples like this feel like tantalizing prototypes of a future where we use drones to perform thousands of distinct civic services.
  Read more here: That night, a forest flew (TechCrunch).
  Check out DroneSeed’s twitter account here.

Learning to diagnose natural disaster damage, with deep learning:
…Facebook & CrowdAI research shows how to automate the analysis of natural disasters…
Researchers with satellite imagery startup CrowdAI and Facebook have shown how to use convolutional neural networks to provide automated assessment of damage to urban areas from natural disasters. In a paper submitted to the “AI for Social Good” workshop at NeurIPs 2018 (a prominent AI conference, formerly named NIPS) the team “propose to identify disaster-impacted areas by comparing the change in man-made features extracted from satellite imagery. Using a pre-trained semantic segmentation model we extract man-made features (e.g. roads, buildings) on the before and after imagery of the disaster affected area. Then, we compute the difference of the two segmentation masks to identify change.”
  Disaster Impact Index (DII): How do you measure the effect of a disaster? The researchers propose DII, which lets them calculate the semantic change that has occurred in different parts of satellite images, given the availability of a before and after dataset. To test their approach they use large-scale satellite imagery datasets of land damaged by Hurricane Harvey and by fires near Santa Rosa.  They show that they can use DII to automatically infer severe flooding and fire damage areas in both images with a rough accuracy (judged by F1 score) of around 80%.
  Why it matters: Deep learning-based techniques are making it cheaper and easier for people to train specific detectors over satellite imagery, altering the number of actors in the world who can experiment with surveillance technologies for both humanitarian purposes (as described here) and likely military ones as well. I think within half a decade it’s likely that governments could be tapping data feeds from large satellite fleets then using AI techniques to automatically diagnose damage from an ever-increasing number of disasters created by the chaos dividend of climate change.
  Read the paper: From Satellite Imagery to Disaster Insights (Facebook Research).

Deep learning for medical applications takes less data than you think:
…Stanford study suggests tens of thousands of images are sufficient for medical applications…
Stanford University researchers have shown that it takes a surprisingly small amount of data to teach neural networks how to automatically categorize chest radiographs. The researchers then trained AlexNet, ResNet-18, and DenseNet-121 baselines on the data, attempting to classify normal versus abnormal images. In tests, the researchers show that it is possible to obtain an area under the receiver operating characteristic curve (AUC) of .095 for a CNN model trained on 20,000 images, versus 0.96 for one trained on 200,000 images, suggesting that it may take less data than previously assumed to train effective AI medical classification tools. (By comparison, 2,000 images yields an AUC of 0.84, representing a significant accuracy penalty.)
  Data scaling and medical imagery: “While carefully adjudicated image labels are necessary for evaluation purposes, prospectively labeled single-annotator data sets of a scale modest enough (approximately 20,000 samples) to be available to many institutions are sufficient to train high-performance classifiers for this task.”
  Drawbacks: All the data used in this study was drawn from the same medical institution, so it’s possible that either the data (or, plausibly, the patients) contain some specific idiosyncracies that mean networks trained on this dataset might not generalize to imagery captured by other medical institutions.
  Why it matters: Studies like this show how today’s AI techniques are beginning to show good enough performance in clinical contexts that they will soon be deployed alongside doctors to make them more effective. It’ll be interesting to see whether the use of such technology can make healthcare more effective (healthcare is one of the rare industries where the addition of new technology frequently leads to cost increases rather than cost savings).
  Some kind of future: In an editorial published alongside the paper Bram van Ginneken , from the Department of Radiology and Nuclear Medicine at Radboud University  in the Netherlands, wonders if we could in the future create large, shared datasets that multiple institutions could use. This dataset “would benefit from training on a multicenter data set much larger than 20,000 or even 200,000 examinations. This larger size is needed to capture the diversity of data from different centers and to ensure that there are enough examples of relatively rare abnormal findings so that the network learns not to miss them,” he writes. “Such a large-scale system should be based on newly designed network architectures that take the full-resolution images as input. It would be advisable to train systems not only to provide a binary output label but also to detect specific regions in the images with specific abnormalities.”
  Read more: Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs (Jared Dunnmon Github / Radiology, PDF).
  Read the editorial: Deep Learning for Triage of Chest Radiographs: Should Every Instituion Train Its Own System? (Jared Dunnmon Github / Radiology, PDF).

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net

Amnesty and employees petition Google to end Project Dragonfly:
Google employees have published an open letter calling on Google to cancel Dragonfly, its censored search engine being developed for use within China. This follows similar calls by human rights organizations including Amnesty International for the company to suspend the project. The letter accuses the company of developing technologies that “aid the powerful in oppressing the vulnerable”, and of being complicit in the Chinese government’s surveillance programs and human rights abuses.
  Speaking in October about Dragonfly, CEO Sundar Pichai emphasized the need to balance Google’s values with the laws of countries in which they operate, and their core mission of providing information to everyone. Pichai will be testifying to the House Judiciary Committee in US Congress later this week.
  There are clear similarities between these protests and those over Project Maven earlier this year, which resulted in Google withdrawing from the controversial Pentagon contract, and establishing a set of AI principles.
  Read more: We are Google employees. Google must drop Dragonfly (Medium).
  Read more: Google must not capitulate to China’s censorship demands (Amnesty).

High-reliability organizations:
…Want to deploy safe, robust AI? You better make sure you have organizational processes as good as your technology…
As technologies become more powerful, risks from catastrophic errors increase. This is true for advanced AI, even in near-term use cases such as autonomous vehicles or face recognition. A key determinant of these risks will be the organizational environment through which AI is being deployed. New research from Tom Diettrich at Portal State University applies insights from research into ‘high-reliability organizations’ to derive three lessons for the design of robust and safe human-AI systems.

  1. We should aim to create combined human-AI systems that become high-reliability organizations, e.g. by proactively monitoring the behaviour of human and AI elements, continuously modelling and minimizing risk, and supporting combined human-AI cooperation and planning.
  2. AI technology should not be deployed when it is impossible for surrounding human organizations to be highly reliable. For example, proposals to integrate face recognition with police body-cams in the US are problematic insofar as it is hard to imagine how to remove the risk of catastrophic errors from false positives, particularly in armed confrontations.
  3. AI systems should continuously monitor human organizations to check for threats to high-reliability. We should leverage AI to reduce human error and oversight, and empower systems to take corrective actions.

  Why this matters: Advanced AI technologies are already being deployed in settings with significant risks from error (e.g. medicine, justice), and the magnitude of these risks will increase as technologies become more powerful. There is an existing body of research into designing complex systems to minimize error risk, e.g. in nuclear facilities, that is relevant to thinking about AI deployment.
  Read more: Robust AI and robust human organizations (arXiv).

Efforts for autonomous weapons treaty stall:
The annual meeting of the Convention on Conventional Weapons (CCW) has concluded without a clear path towards an international treaty on lethal autonomous weapons. Five countries (Russia, US, Israel, Australia and South Korea) expressed their opposition to a new treaty. Russia successfully reduced the scheduled meetings for 2019 from 10 to 7 days, in what appears to be an effort to decrease the likelihood of progress towards an agreement.
  Read more: Handful of countries hamper discussion to ban killer robots at UN (FLI).

Tech Tales:

Wetware Timeshare

It’s never too hard to spot The Renters – you’ll find them clustered near reflective surfaces staring deeply into their own reflected eyes, or you’ll notice a crowd standing at the edge of a water fountain, periodically holding their arms out over the spray and methodically turning their limbs until they’re soaked through; or you’ll see one of them make their way round a buffet a restaurant, taking precisely one piece from every available type of food.

The deal goes like this: run out of money? Have no options? No history of major cognitive damage in your family? Have the implant? If so, then you can rent your brain to a superintelligence. The market got going a few years ago, after we started letting the robots operate in our financial markets. Well, it turns out that despite all of our innovation in silicon, human brains are still amazingly powerful and, coupled with perceptive capabilities and the very expensive multi-million-years-of-evolution physical substrate, are an attractive “platform” for some of the artificial minds to offload processing tasks to.

Of course, you can set preferences: I want to be fully clothed at all times, I don’t want to have the machine speak through my voice, I would like to stay indoors, etc. Obviously setting these preferences can reduce the value of a given brain in the market, but that’s the choice of the human. If a machine bids on you then you can choose to accept the bid and if you do that it’s kind of like instant-anesthetic. Some people say they don’t feel anything but I always feel a little itch in the vein that runs up my neck. You’ll come around a few hours (or, for rare high-paying jobs, days) later and you’re typically in the place you started out (though some people have been known to come to on sailing ships, or in patches of wilderness, or in shopping malls holding bags and bags of goods bought by the AI).

Oh sure there are protests. And religious groups hate it as you can imagine. But people volunteer for it all the time: some people do it just for the escape value, not for the money. The machines always thank any of the people they have rented brain capacity from, and their complements might shed some light on what they’re doing with all of us: Thankyou subject 478783 we have improved our ability to understand the interaction of light and reflective surfaces; Thankyou subject 382148 we now know the appropriate skin:friction setting for the effective removal of old skin flakes; Thankyou subject 128349 we know now what it feels like to run to exhaustion; Thankyou subject 18283 we have seen sunrise through human eyes, and so on.

The machines tell us that they’re close to developing a technology that can let us rent access to their brains. “Step Into Our World: You’ll Be Surprised!” reads some of the early marketing materials.

Things that inspired this story: Brain-computer interfaces; the AI systems in Iain M Banks books.