Import AI: #105: Why researchers should explore the potential negative effects of their work; fusing deep learning with classical planning for better robots, and who needs adversarial examples when a blur will do?

by Jack Clark

Computer scientist calls for researchers to discuss downsides of work, as well as upsides:
…Interview with Brent Hecht, chair of the Association for Computing Machinery (ACM)’s Future of Computing Academy, which said in March that researchers should list downsides of their work…
One of the repeated problems AI researchers deal with is the omni-use nature of the technology: a system designed to recognize a wide variety of people in different poses and scenes can also be used to surveil people; auto-navigation systems for disaster response can be repurposed for weaponizing consumer platforms; systems to read lips and thereby improve the quality of life of people with hearing and/or speech difficulties can also be used to surreptitiously analyze people in the wild; and so on.
  Recently, the omni-use nature of this tech has been highlighted as companies like Amazon develop facial recognition tools which are subsequently used by the police, or how Google uses computer vision techniques to develop systems for the ‘MAVEN’ program from the DoD. What can companies and researchers do to increase the positive effects of their research and minimize some of the downsides? Computer science professor Brent Hecht says in an interview with Nature that scientists should consider changing the process of peer review to encourage scientists to talk about the potential for abuse of their work.
“In the past few years, there’s been a sea-change in how the public views the real-world impacts of computer science, which doesn’t align with how many in the computing community view our work,” he says. “A sizeable population in computer science thinks that this is not our problem. But while that perspective was common ten years ago, I hear it less and less these days.”
  Why it matters: “Disclosing negative impacts is not just an end in itself, but a public statement of new problems that need to be solved,” he says. “We need to bend the incentives in computer science towards making the net impact of innovations positive.”
  Read more: The ethics of computer science: this researcher has a controversial proposal (Nature).

Sponsored: The AI Conference – San Francisco, Sept 4–7:
…Join the leading minds in AI, including Kai-Fu Lee, Meredith Whittaker, Peter Norvig, Dave Patterson, and Matt Wood. No other conference combines this depth of technical expertise with a laser focus on how to apply AI in your products and in your business today.
…Register soon. Last year this event sold out; training courses and tutorials are filling up fast. Save an extra 20% on most passes with code IMPORTAI20.

Worried about adversarial examples and self-driving cars? You should really be worried about blurry images:
…Very basic corruptions to images can cause significant accuracy drops, research shows…
Researchers with the National Robotics Engineering Center and the Electrical and Computer Engineering Department at CMU have shown that simply applying basic image degradations that blur images, or add haze to them, leads to significant performance issues. “We show cases where performance drops catastrophically in response to barely perceptible changes,” writes researcher Phil Koopman in a blog post that explains the research. “You don’t need adversarial attacks to foil machine learning-based perception – straightforward image degradations such as blur or haze can cause problems too”.
  Testing: The researchers test a variety of algorithms across three different architectures (Faster R-CNN, Single Shot Detector (SSD), and Region-based Fully Convolutional Network (R-FCN); they test these architectures with a variety of feature extractors, like Inception or MobileNets. They evaluate these algorithms by testing them on the NREC ‘Agricultural Person Detection Dataset’. The researchers apply two types of mutation to the images: “simple” mutators which modify the image, and “contextual” mutators which mutate the image while adding additional information. To test the “simple” mutations they apply simple image transformations, like Gaussian blur, JPEG Compression, the addition of salt and pepper noise, and so on. For the “contextual” mutations they apply things like haze to the image.
  Results: In tests, the researchers show that very few detectors are immune from the effects of these perturbations, with results indicating that the Single Shot Detectors (SSD)s have the greatest amount of trouble with dealing with these relatively minor tweaks. One point of interest is that some of the systems which are resilient to these mutations are resilient to quite a few of them quite consistently – the presence of these patterns shows “generalized robustness trends”, which may serve as signposts for future researchers to further evaluate generalization.
  Read more: Putting image manipulations in context: robustness testing for safe perception (Safe Autonomy / Phil Koopman blogspot).
  Read more: Putting Image Manipulations in Context: Robustness Testing for Safe Perception (PDF).

Researchers count on blobs to solve counting problems:
…Segmenting objects may be hard, but placing dots on them may be easy…
Precisely counting objects in scenes, like the number of cars on a road or people walking through a city, is a task that challenges both humans and machines. Researchers are training object counters to label individual entities via dots to indicate each entity, rather than pixel segmentation masks or bounding boxes, as is typical. “We propose a novel loss function that encourages the model to output instance regions such that each region contains a single object instance (i.e. a single point-level annotation),” they explain. This tweak significantly improves performance relative to other baselines based on segmentation and depth.They evaluate their approach on diverse datasets, consisting of images of parking lots, images taken by traffic cameras, images of penguins, PASCAL VOC 2007, another surveillance dataset called MIT Traffic, and Crowd Counting Datasets.
  Why it matters: Counting objects is a difficult task for AI systems, and approaches like this indicate other ways to tackle the problem. In the future, the researchers want to design new network architectures that can better distinguish between overlapping objects that have complicated shapes and appearances.
  Read more: Where are the Blobs: Counting by Localization with Point Supervision (Arxiv).

Predicting win states in Dota 2 for better reinforcement learning research:
…System’s recommendations outperform proprietary product’s…
Researchers have trained a system to predict the probability of a given team winning or losing a game of popular online game Dota 2. This is occurring at the same time that researchers across the world try to turn MOBAs into test-beds for reinforcement learning.
  To train their model, the researchers downloaded and parsed replay files from over 100,000 Dota 2 matches. They generate discrete bits of data for each 60 second period of a game, containing a vector which encodes information about the players state at that point in time. They then use these slices to inform a point-in-time ‘Time Slice Evaluation’ (TSE) model which attempts to predict the outcome of the match from a given point in time. . The researchers do detect some correlation between the elapsed game time, the ultimate outcome of the match, and the data contained within the slice being studied at this point in time. Specifically, they find that after the first fifty percent of games it becomes fairly easy to train a model to accurately predict win likelihoods, so they train their system on this data.
  Results: The resulting system can successfully predict the outcome of matches and outperforms  ‘Dota Plus’, a proprietary subscription service that provides a win probability graph for every match. (Chinese players apparently call this service the ‘Big Teacher’, which I think is quite delightful!).The researchers’ system is, on average, about three percentage points more accurate than Dota Plus Assistant and starts from a higher base prediction accuracy. One future direction of research is to train on the first 50 percent of elapsed match time, though this would require algorithmic innovation to deal with the early-game instability. Another is to implement a recurrent neural network system so that instead of making predictions based on a single time-slice, the system can instead make predictions from sequences of slices.
  Why it matters: MOBAs are rapidly becoming a testbed for advanced reinforcement learning approaches, with companies experiencing with games like Dota. Papers like this give us a better idea of the sorts of specific work that need to go on to make it easy for researchers to work with these platforms.
  Read more: MOBA-Slice: A Time Slice Based Evaluation Framework of Relative Advantage between Teams in MOBA Games (Arxiv).

Better robots via fusing deep learning with classical planning:
…Everything old is new again as Berkeley and Chicago researchers staple two different bits of the AI field together…
Causal InfoGAN is a technique for learning what the researchers call “plannable representations of dynamical systems”. Causal InfoGANs work by observing an environment, for instance, a basic maze simulation, and exploring it. They use this exploration to develop a representation of the space, which they then use to compose plans to navigate across it.
  Results: In tests, the researchers show that Causal InfoGAN can develop richer representations of basic mazes, and can use these representations to create plausible trajectories to navigate the space. In another task, they show how the Causal InfoGAN can learn to perform a multi-stage task that requires searching to find a key then unlocking a door and proceeding through it. They also test their approach on a rope manipulation task, where the Causal InfoGAN needs to plan how to transition a rope from an initial state to a goal state (such as a crude knot, or different 2D arrangement of the rope on a table.
  Why it matters: The benefit of techniques like this is it takes something that has been developed for many years in classical AI – planning under constraints – and augments it with deep learning-based approaches to make it easier to access information about the environment. “Our results for generating realistic manipulation plans of a rope suggest promising applications in robotics,” they write. “Where designing models and controllers for manipulating deformable objects is challenging.”
  Read more: Learning Plannable Representations with Causal InfoGAN (Arxiv).

AI Policy with Matthew van der Merwe:
…Reader Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net…

Amazon’s face recognition software falsely matches US Members of Congress with criminals:
The ACLU have been campaigning against the use of Amazon’s Rekognition software by US law enforcement agencies. For their latest investigation, they used the software to compare photos of all sitting members of Congress against 2,500 mugshots. They found 28 members were falsely matched with mugshots. While the error-rate across the sample was 5%, it was 39% for non-white members.
  Amazon responds: Matt Wood (Amazon’s general manager of AI) writes in an Amazon blog post that the results are misleading, since the ACLU used the default confidence level of 80%, whereas Amazon recommends a setting of 99% for ‘important’ uses. (There is no suggestion that Amazon requires clients to use a higher threshold). He argues that the bias of the results is a reflection of bias in the mugshot database itself.
  Why this matters: Amazon’s response about the biased sample set is valid, but is precisely the problem the ACLU and others have pointed out. Mugshot and other criminal databases in the US reflect the racial bias in the US criminal justice system, which interacts disproportionately with people of colour. Without active efforts, tools that use these databases will inherit their biases, and could entrench them. We do not know if these agencies are following Amazon’s recommendation to use a 99% confidence rate, but it seems unwise to allow these customers to use a considerably lower setting, given the potential harms from misidentification.
  Read more: Amazon’s Face Recognition Falsely Matched 28 Members of Congress With Mugshots (ACLU).
Read more: Amazon’s response (AWS blog).

Chinese company exports surveillance tools:
Chinese company CloudWalk Technology has entered a partnership with the Zimbabwean government to provide mass face recognition, in a country with a long history of human rights abuses. The most interesting feature of the contract is the agreement that CloudWalk will have access to a database that aims to contain millions of Zimbabweans. Zimbabwe does not have legislation protecting biometric data, leaving citizens with few options to prevent either the surveillance program being implemented, or the export of their data. This large dataset may have significant value for CloudWalk in training the company’s systems on a broader racial mix.
  Why this matters: This story combines two major ethical concerns with AI-enabled surveillance. The world’s authoritarian regimes represent a huge potential market for these technologies, which could increase control over their citizens and have disastrous consequences for human rights. At the same time, as data governance in developed countries becomes more robust, firms are increasingly “offshoring” their activities to countries with lax regulation. This partnership is a worrying interaction of these issues, with an authoritarian government buying surveillance technology, and paying for it with their citizens’ data.
  Read more: Beijing’s Big Brother Tech Needs African Faces (Foreign Policy)

UK looks to strengthen monitoring of foreign investment in technology:
The UK has announced proposals to strengthen the government’s ability to review foreign takeovers that pose national security risks. While the measures cover all sectors, the government identifies “advanced technologies” and “military and dual-use technologies” as core focus areas, suggesting that AI will be high on the agenda. US lawmakers are currently considering proposals to strengthen CFIUS, the US government’s equivalent tool for controlling foreign investment.
  Why this matters: As governments realize the importance of retaining control over advanced technologies.It will be interesting to see how broad a scope the government takes, and whether these measures could become a means of blocking a wide range of investments in technology. It is noteworthy that they take a fairly wide definition of national security risks, not restricted to military or intelligence considerations, and including risks from hostile parties gaining strategic leverage over the UK.
  Read more: National Security and Investment White paper.

FLI grants add $2m funding for research on robust and beneficial AI:
The Future of Life Institute has announced $2m in funding for research towards ensuring that artificial general intelligence (AGI) is beneficial for humanity. This is the second round of grants from Elon Musk’s $10m donation in 2015. The funding is more focussed on AI strategy and technical AI safety than the previous round, which included a diverse range of projects.
  Why this matters: AGI could be either humanity’s greatest invention, or its most destructive. The FLI funding will further support a community of researchers trying to ensure positive outcomes from AI. While the grant is substantial, it is worth remembering that the funding for this sort of research remains a minuscule proportion of AI investment more broadly.
  Read more: $2 Million to Keep AGI Beneficial and Robust (FLI)
 Read more: Research Priorities for Robust and Beneficial Artificial Intelligence (FLI)

Lost in translation:
Last week I summarized Germany’s AI report using Google Translate. A reader kindly pointed out that Charlotte Stix, Policy Officer at the Leverhulme Centre for the Future of Intelligence, has translated the document in full: Cornerstones for the German AI Strategy. (Another researcher doing prolific translation work is Jeffrey Ding, from the Future of Humanity Institute, whose ChinAI newsletter is a great resource to keep up-to-speed with AI in China.)

OpenAI Bits and Pieces:

OpenAI Scholars 2018:
Learn more about the first cohort of OpenAI Scholars and get a sense of what they’re working on.
  Read more: Scholars Class 2018 (OpenAI Blog).

Tech Tales:

The Sound of Thinking

It is said that many hundreds of years ago we almost went to the stars. Many people don’t believe this now, perhaps because it is too painful. But we have enough historical records preserved to know it happened: for a time, we had that chance. We were distracted, though. The world was heating up. Systems built on top of other systems over previous centuries constrained our thinking. As things got hotter and more chaotic making spaceships became a more and more expensive proposition. Some rich people tried to colonize the moon but lacked the technology for it to be sustainable. In school we use ancient telescopes to study the wreckage of the base. We tell many stories about what went on in it, for we have no records. The moonbase, like other things around us in this world, is a relic from a time when we were capable of greater things.

It is the AIs that are perhaps the strangest things. These systems were built towards the end of what we refer to as the ‘technological high point’. Historical records show that they performed many great feats in their time – some machines helped the blind see, and others solved outstanding riddles in physics and mathematics and the other sciences. Other systems were used for war and surveillance, to great effect. But some systems – the longest lasting ones – simply watch the world. There are perhaps five of them left worldwide, and some of us guard them, though we are unsure of their purpose.

The AI I guard sits at the center of an ancient forest. Once a year it emits a radio broadcast that beams data out to all that can listen. Much of the data is mysterious to us but some of it is helpful – it knows the number of birds in the forest, for example, and has also helped us identify rivers, and deposits of rare minerals. 

The AI is housed in a large building which, if provided with a steady supply of water, is able to generate power sufficient to let the machine function. When components break, small doors in the side of the AI’s building open, revealing components sealed in vacuum bags, marked with directions in every possible language about how to replace them. We speak different languages now and one day it will be my job to show the person who comes after me how to replace different components. At current failure rates, I expect the AI to survive for several hundred years.

My AI sings, sometimes. After certain long, wet days, when the forest air is sweet, the machine will begin to make sounds, which sound like a combination of wind and stringed instruments and the staccato sounds of birds. I believe the machine is singing. After it starts to make sounds the birds of the forest respond – they start to sing and as they sing they mirror the rhythms of the computer with their own chorus.

I believe that the AI is learning how to communicate with the birds, perhaps having observed that people, angry and increasingly hopeless, are sliding down the technological gravity well and, given a few hundred thousand years, may evolve into something else entirely. Birds, by comparison, have been around for far longer than humans and never had the arrogance to try and escape their sphere. Perhaps the AI thinks they are a more promising type of life-form to communicate with: it sings, and they listen. I worry that when the AIs sang for us, all those years ago, we failed to listen.

Things that inspired this story: Writings of the Gawain Poet about ancient ruins found amid dark age England, J G Ballard, the Long Now Foundation, flora&fauna management.