ImportAI: #88: NATO designs a cyber-defense AI; object detection improves with YOLOv3; France unveils its national AI strategy

by Jack Clark

Fast object detector YOLO gets its third major release:
…Along with one of the most clearly written and reassuringly honest research papers of recent times. Seriously. Read it!…
YOLO (You Only Look Once) is a fast, free object detection system developed by researchers at the University of Washington. Its latest v3 update makes it marginally faster by incorporating “good ideas from other people”. These include a residual network system for feature extraction which attains reasonably high scores on ImageNet classification while being more efficient than current state-of-the-art systems, and a method inspired by feature pyramid networks that improves prediction of bounding boxes.
  Reassuringly honest: The YOLOv3 paper is probably the most approachable AI research paper I’ve read in recent years, and that’s mostly because it doesn’t take itself too seriously. Here’s the introduction: “Sometimes you just kinda phone it in for a year, you know? I didn’t do a whole lot of research this year. Spent a lot of time on Twitter. Played around with GANs a little. I had a little momentum left over from last year; I managed to make some improvements to YOLO. But, honestly, nothing like super interesting, just a bunch of small changes that make it better,” the researchers write. The paper also includes a “Things We Tried That Didn’t Work” section, which should save other researchers time.
  Why it matters: YOLO makes it easy for hobbyists to access near state-of-the-art object detectors than run very quickly on tiny computational budgets, making it easier for people to deploy systems onto real world hardware, like phones or embedded chips paired with webcams. The downside of systems like YOLO is that they’re so broadly useful that bad actors will use them as well; the researchers demonstration awareness of this via a ‘What This All Means’ section: ““What are we going to do with these detectors now that we have them?” A lot of the people doing this research are at Google and Facebook. I guess at least we know the technology is in good hands and definitely won’t be used to harvest your personal information and sell it to…. wait, you’re saying that’s exactly what it will be used for?? Oh. Well the other people heavily funding vision research are the military and they’ve never done anything horrible like killing lots of people with new technology oh wait…”
  Read more: YOLOv3: An Incremental Improvement (PDF).
  More information on the official YOLO website here.

The military AI cometh: new reference architecture for MilSpec defense detailed by researchers:
…NATO researchers plot automated, AI-based cyber defense systems…
A NATO research group, led by the US Army Research Laboratory, has published a paper on a reference architecture for a cyber defense agent that uses AI to enhance its capabilities. The paper is worth reading because it provides a nuts&bolts perspective on how a lot of militaries around the world are viewing AI: AI systems let you automate more stuff, automation lets you increase the speed with which you can take actions and thereby gain strategic initiative against an opponent, so the goal of most technology integrations is to automate as many chunks of a process as possible to retain speed of response and therefore initiative.
  “Artificial cyber hunters“: “In a conflict with a technically sophisticated adversary, NATO military tactical networks will operate in a heavily contested battlefield. Enemy software cyber agents—malware—will infiltrate friendly networks and attack friendly command, control, communications, computers, intelligence, surveillance, and reconnaissance (C4ISR) and computerized weapon systems. To fight them, NATO needs artificial cyber hunters—intelligent, autonomous, mobile agents specialized in active cyber defense,” the researchers write.
  How the agents work: The researchers propose agents that possess five main components: “sensing and world state identification”, “planning and action selection”, “collaboration and negotiation”, “action execution”, and “learning and knowledge improvement”. Each of these functions has a bunch of sub-systems to perform tasks like ingest data from the agent’s actions, ot to communicate and collaborate with other agents.
  Usage scenarios: These agents are designed to be modular and deployable across a variety of different form factors and usage scenarios, including multiple agents that deployed throughout a vehicle’s weapons, navigation, and observation systems, as well as the laptops used by its human crew, and managed by a single “master agent”. Under this scenario, the NATO researchers detail a threat where the vehicle is compromized by a virus placed into it during maintenance; this virus is subsequently detected by one of the agents when it begins scanning other subsystems within the vehicle, causing the agents deployed on the vehicle to decrease trust in the ‘vehicle management system’ and places the BMS (an in-vehicle system used to survey the surrounding territory) into an alert state. Next, one of the surveillance AI agents discovers that the enemy malware has loaded software directly into the BMS, causing the AI agent to automatically restart the BMS to reset it to a safe state.
  Why it matters: As systems like these move from reference architectures to functional blocks of code we’re going to see the nature of conflict change as systems become more reactive over shorter timescales, which will further condition the sorts of strategies people use in conflict. Luckily, technologies for offense are too crude and brittle and unpredictable to be explored by militiaries any time soon, so most of this work will take place in the area of defense, for now.
  Read more: Initial Reference Architecture of an Intelligent Autonomous Agent for Cyber Defense (Arxiv).

Google researchers train agents to project themselves forward and to work backward from the goal:
…Agents perform better at long horizon tasks when they…
When I try to solve a task I tend to do two things: I think of the steps I reckon I need to take to be able to complete it, and then I think of the end state and try to work my way backwards from there to where I am. Today, most AI agents just do the first thing, exploring (usually without a well-defined notion of the end state) until they stumble into correct behaviors. Now, researchers with Google Brain have proposed a somewhat limited approach to give agents the ability to work backwards as well. Their approach requires the agent to be provided with knowledge of the reward function and specifically the goal – that’s not going to be available in most systems, though it may hold for some software-based approaches. The agent is able to then use this information to project forward from its own state when considering the next actions, and also look backward from its sense of the goal to help it perform better action selection. The approach works well on lengthy tasks requiring large amounts of exploration, like navigating in gridworlds or solving Towers of Hanoi problems. It’s not clear from this paper how far this technique can go as it is tested on small-scale toy domains.
  Why it matters: To be conscious is to be trapped in a subjective view of time that governs everything we do. Integrating more of an appreciation of time as a specific contextual marker and using that to govern environment modelling seems like a prerequisite for the development of more advanced systems.
  Read more: Forward-Backward Reinforcement Learning (Arxiv).

AI researchers train agents to simulate their own worlds for superior performance:
…I am only as good as my own imaginings…
Have you ever heard the story about the basketball test? Scientists split a group of people into three groups; one group was told to not play basketball for a couple of weeks, the second group was told to play basketball for an hour a day for two weeks, and the third group was told to think about playing basketball for an hour a day for two weeks, but not play it. Eventually, all three groups played basketball and the scientists discovered that the people that had spent a lot of time thinking about the game did meaningfully better than the group that hadn’t played it at all, though neither were as good as the team that practised regularly. This highlights something most people have a strong intuition about: our brains are simulation engines, and the more time we spend simulating a problem, the better chance we have of solving that problem in the real world. Now, researchers David Ha and Juergen Schmidhuber have sought to give AI agents this capability, by training systems to develop a compressed representation of their environment, then having these agents train themselves within this imagined version of the environment to solve a task – in this case, driving a car around a race course, and solving a challenge in VizDoom.
   Significant caveat: Though the paper is interesting it may be pursuing a research path that doesn’t go that far according to the view of one academic, Shimon Whiteson, who tweeted out some thoughts about the paper a few days ago.
  Surprising transfer learning: For the VizDoom tasks the researchers found they were able to make the agents’ model of its Doom challenge more difficult by raising the temperature of the environment model, which essentially increases randomization of its various latent variables. This means the agent had to contend with a more difficult version of the task, replete with more enemies, less predictable fireballs, and even the occasional random death. They found that agents trained in this simulation excelled at a simpler real world task, suggesting that the underlying learned environment model was of sufficient fidelity to be a useful mental simulation.
  Why it matters: “Imagination” is a somewhat loaded term in AI research, but it’s a valid thing to be interested in. Imagination is what lets humans explore the world around them effectively and imagination is what gives them a sufficiently vivid and unpredictable internal mental world to be able to have insights that lead to inventions. Therefore, it’s worth paying attention to systems like those described in this paper that strive to give AI agents access to a learned and rich representation of the world around them which they can then use to teach themselves. It’s also interesting as another way of applying data augmentation to an environment: simply expose an agent to the real environment enough that it can learn an internal representation of it, then throw computers at expanding and perturbing the internal world simulation to cover a greater distribution of (potentially) real world outcomes.
   Readability endorsement: The paper is very readable and very funny. I wish more papers were written to be consumed by a more general audience as I think it makes the scientific results ultimately accessible to a broader set of people.
  Read more: World Models (Arxiv).

Testing self-driving cars with toy vehicles in toy worlds:
…Putting neural networks to the (extremely limited) test…
Researchers with the Center for Complex Systems and Brain Sciences at Florida Atlantic University have used a toy racetrack, a DIY model car, and seven different neural network approaches to evaluate self-driving capabilities in a constrained environment. The research seeks to provide a cheap, repeatable benchmark developers can use to evaluate different learning systems against eachother (whether this benchmark has any relevance for full-size self-driving cars is to be determined.) They test seven types of neural network on the same platform, including a feed forward network; a two-layer convolutional neural network; an LSTM; implementations of Alexnet, VGG-126, Inception V3, and a ResNet-26. Each network is tested on the obstacle course following training and is evaluated according to how many laps the car completes. They test the networks on three data types: color and grayscale single images, as well as a ‘gray framestack’ which is a set of images that occurred in a sequence. Most systems were able to complete the majority of the courses, which suggests the course is a little too easy. An AlexNet-based system attained perfect performance on one data input type (single color frame), and a ResNet attained the best performance when trying to use a Gray Framestack.
  Why it matters: This paper highlights just how little we know today about self-driving car systems and how poor our methods are for testing and evaluating different tactics. What would be really nice is if someone spent enough money to do a controlled test of actual self-driving cars on actual roads, though I expect that companies will make this difficult out of a desire to keep their IP secret.
  Read more: A Systematic COmparison of Deep Learning Architectures in an Autonomous Vehicle (Arxiv).

Separating one detected pedestrian from another with deep learning:
…A little feature engineering (via ‘AffineAlign’) goes a long way…
As the world starts to deploy large-scale AI surveillance tools researchers are busily working to deal with some of the shortcomings of the technology. One major issue for image classifiers has been object segmentation and disambiguation, for example: if I’m shown images of a crowd of people how can I specifically label each one of those people and keep track of each of them, without accidentally mis-labeling a person, or losing them in the crowd? New research from Tsinghua University, Tencent AI Lab, and Cardiff University attempts to solve this problem with “a brand new pose-based instance segmentation framework for humans which separates instances based on human pose rather than region proposal detection.” The proposed method introduces an ‘AffineAlign’ layer that aligns images based on human poses which it uses within an otherwise typical computer vision pipeline. Their approach works by adding in a bit more prior knowledge (specifically, knowledge of human poses) into a recognition pipeline, and using this to better identity and segment people in crowded photos.
  Results: The approach attains comparable results to MASK-RCNN on the ‘COCOHUMAN’ dataset, and outperforms it on the ‘COCOHUMAN-OC” dataset which test systems’ ability to disambiguate partially occluded humans.
   Why it matters: As AI surveillance systems grow in capability it’s likely that more organizations around the world will deploy such systems into the real world. China is at the forefront of doing this currently, so it’s worth tracking public research on the topic area from Chinese-linked researchers.
  Read more: Pose2Seg: Human Instance Swegmentation Without Detection (Arxiv).

French leader Emmanuel Macron discusses France’s national AI strategy:
…Why AI has issues for democracy, why France wants to lead Europe in AI, and more…
Politicians are somewhat similar to hybrids of weathervanes and antennas; the job of a politician is to intuit the public mood before it starts to change and establish a rhetorical position that points in the appropriate direction. For that reason it’s been interesting to see more and more politicians ranging from Canada’s Justin Trudeau to China’s Xi Jinping to, now, France’s Emmanuel Macron, taking meaningful positions on artificial intelligence; this suggests they’ve intuited that AI is going to become a galvanizing issue for the general public. Macron gives some of his thinking about the impact of AI in an interview with Wired. His thesis is that European countries need to pool resources and support AI individually to have a chance at becoming a significant enough power bloc with regards to AI capabilities to not be crushed by the scale of the USA’s and China’s AI ecosystems. Highlights:
– AI “will disrupt all the different business models”, and France needs to lead in AI to retain agency over itself.
– Opening up data for general usage by AI systems is akin to opening up a Pandora’s Box: “The day we start to make such business out of this data is when a huge opportunity becomes a huge risk. It could totally dismantle our national cohesion and the way we live together. This leads me to the conclusion that this huge technological revolution is in fact a political revolution.”
– The USA and China are the two leaders in AI today.
– “AI could totally jeopardize democracy.”
– He is “dead against” the usage of lethal autonomous weapons where the machine makes the decision to kill a human.
– “My concern is that there is a disconnect between the speediness of innovation and some practices, and the time for digestion for a lot of people in our democracies.”
   Read more: Emmanuel Macron Talks To Wired About France’s AI Strategy (Wired).

France reveals its national AI strategy:
…New report by Fields Medal-winning minister published alongside Emmanuel Macron speech and press tour…
For the past year or so French mathematician and politician Cedric Villani has been working on a report for the government about what France’s strategy should be for artificial intelligence. He’s now published the report and it includes many significant recommendations meant to help France (and Europe as a whole) chart a course between the two major AI powers, the USA and China.
  Summary: Here’s a summary of what France’s AI strategy involves: rethink data ownership to make it easier for governments to create large public datasets; specialize in four sectors: healthcare, environment, transport-mobility, and defense security; revise public sector procurement so it’s easier for the state to buy products from smaller (and specifically European) companies; create and fund interdisciplinary research projects; create national computing infrastructure including “a supercomputer designed specifically for AI usage and devoted to researchers” along with creating a European-wide private cloud for AI research; increase competitiveness of public sector remuneration; fund a public laboratory to study AI and its impact on labor markets which will work in tandem with schemes to get companies to look into funding professional training for people whose lives are affected by innovations developed by the private sector; increase transparency and interpretability of AI systems to deal with problems of bias; create a national AI ethics committee to provide strategic guidance to the government, and improve the diversity of AI companies.
  Read more: Summary of France’s AI strategy in English (PDF).

Berkeley researchers shrink neural networks with SqueezeNet-successor ‘SqueezeNext’:
…Want something eight times faster and cheaper than ImageNet…
Berkeley researchers have published ‘SqueeseNext’, their latest attempt to distill the capabilities of very large neural networks into smaller models that can feasibly be deployed on devices with small memory and compute capabilities, like mobile phones. While much of the research into AI systems today is based around getting state-of-the-art results on specific datasets, SqueezeNext is part of a parallel track focused on making systems deployable. “A general trend of neural network design has been to find larger and deeper models to get better accuracy without considering the memory or power budget,” write the authors.
  How it works: SqueezeNext is efficient because of a few design strategies: low rank filters; a bottleneck filter to constrain the parameter count of the network; using a single fully connected layer following a bottleneck; weight and output stationary; and co-designing the network in tandem with a hardware simulator to maximize hardware usage efficiency.
  Results: The resulting SqueezeNext network is a neural network with 112X fewer model parameters than those found in AlexNet, the model that was used to attain state-of-the-art image recognition results in 2012. They also develop a version of the network whose performance approaches that of VGG-19 (which did well in ImageNet 2014). The researchers also design an even more efficient network by carefully tuning model design in parallel with a hardware simulator, ultimately designing a model that is significantly faster and more energy efficient than a widely used compressed network called SqueezeNet.
  Why it matters: One of the things holding neural networks back from being deployed is their relatively large memory and computation requirements – traits that are likely to continue to be present given the current trend for solving tasks via training unprecedentedly multi-layered systems. Therefore, research into making these networks run efficiently broadens the number of venues neural nets can run in.
   Read more: SqueezeNext: Hardware-Aware Neural Network Design (Arxiv).

Tech Tales:

Metal Dogs Grow Old.

It’s not unusual, these days, to find rusting piles of drones next to piles of Elephant skeletons. Nor is it unusual to see an old elephant make its way to a boneyard accompanied by a juddering, ancient drone, and to see both creature and machine set themselves down and supside at the same time. There have even been stories of drones falling out of the sky when one of the older birds in the flock dies. These are all some of the unexpected consequences of a wildlife preservation program called PARENTAL UNIT. Starting in the early twenties we started to introduce small, quiet drones to vulnerable animal populations. The drones would learn to follow a specific group of creatures, say a family of elephants, or – later, after the technology improved – a flock of birds.

The machines would learn about these creatures and watch over them, patrolling the area around them as they slept and, upon finding the inevitable poachers, automatically raising alerts with local park rangers. Later, the drones were given some autonomous defense capabilities, so they could spray a noxious chemical onto the poachers that had the duel effect of making local predators be drawn to them, and providing a testable biomarker that police could subsequently check people for at the borders of national parks.

A few years after starting the program the drone deaths started happening. Drones died all the time, and we modelled their failures as rigorously as any other piece of equipment. But drones started dying at specific times – the same time the oldest animal in the group they were watching died. We wondered about this for weeks, running endless simulations, and even pulling in some drones from the field and inspecting the weights in their models to see if any of their continual learning had led to any unpredictable behaviors. Could there be something about the union of the concept of death and the concept of the eldest in the herd that fried the drones brains, our scientists wondered? We had no answers. The deaths continued.

Something funny happened: after the initial rise in deaths they steadied out, with a few drones a week dying from standard hardware failures and one or two dying as a consequence of one of their creatures dying. So we settled into this quieter new life and, as we stopped trying to interfere, we noticed a further puzzling statistical trend: certain drones began living improbably long lifespans, calling to mind the Mars rovers Spirit and Opportunity that had miraculously exceeded their own designed lifespans. These drones were also the same machines that died when the eldest animals died. Several postgrads are currently exploring the relationship, if any, between these two. Now we celebrate these improbably long-lived machines, cheering them on as they fuzz in for a new propeller, or update our software monitors with new footage from their cameras, typically hovering right above the creature they have taken charge of, watching them and learning something from them we can measure but cannot examine directly.

Things that inspired this story: Pets, drones, meta-learning, embodiment.