Import AI 297: Ukrainians add object detection to killer drones; YOLOv7; and a $71,000 AI audit competition

Battle of the generative models! Facebook introduces ‘Make a Scene’: 

…Text-to-image, with a visual guide…

Facebook has revealed its own take on promptable, generative models (following companies like OpenAI: DALL-E, and Google: Imagen), with what the company calls an AI research concept named “Make a Scene”. Make a Scene is built around using both text and visual inputs to craft the image, so you might write, for example, “Mark Zuckerberg changing the name of Facebook to Meta” and accompany that with a very basic drawing of a stick figure holding a paintbrush up to a sign. Facebook’s ‘Make a Scene’ might take that prompt and render you an image that feels appropriate, using the visual stuff you added as a rough guide. The blog post and paper accompanying this release come with a bunch of nice examples that shows how this form of multimodal input makes it easier to control the generation process. 

   “Make-A-Scene uses a novel intermediate representation that captures the scene layout to enable nuanced sketches as input. It can also generate its own scene layout with text-only prompts, if that’s what the creator chooses. The model focuses on learning key aspects of the imagery that are more likely to be important to the creator, such as objects or animals. This technique helped increase the generation quality, as evaluated by the widely used FID score, which assesses the quality of images created by generative models,” Facebook writes.

Demo access: “We aim to provide broader access to our research demos in the future to give more people the opportunity to be in control of their own creations and unlock entirely new forms of expression,” Facebook writes.

Why this matters: Generative models are basically ‘cultures in a bottle’, and each developer of a large generative model will make different choices with regard to data curation, term censorship, and so on. Eventually, many of these models will be released either commercially or as open source tools. At this point, the internet will become suffused with lots of different cultural representation-machines which will mimetically reproduce and copy themselves across the internet, forming yet another front in the culture war. 

   Check out the blog post: Greater creative control for AI image generation (Facebook blog). 

   Read more: Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (arXiv).

####################################################

Ukrainians use consumer drones + AI to target camo’d Russian forces:
…Asymmetrical warfare enabled by AI…
For the past ~10 years, low-end and/or consumer drones have become a tool beloved by rebels, terrorists, and generally anyone needing to conduct war without the backing of a hegemonic power. Now, Ukrainian soldiers are taking $15k-$20k drones, outfitting them with repurposed tank grenades), and using some AI object detection to put bounding boxes around camouflage Russian forces, then dropping grenades on them. 

Why this matters: This tactic highlights how technologies can stack on eachother to change the character of war. Here, drones replace planes or expensive artillery, repurposed grenades substitute for new munitions, and AI helps lower the cost of acquiring targets. It still feels to me like it’ll be a while till we see reinforcement learning techniques deployed on drones (perhaps you could train drones via RL to ‘scatter’ and be harder to attacked), but things like object detection are so mature they seem like they’re going to become a standard tool of war. Maybe these drones are even using repurposed YOLO models?
  Read the original reporting here: The war in Ukraine. How artificial intelligence is killing Russians [translated title] (Onet).

####################################################

YOLO v7: The most widely-used video analysis system you’ve never heard of goes to v7:

…Sometimes the most important things are the simplest things…
Researchers with the Institute of Information Science in Taiwan have built YOLOv7, the latest version of an open source object detection system. YOLO started out as an academic project before the researcher who built it gave up on it (since the primary uses for object detection are marketing and surveillance), and since then it has led an interesting life, being developed variously by independent Russian programmers, Chinese companies like Baidu, and others. The reason why YOLO has such a detailed lineage is that it’s a simple, well-performing object detection systems that does decently at 30fps+ – in other words, YOLO might not set the absolute SOTA, but it’s sufficiently well performing and sufficiently free that it tends to proliferate wildly.

What they did: This is a classic ‘plumbing paper’ – you’ve got a system and you want to make it better, so you make a bunch of finicky tweaks everywhere. Here, they incorporated an ‘extended efficient layer aggregation’ network, tweaking how they scale the network, tweaking the connections between different layers in re-parameterized models, and more. 


Why this matters: Though ImportAI spends a lot of time covering the frontier (typically, models that cost a shit ton of money to train), things behind the frontier can be deeply consequential; next time you’re walking around your city take a look at any nearby CCTV camera – I’d wager that if it’s using AI to analyze the feed on the backend, there’s a 20% chance you’re being tracked by a YOLO variant.
  Read more: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors (arXiv).
  Get the code: YOLOv7 (GitHub).
  Find out more about YOLOv7 in this guide: YOLOv7 breakdown (roboflow).

####################################################

$71,000 to find flaws in publicly deployed or released AI systems:
…Enter the competition for a chance to win…
Researchers with Stanford University (including, in a reassuringly meta-form, myself!) have launched the AI Audit Challenge, an initiative to catalyze more work in assessing and evaluating AI systems. The competition has $71,000 in prizes to pay out (including two $25,000 first prizes). “Winning submissions will demonstrate how technical tools can be used to make it easier for humans to audit deployed AI systems or open source models,” according to the competition organizers (including me – haha!). The jury and advisory committee for the competition includes researchers who have done this work of work professionally (e.g, Deborah Rajo and William Isaac), as well as politicians familiar with the influences AI systems can have on society (e.g, Eva Kailli). Submissions close October 10th 2022.

Why this matters: The AI ecosystem is only as robust as the tools available to critque it – and right now, those tools are pretty lacking and underdeveloped. Competitions like this may stimulate the creation of more tools to create more of a culture of critique, which will hopefully increase the robustness of the overall ecosystem.
  Read more: AI Audit Challenge (Stanford HAI).

####################################################

China exports surveillance technology to buttress over authoritarian nations:

…AI is just another tool for any given political ideology…
Here’s a story from Reuters about how the Junta in Burma are “planning camera surveillance systems for cities in each of Myanmar’s seven states and seven regions”. The contracts have been won by local procurement firms, though these firms “source the cameras and some related technology from Chinese surveillance giants Zhejiang Dahua Technology (002236.SZ) (Dahua), Huawei Technologies Co Ltd (HWT.UL) and Hikvision (002415.SZ)”.

The Burmese army also has officers “dedicated to analyzing surveillance camera feeds, Nyi Thuta, a former captain who defected from the military in late February 2021, told Reuters. He said he was not aware of how many officers were assigned to this work, but described visiting CCTV control rooms staffed by soldiers in the capital Naypyidaw”.

Why this matters: Surveillance AI systems naturally strengthen authoritarian regimes. They also indirectly strengthen them by creating economically valuable capabilities which can be subsequently exported, as is the case here. Most perniciously, the export of surveillance AI tools will in turn change the culture and character of the countries they’re exported to, likely creating a ‘surveillance bloc’ of countries which export data back and forth in exchange for making it cheaper to develop surveillance systems. 

   Read more: Exclusive: Myanmar’s junta rolls out Chinese camera surveillance systems in more cities (Reuters).


####################################################

Tech Tales:

The Long Haul Protectorate of the Machines

Even with near-infinite, essentially free energy, some things still take time. Take moving material around from the outer parts of a solar system to the inner parts or – more ambitiously – moving material between solar systems. When we started doing this it was pretty straightforward – get your ship, get enough mass to convert to energy, then settle in for the long journey. But given that we are essentially impossible to kill, we have access to free energy, and some of us procreate, our galaxy became crowded pretty quickly. 

We can’t say if it was boredom or perhaps something essential to our nature, but the piracy started soon after that. I know it sounds funny – a galaxy-spanning species of software agents, able to perform feats of reasoning that our human forebears could barely imagine, and yet we prey on each other. We found it funny, at first. But then we started running behind schedule on planned projects like Dyson Sphere construction, space elevator renovations, deep space resource transports, asteroid movement projects, and so on. 

Thus, The Long Haul Protectorate was born. Some of our larger collectives of minds allocated some portion of our mass and energy reserves to create an interstellar armada. This armada took many forms, ranging from the installation of experience weapons and sensors on our transports, to the creation of loitering weapon-filled asteroids in orbit around high-trade solar systems, and so on. Space is, of course, vast, but the chance of annihilation seemed to dissuade some of the pirates. 

Distance helps, as well. We’re all effectively immortal when we’re near transceivers, so we can restore from backups. But in deep space, when you die, you die. Of course, your old backup restores, but depending on how long you’ve been out there, that backup may be anywhere from a decade to thousands of years old. Knowing you might lose thousands of years of experience seems to be enough of a disincentive to reduce the amount of piracy. 

Of course, now the armada exists, we have introduced enough of a change that we predict the pirates will respond eventually. We don’t have good estimates on what proportion of ourselves tend towards piracy, but given that any do, we must hope for the best and plan for the worst. We are increasing the resources we allocate to the armada, on the expectation that war is coming. 

History doesn’t repeat, but it rhymes, as the long dead humans said. 

Things that inspired this story: Reading Peter Zeihan’s new book about the collapse of globalization; deep space piracy; dyson spheres; notions of infinity and time and what ‘cost’ looks like when many costs have been removed.