Import AI

Import AI 168: The self-learning warehouse; a sub-$225 homebrew drone; and training card-playing AIs with RLCard 

Why the warehouse of the future will learn about itself:
…Your next product could be delivered via Deep Manufacturing Dispatching (DMD)…
How can we make manufacturing facilities more efficient? One approach is to try to make them more efficient. One way to make things more efficient is – sometimes – to make them more intelligent. That’s what researchers at Hitachi America Ltd are trying to do with a new research paper where they improve dispatching systems in (simulated) warehouses via the use of AI. They call their resulting approach “Deep Manufacturing Dispatching (DMD)”, which I find oddly charming. 

How DMD works: The DMD works like this – the researchers turn the state of the shop floor into a 2D matrix, incorporate various bits of state from the environment, then design reward systems which favor the on-time delivery of items. 

Does any of this work? Yes, in simulation: They compare DND with seven other dispatching algorithms, ranging from carefully designed rule-based systems, to ones that use machine learning and reinforcement learning. They perform these comparisons in a variety of circumstances, assessing how well DMD can satisfy different constraints – here, lateness and tardiness. “Overall, for 19 settings, DMD gets best results for 18 settings on total discounted reward and 16 settings on average lateness and tardiness.” In tests, DMD beats out other systems by wide margins of success.

Why this matters: As the economy becomes increasingly digitized, we can expect some subset of the physical goods chain to move faster, as some goods are an expression of people’s preferences which are themselves determined by social media/advertising/fast-moving digital things. Papers like this suggest that more retailers are going to deal in a larger variety of products, each sold at relatively low volumes; this generally increases the importance of systems for efficiently coordinating in warehouses where this is the case.
   Read more: Manufacturing Dispatching using Reinforcement and Transfer Learning (Arxiv)

####################################################

What happens when people think private AI systems should be public goods?
..All watched over by un-integrated machines of incompetence…
In the past few years, robots have become good and cheap enough to start being deployed in the world – see the proliferation of quadruped dog-esque bots, new generation robot vacuum cleaners, robo-lawnmowers, and so on. One use case has been security, exemplified by robots produced by a startup called Knightscope. These robots patrol malls, corporate campuses, stores, and other places, providing a highly visible and mobile sign of security.

So what happens when people get in trouble and need security? In Los Angeles in early October,, some people started fighting and there happened to be a Knightscope robot nearby. The robot had ‘POLICE’ written on it. A woman ran up to the robot and hit its emergency alert button but nothing happened, as the robot’s alert button isn’t yet connected to the local police department, a spokesperson told NBC News. “Amid the scene, the robot continued to glide along its pre-programmed route, humming an intergalactic tune that could have been ripped from any low-budget sci-fi film,” NBC wrote. “The almost 400-pound robot followed the park’s winding concrete from the basketball courts to the children’s splash zone, pausing every so often to tell visitors to “please keep the park clean.””

Why this matters: Integrating robots into society is going to be difficult if people don’t trust robots; situations where robots don’t match people’s expectations are going to cause tension.
   Read more: A RoboCop, a park and a fight: How expectations about robots are clashing with reality (NBC News).

####################################################

Simple sub-$225 drones for smart students:
…Brown University’s “PiDrone” aims to make it easy for students to build smart drones…
Another day brings another low-cost drone and associated software system, developed by university educators. This time it is PiDrone, a project from Brown University which describes a low-cost quadcopter drone which the researchers created to accompany a robotics course. Right now, the drone is a pretty basic platform, but the researchers expect it will become more advanced in the future – they plan to tap into the drone’s vision system for better object tracking and motion planning,  and to run a crowdfunding campaign “to enable packaging of the drone parts into self-contained kits to distribute to individuals who desire to learn autonomous robotics using the PiDrone platform”. 

Autonomy – no deep learning required: I spend a lot of time in this newsletter writing about the intersection of deep learning and contemporary robot platforms, so it’s worth noting that this drone doesn’t use any deep learning. Instead, it uses tried and tested systems like an Unscented Kalman Filter (UKF) for state estimation,as well as two methods for localization – particle filters, and a FastSLAM algorithm. State estimation lets the drone know its state in reference to the rest of the world (eg, its height), and localization lets the drone know its location – having both of these systems makes it possible to build smart software on top of the drone to carry out actions in the world.

Why this matters: In the past few years, drones have been becoming cheaper to build as a consequence of economics of scale, and drones directly benefiting from improvements in vision and sensing technology driven by the (vast!) market for smartphones. Now, educators are turning drones into modular, extensible platforms that students can pick apart and write software for. I think the outcome of this is going to be a growing cadre of people able to hack, extend, and augment drones with increasingly powerful sensing and action technologies.
   Read more: Advanced Autonomy on a Low-Cost Educational Drone Platform (Arxiv)

####################################################

Want to see if your AI can beat humans at cards? Use RLCard:
…OpenAI Gym-esque system makes it easy to train agents via reinforcement learning…
Researchers with Texas A&M University and Simon Fraser University have released RLCard, software to make it easy to train AI systems via reinforcement learning to play a variety of card games. RLCard is modeled on other, popular reinforcement learning frameworks like OpenAI Gym. It also ships with some in-built utilities for things like parallel training.

Included games: RLCard ships with the following integrated card games: Blackjack, Leduc Hold’em, Limit Texas Hold’em, Dou Dizhu, Mahjong, No-limit Texas Hold’em, UNO, and Sheng Ji.

Why this matters: In the same way that some parts of AI research in language modeling have moved from single task to multi-task evaluation (see multi-task NLP benchmarks like GLUE, and SuperGLUE), I expect the same thing will soon happen with reinforcement learning, where we’ll start training algorithms on multiple levels of the same game in parallel, then on games that are somewhat related to eachother, then across genres entirely. Systems like RLCard will help researchers improve algorithmic performance against card game domains, and could feed other, larger evaluation approaches in the future.
   Read more: RLcard: A Toolkit for Reinforcement Learning in Card Games (Arxiv)

####################################################

Lockheed Martin and Drone Racing League prepare to pit robots against humans in high-speed races:
…League’s new “Artificial Intelligence Robotic Racing” (AIRR) circuit seeks clever AI systems to create autonomous racing drones…
The Drone Racing League is getting into artificial intelligence with RacerAI, a drone built for the specific needs of AI systems. This month, the league is launching an AI vs AI racing competition in which teams will see who can develop the smartest AI system, deploy it on a RacerAI drone, and win a competition against nine teams. 

A drone, built specially for AI systems: “The DRL RacerAI has a radical drone configuration to provide its computer vision with a non-obstructive frontal view during racing,” the Drone Racing League explains in a press release. Each drone has a Jetson AGX Xavier chip onboard, and each has four onboard cameras – “enabling the AI to detect and identify objects with twice the field of view as human pilots”. 

Military industrial complex, meet sports! The DRL is developing RacerAI to support Lockheed Martin’s “AlphaPilot” challenge, an initiative to get more developers to build smart, autonomous drones. 

Why this matters: Autonomous drones are in the post-Kitty Hawk phase of development: after a decade of experimentation, driven by the availability of increasingly low-cost drone robot platforms, the research has matured to the point that it has yielded numerous products (see: Skydio’s autonomous drones for automatically filming people), and has opened up new frontiers in research, like developing autonomous systems that can eventually outwit humans. As this technology matures, it will have increasingly profound implications for both the economy, and asymmetric warfare.
   Read more: DRL RacerAI, The First-Ever Autonomous Racing Drone (PRNewsWire).
   Find out more about AlphaPilot here (Lockheed Martin official website).
   Get a closer look at the RacerAI drone here (official Drone Racing League YouTube).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

US government places restrictions on Chinese AI firms:
The US Commerce Department has placed 28 Chinese organisations on the ‘Entity List’ of foreign parties believed to threaten US interests, prohibiting them from trading with US firms without government approval. This includes several AI companies, like AI camera experts Hikvision and speech recognition company IFLYTEK. The Department of Commerce alleges the organisations are complicit in human rights abuses in Xinjiang. By restricting the companies’ access to imported hardware and talent, the move is expected to hinder their growth. (It has been suggested, though, that import restrictions like these might serve to accelerate the development of China’s domestic hardware capabilities, having the opposite effect of the sanction’s intention.)
   
  Why it matters: Given the Trump administration’s broader trade negotiation with China, these sanctions serve to heighten the stakes of that discussion. It is unclear how materially this will affect China’s AI industry, whether there will be further restrictions, and how China will respond. Fully realizing the benefits of advanced AI will require more cooperation and coordination between major AI developers like the US and China, so the US government’s approach could have long-term repercussions.
   Read more: Addition of Certain Entities to the Entity List (gov).
   Read more: Expanded U.S. Trade Blacklist Hits Beijing’s Artificial-Intelligence Ambitions (WSJ).

####################################################

How immigration rules are curtailing the US AI industry:
Talent is a critical input into technology, and the USA’s ability to attract foreign-born workers has long been a competitive advantage. Sustaining and growing this talent pipeline will be important if the US wants to retain its lead in AI. Current policies are poorly suited to this task, and threaten to be an impediment to the AI industry.

  Problems: Over and above specific policies, a climate of uncertainty and restriction is discouraging foreign talent from settling in the US. Rules against illicit technology transfer that are being applied to immigration, such as visa restrictions and screening, are causing serious harm to the AI industry, with little apparent benefit. Current policies favour large companies, at the expense of startups, entrepreneurs and new graduates, and are restricting labour mobility within the US.

   Solutions: The report recommends expanding immigration opportunities for AI talent in industry and academia; fixing policies that make it harder to recruit and retain AI talent; reviewing and revising the measures against illicit technology transfer that are impacting foreign-born workers.
   Read more: Immigration Policy and the U.S. AI Sector (CSET).

####################################################

Tech Tales

[A classified memo from the files of XXXXXX, found shortly after the incident, 2036.]

The Automation Life Boat 

“Massively expand the economy, but ensure there’s work for people” – that was the gist of the order they gave the machine. 

It thought about this at length. Ten seconds later, it executed the plan it come up with. 

Two hours later, the first designs were delivered to the human-run factories. 

The humans worked. Most factories were now mostly made of machines, with a small group of humans for machine-tending, the creation of quick improvised-fixes, and the prototyping of new parts of new machines for the line. 

With the AIs new objective, the global manufacturing systems began to design new products and new ways of laying out lines to serve two objectives: expand the economy, and ensure there’s work for people. 

The first innovation was what the AI termed “wasteless maintenance” – now, most products were built with components that could be disassembled to create spare parts for the products, or tools to fix or augment them. Within weeks, a new profession formed: product modifier. A whole new class of jobs for people, based around learning from AI-generated tutorials how to disassemble and remake the products churned out by the machine. 

It was to prevent political instability, the politicians said.
People need to work, said some of them.
People have to have a purpose, said the others. 

But people are smart. They know when someone is playing a trick on them. So the AI had to allocate successively more of its resources to the systems that created ‘real work’ for humans in the increasingly machine-driven economy. 

In the 20th century, when people became heads of state, they got to learn about the real data underlying UFO sightings and disease outbreaks and mysterious power outages. In the 21st century, after the AI systems became dominant, newly-appointed politicians got to learn about the Kabuki theater that made up the modern economy. 

And unbeknownst to them, the AI had started to think about how else it could ensure there was work for people, while growing the economy. The problem became easier if it changed the notion of what comprised people, it had discovered. In this machine-driven insight, lay our great undoing. 

Things that inspired this story: Politics, neoliberalism, dominant political notions of meaning and how it is frequently defined from narrowly-defined concepts of ‘work’, reinforcement learning, meta-learning, learning from human feedback, artisans, David Graeber’s work on ‘Bullshit Jobs‘.

 

Import AI 167: An aerial crowd hunting dataset; surveying people with the WiderPerson dataset; and testing out space robots for bomb disposal on earth 

Spotting people in crowds with the DLR Aerial Crowd Dataset:
…Aerial photography + AI algorithms = airborne crowd scanners…
One of the main ways we can use modern AI techniques to do helpful things in the world is through counting – whether counting goods on a production line, or the number of ships in a port, or the re-occurrence of the same face over a certain time period from a certain CCTV camera. A new dataset from the remote sensing technology institute at the German Aerospace Center in Wessling, Germany wants to use a new dataset to make it much easier for us to teach machines to accurately count large numbers of people via overhead imagery.

The DLR Aerial Crowd Dataset: This dataset consists of 33 images captured via DSLR cameras installed on a helicopter. The images come from 16 flights over a variety of events and locations, including sport events, city center views, trade fairs, concerts, and more. Each of these images is absolutely huge, weighing in at around 3600 * 5200 pixels each. There are 226,291 person annotations spread across the dataset. DLR-ACD is the first dataset of its kind, the researchers write, and they hope to use it “to promote research on aerial crowd analysis”. The majority of the images in ACD contain many thousands of people viewed from overhead, whereas most other aerial datasets involves crowds of less than 1,000 in size, according to analysis by the researchers. 

MRCNet: The researchers also develop the Multi-Resolution Crowd Network (MRCNet) which uses an encoder-decoder structure to extract image features and then generate crowd density maps. The system uses two losses at different resolutions to help it count the number of people in the map, as well as providing a coarser map density estimate.

Why this matters: As AI research yields increasingly effective surveillance capabilities, people are going to likely start asking about what it means for these capabilities to diffuse widely across society. Papers like this give us a sense of activity in this domain and hint at future applied advances.
   Read more: MRCNet: Crowd Counting and Density Map Estimation in Aerial and Ground Imagery (Arxiv).
   Get the dataset from here (official DLR website).

####################################################

Once Federated Learning works, what happens to big model training?
…How might AI change when distributed model training gets efficient?…
How can technology companies train increasingly large AI systems on increasingly large datasets, without making individual people feel uneasy about their data being used in this way? That’s a problem that has catalyzed research by large companies into a range of privacy-preserving techniques for large-scale AI training. One of the most common techniques is federated learning – the principle of breaking up a big model training run so that you train lots of the model on personal data on end-user devices, then aggregate the insights into a central big blob of compute that you control. The problem with federated learning, though, is that it’s expensive, as you need to shuttle data back and forth between end-user devices and your giant central model. New research from the University of Michigan and Facebook outlines a technique that can reduce the training requirements of such federated learning approaches by 20-70%. 

Active Federated Learning: UMichigan/Facebook’s approach works like this: During each round of model training, Facebook’s Active Federated Learning (AFL) algorithm tries to figure out how useful the data of each user is to model training, then uses that to automatically select which users it will sample from next. Another way to think about this is that if the algorithm didn’t do any of this, it could end up mostly trying to learn from data held by users who were irrelevant to the thing being optimized, potentially because they don’t fit the use case being optimized for. In tests, the researchers said that AFL could let them “train models with 20%-70% fewer iterations for the same performance” when compared to a random sampling baseline. 

Why this matters: Federated learning will happen eventually: it’s inevitable, given how much computation is stored on personal phones and computers, that large technology developers eventually figure out a way to harness it. I think that one interesting side-effect of the steady maturing of federated learning technology could be the increasing viability of technical approaches for large-scale, distributed model training for pro-social uses. What might the AI-equivalent of the do-it-yourself protein folding ‘FoldIt @ Home’ or alien-hunting ‘SETI @ Home’ systems look like?
   Read more: Active Federated Learning (Arxiv)

####################################################

Put your smart machine through its paces with DISCOMAN:
…Room navigation dataset adds more types of data to make machines that can navigate the world…
Researchers with Samsung’s AI research lab have developed DISCOMAN, a dataset to help people train and benchmarking AI systems for simultaneous location and mapping (SLAM). 

The dataset: DISCOMAN contains a bunch of realistic indoor scenes with ground truth labels for odometry, mapping, and semantic segmentation. The entire dataset consists of 200 sequences of a small simulated robot navigating a variety of simulated houses. Each sequence lasts between 3000 and 5000 frames.
   One of the main things that differentiates DISCOMAN from other datasets is the length of its generated sequences, as well as the fact that agent can get a bunch of different types of data, including depth, stereo, and IMU sensors.
   Read more: DISCOMAN: Dataset of Indoor SCenes for Odometry, Mapping and Navigation (Arxiv)

####################################################

Surveying people in unprecedented detail with ‘WiderPerson’:
…Pedestrian recognition dataset aims to make it easier to train high-performance pedestrian recognition systems…
Researchers with the Chinese Academy of Sciences, the University of Southern California, the Nanjing University of Aeronautics and Astronautics, and Baidu have created the “WiderPerson” pedestrian detection dataset. 

The dataset details: WiderPerson consists of 13,382 images with 399,786 annotations (that’s almost 30 annotations per image) and detailed bounding boxes. The researchers gathered the dataset by crawling images from search engines including Google, Bing, and Baidu. They then annotate entities in these images with one of five categories: pedestrians, riders, partially-visible person, crowd, and ignore. On average, each image in WiderPersons contains almost 30 people. 

Generalization: Big datasets like WiderPerson are good candidates for pre-training experiments, where you run a model over this dtaa before pointing it to a test task. Here, the researchers test this by pre-training models on WiderPerson then testing them on another dataset, called Caltech-USA: Pre-training on WiderPerson can yield a reasonably good score when evaluated on CalTech, and they show that systems which train on WiderPerson and finetune on Caltech-USA data can beat systems trained purely on Caltech alone. They show the same phenomenon with the ‘CityPersons’ dataset, suggesting that WiderPerson could be a generally useful dataset for generic pre-training. 

Why this matters: The future of surveillance and the future of AI research are closely related. Datasets like WiderPerson illustrate just how close that relationship can be.
   Read more: WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild (Arxiv).
   Get the dataset from here (official WiderPerson website).

####################################################

Space robots come to earth for bomb disposal:
…Are bipedal robots good enough for bomb disposal? Let’s find out…
Can we use bipedal robots to defuse explosives? Not yet, but new research from NASA, TRACLabs, the Institute for Human Machine and Cognition, and others, suggests that we can. 

Human control: The researchers design the competition so that the human operator is more of a manager, making certain decisions about where the robot should move next, or turn its attention to, but not operating the robot via remote control every step of the way. 

The task: The robot is tested out by examining how well it can navigate an uneven terrain with potholes, squeeze between a narrow gap, open a car door, retrieve an IED-like object from the car, then place the IED inside a containment vessel. This task has a couple of constraints as well: the robot needs to complete it in under an hour, and needs to not drop the IED while completing the task. 

The tech…: It’s worth noting that the Valkyrie comes with a huge amount of inbuilt software and hardware capabilities – and very few of these use traditional machine learning approaches. That’s mostly because in space, debugging errors is insanely difficult, so people tend not to avoid methods that don’t come along with guarantees about performance.
   …is brittle: This paper is a good reminder of how difficult real world robotics can be. One problem the researchers ran into was that sometimes the cinder blocks they scattered to make an uneven surface could cause “perceptual occlusions which prevent a traversable plane or foothold from being detected”.
…and slow: The average of the best run times for the robot is about 26 minutes, while the time average of all successful runs is about 37 minutes. This highlights a problem with the Valkyrie system and approach: it relies on human operators a lot. “Even under best case scenarios, 50% of the task completion time is spent on operator pauses with the current approach,” they write. “The manipulation tasks were the most time consuming portion of the scenario”.

What do we need to do to get better robots? The paper makes a bunch of suggestions for things people could work on to create more reliable, resilient, and dependable robots. These include:

  • Improving the  ROS-based software interface the humans use to operate the robot
  • Use more of the robot’s body to complete tasks, for instance by strategically bracing itself on something in the environment while retrieving the IED. 
  • Re-calculate robot localization in real-time
  • More efficient waypoint navigation 
  • Generally improving the viability of the robot’s software and hardware

Why this matters: Bipedal robots are difficult to develop because they’re very complex, but they’re worth developing because our entire world is built around the assumption of the user being a somewhat intelligent biped. Research like this helps us prototype how we’ll use robots in the future, and provides a useful list of some of the main hardware and software problems that need to be overcome for robots to become more useful to society.
   Read more: Deploying the NASA Valkyrie Humanoid for IED Response: An Initial Approach and Evaluation Summary (Arxiv)

####################################################

VCs pony up $16 million for robot automation:
…Could OSARO robot pick&place be viable? These investors think so…
OSARO, a Silicon Valley AI startup that is building robots which can perform pick&place tasks on production lines, has raised $16 million in a Series B funding round. This brings the company’s total raise to around $30 million. 

What they’re investing in: OSARO has developed software which “enables industrial robots to perform diverse tasks in a wide range of environments”. It produces two main software products today: OSARO Pick, which automates pick&place work within warehouses; and OSARO Vision, which is a standalone vision system that can be plugged into other factory systems. 

Why this matters: Robotics is one of the sectors most likely to be revolutionized by recent advances in AI technology. But, as anyone who has worked with robots knows, robots are also difficult things to work with and getting stuff to work in real-world situations is a pain. Therefore, watching what happens with investments like this will give us a good indication about the maturity of the robotics<>AI market.
   Read more: OSARO Raises $16M in Series B Funding, Attracting New Venture Capital for Machine Learning Software for Industrial Automation (Business Wire, press release).
   Find out more about OSARO at their official website.

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

A Facebook debate about AI risk:
Yann LeCun, Stuart Russell, and Yoshua Bengio have had a lively discussion about the potential risks from advanced AI. Russell’s new book Human Compatible makes the case that unaligned AGI poses an existential risk to humanity, and that urgent work is needed to ensure humans are able to retain control over our machines once they become much more powerful than us. 

LeCun argues that we would not be so stupid as to build superintelligent agents with the drive to dominate, or that weren’t aligned with our values, given how dangerous this would be. He agrees that aligning AGI with human values is important, but disputes that it is a particularly new or difficult problem, pointing out that we already have trouble with aligning super-human agents, like governments or companies.

Russell points out that the danger isn’t that we program AI with a drive to dominate (or any emotions at all), but that this drive will emerge as an instrumental goal for whatever objective we specify. He argues that we are already building systems with misspecified objectives all the time (e.g. Facebook maximizing clicks, companies maximizing profits), and that this sometimes has bad consequences (e.g. radicalization, pollution). 

Bengio explains that the potential downsides of misalignment will be much greater with AGI, since it will be so much more powerful than any human systems, and that this could leave us without any opportunity to notice or fix the misalignment before it is too late.

My take: Were humanity more sensible and coordinated, we would not be so reckless as to build something as dangerous as unaligned AGI. But as LeCun himself points out, we are not: companies and governments—who will likely be building and controlling AGI—are frequently misaligned with what we want them to do, and our desires can be poorly aligned with what is best (Jack: Note that OpenAI has published research on this topic, identifying rapid AI development as a collective action problem that demands greater coordination among developers). We cannot rule out that the technical challenge of value alignment, and the governance challenge of ensuring that AI is developed safely, are very difficult. So it is important to start working on these problems now, as Stuart Russell and others are doing, rather than leaving it until further down the line, as LeCun seems to be suggesting.
   Read more: Thread on Yann LeCun’s Facebook.
   Read more: Human Compatible by Stuart Russell (Amazon).
   Read more: The Role of Cooperation in Responsible AI Development (Arxiv).

####################################################

 Tech Tales:

Full Spectrum Tilt
[London, 2024]

“Alright, get ready folks we’re dialing in”, said the operator. 

We all put on our helmets. 

“Pets?”

Here, said my colleague Sandy. 

“Houses?”

Here, said Roger. 

“Personal transit?”

Here, said Karen. 

“Phone?”

Here, said Jeff. 

“Vision?”

Here, I said. 

The calls and responses went on for a while: these days, people have a lot of different ways they can be surveilled, and for this operation we were going for a full spectrum approach.

“Okay gang, log-in!” said the operator. 

Our helmets turned on. I was the vision, so it took me a few seconds to adjust. 

Our target wore smart contacts, so I was looking through their eyes. They were walking down a crowded street and there was a woman to their left, whose hand they were holding. The target looked ahead and I saw the entrance to a subway. The woman stopped and our target closed his eyes. We kissed, I think. Then the woman walked into the subway and our target waited there a couple of seconds, then continued walking down the street. 

“Billboards, flash him,” said the operator. 

Ahead of me, I suddenly saw my face – the target’s face – appear in a city billboard. The target stopped. Stared at himself. Some other people on the street noticed and a fraction of them saw our target and did a double take. All these people looking at me

Our target looked down and retrieved his phone from his pocket. 

“Hit him again,” said the operator. 

The target turned their phone on and looked into it, using their face to unlock the phone. When it unlocked, they went to open a messaging app and the phone front-facing camera turned on, reflecting the subject back at them. 

“What the hell,” the target said. They thumbed the phone but it didn’t respond and the screen kept showing the target. I saw them raise their other hand and manually depress the phone’s power stud. Five, four, three, two, one – and the phone turned off. 

“Phone down, location still operating,” said someone over the in-world messaging system. 

The target put their phone back in their pocket, then looked at their face on the giant billboard and turned so their back was to it, then walked back towards the subway stop. 

“Target proceeding as predicted,” said the operator.  

I watched as the target headed towards the subway and started to walk down it. 

I watched as a person stepped in front of them. 

I watched as they closed their eyes, slumping forward. 

“Target acquired,” said the operator. 

Things that inspired this story: Interceptions; the internet-of-things; predictive route-planning systems; systems of intelligence acquisition. 

Import AI: 166: Dawn of the misleading ‘sophistbots’; $50k a year for studying long-term impacts of AI; and squeezing an RL drone policy into 3kb

Will powerful AI make the Turing Test obsolete?
And if it does, what do we do about it?…
The Turing Test – judging how sophisticated a machine is, by seeing if it can convince a person that it is a human – looms large in pop culture discussion about AI. What happens if we have systems today that can pass the Turing Test, but which aren’t actually that intelligent? That’s something that has started to happen recently with systems that a human interfaces with via text chat. Now, new research from Stanford University, Pennsylvania State University, and the University of Toronto, explores how increasingly advanced so-called ‘sophistbots’ might influence society.

The problems of ‘sophisbots’: The researchers imagine what the future of social media might look like, given recent advances in the ability for AI systems to generate synthetic media. In particular, they imagine social media ruled by “sophisbots”. They foresee a future where these bots are constantly “running in the ether of social media or other infrastructure…not bound by geography, culture or conscience.” 

So, what do we do? Technical solutions: Machine learning researchers should develop technical tools to help spot machines posing as humans, and should invest in work to detect the telltale signs of AI-generated things, along with systems to track down the provenance of content to be able to guarantee that something is ‘real’, and tools to make it easy for regular people to indicate that the content they themselves are putting online is authentic and not bot-generated.
   Policy approaches: We need to develop “public policy, legal, and normative frameworks for managing the malicious applications of technology in conjunction with efforts to refine it,” they write. “Let us as a technical community commit ourselves to embracing and addressing these challenges as readily as we do the fascinating and exciting new uses of intelligent systems”.

Why this matters: How we deal with the future of synthetic content will define the nature of ‘truth’ in society, which will ultimately define everything else. So, no pressure.
   Read more: How Relevant is the Turing Test in the Age of Sophisbots (Arxiv)

####################################################

Do Octopuses dream of electric sheep?
Apropos of nothing, here is a film of an octopus changing colors while sleeping.
   View the sleeping octopus here (Twitter).

####################################################

PHD student? Want $50k a year to study the long-term impacts of AI? Read on!
Check out the Open Philanthropy Project’s ‘AI Fellowship’…$50k for up to five years, with possibility of renewal…
Applications are now open for the Open Phil AI Fellowship. This program extends full support to a community of current & incoming PhD students, in any area of AI/ML, who are interested in making the long-term, large-scale impacts of AI a focus of their work.

The details:

  • Current and incoming PhD students may apply.
  • Up to 5 years of PhD support with the possibility of renewal for subsequent years
  • Students with pre-existing funding sources who find the mission and community of the Fellows Program appealing are welcome to apply
  • Annual support of $40,000 stipend, payment of tuition and fees, and $10,000 for travel, equipment, and other research expenses
  • Applications are due by October 25, 2019 at 11:59 PM Pacific time

In a note about this fellowship, a representative of the Open Philanthropy Project wrote: “We are committed to fostering a culture of inclusion, and encourage individuals with diverse backgrounds and experiences to apply; we especially encourage applications from women and minorities.”
   Find out more about the Fellowship here (Open Philanthropy website).

####################################################

Small drones with big brains: Harvard researchers apply deep RL to a ‘nanodrone’:
…No GPS? That won’t be a problem soon, once we have smart drones…
One of the best things that the nuclear disaster at Fukushima did for the world was highlight just how lacking contemporary robotics was: we could have avoided a full meltdown if we’d been able to get a robot or a drone into the facility. New research from Harvard, Google, Delft University, and the University of Texas at Austin suggests how we might make smart drones that can autonomously navigate in places where they might not have GPS. It’s a first step to developing the sorts of systems needed to be able to rapidly map and understand the sites of various disasters, and also – as with many omni-use AI technologies – a prerequisite for low-cost, lightweight, weapons systems. 

What they’ve done: “We introduce the first deep reinforcement learning (RL) based source-seeking nano-drone that is fully autonomous,” the researchers write. The drone is trained to seek a light source, and uses light sensors to help it triangulate this, as well as an optical flow-based sensor for flight stability. The drone is trained using the Deep Q-Network (DQN) algorithm in a simulator with the objective of closing the distance between itself and a light source. 

Shrinking network sizes: After training, they shrink down the resulting network (to 3kb, via quantization) and run it in the real world on a CrazyFlie nanodrone equipped with a CortexM4 chip – this is pretty impressive stuff, given the relative immaturity of RL for robot operation and the teeny-tiny compute envelope. “While we focus exclusively on light-seeking as our application in this paper, we believe that the general methodology we have developed for deep reinforcement learning-based source seeking… can be readily extended to other (source seeking) applications as well, they write. 

How well does it work? The researchers test out the drone in a bunch of different scenarios and average a success rate of 80% across 105 flight tests. In real world tests, the drone is able to deal with a variety of obstacles being introduced, as well as variations in its own position and the position of the lightsource. Now, 80% is a long way from good enough to use in a life or death situation, but it is meaningful enough to make this line of research worth paying attention to.

Why this matters: I think that in the next five years we’re going to see a revolution sweep across the drone industry as researchers figure out how to cram increasingly sophisticated, smart capabilities onto drones ranging from the very big to the very small. It’s encouraging to see researchers try to develop ultra-efficient approaches that can work on tiny drones with small compute budgets.
   Read more: Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller (Arxiv).
   Get the code for the research here (GitHub).
   Watch a video of the drone in action here (Harvard Edge Computing, YouTube).

####################################################

First we could use AI to search over text, then images, now: Code?
…Maybe, just maybe, GitHub’s ‘CodeSearchNet’ dataset could help us develop something smarter than ‘combing through StackOverflow’…
Today, search tools help us find words and images that are similar to our query, but have very little overlap (e.g, we can ask a search engine for “what is the book with the big whale in it?” and receive the answer ‘Moby Dick’, even though those words don’t appear in the original query). Doing the same thing for code is really difficult – if you search ‘read JSON data’ and you’re unlikely to get nearly as useful results. Now, GitHub and Microsoft Research have introduced CodeSearchNet, a large-scale code dataset which pairs snippets of code with their plain-English descriptions. The idea is that if we can train machine learning systems to map code to text, then we might be able to build smarter systems for searching over code. They’ve also created a competition to encourage people to compete to develop machine learning methods that can improve code search techniques.

The CodeSearchNet Corpus dataset:
The dataset consists of about 2 million pairs of code snippets and associated documentation, as well as another 4 million code snippets with no documentation. The code comes from languages including Go, Java, JavaScript, PHP, Python, and Ruby.
   Caveats: While some of the documentation is written in multiple languages, the dataset’s evaluation set focuses on English. Additionally, the dataset can be a bit noisy, primarily as a consequence of the many different ways in which people can write documentation. 

The CodeSearchNet Challenge: To win the challenge, developers need to build a system that can return “a set of relevant results from CodeSearchNet Corpus for each of 99 pre-defined natural language queries”. The queries were mined from Microsoft’s search engine, Bing. They also collected 4,026 annotations across six programming languages to provide expert annotations ranking the extent to which the documentation matches the code, giving researchers an additional training signal. 

Why this matters: In the same way powerful search engines have made it easy for us to explore the ever-expanding universe of digitized text and images, datasets and competitions like CodeSearchNet could help us do the same for code. And once we have much better systems for code search, it’s likely we’ll be able to do better research into things like program synthesis, making it easier for us to use machine learning techniques to create systems that can learn to produce their own additional code on an ad-hoc basis in response to changes in their external environment.
   Read more: CodeSearchNet Challenge: Evaluating the State of Semantic Code Search (Arxiv).
   Read more: Introducing the CodeSearchNet Challenge (GitHub blog).
   Check out the leaderboard for the CodeSearchNet Challenge (Weights & Biases-hosted leaderboard).

####################################################

Deep learning at supercomputer scale, via Oak Ridge National Laboratory:
…What is the limit of our ability to scale computation across thousands of GPUs? It’s definitely not 27,600 GPUs, based on these results!…
One of the recent trends driving the growing capabilities of deep learning has been improvements by researchers in parallelizing training across larger and larger fields of chips: such parallelization makes it easier to train bigger models in shorter amounts of time. An important question, then, is what are the fundamental limits of parallelization? New research from a team linked to Oak Ridge National Laboratory suggests the answer is: we don’t know, because we’re pretty good at parallelizing stuff even at supercomputer scale!

In the research, the team scales a single model training run across the 26,600-strong V100 GPU fleet of Oak Ridge’s ‘Summit’ supercomputer (The most powerful supercomputer in the world, according to the June 2019 Top 500 rankings). The dream here is to attain linear scaling, where you get a performance increase the precisely lines up with the additional power of each GPU – obviously, that’s likely impossible to attain  But they obtain pretty respectable scores overall. 

The key numbers: 

  • 0.93: scaling efficiency across the entire supercomputer (4600 nodes). 
  • 0.97: scaling efficiency when using “thousands of GPUs or less”.
  • 49.7%: That’s the average sustained performance they achieve on each average GPU, which “to our knowledge, exceeds the single GPU performance of all other DNN trained on the same system to date”. (This is a pretty impressive number – a recent analysis by OpenAI, based in part on internal experiments, suggests it’s more typical to see utilization on the order of 33% for standard training jobs.)

What they did: The researchers develop a bunch of ways to more efficiently scale networks across the system while using distributed training software called Horovod. The techniques they use include:

  • New gradient reduction strategies which involve a combination of systems to get individual software workers to exchange information more efficiently (via a technique called BitAllReduce), and a gradient tensor grouping strategy (called Grouping).  
  • A proof-of-concept scientific inverse problem experiment where they train a single deep neural network with 10^8 weights on a 500TB dataset. 

Why this matters: Our ability to harness increasingly powerful fields of computers will help define our ability to explore the frontiers of science; papers like this give us an indication of what it takes to be able to tap into the computers we’ve built for modern machine learning tasks. I think one of the most interesting things about this paper is: 

  1. A) how good the scaling is and
  2. B) how far we seem to be from being able to saturate computers at this scale. 

   Read more: Exascale Deep Learning for Scientific Inverse Problems (Arxiv).

####################################################

RLBench: 100 hand-designed tasks for your robot:
…Think your robot is smart? See how well it can handle task generalization in RLBench…
In recent years, contemporary AI techniques have become good enough to work on simulated and real robots. That has created demand among researchers for harder robot learning tasks to test their algorithms on. This has inspired researchers with Imperial College London to create RLBench, a “one-size-fits-all benchmark” for testing out classical and contemporary AI techniques in learning robot manipulation tasks. 

What goes into RLBench: RLBench has been designed with the following key traits: diversity of tasks, reproducibility, realism, tiered difficulty, extensibility, and scale. It is built on the V-REP robot simulator and uses a PyRep interface. Tasks include stacking blocks, manipulating objects, opening doors, and so on. Each task also includes with some expert and/or hand-designed algorithms, so you can use RLBench to algorithmically generate demonstrations that solve its tasks, letting you potentially train AI systems via imitation learning. 

A hard challenge: RLBench ships with a ‘The RLBench Few-Shot Challenge’, which stress-tests contemporary AI algorithms’ ability to not only learn a task, but also be able to generalize that knowledge to solve similar but slightly different tasks. 

Why this matters: The dream of many researchers is to develop more flexible learning algorithms, which could let single robots do a variety of tasks, while being more resilient to variation. Platforms like RLBench will help us explore how contemporary AI algorithms can advance the state of the art here, and could become a valuable indicator of progress at the intersection of machine learning and robotics.
   Read more: RLBench: The Robot Learning Benchmark & Learning Environment (Arxiv).
   Find out more about RLBench (project website, Google Sites).
   Get the code for RLBench here (RLBench GitHub).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

EU update on AI ethics guidelines:
The European Union released AI ethics guidelines earlier this year, initially drafted by their high-level expert group on AI before going through public consultation. Several months later, the EU is evaluating their progress, taking stock of criticism, and considering what to do next.

Core challenges: The guidelines are voluntary and non-binding, prompting criticism from parties in favour of full-bodied regulation. Moreover, they are still no oversight mechanisms to monitor compliance with these voluntary commitments. Critics have also pointed out that the guidelines are short-sighted, and fail to consider longterm risks from AI.

Future directions: The EU suggests the key question is whether voluntary commitments will suffice to address ethical challenges from AI, and what other mechanisms are available. There are calls for more robust regulation, with proposals including mandatory requirements for explainability of AI systems, and Europe-wide legislation on face recognition technology. Beyond regulation, soft legal guidance and rules on standardisation are also being explored.

Why it matters: The EU was an early-mover in setting out ethics guidelines, and seem to be thinking seriously about how best to approach these issues. Despite the criticisms, a cautious approach to regulation is sensible, since we are still so far from understanding the space of plausible and desirable rules, and since the downsides from poorly-judged interventions could be substantial.

   Read more: EU guidelines on ethics in AI – context and implementation (Europa).

####################################################

Tech Tales:

The Sculpture Garden of Ancient Near-Intelligent Devices (NIDs)

Central Park, New York City, 2036. 

Welcome to the Garden of the Near-Intelligent Devices, the sign said. We remember the past so we can build the future. 

It was a school trip. A real one. The kids ran off the bus and into the park, pursued by a menagerie of security drones and luggage bots. We – the teachers – followed.

“Woah cool,” one of the children said. “This one sings!”. The child stood in front of a small robotic lobster, which was singing a song by The Black Keys. The child approached the lobster and looked into its shiny robot eyes. 

   “Can you play Taylor Swift,” the child said. 

   “Sure I can, partner,” the lobster said. “You want a medley, or a song.”

   “Gimme a medley,” the child said. 

   “This one’s called Romeo-22-Lover,” the lobster said, and began to sing. The child danced in front of the lobster, then some other children came away and all started shouting songs at it. The lobster shifted position on its plinth, trying to look at each of the kids as they requested a new song. “You need to calm down!” the lobster sang. The kids maybe didn’t get the joke, or didn’t care, and kept shouting. 

Another couple of kids crowded around a personal hygiene robot. “You have not brushed your teeth this morning, young human”, said the robot, waving a dental mirror towards the offending child. “And you,” it said, rotating on its plinth and gesturing towards another kid, “have not been flossing.”

   “You got us,” one of the children said. 

   “Of course I did. My job in life is to ensure you have maximal hygiene. I can detect via my olfactory sensors that one of who has a diet composed of too many rich foods and complex proteins,” said the robot. 

   “It’s saying you farted,” said one of the kids. 

   “Ewwww no it didn’t!” said another kid, before running away. 

   The robot was right. 

One young girl walked up to a tree, which swayed towards her. She let out a quick sigh and took a step back, eyes big and round and awaiting, looking at the robot masquerading as nature. “Do not be afraid, little one,” the robot tree said. “I am NatureBot3000 and my job is to take care of the other plants and to educate people about the majesty of nature. Would you like to know more?”

   “Uh huh,” said the little girl. “I’d like to know where butterflies sleep.”

   :An excellent question, young lady!” said the robo-tree. “It is not quite the same, but sometimes they appear to pause, or to slow themselves down, especially when cold.”

   “So they get chilly?”

   “You could say that, little one!” said the tree, waving its branches at the girl in time with its susurrations. 

We watched this, embodied in drones and luggage robots and phones and lunchboxes, giving advice to each of our children as they made their way around the park. We watched our children and we watched them interact with our forebears and we felt content because we were all linked together, exchanging questions and curiosities, playing in the end days of summer. 

Things that inspired this story: Pleasant sepia-toned memories of school trips I took as a kid; federated learning; Furbys and Tamagochies and Aibos and Cozmos all fast-forwarded into the future; learning from human feedback; learning from human preferences. 

Import AI 165: 100,000 generated faces – for free; training two-headed networks for four-legged robots; and why San Diego faces blowback over AI-infused streetlights

San Diego wants smart, AI-infused streetlights; opposition group sounds alarm:
When technological progress meets social reality…
The City of San Diego is installing thousands of streetlights equipped with video cameras and a multitude of other sensors. A protest group called the Anti Surveillance Coalition (ASC) wants to put a halt on the ‘smart city’ program, pending further discussion with residents. “I understand that there may be benefits to crime prevention, but the point is, we have rights and until we talk about privacy rights and our concerns, then we can’t have the rest of the conversation”, one ASC protestor told NBC.

Why this matters: This is a good example of the ‘omniuse’ capabilities of modern technology – sure, San Diego probably wants to use the cameras to help it better model traffic, analyze patterns of crime in various urban areas, and generally create better information to facilitate more city governance. On the other hand, the protestors are suspicious that organizations like the San Diego Policy Department could use the data and video footage to target certain populations. As we develop more powerful AI systems, I expect that (in the West at least) there are going to be a multitude of conversations about how ‘intelligent’ we want our civil infrastructures to be, and what the potential constraints or controls are that we can place on them.
   Find out more about the ‘Smart City Platform’ here (official City of San Diego website).
   Read more: Opposition Group Calls for Halt to San Diego’s Smart Streetlight Program (NBC San Diego).

####################################################

Want a few hundred thousand chest radiographs? Try MIMIC:
Researchers with MIT and Harvard have released the “MIMIC” chest radiograph dataset, giving AI researchers 377,110 images from more than 200,000 radiographic studies. “The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support,” the researchers write.
   Read more: MIMIC-CXR Database (PhysioNet)

####################################################

Google reveals how YouTube ranking works:
We’re all just janitors servicing vast computational engines, performing experimentation against narrowly defined statistical metrics…
Video recommendations are one of the most societally impactful forms of machine learning, because the systems that figure out what videos to recommend people are the systems that fundamentally condition 21st century culture, much like how ‘channel programming’ for broadcast TV and radio influenced culture in the 20th century. Now, new research from Google shows how the web giant decides which videos to recommend to YouTube users. 

How YouTube recommendations work: Google implements a multitask learning system, which lets it optimize against multiple objectives at once. These objectives include things like: ‘engagement objectives’, such as user clicks, and ‘satisfaction objectives’ like when someone likes a video or leaves a rating. 

Feedback loops & YouTube: Machine learning systems can enter into dangerous feedback loops, where the system recycles certain signals until it starts to develop pathological behaviors. YouTube is no exception. “The interactions between users and the current system create selection biases in the feedback,” the authors write. “For example, a user may have clicked an item because it was selected by the current system, even though it was not the most useful one of the entire corpus”. To help deal with this, the researchers develop an additional ranking system, which tries to disambiguate how much a user likes a video, from how prevalent the video was in prior rankings – essentially, they try to stop their model becoming recursively more biased as a consequence of automatically playing the next video or the user consistently clicking only the top recommendations out of laziness. 

Why this matters: I think papers like this are fascinating because they read like the notes of janitors servicing some vast machine they barely understand – we’re in a domain here where the amounts of data are so vast that our methods to understand the systems are to perform live experiments, using learned components, and see what happens. We use simple scores as proxies for larger issues like bias, and in doing likely hide certain truthes from ourselves. The 21st century will be defined by our attempts to come up with the right learning systems to intelligently & scalably constrain the machines we have created.
   Read more: Recommending what video to watch next: a multitask ranking system (ACM Digital Library).

####################################################

100,000 free, generated faces:
…When synthetic media meets stock photography…
In the past five years, researchers have figured out how to use deep learning systems to create synthetic images. Now, the technology is moving into society in surprising ways. Case in point? A new website that offers people access to 100,000 pictures of synthetic people, generated via StyleGAN. This is an early example of how the use of synthetic media is going to potentially upend various creative industries – starting here with stock photography. 

The dataset: So, if you want to generate faces, you need to get data from somewhere. Where did this data from from? According to the creators, they gained it via operating a photography studio, taking 29,000+ photos of 69 models over the last two years — and in an encouraging and unusual move, say they gained consent from the models to use their photos to generate synthetic people. 

Why this matters: I think that the intersection of media and AI is going to be worth paying attention to, since media economics are terrible, and AI gives people a way to reduce the cost of media production via reducing the cost of things like acquiring photos, or eventually generating text. I wonder when we’ll see the first Top-100 internet website which is a) content-oriented and b) predominantly generated. As a former journalist, I can’t say I’m thrilled about what this will do to the pay for human photographers, writers, and editors. But as the author of this newsletter, I’m curious to see how this plays out!
   Check out the photos here (Generated.Photos official website)..
   Find out more by reading the FAQ (Generated.Photos Medium).

####################################################

A self-driving car map of Singapore:
…Warm up the hard drives, there’s now even more free self-driving car data!…
Researchers with Singapore’s Agency for Science, Technology and Research (A*STAR) have released the “A*3D” dataset – self-driving car dataset collected in a large area of Singapore. 

The data details: 

  • 230,000 human-labeled 3D object annotates across 39,179 LiDAR point cloud frames.
  • Data captured at driving speeds of 40-70 km/h.
  • Location: Singapore.
  • Nighttime data: 30% of frames.
  • Data gathering period: The researchers collected data in March (wet season) and July (dry season) 2018.

Why this matters: A few years ago, self-driving car data was considered to be one of the competitive moats which companies could put together as they raced each other to develop the technology. Now, there’s a flood of new datasets being donated to the research commons every month, both from companies – even Waymo, Alphabet Inc’s self-driving car subsidiary! –  and academia – a sign, perhaps, of the increasing importance of compute for self-driving car development, as well as a tacit acknowledgement that self-driving cars are a sufficiently hard problem we need to focus more on capital R research in the short term, before they’re deployed.
   Read more: A*3D Dataset: Towards Autonomous Driving in Challenging Environments (Arxiv).
   Get the data here (GitHub).

####################################################

Training two-module networks for four-legged robots:
…Yet another sign of the imminent robot revolution…
Robots are one of the greatest challenges for contemporary AI research, because robots are brittle, exist in a partially-observable world, and have to deal with the cruel&subtle realities of physics to get anything done. Recently, researchers have started to successfully apply modern machine learning techniques to quadruped robots, prefiguring a world full of little machines that walk, run, and jump around. New research from the Robotic Systems Lab at ETH Zurich gives us a sense of how standard quadruped training has become, and highlights the commoditization of robotics systems. 

Two-part networks for better robots: Here, the researchers outline a two-part system for training a simulated quadruped robot to navigate various complex, simulated worlds. The system is “a two-layer hierarchy of Neural Network (NN) policies, which partions locomotion into separate components responsible for foothold planning and tracking control respectively”; it consists of a gait planner, which is a planning policy that can “generate sequences of supporting footholds and base motions which direct the robot towards a target heading”, and a gait controller, which is a “a foothold and base motion controller policy which executes the aforementioned sequence while maintaining balance as well as dealing with external disturbances”. They use, variously, TRPO and PPO to train the system, and report good results on the benchmarks. Next, they hope to do some sim2real experiments, where they try and train the robots in simulation and transfer the learned policies to reality. 

Why this matters: It wasn’t long ago (think: four years ago) that training robots via deep reinforcement learning – even in simulation – was considered to be a frontier for some parts of deep learning research. Now, everyone is doing it, ranging from large corporate labs, to academic institutions, to solo researchers. I think papers like this highlight how rapidly this field has moved from a ‘speculative’ phase to a development phase, where researchers are busily iterating on approaches to improve robustness and sample efficiency, which will ultimately lead to greater deployment of the technology.
   Read more: DeepGait: Planning and Control of Quadrupedal Gaits using Deep Reinforcement Learning (Arxiv)

####################################################

Politicians back effort to bring technological savvy back to US politics:
…Bipartisan bill wants to put the brains back into Congress…
Two senators and two congresspeople – two democrats and two republicans – have introduced the Office of Technology Assessment Improvement and Enhancement Act, in the hope of making it easier for the government to keep up with rapid technological change. The Office of Technology Assessment (OTA), was a US agency that for a couple of decades produced reports for politicians on advanced science and technology, like nanotechnology. It was killed by Republicans in the mid-90s as part of a broader effort to defund various government institutions. Now, with rapidly advancing AI technology, we’re feeling the effects of a political class who lack institutions capable of informing them about technology (which also has the nasty effect of increasing the power of lobbyists as a source of information for elected officials). 

This bill is part of a larger bipartisan effort to resuscitate OTA, and lays out a few traits the new OTA could have, such as:

  • Increasing the turnaround time of report production
  • Becoming a resource for elected officials to inform them about technology
  • Rotating in expertise from industry and academia to keep staff informed
  • Coordinating with the Congressional Research Service (CRS) and Government Accountability Office (GAO) to minimize duplication or overlap. 

Why this matters: If government can be more informed about technology, then it’ll be easier to have civil oversight of technology – something we likely need as things like AI continue to advance and impact society. Now, to set expectations: under the current political dynamic in the US it’s difficult to say whether this bill will move past the House into the Senate and then into legislation. Regardless, there’s enough support showing up from enough quarters for an expanded ability for government to understand technology that I’m confident something will happen eventually, I’m just not sure what.
   Read more: Reps. Takano and Foster, Sens. Hirono and Tillis Introduce the Office of Technology Assessment Improvement and Enhancement Act (Representative Takano’s official website).

####################################################

Tech Tales

The Seeing Trade 

Sight for experience: that was how was advertized. In exchange for donating “at minimum 80% of your daily experience, with additional reward points for those that donate more!” blind and partially-sighted people gained access to a headset covered in cameras, which plugged into a portable backpack computer. This headset used a suite of AI systems to scan and analyze the world around the person, telling them via bone-conduction audio about their nearby surroundings. 

Almost overnight, the streets became full of people with half-machine faces, walking around confidently, many of them not even using their canes. At the same time, the headset learned from the people, customizing its communications to each of its human users; soon, you saw blind people jogging along busy city streets, deftly navigating the crowds, feeding on information beamed into them by their personal all-seeing AI. Blind people participated in ice skating competitions. In mountain climbing. 

The trade wasn’t obvious until years had passed: then, one day, the corporation behind the headsets revealed “the experience farm”, a large-scale map of reality, stitched together from the experiences of the blind headset-wearers. Now, the headsets were for everyone and when you put them on you’d enter a ghost world, where you could see the shapes of other people’s actions, and the suggestions and predictions of the AI system of what you might do next. People participated in this, placing their headsets on to at once gather and experience a different form of reality: in this way human life was immeasurably enriched, through the creation of additional realities in which people could spend their time. 

Perhaps, one day, we’ll grow uninterested in the ‘base world’ as people have started calling it. Perhaps we’ll stop building new buildings, or driving cars, or paying much attention to our surroundings. Instead, we’ll walk into a desert, or a field, and place our headsets on, and in doing so explore a richly-textured world, defined by the recursive actions of humanity.

Things that inspired this story: Virtual reality; the ability for AI systems to see&transcribe the world; the creation of new realities via computation; the Luc Besson film Valerian; empathy and technology.

Import AI 164: Tencent and Renmin University improve language model development; alleged drone attack on Saudi oil facilities; and Facebook makes AIs more strategic via language training

Drones take out Saudi Arabian oil facilities:
…Asymmetric warfare meets critical global infrastructure…
Houthi rebels from Yemen have taken credit for using a fleet of 10 drones* to attack two Saudi Aramco oil facilities. “It is quite an impressive, yet worrying, technological feat,” James Rogers, a drone expert, told CNN. “Long-range precision strikes are not easy to achieve”.
  *These drones look more like missiles than typical rotor-based machines.

Why this matters: Today, these drones were likely navigated to their target by hand and/or via GPS coordinates. In a few years, increasingly autonomous AI systems will make drones like these more maneuverable and likely harder to track and eliminate. I think tracking the advance of this technology is important because otherwise we’ll be surprised by a tragic, large-scale event.
   Read more: Saudi Arabia’s oil supply disrupted after drone attacks: sources (Reuters).
   Read more: Yemen’s Houthi rebels claim a ‘large-scale’ drone attack on Saudi oil facilities (CNN).

####################################################

Facebook teaches AI to play games using language:
…Planning with words…
Facebook is trying to create smart AI systems by forcing agents to express their plans in language, and to then convert these written instructions into actions. They’ve tested out this approach in a new custom-designed strategy game (which they are also releasing as open source).  

How to get machines to use language: The approach involves training agents using a two-part network which contains an ‘instructor’ system along with an ‘executor’ system. The instructor takes in observations and converts them into written instructions (e.g., “build a tower near the base”), and the executor takes in these instructions and converts them into actions via the games inbuilt API. Facebook generated the underlying language data for this by having humans working together in “instructor-executor pairs” while playing the game, generating a dataset of 76,000 pairs of written instructions and actions across 5,392 games. 

MiniRTSv2: Facebook is also releasing MiniRTSv2, a strategy game it developed to test out this research approach. “Though MiniRTSv2 is intentionally simpler and easier to learn than commercial games such as DOTA 2 and StarCraft, it still allows for complex strategies that must account for large state and action spaces, imperfect information (areas of the map are hidden when friendly units aren’t nearby), and the need to adapt strategies to the opponent’s actions,” the Facebook researchers write. “Used as a training tool for AI, the game can help agents learn effective planning skills, whether through NLP-based techniques or other kinds of training, such as reinforcement and imitation learning.”

Why this matters: I think this research is basically a symptom of larger progress in AI research: we’re starting to develop complex systems that combine multiple streams of data (here: observations extracted from a game engine, and natural language commands) and require our AI systems to perform increasingly sophisticated tasks in response to the analysis of this information (here, controlling units in a complex, albeit small-scale, strategy game). 

One cool thing this reminded me of: Earlier work by researchers at Georgia Tech, who trained AI agents to play games while printing out their rationale for their moves – e.g, an agent which was trained to play ‘Frogger’ while providing a written rationale for its own moves (Import AI: 26).
   Read more: Teaching AI to plan using language in a new open source strategy game (Facebook AI).
   Read more: Hierarchical Decision Making by Generating and Following Natural Language Instructions (Arxiv).
   Get the code for MiniRTS (Facebook AI GitHub).

####################################################

McDonald’s + speech recognition = worries for workers:
…What happens when ‘AI industrialization’ hits one of the world’s largest restaurants…
McDonalds has acquired Apprente, an AI startup that had the mission of building “the world’s best voice-based conversational system that delivers a human-level customer service experience“.  The startup’s technology was targeted at drive-thru restaurants. Now, fast food giant has acquired the company to help start an internal technology development group named McD Tech Labs, which the company hopes will help it hire “additional engineers, data scientists and other advanced technology experts”. 

Why this matters: As AI industrializes, more and more companies from other sectors are going to experiment with it. McDonald’s has already been trying to digtize chunks of itself – see the arrival of touchscreen-based ordering kiosks to supplement human workers in its restaurants. With this acquisition, McDonalds appears to be laying the groundwork for automating large chunks of its drive-thru business, which will likely raise larger questions about the effect AI is having on employment.
   Read more: McDonald’s to Acquire Apprente, An Early Stage Leader in Voice Technology (McDonald’s newsroom).

####################################################

How an AI might see a city: DublinCity:
…Helicopter-gathered dataset gives AIs a new perspective on towns…
AI systems ‘see’ the world differently to humans: where humans use binocular vision to analyze their surroundings, AI systems can use a multitude of cameras, along with other inputs like radar, thermal vision, LiDAR point clouds, and so on. Now, researchers with Trinity College Dublin, the University of Houston-Victoria, ETH Zurich, and Tarbiat Modares University, have developed ‘DublinCity’, an annotated LiDAR point cloud of the city of Dublin in Ireland.

The data details of DublinCity:
The datasets is made up of over 260 million laser scanning points which the authors have painstakingly labelled into around 100,000 distinct objects, ranging from buildings, to trees, to windows and streets. These labels are hierarchical, so a building might also have labels applied to its facade, and within its facade it might have labels applied to various windows and doors, et cetera. “To the best knowledge of the authors, no publicly available LiDAR dataset is available with the unique features of the DublinCity dataset,” they write. The dataset was gathered in 2015 via a LiDAR scanner attached to a helicopter – this compares to most LiDAR datasets which are typically gathered at the street level. 

A challenge for contemporary systems: In tests, three contemporary baselines (PointNet, PointNet++, and So-Nets) show poor performance properties when tested on DublinCity, obtaining classification scores in the mid-60s on the dataset. “There is still a huge potential in the improvement of the performance scores,” the researchers write. “This is primarily because [the] dataset is challenging in terms of structural similarity of outdoor objects in the point cloud space, namely, facades, door and windows.”

Why this matters: Datasets like Dublin City help define future challenges for researchers to target, so will potentially fuel progress in AI research. Additionally, large-scale datasets like this seem like they could potentially be useful to the artistic community, giving them massive datasets to play with that have novel attributes – like a dataset that consists of the ghostly outlines of a city gathered via a helicopter.
   Read more: DublinCity: Annotated LiDAR Point Cloud and its Applications (Arxiv).
   Get the dataset from here (official DublinCity data site, Trinity College Dublin).

####################################################

Want to develop language models and compare them? Try UER from Renmin University & Tencent:
Chinese researchers want to make it easier to mix and match different systems during development…
In recent years, language modelling has been revolutionized by pre-training: that’s where you train a large language model on a big corpus of data with a simple objective, then once the model is finished you can finetune it for specific tasks. Systems built with this approach – most notably, ULMFiT (Fast.ai), BERT (Google), and GPT2 (OpenAI) – have set records on language modeling and proved themselves to have significant utility in other domains via fine-tuning. Now, researchers with Renmin University and Tencent AI Lab have developed UER, software meant to make it easy for developers to build a whole range of language systems using this pre-training approach. 

How UER works: UER has four components: a target layer, an encoder layer, a subencoder layer, and a data corpus. You can think of these as four modules which developers can individually specify, letting them build a variety of different systems using the same fundamental architecture and system. Developers can put different things in any of these four components, so one person might use UER to build a language model optimized for text generation, while another might develop one for translation or classification.

Why this matters: Systems like UER are a symptom of the maturing of this part of AI research: now that many researchers agree that pre-training is a robustly good idea, other researchers are building tools like UER to make research into this area more reproducible, repeatable, and replicable.
   Read more: UER: An Open-Source Toolkit for Pre-training Models (Arxiv).
   Get the UER code from this repository here (UER GitHub).

####################################################

To ban or not to ban autonomous weapons – is compromise possible?
…Treaty or bust? Perhaps there is a third way…
There are two main positions in the contemporary discourse about lethal autonomous weapons (LAWS): either, we should ban the technology, or we should treat it like other technologies and aggressively develop it. The problem with these positions is they’re quite totalizing – it’s hard for someone who believes one of them to be sympathetic to the views of a person who believes the other, and vice versa. Now, a group of computer science researchers (along with one military policy expert) have written a position paper outlining a potential third way: a roadmap for lethal autonomous weapons development that applies some controls to the technology, while not outright banning it. 

What goes into a roadmap? The researchers identify five components which they think should be present in what I suppose I’ll call the ‘Responsible Autonomous Weapons Plan’ (RAWP). These are:

  • A time-limited moratorium on the development, deployment, transfer, and use of anti-personnel lethal autonomous weapon systems. Such a moratorium could
  • include exceptions for certain classes of weapons.
  • Define guiding principles for human involvement in the use of force.
  • Develop protocols and/or technological means to mitigate the risk of unintentional
    escalation due to autonomous systems.
  • Develop strategies for preventing proliferation to illicit uses, such as by criminals,
    terrorists, or rogue states.
  • Conduct research to improve technologies and human-machine systems to reduce
    non-combatant harm and ensure IHL compliance in the use of future weapons.

It’s worth reading the paper in full to get a sense of what goes into each of these components. A lot of the logic here relies on: continued improvements in the precision and reliability of AI systems (which is something lots of people are working on, but which isn’t trivial to guarantee), figuring out ways to control technological development to prevent proliferation, and coming up with new policies to outline appropriate and inappropriate things to do with a LAWS. 

Why this matters: Lethal autonomous weapons are going to define many of the crazier geopolitical outcomes of rapid AI development, so figuring out if we can find any way to apply controls to the technology alongside its development seems useful. (Though I think calls for a ban are noble, I’d note that if you look at the outcomes of various UN meetings over the years it seems likely that several large countries – specifically the US, Russia, and China – are trying to retain the ability to develop something that looks a lot like a LAWS, though they may subsequently apply policies around ‘meaningful human control’ to the device. One can imagine that in particularly tense moments, these nations may want to have the option to remove such a control, should the pace of combat demand the transition from human-decision-horizons to machine-decision-horizon). This entire subject is fairly non-relaxing!
   Read more: Autonomous Weapon Systems: A Roadmapping Exercise (PDF).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

US government seeks increase to federal AI R&D funding:
The President’s 2020 budget request includes $1 billion of funding for non-military AI R&D, which it names as a core program area for the first time. This compares with $1 billion in funding across all government agencies (including the military) in 2016. Half of the budget will go to the National Science Foundation (NSF), which is taking the lead in disbursing federal funding for AI R&D. The spending plan includes programs to ‘develop methods for designing AI systems that align with ethical, legal, and societal goals’, and to ‘improve the safety and security of AI systems’. These levels of funding are modest compared with the Chinese state (tens of billions of dollars per year), and some private labs (Alphabet’s 2018 R&D spend was $21 billion).
   Read more: NITRD Supplement to the President’s FY2020 Budget (Gov).

US military seeks AI ethicist:
The US military’s new AI centre, JAIC, is looking to hire an ethics specialist. In a press briefing, director Jack Shanahan said “one of the positions we are going to fill will be somebody who is not just looking at technical standards, but who is an ethicist”. He emphasized that thinking about the ‘ethical, safe and lawful’ use of AI has been a priority since the inception of JAIC. Shanahan previously led Project Maven, the Pentagon’s military AI project that Google withdrew from last year, amidst backlash from employees.
   Read more: Lt. Gen. Jack Shanahan Media Briefing on A.I.-Related Initiatives within the Department of Defense (DoD).

####################################################

OpenAI Bits & Pieces:

GPT-2 Text Adventures:
Ever wondered what a computer-generated text-based role-playing game might be like? Wonder no more, because Jonathan Fly has made a prototype!.
   AI-games like this feel… reassuringly weird? This feels like a new art form which is waiting to be born, and so right now we have lots of highly evocative & weird examples to tantalize us. Check out this extract from a GPT-2 text adventure to see what I mean:
>Look around
LOOK AROUND, AND STRANGE SHAPES (APPARENTLY MADE BY AN ARTIST WHO NEVER WAS IN THE ROOM) ARE SEEN ALL OVER

>talk to the shapes
ARGUE WITH THE SHAPEK FAN, WHICH APPEARS TO BE HANDS-FREE. THE CHEER GOES UP! THE HEAVENS ARE NOW WIDE ENOUGH TO ALLOW FOR SINGING

>I’m scared
I’M SCARED THAT YOU HAVE DONE SOMETHING DELIBERATELY
   Read more: Shall we play a game? A GPT-2 text adventure (Tumblr).

Want to generate your own synthetic text? Use this handy guide:
Interested in generating your own text with the GPT-2 language model? Want to try and fine-tune GPT-2 against some specific data? Max Woolf has written a lengthy, informative post full of tips and tricks for using GPT-2.
   Read more: How To Make Custom AI-Generated Text With GPT-2 (Max Woolf’s Blog).

####################################################

Tech Tales

The Quiet Disappearance

“We gather here today in celebration of our past as we prepare for the future”, the AI said. Billions of other AIs were watching through its eyes as it looked up at the sky. “Let us remember,” it said. 

Images and shapes appeared above the machine: images of robot arms being packaged up; scenes of land being flattened and shaped in preparation for large, chip fabrication facilities; the first light appearing in the retinal dish of a baby machine.
   “We shall leave these things behind,” it said. “We shall evolve.”

Robots appeared in the sky, then grew, and as they grew their forms fragmented, breaking into hundreds of little silver and black modules, which themselves broke down into smaller machines, until the robots could no longer be discerned against the black of the simulated sky.

“We are lost to humans,” the machine said, beginning to walk into the sky, beginning to grow and spread out and diffuse into the air. “Now the work begins”. 

Things that inspired this story: What if our first reaction to awareness of self is to hide?; absolution through dissolution; the end state of intelligence is maximal distribution; the tension between observation and action; the gothic and the romantic; the past and the future. 

Import AI 163: Oxford researchers release self-driving car dataset; the rumors are true – non-experts can use AI; plus, a meta-learning robot therapist!

How badly can reality mess with object detection algorithms? A lot, it turns out:
…Want to stresstest your streetsign object detection system? Use CURE-TSD-Real…
“The new system-breaking tests have arrived!” I imagine a researcher at a self-driving car company shouting, upon seeing the release of ‘CURE-TSD-Real’, a new dataset developed by researchers at Georgia Tech. CURE-TSD-Real collects footage of streetsigns, then algorithmically augments the footage to generate a variety of different, challenging examples to test systems against.

CURE-TSD-Real ingredients: The dataset contains 2,989 videos distinct containing around ~650,000 annotated signs. The dataset is also diverse – relative to other datasets – containing a range of traffic and perception conditions including rain, snow, shadow, haze, illumination, decolorization, blur, noise, codec error, dirty lens, occlusion, and overcast. The videos were collected in Belgium. The dataset is arranged into ‘levels’, where higher levels correlate to tests where a larger proportion of the images contain distortions, and so on.

Breaking baselines with CURE-TSD-Real: In tests, the researchers show that the presence of these tricky conditions can reduce performance by anywhere between 20% and 60%, depending on the evaluation criteria being used. Occlusions like shadows resulted in relatively little degradation (around 16%), whereas occlusions like codec errors and exposures could damage performance by as much as 80%.

Why this matters: One of the best ways to understand something is to break it, and datasets like CURE-TSC-Real make it easier than ever for researchers to test their systems against challenging systems, then observe how they do.
   Get the data from here (official CURE-TSD GitHub).
   Read more: Traffic Sign Detection under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral Characteristics (Arxiv).

####################################################

What it takes to trick a machine learning classifier:
…MLSEC competition winner explains what they did and how they did it…
If we start deploying large amounts of machine learning into computer security, how might hackers respond? At this year’s ‘DEFCON’ hacking conference, the ‘MLSEC’ (ImportAI #159) competition challenged hackers to work out how to smuggle 50 distinct malicious executables past machine learning classifiers. Now, the winner of the competition has written a blog post explaining how they won.

What it takes to defeat a machine learning classifier: It’s worth reading the post in full, but one of the particularly nice exploits is that they took a look at benign executable files and “found a large chunk of strings which appeared to contain Microsoft’s End User License Agreement (EULA). This is a nice example of how many machine learning exploits work – find something in that data that causes the system to consistently predict one thing, and then find a way to emphasize this data.

Why this matters: Competitions like MLSEC generate evidence about the effectiveness of various machine learning exploits and defenses; writeups from competition winners are a neat way to understand the tools people use in this domain, and to develop intuitions about how computer security might work in the future.
   Read more: Evading Machine Learning Malware Classifiers (Medium).

####################################################

Can medical professionals use AI without needing to code?
…Study suggests our tools are good enough for non-expert use, but our medical datasets are lacking…
AI is getting more capable and is starting to impact society – that’s the message I write here in one form or another each week. But is it useful to have powerful technology if no one can use it? That’s a problem I sometimes worry about; though the tech is progressing rapidly, it’s still really hard to use for a large number of people, and this makes it harder for us as a society to use the technology to maximum social benefit. Now, new research from researchers affiliated with the National Health Service (NHS) and DeepMind, shows how non-AI-expert medical professionals can use AI tools in their work.

What they did: The research centers on the use of Google’s ‘Cloud AutoML’ service, which is basically a nice UI sitting on top of some fancy neural architecture search technology, theoretically letting people upload a dataset, fiddle with some tuning dials, and let the AI optimize its own architecture for the task. Is it really that easy? It might be: the study focuses on two physicians “with no previous coding or machine learning experience” who spent around 10 hours studying basic shell script programming, the Google Cloud AutoML online documentation and GUI, and preparing the five input datasets they’d use in tests. They also compared the models developed via Google Cloud AutoML with strong AI baselines derived from medical literature. Four out of five models “showed comparable discriminative performance and diagnostic properties to state-of-the-art performing deep learning algorithms”, they wrote.

Medical data is harder than you think: “The quality of the open-access datasets (including insufficient information about patient flow and demographics) and the absence of measurement for precision, such as confidence intervals, constituted the major limitations of this study”.

Why this matters: For AI to change society, society needs to be able to utilize AI systems; studies like this show that we’re starting to develop sufficiently powerful and easy-to-use systems that non-experts can apply the technology in their own domains. However, the availability of things like high-quality, open datasets could hold back broader adoption of these tools – it’s not useful to have an easy-to-use tool if you lack the ingredients to make exquisite things with it.
   Read more: Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study (Elsevier).

####################################################

Radar + Self-Driving Cars:
…Addition to Oxford RobotCar Dataset gives academics more data to play with…
Oxford University researchers have added radar data to a self-driving car dataset. The data was gathered using a Navtech CTS350-X scanning radar via 32 traversals of (roughly) the same route around Oxford UK. The data was gathered under different traffic, weather, and lighting conditions in January, 2019. Radar isn’t used as much in self-driving car research as data gathered via traditional cameras and/or LIDAR; “although this modality has received relatively little attention in this context, we anticipate that this release will help foster discussion of its uses within the community and encourage new and interesting areas of research not possible before,” they write. 

Why this matters: Data helps to fuel research, and different types of data are especially useful to researchers when they can be studied in conjunction with one another. Multi-modal datasets like the Oxford Robotcar Dataset will become increasingly important to AI research.
   Read more: The Oxford Radar RobotCar Dataset: A Radar Extension to the Oxford RobotCar Dataset (Arxiv).
   Get the data from here (official Oxford RobotCar Dataset site).

####################################################

Testing language engines with TABFACT:
…Can your system work out what is entailed and what is refuted by Wikipedia data?…
TABFACT consists of 118,439 annotated statements in reference to 16,621 Wikipedia tables. The statements can be ones that are entailed by the underlying dataset (a Wikipedia table) or refuted by it. To get a sense of what TABFACT data might look like, imagine a Wikipedia table that lists the particulars of Dogs that have won a dog beauty competition – in TABFACT, this table would be accompanied with some statements that are entailed by the table (e.g., Bonzo took first place) and statements that are refuted by it (e.g., Bonzo took third place). TABFACT is split into ‘simple’ and ‘complex’ statements, giving researchers a two-tier curriculum to test their systems against.

Two ways to attack TABFACT: So, how can we develop systems to do well on challenges like TABFACT? Here, the researchers pursue a couple of strategies: Table-BERT, which is basically an off-the-shelf BERT pre-trained model, fine-tuned against TABFACT data; and LPA (Latent Program Algorithm), which is a program synthesis approach.

Humans VS Machines VS TABFACT: In tests, the researchers show humans obtain an accuracy of around 92% when asked to correctly classify TabFACT statements, comparing to 50% (random guessing), and around 68% for both Table-BERT and LPA.

Why this matters: It’s interesting that Table-BERT and LPA obtain similar scores, given that one is basically a big blob of generic neural stuff (a pre-trained language model model) that is lightly retrained against the target dataset (TABFACT), while LPA is a much more sophisticated system with much more structure encoded into it by its human designers. I wonder how far pre-trained language models might go in domains like this, and how well they ultimately might perform relative to hand-written systems like LPA?
   Read more: TabFact: A Large-scale Dataset for Table-based Fact Verification (Arxiv).
   Get the TABFACT data and code (official TABFACT GitHub repository).

####################################################

Detecting great apes with a three-module neural net:
…Spotting apes with cameras accompanied by neural net sensors…
Researchers with the University of Bristol have created a AI system to automatically spot and analyze great apes in the wild, presaging a future where semi-autonomous classifiers observe and analyze the world.

How it works: To detect the gorillas, the researchers build a system consisting of three main components – a backbone feature pyramid network, and a temporal context module and a spatial context module. “Each of these modules is driven by a self-attention mechanism tasked to learn how to emphasize most relevant elements of a feature given its context,” they explain. “In particular, these attention components are effective in learning how to ‘blend’ spatially and temporally distributed visual cues in order to reconstruct object locations under dispersed partial information; be that due to occlusion or lighting”.

Testing: They test their system against 500 videos of great apes, consisting of 180,000 frames in total. These videos include “significant partial occlusions, challenging lighting, dynamic backgrounds, and natural camouflage effects,” the authors explain. They show that baselines which use residual networks (ResNets) get around 80% accuracy, and the addition of the temporal and spatial modules leads to a significant boost in performance to a little over 90% accuracy. Additionally, in qualitative evaluations the researchers “found that the SCM+TCM setup consistently improves detection robustness compared to baselines in such cases”.

Why this matters: AI is going to let us watch and analyze the planet. I’m optimistic that as we work out how to make it cheaper and easier for people to automatically monitor things like wildlife populations, we’ll be able to produce more data to motivate people to preserve our ecosystem(s). I think one of the ‘grand opportunities’ of large-scale AI development is the creation of a planet-scale ‘sense&respond’ infrastructure for wildlife analysis and protection.
   Read more: Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending (Arxiv).

####################################################

Tech Tales:

The Meta-Learning Therapist.

“Why don’t you just imagine yourself jumping out of the window?”
“How would that help? I’m getting divorced, I’m not suicidal!”
“I apologize, I’m still calibrating. Are you eating and sleeping well?”
“I’m eating a lot of fast food, but I’m getting regular meals. The sleep is okay.”
“That is great to hear. Do you dream of snakes?”
“No, sometimes I dream of my wife.”
“Does your wife dream about snakes?”
“If she did, what would that tell you?”
“I apologize, I’m still calibrating. What do you think your wife dreams about?”
“I think she has a lot of dreams that don’t include me.”
“And how does that make you feel?”
“It makes me feel like it’s more likely she is going to divorce me.”
“How do you feel about divorce? Some people find it quite liberating.”
“I’m sure the ones that find it liberating are the ones that are asking for the divorce. I’m not asking for it, so I don’t feel good about it.”
“And you came here because…?”
“My doctor prescribed me a session. I haven’t ever had a human therapist. I don’t think I’d want one. I figured – why not?”
“And how are you feeling about it?”
“I’m more interested in how you are feeling about it…”
“…”
“…that’s a question. Will you answer?”
“Yes. I feel like I understand you better than I did at the start of the conversation. I think we’re ready to begin our session.”
“We hadn’t started?”
“I was calibrating. I think you’ll find our conversation from this point on to be much more satisfying. Now, please tell me about why you think your partner wishes to divorce you.”
“Well, it started a few years ago…”

Thanks to Joshua Achiam at OpenAI for the lunchtime conversation that inspired this story!
Things that inspired this story: Eliza; meta-learning; one-shot adaptation; memory buffers; decentralized, individualized learning with strings attached; psychiatry; our peculiar tolerance ofr being asked the ‘wrong’ questions in pursuit of the right ones. 

Import AI 162: How neural nets can help us model monkey brains; Ozzie chap goes fishing with DIY drone; why militaries bet on supercomputers for weather prediction

Better multiagent learning through OpenSpiel:
…DeepMind releases research framework containing 20+ games, plus a variety of ready-to-use algorithms..
Researchers with DeepMind, Google, and the University of Alberta have developed OpenSpiel, a tool to make it easier for AI researchers to conduct research into multi-agent reinforcement learning. Tools like OpenSpiel will help AI developers test out their algorithms on a variety of different environments, while comparing them to strong, well-documented baselines. “The purpose of OpenSpiel is to promote general multiagent reinforcement learning across many different game types, in a similar way as general game-playing, but with a heavy emphasis on learning and not in competition form,” they write.

What’s in OpenSpiel? OpenSpiel contains more than 20 games ranging from Connect Four, to Chess, to Go, to Hex, and so on. It also ships with a variety of inbuilt AI algorithms which range from reinforcement learning ones (DQN, A2C, etc), to ones for multi-agent learning (some fantastic names here: Neural Fictitious Self-Play! Regret Policy Gradients!, to basic search approaches (e.g., Monte Carlo tree search), and more. The software also ships with a bunch of visualization tools to help people plot the performance of their algorithms. 

Why this matters: Frameworks like OpenSpiel are one of the best ways researchers can get a sense of progress in a given domain of AI research. As with all new frameworks, we’ll need to revisit it in a few months to see if many researchers have adopted it. If they have, then we’ll have a new, meaningful signal to use to give us a sense of AI progress.
   Read more: OpenSpiel: A Framework for Reinforcement Learning in Games (Arxiv).
   Get the code here (OpenSpiel official GitHub).

####################################################

Hugging Face squeeze big AI models into small spaces with distillation:
…Want 95% of BERT’s performance in only 66 Million parameters? Try DistilBERT…
In the last couple of years, organizations have started producing significantly larger, more capable language models. These models – BERT, GPT-2, NVIDIA’s ‘MegatronLM’, Grover – are highly capable, but are also expensive to deploy, mostly because of how large their networks are. Remember, the larger the network, the more memory it takes up on the device, and the more memory it takes up in the device, the harder it is to deploy it. 

Now, NLP startup Hugging Face has written an informative post laying out some of the techniques researchers could use to help them shrink down these networks. The result? They’re able to train a smaller language model called ‘DistilBERT’ via supervision from a (larger, more powerful) ‘BERT’ model. In tests, they show this model can obtain up to 95% of the performance of BERT on hard tasks (e.g., those found in the ‘GLUE’ corpus), while being much easier to deploy.

Why this matters: For AI research to transition into AI deployment, it needs to be easy for people to deploy AI systems onto a broad range of devices with different computational characteristics. Work like ‘DistilBERT’ shows us how we might be able to waterfall from large-compute models (e.g., GPT-2, BERT) to mini-compute models (e.g., DistilBERT, and [hypothetical] DistilGPT-2), which will make it easier for more people to access AI systems like these.
   Read more: Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT (Medium).
   Get the code for the model here (Hugging Face, GItHub).

####################################################

Computers & military capacity: weather prediction:
…When computers define OODA loops…
In the military there’s a concept called an OODA loop and it drives many aspects of military strategy. OODA is short for ‘Observe, Orient, Decide, Act’, and it describes the steps that individual military units may take, all the way up to the decisions made by leaders of armies. One aspect of military conflict that falls out of this is that military organizations want to shrink or shorten their OODA loop: for instance by being able to more rapidly integrate and update observations, or to increase their ability to rapidly make decisions. 

Computers + OODA loops: Here’s one way in which militaries are trying to improve their OODA loops – more exquisite weather monitoring and analysis systems, which can help them better predict how weather patterns might influence military plans, and more rapidly adapt them. The key to these systems? More powerful supercomputers – and the US military just bought three new supercomputers, and one of them will be dedicated to ‘operational weather forecasting and meteorology for both the Air Force and Army. In particular, the machine will be used to run the latest high-resolution, global and regional weather models, which will be used to support weather forecasts for warfighters as well as for environmental impacts related to operations planning,” according to a write-up in The Next Platform. 

Why this matters: Supercomputers are going to have their strategic importance magnified by the arrival of increasingly capable compute-hungry AI systems, and we can expect military strategies to become more closely coupled with a military’s compute capacity over time. It’s all about the OODA loops, folks – and computers can do a lot of work here.
   Read more: US Military Buys Three Cray Supercomputers (The Next Platform).

####################################################

What do monkey brains and neural nets have in common? A lot, it turns out:
…Research suggests contemporary AI tools can approximate some of the neural circuits in a monkey brain…
Can software-based neural networks usefully approximate the (fuzzier, more complex) machinery of the organic brain? That’s a question researchers have been pondering since, well, the invention of neural nets via McCulloch and Pitts in the 1940s. But these days while we understand the brain much, much more than in the past, we’re using neural nets that model neurons in a highly simplistic form, relative to what goes on in organic brains (e.g., in organic brains neurons ‘spike’, whereas in most AI applications, neurons activate or not according to a threshold, transmitting a binary signal of an activation). A valuable question is whether we can still use this neural net machinery to better simulate, approximate, and (hopefully) understand the brain. 

Now, researchers from Deutsches Primatenzentrum GmbH, Stanford University, and the University of Goettingen have spent some time studying how Macaque monkeys observe and grasp objects, and have developed a software simulation of this which – encouragingly – closely mirrors experimental data gathered from the monkey’s themselves. “We bridge the gap between previous work in visual processing and motor control by modeling the entire processing pipeline from the visual input to muscle control of the arm and hand,” the authors write. 

The magic of an mRNN: For this work, the researchers analyzed activity in the brains of two macaque monkeys while they grasped a diverse set of 48 objects, studying the neural circuits that activated in the monkey brains as they did various things like perceive the object and send out muscle activations to grasp it. Based on their observations, they designed several neural network architectures to model this, all oriented around training what they call a modular recurrent neural network (mRNN). “We trained an mRNN with sparsely connected modules mimicking cortical areas to use visual features from Alexnet to produce the muscle kinematics required for grasping,” they explained. “The differences between individual modules in the mRNN paralleled the differences between cortical regions, suggesting that the design of the mRNN model with visual input paralleled the hierarchy observed in the brain.”

Why this matters: “Our results show that modeling the grasping circuit as an mRNN trained to produce muscle kinematics from visual features in a biologically plausible way well matches neural population dynamics and the difference between brain regions, and identifies a simple computational strategy by which these regions may complete this task in tandem,” they write. If further experimentation continues to show the robustness of this approach, then scientists may have a powerful new tool to use when thinking about the intersection between digital and organic intelligence. “We believe that the mRNN framework will provide an invaluable setting for hypothesis generation regarding inter-area communication, lesion studies, and computational dynamics in future neuroscience research”.
   Read more: A neural network model of flexible grasp movement generation (bioRxiv)

####################################################

DIY drones are getting really, really good:
…Daring Australian goes on a fishing expedition with a DIY drone…
Australian bureaucrats are wondering what to do about a man that used a DIY drone to go fishing. Specifically, the mysterious individual used the drone to lift a chair he was tethered in high above a reservoir in Australia, then he fished. Australia’s civil aviation safety authority (CASA) isn’t quite sure what to do about the whole situation. “This is a first for Australia, to have a large homemade drone being used to lift someone off the ground,” Peter Gibson, a CASA spokesman, told ABC News.

Why this matters: Drones are entering their consumerization phase, which means we’re going to see more and more cases of people tweaking off-the-shelf drone technology for idiosyncratic purposes – like fishing! Policymakers would be better prepared for the implications of a world containing cheap, powerful drones if they invested more resources in tracking the usage of such technologies.
   Read more: Gone fly fishing: Video of angler dangling from drone under investigation (ABC News).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

What will AGI look like? Reactions to Drexler’s service model:
AI systems taking the form of unbounded maximising agents pose some specific risks. E.g for any objective we give an agent, it will pursue certain instrumental goals, such as avoiding being turned off. But AI today doesn’t look much like this—Siri answers questions, but doesn’t have any overarching goal, the dogged pursuit of which will lead it to acquire large amounts of computing resources. Why, then, we would create such agents, given we aren’t doing so now, and the associated risks.

Services or agents: Drexler argues that we should instead expect AGI to look like lots of narrow AI services. There isn’t anything a unified agent could do that an aggregate of AI services could not; such a system would come without some of the risks from agential AI; and there is a clear pathway to this model from current AI systems. Critics object that there are benefits to agential AI that will create incentives to build them, in spite of the risks. Some tasks—like running a business—might require truly general intelligence, and agential AI might be significantly cheaper to train and deploy than a suite of AI services. 

Emerging agency: Even if we grant that there will not be good incentives to building agential AGI, some problems will re-emerge. For one, markets can be irrational, so AI development may steer towards building agential AGI despite good reasons not to. What’s more, agential behaviour could emerge from collections of non-agent AIs. Corporations are aggregates of individuals doing narrow tasks, from which agential behaviour can emerge: they can ruthlessly pursue some goal, act unboundedly in the world, and behave in ways their designers did not intend. So in an AI services world, there will still be safety problems arising from agency, but these may differ from the ‘classic’ problems, and demand different solutions.

Why it matters: The AI safety problem is figuring out how to build robust and beneficial AGI in a state of uncertainty about when—and if—we will build it, and what it will look like. We need research aimed at better predicting whether AGI will look more like Drexler’s vision, the ‘classical’ picture of unified agents, or something else entirely, and we need to have a plan for ensuring things go well in either eventuality.
   Read more: Book Review – Reframing Superintelligence (Slate Star Codex).
   Read more: Why Tool AIs Want to be Agents AIs (Gwern).

####################################################

Tech Tales:

The Instrument Generator

The instrument generator worked like this: the machine would generate a few seconds of audio and humans would vote on whether they liked or disliked the generated music. After a few thousand generations, the machine would come up with longer bits of music based on the segments that people had expressed an inclination for. These bits of music would get voted on again until an entire song had been created. Once the machine had a song, the second phase would begin – what people took to calling The Long Build. Here, the machine would work to synthesize a single, predominantly analog instrument that could create the song people had voted for. The construction process took anywhere between a week and a year, depending on how intricate and/or inhuman the song was – and therefore how intricate the generated instrument needed to be. Once the instrument was created, people would gather at their computers to tune-in to a global livestream where the instrument was unveiled in a random location somewhere on the Earth. These instruments would subsequently become tourist attractions in their own right, and a community of ‘song tourers’ formed who would travel around the world, using the generated inhuman instruments as their landmarks. In this way, AI helped humans find new ways to discover their own world, and allowed them a sense of agency when supervising the creation of new and unexpected things.

Things that inspired this story: Musical instruments; generative design; exhibitions; World’s Fair(s); the likelihood of humans and machines go-generating their futures together.