Import AI 147: Alibaba boosts TaoBao performance with Transformer-based recommender system; learning how smart new language models like BERT are; and a $3,000 robot dog

by Jack Clark

Weapons of war, what are they good for?
…A bunch of things, but we need to come up with laws to regulate them…
Researchers with ASRC Federal, a company that supplies technology services to the government (with a particular emphasis on intelligence/defense)  think advances in AI “will lead inevitably to a fully automated, always on [weapon] system”, and that we’ll need to build such weapons to be aware of human morals, ethics, and the fundamental unknowability of war.

In the paper, the researchers observe that: “The feedback loop between ever-increasing technical capability and the political awareness of the decreasing time window for reflective decision-making drives technical evolution towards always-on, automated, reflexive systems.” This analysis suggests that in the long-term we’re going to see increasingly automated systems being rolled out that will change the character of warfare.

More war, fewer humans: One of the effects of increasing the amount of automation deployed in warfare is to reduce the role humans play in war. “We believe that the role of humans in combat systems, barring regulation through treaty, will become more peripheral over time. As such, it is critical to ensure that our design decisions and the implementations of these designs incorporate the values that we wish to express as a national and global culture”

Why this matters: This is a quick paper that lays out the concerns of AI+War from a community we don’t frequently hear from: people that work as direct suppliers of government technology . It’s also encouraging to see the concerns regarding the dual use of AI outlined by the researchers. “Determining how to thwart, for example, a terrorist organization turning a facial recognition model into a targeting system for exploding drones is certainly a prudent move,” they write.
  Read more: Integrating Artificial Intelligence into Weapon Systems (Arxiv).

#####################################################

Mapping the brain with the Algonauts project:
What happens when biological and artificial intelligence researchers collaborate?…
Researchers with the Freie Universitat Berlin, Singapore University Technology , and MIT, have proposed ‘The Algonauts Project’, an initiative to get biological and artificial intelligence researchers to work together to understand the brain. As part of the project, the researchers want to learn to build networks that “simulate how the brain sees and recognizes objects”, and are hosting a competition and workshop in 2019 to encourage work here. The inspiration for this competition is that, today, deep neural networks trained on object classification “are currently the model class best performing in predicting visual brain activity”.

The first challenge has two components:

  • Create machine learning models that predict activity in the early and late parts of the human visual hierarchy in the brain. Participants submit their model responses to a test image set which is compared against held-out fMRI data. This part of the competition measures how well people can build things that model, at fine-detail, activity in the brain at a point in time.
  • Create machine learning models that predict brain data from early and late stages of visual processing in the brain. Participants will submit model responses to a test image dataset and compare against held-out magnetoencephalography (MEG) millisecond temporal resolution data. This challenge assesses how well we can model sequences of activities in the brain.

Cautionary tale: The training datasets for the competition are small, consisting of a few hundred pairs of images and brain data in response to the images, so participants may want to use additional data.

Future projects: Future challenges within the Algonauts project might “focus on action recognition or involve other sensory modalities such as audition or the tactile sense, or focus on other cognitive functions such as learning and memory”.

Why this matters: Cognitive science and AI seem likely to have a mutually reinforcing relationship,l where progress on one domain helps on the other. Competitions like those run by the Algonauts project will generate more activity at the intersection between the two fields, and hopefully push progress forward.
  Find out more about the first Algonauts challenge here (official competition website).
  Read more: The Algonauts Project: A Platform for Communication between the Sciences of Biological and Artificial Intelligence (Arxiv).

#####################################################

Alibaba improves TaoBao e-commerce app with better recommendations:
… Chinese mega e-commerce company shows how a Transformer-based system can improve recommendations at scale…
Alibaba researchers have used a Transformer-based system to more efficiently recommend goods to users of Taobao, a massive Chinese e-commerce app. Their system, which they call the ‘user behavior sequence transformer’ (or “BST”), lets them take in a bunch of datapoints relating to a specific user, then predict what product to show the user next. The main technical work here is a matter of integrating a ‘Transformer’-based core into an existing predictive system used by Alibaba.

Results: The researchers implemented the BST within TaoBao, experimenting with using it to make millions of recommendations. In tests, the BST system let to an online click through rate of 7.57% – a commercially significant performance increase.  

Why this matters: Recommender systems are one of the best examples of ‘the industrialization of AI’, and represent a litmus test for whether a particular technique works at scale. In the same way that it was a big deal when a few years ago Google and other companies started switching to deep learning-based approaches for aspects of speech and image recognition, it seems like a big deal now that relatively new AI systems, like the ‘Transformer’ component, are being integrated into at-scale, business-relevant applications. In general, it seems like the ‘time to market’ for new AI research is dramatically shorter than for other fields (including software-based ones). I think the implications of this are profound and underexplored.
  Read more: Behavior Sequence Transformer for E-commerce Recommendation in Alibaba (Arxiv).

#####################################################

Stanford researchers try to commoditize robots with ‘Doggo’:
…$3,000 quadruped robot meant to unlock robotics research…
Stanford researchers have created ‘Doggo’, a robotic quadruped robot “that matches or exceeds common performance metrics of state-of-the-art legged robots”. So far, so normal. What makes this research a bit different is the focus on bringing down the cost of the robot – the researchers say people can build their own Doggo for less than $3000. This is part of a broader trend of academics trying to create low-cost robotics systems, and follows Berkeley releasing its ‘BLUE’ robotic arm (Import AI 142) and Indian researchers developing a ~$1,000 quadruped named ‘Stoch’ (Import AI 128).

Doggo is a four-legged robot that can run, jump, and trot around the world. It can even – and, to be clear, I’m not making this up – use a “pronking” gait to get around the world. Pronking!

Cheap drives
: Like most robots, the main costs inherent to Doggo lie in things it uses to move around. In this case, that’s a quasi-direct drive (QDD), a type of drive that “increases torque output at the expense of control bandwidth, but maintains the ability to backdrive the motor which allows sensing of external forces based on motor current,” they write.

Dual-use: Right now, robots like Doggo are pretty benign – we’re at the very beginning of the creation of widely-available, quadruped platforms for AI research, and my expectation is any hardware platform at this stage has a bunch of junky flaws that make most of them semi-reliable. But in a few years, once the hardware and software has matured, it seems likely that robots like this will be deployed more widely for a bunch of uses not predicted by their creators or today’s researchers, just as we’ve seen with drones. I wonder about how we can better anticipate these kinds of risks, and what things we could measure or assess this: eg, cost to build is one metric, another could be expertise to build, and another could be ease of customization.) 

Why this matters: Robotics is on the cusp of a revolution, as techniques developed by the deep learning research community become increasingly tractable to run and deploy on robotic platforms. One of the things standing in the way of widespread robot deployment seems to me to be insufficient access to cheap robots for scientists to experiment on (eg, if you’re a machine learning research, it’s basically free in terms of time and cost to experiment on CIFAR-10 or even ImageNet, but you can’t trivially prototype algorithms against real robots, only simulated ones. Therefore, systems like Doggo seem to have a good chance of broadening access to this technology. Now, we just need to figure out the dual use challenges, and how we approach those in the future.
  Read more: Stanford Doggo: An Open-Source, Quasi-Direct-Drive Quadruped (Arxiv).

#####################################################

Big neural networks re-invent the work of a whole academic field:
Emergent sophistication of language models shows surprising parallels to classical NLP pipelines…
As neural networks get ever-larger, the techniques people use to analyze them look more and more like those found in analyzing biological life. The latest? New research from Google and Brown University seeks to probe a larger BERT model by analyzing layer activations in the network in response to a particular input. The new research speaks to the finicky, empirical experimentation required when trying to analyze the structures of trained networks, and highlights how sophisticated some AI components are becoming.

BERT vs NLP: Google’s ‘BERT’ is a Transformer-based neural network architecture that came out a year ago and has since, along with Fast.ai’s ULMFiT and OpenAI’s GPT-1 and GPT-2, defined a new direction in NLP research, as people throw out precisely constructed pipelines and systems for more generic, semi-supervised approaches like BERT. The result has been the proliferation of a multitude of language models that obtain state-of-the-art scores on collections of hard NLP tasks (eg: GLUE), along with systems capable of coherent text generation (eg: GPT2).

Performance is nice, but explainability is better: In tests, the researchers find that, much like a trained vision networks, the lower layers in a trained BERT model appear to perform more basic language tasks, and higher layers do more sophisticated things. “We observe a consistent trend across both of our metrics, with the tasks encoded in a natural progression: POS tags processed earliest, followed by constituents, depencies, semantic roles, and coreference,” they write.
“That is, it appears that basic syntactic information appears earlier in the network, while high-level semantic information appears at higher layers.” The network isn’t dependent on the precise ordering of these tasks, though: “on individual examples the network can resolve out-of-order, using high-level information like predicate argument relations to help disambiguate low-level decisions like part-of-speech.”

Why this matters: Research like this gives us a sense for how sophisticated large generative models are becoming, and indicates that we’ll need to invest in creating new analysis techniques to be able to easily probe the capabilities of ever-larger and more sophisticated systems. I can envisage a future where scientists have a kind of ‘big empiricism toolbox’ they turn to when analysing networks, and we’ll also develop shared ‘evaluation methodologies’ for probing a bunch of different cognitive capabilities in such systems.
  Read more: BERT Rediscovers the Classical NLP Pipeline (Arxiv).

#####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net

San Francisco places moratorium on face recognition software:
San Francisco has become the first city in the US to (selectively and partially) ban the use of face recognition software by law enforcement agencies, until legally enforceable safeguards are put in place to protect civil liberties. The bill states that the technology’s purported benefits are currently outweighed by its potential negative impact on rights and racial justice. Before these products can be deployed in the future, the ordinance requires meaningful public input, and that the public and local government are empowered to oversee their uses.

Why it matters: Face recognition technology is increasingly being deployed by law enforcement agencies worldwide, despite persistent concerns about harms. This ordinance seems sensible: it is not an indefinite ban, but rather sets out clear requirements that must be met before the technology can be rolled out, most notably in terms of accountability and oversight.
Read more: Ordinance on surveillance technology (SFGov).
Read more: San Francisco’s facial recognition technology ban, explained (Vox).


Market-based regulation for safe AI:

This paper (note from Jack – by Gillian Hadfield (Vector Institute / OpenAI) and Jack Clark (OpenAI / Import AI)) – presents a model for market-based AI safety regulation. As AI systems become more advanced, it will become increasingly important to ensure that they are safe and robust. Public sector models of regulation could turn out to be ill-equipped to deal with these issues, due to a lack of resources and expertise, and slow reaction-times. Likewise, a self-regulation model has pitfalls in terms of legitimacy, and efficacy.

Regulatory markets: Regulatory markets are one model for addressing these problems: governments could create a market for private regulators, by compelling companies to pay for oversight by at least one regulator. Governments can then grant licenses to regulators, requiring them to achieve certain objectives, e.g. regulators of self-driving cars could be required to meet a target accident-rate.

What are they good for: Ensuring that advanced AI is safe and robust will eventually require powerful regulatory systems. Harnessing market forces could be a promising way for governments to meet this challenge, by directing more talent and resources into the regulatory sector. Regulatory markets will have to be well designed and maintained, to ensure they remain competitive, and independent from regulated entities.
  Read more: Regulatory markets for AI safety (Clark & Hadfield).
  Note from Jack: This also covered in the AI Alignment newsletter #55 (AI Alignment newsletter Mailchimp).


#####################################################

Tech Tales:
Battle of The Test Cases

Go in the room. Read the message. Score out of 10. Do this three times and you’re done.

Those were the instructions I got outside. Seemed simple. So I went in and there was a table and I sat at it and three cameras on the other side of the room looked at me, scanning my body, moving from side to side and up and down. Then the screen at the other end of the room turned on and a message appeared. Now I’m reading it.

Dear John,
We’re sorry for the pain you’ve been through recently – grief can be a draining, terrible thing to experience.

We know that you’re struggling to remember what they sounded like – that’s common. We know you’re not eating enough, and that your sleep is terrible – that’s also common. We are sorry.

We’re sorry you experienced it. We’re sorry you’re so young. If there’s anything we can do to help it is to recommend one thing: “Regulition”, the new nutrient-dense meal system designed for people who need to eat so that they can remember.

Remember, eat and sleep regularly, and know that with time it’ll get better.

What the fuck, I say. The screen flicks off, then reappears with new text: Please rate the message you received out of 10, with 10 being “more likely to purchase the discussed product” and 0 being less likely to purchase it. I punch in six and say fuck you again for good measure. Oh well, $30. Next message.

Dear John,
Don’t you want to run away, sometimes? Just take your shoes and pack a light bag then head out. Don’t tell your landlord. Don’t tell your family. Board a train or a bus or a plane. Change it up. We’ve all wanted to do this.

The difference is: you can. You can go wherever you want to go. You can get up and grab your shoes and a light bag, open your door, and head out. You don’t need to tell anyone.

Just make sure you bring some food so you don’t get slowed down on the road. Why not a grab and go meal like “Regulition”? Something as fast and flexible as you? Chug it down and take off.

Ok, I say. Nice. I punch in 8.  

Dear John,
Maybe you’re not such a fuckup. Maybe the fact you’ve been bouncing from job to job and from online persona to online persona means you’re exploring, and soon you’re going to discover the one thing in this world you can do better than anybody else.

You haven’t been lost, John – you’ve been searching, and now you know you’re searching know this: one day the whole world is going to come into focus and you’ll understand how you need to line things up to succeed.

To do this, you’ll need your wits about you – so why not pick up some “Regulition” so you can focus on the search, rather than pointless activities like cooking your own feed? Maybe it’s worth a few bucks a month to give yourself more time – after all, that might be all that’s standing between you and the end of your lifelong search.

Ok, I say. Ok. My hand hovers above the numbers.
What happens if I don’t press it? I say.
There’s a brief hiss of feedback as the speaker in the room turns on, then a voice says: “Payment only occurs upon full task completion, partial tasks will not be compensated.”
Ok, I say. Sure.

I think for a moment about the future: about waking up to emails that seem to know you, and messages from AI systems that reach into you and jumble up your insides. I see endless, convincing things, lining up in front of me, trying to persuade me to believe or want or need something. I see it all.

And because there’s probably a few thousand people like me, here, in this room, I press the button. Give it a 7.

Then it’s over: I sign-out at an office where I am paid $30 and given a free case of Regulition and a coupon for cost-savings on Your first yearly subscription. I take the transit home. Then I sit in the dark and think, my feet propped up on the case of “food” in front of me.

Things that inspired this story: Better language models for targeted synthetic text generation; e-marketing; Soylent and other nutrient-drinks; copywriting; user-testing; the inevitably of human a/b/c testing at-scale; reinforcement learning; learning from human preferences.