Import AI: Issue 60: The no good, very bad world of AI & copyright, why chatbots need ensembles of systems, and Amazon adds robot arms to its repertoire

by Jack Clark

Welcome to Import AI, subscribe here.

AI education organization moves from TensorFlow&Keras to PyTorch, following 1,000 hours of evaluation:
…, an education organization that teaches people technical skills, like learning to program deep learning systems, via practical projects, will write all of its new courses in PyTorch, an AI programming framework developed by Facebook.
…They switched over from TF&Keras for a couple of reasons, including PyTorch’s accessibility as a programming language, expressiveness, and native support.
…”The focus of our second course is to allow students to be able to read and implement recent research papers. This is important because the range of deep learning applications studied so far has been extremely limited, in a few areas that the academic community happens to be interested in. Therefore, solving many real-world problems with deep learning requires an understanding of the underlying techniques in depth, and the ability to implement customised versions of them appropriate for your particular problem, and data. Because Pytorch allowed us, and our students, to use all of the flexibility and capability of regular python code to build and train neural networks, we were able to tackle a much wider range of problems,” they write.
…Read more here: Introducing Pytorch for

AI and Fair Use: The No Good, Very Bad, Possibly Ruinous, and/or Potentially Not-So-Bad World of AI & Copyright…
…You know your field is established when the legal scholars arrive…
…Data. It’s everywhere. Everyone uses it. Where does it come from? The fewer questions asked the better. That’s the essential problem facing modern AI practitioners: there are a few open source datasets that are kosher to use, then there’s a huge set of data that people use to train models which they may not have copyright permissions for. That’s why most startups and companies say astonishingly little about where they get their data (either it is generated by a strategic asset, or it may be of.. nebulous legal status). As AI/ML grows in economic impact, it’s fairly likely that this mass-scale usage of other people’s data could run directly into fair use laws as they relate to copyright.
…In a lengthy study author Benjamin Sobel, with Harvard’s Berkman Center, tries to analyze where AI intersects with Fair Use, and what that means for copyright and IP rights to synthetic creations.
…We already have a few at-scale systems that have been trained with a mixture of data sources, primarily user generated. Google, for instance, trained its ‘Smart Reply’ email-reply text generator on its corpus of hundreds of millions of emails, which is probably fine from a legal POV, but the fact it then augmented this language model with data gleaned from thousands of Romance novels is less legally clear, because it seemed to use the Romance novels explicitly because they have a regular, repetitive writing style, which helps it inject more emotion into its relatively un-nuanced emails, so to some extent it was targeting a specific creative product from the authors of the dataset. Similarly, Jukedeck, a startup, lets people create their own synthetic music via AI and even have the option to “Buy the Copyright” of the resulting track – even though it’s not clear what data Jukedeck has used and whether it’s even able to sell the Copyright to a user.
How does this get resolved? Two possible worlds. One is a legal ruling that usage of an individual’s data in AI/ML models isn’t fair use, and one is a world where the law goes the other way. Both worlds have problems.
World One: the generators of data used in datasets can now go after ML developers, and can claim statutory damages of at least $750 per infringed work (and up). When you consider that ML models typically involve millions to hundreds of millions of datapoints, a single unfavorable ruling re a group of users litigating fair use on a dataset, could ruin a company. This would potentially slow development of AI and ML.
World Two: a landmark legal ruling recognizes AI/ML applications as being broadly fair use. What happens then is a free-for-all as the private sector hoovers up as much data (public and private) as possible, trying to train new models for economic gain. But no one gets paid and inequality continues to increase as a consequence of these ever-expanding ML-data moats being built by the companies, made possible by the legal ruling.
Neither world seems sensible: Alternative paths could include legally compelling companies to analyze what portions of their business benefit directly as a consequence of usage of AI/ML, then taxing those portions of the business to feed into author/artists funds to disperse funding to the creators of data. Another is to do a ground-up rethink of copyright law for the AI age, though the author does note this is a ‘moonshot’ idea.
…”The numerous challenges AI poses for the fair use doctrine are not, in themselves, reasons to despair. Machine learning will realize immense social and financial benefits. Its potency derives in large part from the creative work of real human beings. The fair use crisis is a crisis precisely because copyright’s exclusive rights may now afford these human beings leverage that they otherwise would lack. The fair use dilemma is a genuine dilemma, but it offers an opportunity to promote social equity by reasserting the purpose of copyright law: to foster the creation and dissemination of human expression by securing, to authors, the rights to the expressive value in their works,” he writes.
…Read more here: Artificial Intelligence’s Fair Use Crisis.

Open source: Training self-driving trucking AIs in Eurotruck Simulator:
…The new open source ‘Europilot’ project lets you re-purpose the gleefully technically specific game Eurotruck Simulator as a simulation environment for training agents to drive via reinforcement learning.
Train/Test: Europilot offers a couple of extra features to ease training and testing AIs on it, including being able to automatically output a numpy array from screen input at training time, and at test time creating a visible virtual onscreen joystick the network can use to control the vehicle.
Get the code here: Europilot (GitHub.)
Dream experiment: Can someone train a really large model over many tens of thousands of games then try to use domain randomization to create a policy that can generalize to the real world – at least for classification initially, then perhaps eventually movement as well?

Self-navigating, self-flying drone built with deep reinforcement learning:
…UK researchers have used a variety of deep-q network (DQN) family algorithms to create a semi-autonomous quadcopter that can learn to navigate to a landmark and land on it, in simulation.
….The scientists use two networks to let their drones achieve their set goals, including one network for landmark spotting, and another for vertical descent. The drone learns in a semi-supervised manner, figuring out how to use low-resolution pixel visual inputs to guide itself. The two distinct networks are are daisy-chained together via special action triggers, so when the landmark-spotting network detects the landmark is directly beneath the drone, it hands off to the vertical descent network to land the machine. (It would be interested to test this system on the reverse set of actions and see if its network generalizes, figuring out how to instead have the ‘land-in-view; network hand off to the ‘fly to’ network, and make some tweaks to perhaps get the ‘fly to’ network to become ‘fly away’.)
Results: The duel-DQN-network system achieved marginally better scores than a human when trying to pilot drones to landmarks and land them, and attained far higher scores than a system consisting of one network trained in an end-to-end manner.
Components used: Double DQN, a tweaked version of prioritized experience replay called ‘partitioned buffer replay’, a (simulated) Parrot AR Drone 2.
…This is interesting research with a cool result but until I see stuff like this running on a physical drone I’ll be somewhat skeptical of the results – reality is hard and tends to introduce some unanticipated noise and/or disruptive element that the algorithm’s training process hasn’t accounted for and struggles to generalize to.
Read more here: Autonomous Quadcopter Landing using Deep Reinforcement Learning.

Facebook spins up AI lab in Montreal…
….Facebook AI Research is opening up its fourth lab worldwide. The new lab in Montreal (one of Canada/the world’s key hubs for deep learning and reinforcement learning) will sit alongside existing FAIR labs in Menlo Park, New York City, and Paris.
…The lab will be led by McGill University professor Joelle Pineau, who will work with several other scientists. In a speech Yann Lecun said most of FAIR’s other labs are between 30 and 50 people and he expects Montreal to grow to this number as well.
…Notable: Canadian PM Justin Trudeau gave a speech showing significant political support for AI. In a chat with Facebook execs he said he had gotten an A+ in a C++ class in college that required him to write a raytracer.
…Read more here: Expanding Facebook AI Research to Montreal.

Simulating populations of thousands to millions of simple proto-organic agents with reinforcement learning:
Raising the question: Who will be the first AI Zoologist, tasked with studying and cataloging the proclivities of synthetic, emergent creatures?…
…Researchers with University College London and Shanghai Jiao Tong University have carried out a large scale (up to a million entities) simulation of agents trained via reinforcement learning. They set their agents in a relatively simple grid world consisting of predators and prey, and the setting of the world lead to agents that collaborate with one another gaining higher rewards over time. The result is that many of the species ratios (how many predators versus prey are alive at any one time) end up mapping fairly closely to what happens in real life, with the simulated world displaying the characteristics predicted by Lotka-Volterra dynamics equations used to explain phenomena in the natural world. This overlap is encouraging as it suggests such systems like the above, when sufficiently scaled up, could let us simulate dynamic problems where more of the behaviors emerge through learning rather than programming.
A puzzle: The ultimate the trick will be coming up with laws that map the impermeable synthetic creatures and their worlds to the real worlds as well, letting us analyze the difference between simulations and reality, I reckon. Having systems that can anticipate the ‘reality gap’ of AI algorithms versus reality would far enhance our understanding of the interplay of these distinct systems.
…”Even though the Lotka-Volterra models are based on a set of equations with fixed interaction terms, while our findings depend on intelligent agents driven by consistent learning process, the generalization of the resulting dynamics onto an AI population still leads us to imagine a general law that could unify the artificially created agents with the population we have studied in the natural sciences for long time,” they write.
…Read more here: An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning.

Learning the art of conversation with reinforcement learning:
…Researchers from the Montreal Institute of Learning Algorithms (MILA) (including AI pioneer Yoshua Bengio) have published a research paper outlining ‘MILABOT’, their entry into Amazon’s ‘Alexa Prize’LINK meant to stimulate activity in conversational agents.
…Since MILABOT is intended to be deployed into the most hostile environment any AI can face – open-ended conversational interactions with people with unbounded interests – it’s worth studying the system to get an idea of the needs of applied AI work, as opposed to pure research.
…The secret to MILABOT’s success (it was a semi-finalist, and managed to score reasonably highly in terms of user satisfaction, while also carrying out some of the longest conversations of the competition) appears to be the use of lots of different models, ensembled together. It then uses reinforcement learning to figure out during training how to select between different models to create better conversations.
Models used: 22(!), ranging from reasonably well understood ones (AliceBot, ElizaBot, InitiatorBot), to ones built using neural network technologies (eg, LSTMClassifierMSMarco, GRU Question Generator).
Components used: Over 200,000 labels generated via Mechanical turk, 32 dedicated Tesla K80 GPUs.
What this means: To me this indicates that full-fledged open domain assistants are still a few (single digit) years away from being broad and un-brittle, but it does suggest that we’re entering an era in which we can fruitfully try to build these integrated, heavily learned systems. I also like the Franken-Architecture used by the researchers where they ensemble together many distinct systems, some of which are supervized or structured and some of which are learned.
Auspicious: In the paper the researchers note “‘Further, the system will continue to improve in perpetuity with additional data.‘” – this is not an exaggeration, it’s just how systems work that are able to iteratively learn over data, endlessly re-calibrating and enhancing their ability to distinguish between subtle things.
…Read more: A Deep Reinforcement Learning Chatbot.

Amazon’s robot empire grows with mechanical arms:
…Amazon has started deploying mechanical arms in its warehouses to help stack and place pallets of goods. The arms are made by an outside company.
…That’s part of a larger push by Amazon to add even more robots into its warehouse. Today, the company has over 100,000 of them, it says. Its Kiva system population alone has grown from 15,000 in 2014 to 30,000 in 2015 to 45,000 by Christmas of 2016.
…The story positions these robots as being additive for jobs, with new workers moving onto new roles, some of which include training or tending their robot replacements. That’s a cute narrative, but it doesn’t help much with the story of the wider economy, in which an ever smaller number of mega firms (like Amazon) out-compete and out-automate their rivals. Amazon’s workers may be fine working alongside robots, but I’d hazard a guess the company is destroying far more traditional jobs in the aggregate by virtue of its (much deserved) success.
…Read more here: As Amazon Pushes Forward with Robots, Workers Find New Roles.

OpenAI bits&pieces:

Learning to model other minds with LOLA:
….New research from OpenAI and the University of Oxford shows how to train agents in a way where they learn to account for the actions of others. This represents an (incredibly early, tested only in small-scale toy environments) to creating agents that model other minds as part of their learning process.
…Read more here: Learning to model other minds.

Tech Tales:

[2029: A government bunker, buried inside a mountain, somewhere hot and dry and high altitude in the United States of America. Lots of vending machines, many robots, thousands of computers, and a small group of human overseers.]

TIME: 0800.

Unaffiliated Systems Unknown Reactive Payload, or USURP, are a class of offensive, semi-autonomous cyber weapons created several years ago to semi-autonomously carry out large-scale area denial attacks in the digital theater. They are broad, un-targeted weapons designed as strategic deterrents, developed to fully take down infrastructure in targeted regions.

Each USURP carries a payload of between 10 and 100 zero day vulnerabilities classified at ‘kinetic-cyber’ or hire, along with automated attack and defense sub-processes trained via reinforcement learning. USURPs are designed so that the threat of their usage is sufficient to alter the actions of other actors – we have never taken credit for them but we’ve never denied them and suspect low-level leaks mean our adversaries are aware of them. We have never activated one.

In directive 347-2 we were tasked a week ago to deploy the codes to all USURP’s deployed in the field so as to make various operational tweaks to them. We were able to contact all systems but one of them. The specific weapon in question is USURP 742, a ‘NIGHTSHADE’ class device. We deployed USURP742 into REDACTED country REDACTED years ago. Its goal was to make its way into the central grid infrastructure of the nation, then deploy its payloads in the event of a conflict. Since deploying USURP742 the diplomatic situation with REDACTED has degraded further, so 742 remained active.

USURPS are designed to proactively shift the infrastructure they run on, so they perform low-level hacking attacks to spread into other data centers, regularly switching locations to frustrate detection and isolation processes. USURP247 was present in REDACTED locations in REDACTED at the time of Hurricane Marvyn (See report CLIMATE_SHOCKS appendix ‘EXTREME WEATHER’ entry ‘HM: 2029). After Marvyn struck we remotely disabled USURP742’s copies in the region, but we weren’t able to reach one of them – USURP742-A. The weapon in question was cut off from the public internet due to a series of tree-falls and mudslides as a consequence of HM. During reconstruction efforts REDACTED militarized the data center USURP742-A resided in and turned it into a weapons development lab, cut off from other infrastructure.

0100: Received intelligence that fiber installation trucks had been deployed to the nearby area.
0232: Transport units associated with digital-intelligence agency REDACTED pull into the parking lot of the data center. REDACTED people get out and enter data center, equipped with Cat5 diagnostic servers running REDACTED.
0335: On-the-ground asset visually verifies team from REDACTED is attaching new equipment to servers in data center.
0730: Connection established between data center and public internet.
0731: Lights go out in the datacenter.
0732: Acquisition of digital identifier for USURP742-A. Attempted remote shut down failed.
0733: Detected rapid cycling of fans within the data center and power surges.
0736: Smoke sighted.
0738: Deployment of gas-based fire suppression system in data center.
0742: Detected USURP transmission to another data center. Unresponsive to hailing signals. 40% confident system has autonomously incorporated new viruses developed by REDACTED at the site into its programming, likely from Cat5 server running REDACTED.
0743: Cyber response teams from REDACTED notified of possible rogue USURP activation.
0745: Assemble a response portfolio for consideration by REDACTED ranging cyber to physical kinetic.
0748: Commence shutdown of local internet ISPS in collaboration with ISPS REDACTED, REDACTED, REDACTED.

TIME: 0900.
INCIDENT STATUS: Active. Broadening.

0820: Detected shutdown of power stations REDACTED, REDACTED, and REDACTED. Also detected multiple hacking attacks on electronic health record systems.
0822: Further cyber assets are deployed.
0823: Connections severed at locations REDACTED in a distributed cyber perimeter around affected sites.
0824: Multiple DDOS attacks begin emanating from USURP-linked areas.
0825: Contingencies CLASSIFIED activated.
0826: Submarines #REDACTED, #REDACTED, #REDACTED arrive at at inter-continental internet cables at REDACTED.
0827: Command given. Continent REDACTED isolated.
0830: Response team formed for amelioration of massive loss of electronic infrastructure in REDACTED region.