Import AI #99: Using AI to generate phishing URLs, evidence for how AI is influencing the economy, and using curiosity for self-imitation learning.

by Jack Clark

Auto-generating phishing URLs via AI components:
…AI is an omni-use technology, so the same techniques used to spot phishing URLs can also be used to generate phishing URLs…
Researchers with the Cyber Threat Analytics division of Cyxtera Technologies have written an analysis of how people might “use AI algorithms to bypass AI phishing detection systems” by creating their own system called DeepPhish.
  DeepPhish: DeepPhis works by taking in a list of fraudulent URLS that have been successfully worked in the past, encodes these as a one-hot representation, then trains a model to generate new synthetic URLs given a seed sentence. They found that DeepPhish could dramatically improve the chances of a fraudulent URL getting past automated phishing-detection systems, with DeepPhish URLs seeing a boost in effectiveness from 0.69% (no DeepPhish) to 20.90% (with DeepPhish).
  Security people always have the best names: DeepPhis isn’t the only AI “weapon” system recently developed by researchers, the authors note; other tools include Honey-Phish, SNAP_R, and Deep DGA.
  Why it matters: This research highlights how AI is an inherent omni-use technology, where the same basic components used to, for instance, train systems to learn to spot potentially fraudulent URLS, can also be used to generate plausible-seeming fraudulent URLs.
  Read more: DeepPhish: Simulating Malicious AI (PDF).

Curious about the future of reinforcement learning? Apply more curiosity!
…Self-Imitation Learning, aka: That was good, let’s try that again…
Self-Imitation Learning (SIL) works by having the agent exploit its replay buffer by learning to repeat its own prior actions if they have generated reasonable returns previously and, crucially, only when those actions delivered larger returns than were expected. The authors combine SIL with Advantage Actor-Critic (A2C) and test the algorithm out on a variety of hard tasks, including the notoriously tough Atari exploration game Montezuma’s Revenge. They also report scores for games like Gravitar, Freeway, PrivateEye, Hero, and Frostbite: all areas where A2C+SIL beats A3C+ baselines. Overall, AC2+SIL gets a median score across all of Atari of 138.7%, compared to 96.1% for A2C.
  Robots: They also test a combination of PPO+SIL on simulated robotics tasks within OpenAI Gym and significantly boost performance relative to non-SIL baselines.
  Comparisons: At this stage it’s worth noting that many other algorithms and systems have come out since A2C with better performance on Atari, so I’m a little skeptical of the comparative metric here.
  Why it matters: We need to design AI algorithms that can explore their environment more intelligently. This work provides further evidence that developing more sophisticated exploration techniques can further boost performance. Though, as the report notes, such systems can still get stuck in poor local optima. “Our results suggest that there can be a certain learning stage where exploitation is more important than exploration or vice versa,” the authors write. “We believe that developing methods for balancing between exploration and exploitation in terms of collecting and learning from experiences is an important future research direction.”
  Read more: Self-Imitation Learning (Arxiv).

Yes, AI is beginning to influence the economy:
…New study by experienced economists suggests the symptoms of major economic changes as a consequence of AI are already here…
Jason Furman, former chairman of the Council of Economic Advisers and current professor at the Harvard Kennedy School, and Robert Seamans of the NYU Stern School of Business, have published a lengthy report on AI and the Economy. The report compiles information from a wide variety of sources, so it’s worth reading in full.
  Here are some of the facts the report cites as symptoms that AI is influencing the economy:
– 26X: Increase in AI-related mergers and acquisitions from 2015 to 2017. (Source: The Economist).
– 26%: Real reduction in ImageNet top-5 image recognition error rate from 2010 to 2017. (Source: the AI Index.)
– 9X: Increase in number of academic papers focused on AI from 1996 to now, compared to a 6X increase in computer science papers. (Source: the AI Index.)
– 40%: Real increase in venture capital investment in AI startups from 2013 to 2016 (Source: MGI Report).
– 83%: Probability a job paying around $20 per hour will be subject to automation (Source: CEA).
– 4%: Probability a job paying over $40 per hour will be subject to automation (Source: CEA).
  “Artificial intelligence has the potential to dramatically change the economy,” they write in the report conclusion. “Early research findings suggest that AI and robotics do indeed boost productivity growth, and that effects on labor are mixed. However, more empirical research is needed in order to confirm existing findings on the productivity benefits, better understand conditions under which AI and robotics substitute or complement for labor, and understand regional level outcomes.”
   Read more: AI and the Economy (SSRN).

US Republican politician writes op-ed on need for Washington to adopt AI:
Op-ed from US politician Will Hurd calls for greater use of AI by federal government …
The US government should implement AI technologies to save money and cut the time it takes for it to provide services to citizens, says Will Hurd, chairman of the US Information Technology Subcommittee of the House Committee on Oversight and Government Reform.
  “While introducing AI into the government will save money through optimizing processes, it should also be deployed to eliminate waste, fraud, and abuse,” Hurd said. “Additionally, the government should invest in AI to improve the security of its citizens… it is in the interest of both our national and economic security that the United States not be left behind.”
  Read more: Washington Needs to Adopt AI Soon or We’ll Lose Millions (Fortune).
  Watch the hearing in which I testified on behalf of OpenAI and the AI Index (Official House website).

European Commission adds AI advisers to help it craft EU-wide AI strategy:
…52 experts will steer European AI alliance, advise the commission, draft ethics guidelines, and so on…
As part of Europe’s attempt to chart its path forward in an AI world, the European Commission has announced the members of a 52-strong “AI High Level Group” who will advise the Commission and other initiatives on AI strategy. Members include professors at a variety of European universities; representatives of industry,  like Jean-Francois Gagne the CEO of Element AI, SAP’s SVP of Machine Learning, and Francesca Rossi who leads AI ethics initiatives at IBM and also sits on the board of the Partnership on AI; as well as members of the existential risk/AGI community like Jaan Tallinn, who was the founding engineer of Skype and Kazaa.
  Read more: High-Level Group on Artificial Intelligence (European Commission).

European researchers call for EU-wide AI coordination:
…CLAIRE letter asks academics to sign to support excellence in European AI…
Several hundred researchers have signed a letter in support of the Confederation of Laboratories for Artificial Intelligence Research in Europe (CLAIRE), an initiative to create a pan-EU network of AI laboratories that can work together and feed results into a central facility which will serve as a hub for scientific research and strategy.
  Signatories: Some of the people that have signed the letter so far include professors from across Europe, numerous members of the European Association for Artificial Intelligence (EurAI) and five former presidents of IJCAI (International Joint Conference on Artificial Intelligence).
  Not the only letter: This letter follows the launch of another one in May which called for the establishment of a European AI superlab and associated support infrastructure, named ‘Ellis’. (Import AI: #92).
  Why it matters: We’re seeing an increase in the number of grass roots attempts by researchers and AI practitioners to get governments or sets of governments to pay attention to and invest in AI. It’s mostly notable to me because it feels like the AI community is attempting to become a more intentional political actor and joint-letters like this represent a form of practice for future more substantive engagements.
  Read more: CLAIRE (claire-ai.org).

When Good Measures go Bad: BLEU:
…When is an assessment metric not a useful assessment metric? When it’s used for different purposes…
A researcher with the University of Aberdeen has evaluated how good a metric BLEU (bilingual evaluation understudy) is for assessing the performance of natural language processing systems; they analyzed 284 distinct correlations between BLEU and gold-standard human evaluations across 34 papers and concluded that BLEU is useful for the evaluation of machine translation systems , but found its utility breaks down when used for other purposes, like the assessment of individual texts or scientific hypothesis testing or evaluation of things like natural language generation.
  Why it matters: AI research runs partially on metrics and metrics are usually defined by assessment techniques. It’s worth taking a step back and looking at widely-used things like BLEU to work out how meaningful it can be as an assessment methodology and to remember to use it within its appropriate domains.
  Read more: A Structured Review of the Validity of BLEU (Computational Linguistics).

Neural networks can be more brain-like than you assume:
…PredNet experiments show correspondence between activations in PredNet and activations in Macaque brains…
How brain-like are neural networks? Not very. That’s because, at a basic component level, they’re based on a somewhat simplified ~1950s conception of how neurons work, so their biological fidelity is fairly low. But can neural networks, once trained to perform particular tasks, end up reflecting some of the functions and capabilities found in biological neural networks? The answer seems to be yes, based on several years of experiments in things as varied as analyzing pre-trained vision networks, verifying the emergence of ‘place cells‘, and experiments.
  Harvard and MIT Researchers have analyzed PredNet, a neural network trained to perform next-frame prediction in a video of sequences, to understand how brain-like its behavior is. They find that groups when they expose the network to input its neurons fire with a response pattern (consisting of two distinct peaks) that is analogous to the firing patterns found in individual neurons within Macaque monkeys. Similarly, when analyzing a network trained on the self-driving Kittie dataset in terms of its spatial receptivity they find that the artificial network displays similar dynamics to real ones (though with some variance and error). The same high level of overlap between behavior of artificial and real neurons is roughly true of systems trained on sequence learning tasks.
  Less overlap: The areas where artificial and real neurons display less overlap seems to roughly correlate to intuitively harder tasks, like being able to deal with optical illusions, or in how the systems respond to different classes of object.
  Why it matters: We’re heading into a world where people are going to increasingly use trained analogues of real biological systems to better analyze and understand the behavior of both. PredNet provides an encouraging example that this line of experimentation can work. “We argue that the network is sufficient to produce these phenomena, and we note that explicit representation of prediction errors in units within the feedforward path of the PredNet provides a straightforward explanation for the transient nature of responses in visual cortex in response to static images,” the researchers write. “That a single, simple objective—prediction—can produce such a wide variety of observed neural phenomena underscores the idea that prediction may be a central organizing principle in the brain, and points toward fruitful directions for future study in both neuroscience and machine learning.”
  Read more: A neural network trained to predict future video frames mimics the critical properties of biological neuronal responses and perception (Arxiv).
  Read more: PredNet (CoxLab).

Unsupervised Meta-Learning: Learning how to learn without having to be told how to learn:
…The future will be unsupervised…
Researchers with the University of California at Berkeley have made meta-learning more tractable by reducing the amount of work a researchers needs to do to setup a meta-learning system. Their new ‘unsupervised meta-learning’ (ULM) approach lets their meta-learning agent automatically acquire distributions of tasks which it can subsequently perform meta-learning over. This deals with one drawback of meta-learning, which is that it is typically down to the human designer to come up with a set of tasks for the algorithm to be trained on. They also show how to combine ULM with other recently developed techniques like DIAYN (Diversity is all you need) for breaking environments down into collections of distinct tasks/states to train over.
  Results: UML systems beat basic RL baselinets on simulated 2D navigation and locomotion tasks. They also tend to be obtain performance roughly equivalent to systems built with human-designed tuned reward functions, suggesting that UML can successfully explore the problem space enough to devise good reward signals for itself.
  Why it matters: Because the diversity of tasks we’d like AI to do is much larger than the number of tasks we can neatly specify via hand-written rules it’s crucial we develop methods that can rapidly acquire information from new environments and use this information to attack new problems. Meta-learning is one particularly promising approach to dealing with this problem, and by removing another one of its more expensive dependencies (a human-curated task distribution) UML may help push things forward. “An interesting direction to study in future work is the extension of unsupervised meta-learning to domains such as supervised classification, which might hold the promise of developing new unsupervised learning procedures powered by meta-learning,” the researchers write.
  Read more: Unsupervised Meta-Learning for Reinforcement Learning (Arxiv).

OpenAI Bits&Pieces:

Better language systems via unsupervised learning:
New OpenAI research shows how to pair unsupervised learning with supervised finetuning to create large, generalizable language models. This sort of result is interesting because it shows how deep learning components can end up displaying sophisticated capabilities, like being able to obtain high scores on Winograd schema tests, having only learned naively from large amounts of data rather than via specific hand-tuned rules.
  Read more: Improving Language Understanding with Unsupervised Learning (OpenAI Blog).

Tech Tales:

Special Edition: Guest short story by James Vincent, a nice chap who writes about AI. All credit to James, all blame to me, etc… jack@jack-clark.net.

Shunts and Bumps.

Reliable work, thought Andre, that was the thing. Ignore the long hours, freezing warehouses, and endless retakes. Ignore the feeling of being more mannequin than man when the director storms onto set, snatches the coffee cup out of your hand and replaces it with a bunch of flowers without even looking at you. Ignore it all. This was a job that paid, week after week, and all because computers had no imagination.

God bless their barren brains.

Earlier in the year, Rocky had explained it to him like this. “They’re dumb as shit, ok? Show them a potato 50 times and they’ll say it’s an orange. Show them it 5,000 times and they’ll say it’s a potato but pass out in shock if you turn it into fries. They just can’t extrapolate like humans can — they can’t think.” (Rocky, at this point, had been slopping her beer around the bar as if trying to short-circuit a crowd of invisible silicon dunces.) “They only know what you show them, and only then when you show them it enough times. Like a mirror … that gets a burned-in image of your face after you’ve looked at it every day for year.”

For the self-driving business, realizing this inability to extrapolate had been a slow and painful process. “A bit of a car crash,” Rocky said. The first decade had been promising, with deep learning and cheap sensors putting basic autonomy in every other car on the road. Okay, so you weren’t technically allowed to take your hands off the wheel, and things only worked perfectly in perfect conditions: clearly painted road markings, calm highways, and good weather. But the message from the car companies was clear: we’re going to keep getting better, this fast, forever.

Except that didn’t happen. Instead, there was freak accident after freak accident. Self-driving cars kept crashing, killing passengers and bystanders. Sometimes it was a sensor glitch; the white side of a semi getting read as clear highway ahead. But more often it was just the mild chaos of life: a party balloon drifting into the road or a mattress falling off a truck. Moments where the world’s familiar objects are recombined into something new and surprising. Potatoes into fries.

The car companies assured us that the data they used to train their AI covered 99 percent of all possible miles you could travel, but as Rocky put it: “Who gives a fuck about 99 percent reliability when it’s life or death? An eight-year-old can drive 99 percent of the miles you can if you put her in a booster seat, but it’s those one percenters that matter.”

Enter: Andre and his ilk. The car companies had needed data to teach their AIs about all the weird and unexpected scenarios they might encounter on the road, and California was full of empty film lots and jobbing actor who could supply it. (The rise of the fakies hadn’t been kind to the film industry.) Every incident that an AI couldn’t extrapolate from simulations was mocked up in a warehouse, recorded from a dozen angles, and sold to car companies as 4D datasets. They in turn repackaged it for car owners as safety add-ons sold at $300 a pop. They called it DDLC: downloadable driving content. You bought packs depending on your level of risk aversion and disposable income. Dog, Cats, And Other Furry Fiends was a bestseller. As was Outside The School Gates.

It was a nice little earner, Rocky said, and typical of the tech industry’s ability to “turn liability into profit.” She herself did prototyping at one of the higher-end self-driving outfits. “They’re obsessed with air filtration,” she’d told Andre, “Obsessed. They say it’s for biological attacks but I think it’s to handle all their meal-replacement-smoothie farts.” She’d also helped him find the new job. As was usually the case when the tech industry used cheap labor to paper over the cracks in its products, this stuff was hardly advertised. But, a few texts and a Skype audition later, and here he was.

“Ok, Andre, this time it’s the oranges going into the road. Technical says they can adjust the number in post but would prefer if we went through a few different velocities to get the physics right. So let’s do a nice gentle spill for the first take and work our way up from there, okay?”

Andre nodded and grabbed a crate. This week they were doing Market Mayhem: Fruits, Flowers, And Fine Food and he’d been chucking produce about all day. Before that he’d pushing a cute wheeled cart around on the warehouse’s football field-sized loop of fake street. He was taking a break after the crate work, staring at a daisy pushing its way through the concrete (part of the set or unplanned realism?) when the producer approached him.

“Hey man, great work today — oops, got a little juice on ya there still — but great work, yeah. Listen, dumb question, but how would you like to earn some real money? I mean, who doesn’t, right? I see you, I know you’ve got ambitions. I got ‘em too. And I know you’ve gotta take time off for auditions, so what I’m talking about here is a little extra work for triple the money.”

Andre had been suspicious. “Triple the money? How? For what?”

“Well, the data we’ve been getting is good, you understand, but it’s not covering everything the car folks want. We’re filling in a lot of edge cases but they say there’s still some stuff there’s no data for. Shunts and bumps, you might say. You know, live ones… with people.”

And that was how Andre found himself, standing in the middle of a fake street in a freezing warehouse, dressed in one of those padded suits used to train attack dogs, staring down a mid-price sedan with no plates. Rocky had been against it, but the money had been too tempting to pass up. With that sort of cash he’d be able to take a few days off, hell, maybe even a week. Do some proper auditions. Actually learn the lines for once. And, the producer said, it was barely a crash. You probably wouldn’t even get bruised.

Andre gulped, sweating despite the cold air. He looked at the car a few hundred feet away. The bonnet was wrapped in some sort of striped, pressure sensitive tape, and the sides were knobbly with sensors. Was the driver wearing a helmet? That didn’t seem right. Andre looked over to the producer, but he was facing away from him, speaking quickly into a walkie-talkie. The producer pointed at something. A spotlight turned on overhead. Andre was illuminated. He tried to shout something but his tongue was too big in his mouth. Then he heard the textured whine of an electric motor, like a kazoo blowing through a mains outlet, and turned to see the sedan sprinting quietly towards him.

Regular work, he thought, that was the thing.

Things that inspired this story: critiques of deep learning; failures of self driving systems; and imitation learning.

Once again, the story above is from James Vincent, find him on Twitter and let him know what you thoughts!