ImportAI: #87: Salesforce research shows the value of simplicity, Kindred’s repeatable robotics experiment, plus: think your AI understands physics? Run it on IntPhys and see what happens.

by Jack Clark

Chinese AI star says society must prepare for unprecedented job destruction:
…Kai-Ful Lee, venture capitalist and former AI researchers, discusses impact of AI and why today’s techniques will have a huge impact on the world…
Today’s AI systems are going to influence the world’s economy so much that their uptake will lead to what looks in hindsight like another industrial revolution, says Chinese venture capitalist Kai-Fu Lee, in an interview with Edge. “We’re all going to face a very challenging next fifteen or twenty years, when half of the jobs are going to be replaced by machines. Humans have never seen this scale of massive job decimation. The industrial revolution took a lot longer,” he said.
He also says that he worries deep learning might be a one-trick pony, in the sense that we can’t expect other similarly scaled breakthroughs to occur in the next few years, and we should adjust our notions of AI progress on this basis. “You cannot go ahead and predict that we’re going to have a breakthrough next year, and then the month after that, and then the day after that. That would be exponential. Exponential adoption of applications is, for now, happening. That’s great, but the idea of exponential inventions is a ridiculous concept. The people who make those claims and who claim singularity is ahead of us, I think that’s just based on absolutely no engineering reality,” he says.
AI Haves and Have-Nots: Countries like China and the USA that have large populations and significant investments in AI stand to fair well in the new AI era, he says. “The countries that are not in good shape are the countries that have perhaps a large population, but no AI, no technologies, no Google, no Tencent, no Baidu, no Alibaba, no Facebook, no Amazon. These people will basically be data points to countries whose software is dominant in their country.”
Read more: We Are Here To Create, A Conversation With Kai-Fu Lee (Edge).

AI practitioners grapple with the upcoming information apocalypse:
..And you thought DeepFakes was bad. Wait till DeepWar…
Members of the AI community are beginning to sound the alarm about the imminent arrival of stunningly good, stunningly easy to make synthetic images and videos. In a blog post, AI practitioners say that the increasing availability of data combined with easily accessible AI infrastructure (cloud-rentable GPUs) is lowering the barrier to entry for people that want to make this stuff, and that ongoing progress in AI capabilities means the quality of these fake media is increasing over time.
How can we deal with these information threats? We could look at how society already makes it hard to forge currencies via making it costly to produce high-fidelity copies and in parallel developing technologies to verify the authenticity of currency materials. Unfortunately, though this may help with some of the problems brought about by AI forgery, it doesn’t deal with the root problems: AI is predominantly embodied in software rather than hardware and so it’s going to be difficult to insert detectable (and non-spoofable) distinct visual/audio signatures into generated media barring some kind of DRM-on-steroids. One solution could be to train AI classifiers on real and faked datasets from the same domain so as to provide classifiers to spot faked media in the wild.
Read more: Commoditisation of AI, digital forgery and the end of trust: how we can fix it.

Berkeley researchers use Soft Q-Learning to let robots compose solutions to tasks:
…Research reduces the time it takes to learn new behaviors on robots…
Berkeley researchers have figured out how to use soft q-learning, a recently introduced variant of traditional q-learning, to let robots learn more efficiently. They introduce a new trick where they’re able to learn to compose new q-functions from existing learned policies, letting them, for example, train a robot to move its arm to a particular distribution of X positions, then to a particular distribution of Y positions, then they can create a new policy which moves the arm to the intersection of the X and Y positions without having been trained on the combination previously. This sort of learning is typically quite difficult to achieve in a single policy as it requires so much exploration that most algorithms will spend a long time trying and failing to succeed at the task.
Real world: The researchers train real robots to succeed at tasks like reaching to a specific location and stacking Lego blocks. They also demonstrate the utility of combining policies by training a robot to avoid an obstacle near its arm and separately training it to stack legos, then combine the two policies allowing the robot to stack blocks while avoiding an obstacle, despite having never been trained on the combination before.
Why it matters: The past few years of AI progress have let us get very good at developing systems which excel at individual capabilities; being able to combine capabilities in an ad-hoc manner to generate new behaviors further increases the capabilities of AI systems and makes it possible to learn a distribution of atomic behaviors then chain these together to succeed at far more complex tasks than those found within the training set.
Read more: Composable Deep Reinforcement Learning for Robotic Manipulation (Arxiv).

Think your AI model has a good understanding of physics? Run it on IntPhys and prepare to be embarrassed:
…Testing AI systems in the same way we test infants and creatures…
INRIA and Facebook and CNRS researchers have released IntPhys, a new way to evaluate AI systems’ ability to model the physical world around them using what the researchers call a ‘physical plausibility test’. IntPhys follows in a recent trend in AI for testing systems on tougher problems that more closely map to the sorts of problems humans typically tackle (see, AI2’s ‘ARC’ dataset for written reasoning, and DeepMind’s cognitive science-inspired ‘PsychLab’ environment).
How it works: IntPhys presents AI systems with movies of scenes rendered in UnrealEngine4 and challenges them to figure out whether one scene can lead to another, letting them test models’ ability to internalize fundamental concepts about the world like object permanence, causality, etc. Systems need to compute a “plausibility score” for each of the scenes or scene combinations they are shown, then use this to figure out if the systems have learned about the underlying dynamics of the world.
The IntPhys Benchmark: v1 of IntPhys focuses on unsupervised learning. The first version tests systems’ ability to understand object permanence. Future releases will include more tests for things like shape constancy, spatio-temporal continuity, and so on. The initial IntPhys release contains 15,000 videos of possible events, each video around 7 seconds long running at 15fps, totalling 21 hours of videos. It also incorporates some additional information so you don’t have to attempt to solve the task in a purely unsupervised manner, including depth of field data for each image, as well as object instance segmentation masks.
Baseline Systems VERSUS Humans: The researchers create two baselines for others to evaluate their systems against: a CNN encoder-decoder system, and a conditional GAN. “Preliminary work with predictions at the pixel level revealed that our models failed at predicting convincing object motions, especially for small objects on a rich background. For this reason, we switched to computing predictions at a higher level, using object masks.” The researchers tested humans on their system, finding that humans had an average error rate of about 8 percent when the scene is visible and 25 percent when the scene contains partial occlusion. Neural network-based systems, by comparison, had errors of 31 percent on visible scenes and 50 percent on partially occluded scenes.
What computers are up against: “At 2-4 months, infants are able to parse visual inputs in terms of permanent, solid and spatiotemporally continuous objects. At 6 months, they understand the notion of stability, support and causality. Between 8 and 10 months, they grasp the notions of gravity, inertia, and conservation of momentum in collision; between 10 and 12 months, shape constancy, and so on,” the researchers write.
Why it matters: Tests like this will give us a greater ability to model the abilities of AI systems to perform fundamental acts of reasoning, and as the researchers extend the benchmark with more challenging components we’ll be able to get a better read on what these systems are actually capable of. As new components are added “the prediction task will become more and more difficult and progressively reach the level of scene comprehension achieved by one-year-old humans,” they write.
Competition: AI researchers can download the dataset and submit their system scores to an online leaderboard at the official IntPhys website here (IntPhys).
Read more: IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning (Arxiv).

Kindred researchers explain how to make robots repeatable:
…Making the dream of repeatable robot experiments a reality…
Researchers with robot AI startup Kindred have published a paper on a little-discussed subject in AI: repeatable real-world robotics experiments. It’s a worthwhile primer on some of the tweaks people need to make to create robot development environments that are a) repeatable and b) effective.
Regular robots: The researchers set up a reaching task using a Universal Robotics ‘UR5’ robot arm and describe the architecture for the system. One key difference between simulated and real world environments is the role of time, where in simulation one typically executes all the learning and action updates synchronously, whereas in real robots you need to do stuff asynchronously. “In real-world tasks, time marches on during each agent and environment-related computations. Therefore, the agent always operates on delayed sensorimotor information,” they explain.
Why it matters: It’s currently very difficult to model progress in real-world robotics due to the diversity of tasks and the lack of trustworthy testing regimes. Papers like this suggest a path forward and I’d hope they encourage researchers to try to structure their experiments to be more repeatable and reliable. If we’re able to do this then we’ll be able to better develop intuitions about the rate of progress in the field which should help for forecasting trends in development – a critical thing to do, given how much robots are expected to influence employment in the regions they are deployed into.
Read more here: Setting up a Reinforcement Learning Task with a Real-World Robot (Arxiv).

Salesforce researchers demonstrate the value of simplicity for language modelling:
…Well-tuned LSTM or QRNN-based systems shown to beat more complex systems…
Researchers with Salesforce have shown that well-tuned basic AI components can attain superior performance on tough language tasks than more sophisticated and in many cases more modern systems. Their research shows that RNN-based systems that model language using well-tuned, simple components like LSTMs or the Salesforce-inventred QRNN beat more complex models like recurrent highway networks, hyper networks, or systems found by neural architecture search. This result highlights that much of the recent progress in AI may to some extent be illusory: jumps in performance on certain datasets that have previously been assumed to be possible due to fundamentally new capabilities in new models are now being shown to be within reach of simpler components that are tuned and tested comprehensively.
Results: The researchers test their QRNN and LSTM-based systems against the Penn Treebank and enwik8 character-level datasets and the word-level WikiText-103 dataset, beating state-of-the-art scores on Penn Treebank and enwik8 when measured by bits-per-character, and significantly outperforming SOTA on perplexity on WikiText-103.
Why it matters: This paper follows prior work showing that many of our existing AI components are more powerful than researchers suspected, and follows research that has shown that fairly old systems like GANs or DCGANs can adeptly model data distributions more effectively than sophisticated successor systems. That’s not to say this should be taken as a sign that the subsequent inventions are pointless, but it should cause researchers to devote more time to interrogating and tuning existing systems rather than trying to invent different proverbial wheels. “Fast and well tuned baselines are an important part of our research community. Without such baselines, we lose our ability to accurately measure our progress over time. By extending an existing state-of-the-art word level language model based on LSTMs and QRNNs, we show that a well tuned baseline can achieve state-of-the-art results on both character-level (Penn Treebank, enwik8) and word-level (WikiText-103) datasets without relying on complex or specialized architectures,” they write.
Read more: An Analysis of Neural Language Modeling at Multiple Scales (Arxiv).

Want to test how well your AI understands language and images? Try VQA 2.0
…New challenge arrives to test AI systems’ abilities to model language and images…
AI researchers that think they’ve developed models that can learn to model the relationship between language and images may want to submit to the third iteration of the Visual Question Answering Challenge. The challenge prompts models to answer questions about the contents of images. Challengers will use the v2.0 version of the VQA dataset, which includes more written questions and ground truth answers about images.
Read more: VQA Challenge 2018 launched! (VisualQA.org).

Tech Tales:

Miscellaneous Letters Sent To The Info@ Address Of An AI Company

2023: I saw what you did with that robot so I know the truth. You can’t hide from me anymore I know exactly what you are. My family had a robot in it and the state took them away and told us they were being sent to prison but I know the truth they were going to take them apart and sell their body back to the aliens in exchange for the anti-climate change device. What you are doing with that robot tells me you are going to take it apart when it is done and sell it to the aliens as well. You CANNOT DO THIS. The robot is precious you need to preserve it or else I will be VERY ANGRY. You must listen to me we-

2025: So you think you’re special because you can get them to talk to each other in space now and learn things together well sure I can do that as well I regularly listen to satellites so I can tell you about FLUORIDE and about X74-B and about the SECRET UN MOONBASE and everything else but you don’t see me getting famous for these things in fact it is a burden it is a pain for me I have these headaches. Does your AI get sick as well?-

2027: Anything that speaks like a human but isn’t a human is a sin. You are sinners! You are pretending to be God. God will punish you. You cannot make the false humans. You cannot do this. I have been calling the police every day for a week about this ever since I saw your EVIL creation on FOX-25 and they say they are taking notes. They are onto you. I am going to find you. They are going to find you. I am calling the fire department to tell them about you. I am calling the military to tell them about you. I am calling the-

2030: My mother is in the hospital with a plate in her head I saw on the television you have an AI that can do psychology on other AIs can your AI help my mother? She has a plate in her head and needs some help and the doctors say they can’t do anything for her but they are liars. You can help her. Please can you make your AI look at her and diagnose what is wrong with her. She says the plate makes her have nightmares but I studied many religions for many years and believe she can be healed if she thinks about it more and if someone or something helps her think.

2031: Please you have to keep going I cannot be alone any more-

Things that inspired this story: Comments from strangers about AI, online conspiracy forums, bad subreddits, “Turing Tests”, skewed media portrayals of AI, the fact capitalism creates customers for false information which leads to media ecosystems that traffic in fictions painted as facts.

2 Comments to “ImportAI: #87: Salesforce research shows the value of simplicity, Kindred’s repeatable robotics experiment, plus: think your AI understands physics? Run it on IntPhys and see what happens.”

Import AI 113: Why satellites+AI gives us a global eye; industry pays academia to say sorry for strip-mining it; and Kindred researchers seek robot standardization | Import AI says:

September 25, 2018 at 6:22 am

[…] Researchers with robotics startup Kindred have built on prior work on robot standardization (Import AI #87) have tried to make it easier for researchers to compare the performance of real world robots […]

Import AI 223: Why AI systems break; how robots influence employment; and tools to ‘detoxify’ language models | Import AI says:

November 16, 2020 at 3:46 pm

[…] was a robot startup that tried to train its robots via reinforcement learning (Import AI 87), and tried to standardize how robot experimentation works (#113). It was founded by some of the […]

Import AI