Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search

by Jack Clark

Learning with Junk:
…Reversing sentences for better language models…
Sometimes a little bit of junk data can be useful: that’s the implication of new research from Stony Brook University, which shows that you can improve natural language processing systems by teaching them during training to distinguish between real and fake sentences.
Technique: “Given a large unlabeled corpus, for every original sentence, we add multiple fake sentences. The training task is then to take any given sentence as input and predict whether it is a real or fake sentence,” they write. “In particular, we propose to learn a sentence encoder by training a sequential model to solve the binary classification task of detecting whether a given input sentence is fake or real”.
  The researchers create fake sentences in two ways: WordShuffle, which sees them shuffle some of the orders of the words in the sentence; and WordDrop, which sees them drop a random word from a sentence.
Evaluation: They evaluate these systems on tasks including sentiment classification, question answering, subjectivity, retrieval, and others. Systems trained with this approach display significantly higher scores than prior language modeling approaches (specifically, the FastSent and Skipthought techniques.
  Why this matters: Language modeling is one of the hardest tasks that contemporary AI is evaluated on. Typically, most of today’s systems fail to display much complexity in their learned models, likely due to the huge representational space of language, paired with the increased costs for getting different things wrong (it’s way easier to notice a sentence error or spelling error than to see how the value of one or two of the pixels in a large generated image are off). Systems and approaches like those described in this paper show how we can use data augmentation techniques and discriminative training approaches to create high-performing systems.
  Read more: Fake Sentence Detection as a Training Task for Sentence Encoding (Arxiv).

Reinforcement learning breaks out of the simulator with new Chinese research:
…Training robots via reinforcement learning to solve warehouse robot problems…
Researchers with the Department of Mechanical and Biomedical Engineering of City University of Hong Kong, China, along with Metoak Technology Co, and Fuzhou University’s College of Mathematics and Computer Science, have used reinforcement learning to train warehouse robots in simulation and transfer them to the real world. These are the same sorts of robots used by companies like Amazon and Walmart for automation of their own warehouses. The research has implications for how AI is going to revolutionize logistics and supply chains, as well as broadening the scope of capabilities of robots.
The researchers’ develop a system for their logistics robots based around what they call: “sensor-level decentralized collision avoidance”. This “requires neither perfect sensing for neighboring agents and obstacles nor tedious offline parameter-tuning for adapting to different scenarios”. Each robot makes navigation decisions independently without any communication with others, and are trained in simulation via a multi-stage reinforcement learning scheme. The robots are able to perceive the world around them via a 2D laser scanner, and have full control over their translational and rotational velocity (think of them as autonomous dog-sized hockey pucks).
  Network architecture: They tweak and extend the Proximal Policy Optimization (PPO) algorithm to make it work in large-scale, parallel environments, then they train their robots using a two-stage training process: they first train 20 of them in a 2D randomized placement navigation scenario, where the robots need to learn basic movement and collision avoidance primitives. They then save the trained policy and use this to start a second training cycle, which trains 58 robots in a series of more complicated scenarios that involve different building dimensions, and so on.
  Mo’ AI, Mo Problems: Though the trained policies are useful and transfer into the world, they exhibit many of the idiosyncratic behaviors typical of AI systems, which will make them harder to deploy. “For instance, as a robot runs towards its goal through a wide-open space without other agents, the robot may approach the goal in a curved trajectory rather than in a straight line,” the researchers say. “We have also observed that a robot may wander around its goal rather than directly moving toward the goal, even though the robot is already in the close proximity of the target.” To get around this, the researchers design software to classify the type of scenario being faced by the robot, and then switch the robot between fully autonomous and PID-controlled modes according to the scenario. By using the hybrid system they create more efficient robots, because switching opportunistically into PID-control regimes leads to the robots typically taking straight line courses or moving and turning more precisely.
  Generalization: The researchers test their system’s generalization by evaluating it on scenarios with non-cooperative robots which don’t automatically help the other robots; with heterogeneous robots, so ones with different sizes and shapes; and in scenarios with larger numbers of robots than those controlled during simulation (100 versus 58). In tests, systems trained with both the RL and hybrid-RL system display far improved accuracy relative to supervised learning baselines; the systems are also flexible, able to get stuck less as you scale up the number of agents, and go through fewer collisions.
Real world: The researchers also successfully test out their approach on robots deployed in the real world. For this, they develop a robot platform that uses the Hokuyo URG-04LX-UG01 2D LiDAR, a Pozyx localization based based on Ultra-Wide Band (UWB) tech, and the NVIDIA Jetson TX1 for computing, and then they test this platform on a variety of different robot chassis including a Turtlebod, the ‘Igor’ robot from Hebi robotics, the Baidu Bear robot, and the Baidu shopping cart. They test their robots on simulated warehouse and office scenarios, including ones where robots need to shuttle between two transportation stations while avoiding pedestrian foot traffic; they also test the robots on tasks like following a person through a crowd, and around a platform. “Our future work would be how to incorporate our approach with classical mapping methods (e.g. SLAM) and global path planners (e.g. RRT and A∗ ) to achieve satisfactory performance for planning a safe trajectory through a dynamic environment,” they say.
  Why it matters: One of the legitimate criticisms of contemporary artificial intelligence is that though we’ve got a lot of known successes for supervised learning, we have relatively few examples of ways in which reinforcement learning-based systems are doing productive economic work in the world – though somewhat preliminary, research papers like this indicate that RL is becoming tractable on real world hardware, and that the same qualities of generalization and flexibility seen on RL-trained policies developed in simulation also appear to be present in reality. If this trend holds it will increase the rate at which we deploy AI technology like this into the world.
  Read more: Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios (Arxiv).

Watch James Mickens explain, in a humorous manner, why life is pointless and we’re all going to die:
…Famed oddball researcher gives terror-inducing rant at Usenix…
James Mickens is a sentient recurrent neural network sent from the future to improve the state of art and discourse about technology. He has recently talked about the intersection of AI and computer security. I’m not going to try and and explain anything else about this just, please, watch it. Watch it right now.
Read/watch more: Q Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible? A: Because Keynote Speakers Make Bad Life Decisions And Are Poor Role Models (Usenix).
  Bias alert: Mickens was one of the advisors on the ‘Assembly’ program that I attended @ Harvard and MIT earlier this year. I had a couple of interactions with him that led to me spending one evening hot-glueing cardboard together to make an ancient pyramid which I dutifully assembled, covered in glitter, photographed, and emailed him, apropos of nothing.

Neural Architecture Search: What is it good for?
…The answer: some things! But researchers are a bit nervous about the lack of theory…
Researchers with the Bosch Center for Artificial Intelligence and the University of Freiburg have written up a review of recent techniques relating to Neural Architecture Search, techniques for using machine learning to automate the design of neural networks. The review highlights how NAS has grown in recent years following an ImageNet-style validation of the approach in a paper from Google in 2017 (which used 800 GPUs), and has subsequently been made significantly more efficient and more high performing by other researchers. They also show how NAS – which originally started being used for tasks like image classification – is being used in a broadening set of domains, and that NAS systems are themselves becoming more sophisticated, evolving larger bits of systems, and starting to perform multi-objective optimization (like recent work from Google which showed how to use NAS techniques to evolve networks according to tradeoffs of concerns between performance and efficiency).
But, a problem: NAS is the most automated aspect of an empirical discipline that lacks much theory about why anything works. That means that NAS techniques are themselves vulnerable to the drawbacks of empirically-grounded science: poor experimental setup can lead to bad results, and scientists don’t have much in the way of theory to give them a reliable substrate on which to found their ideas. This means that lots of the known flaws with NAS-style approaches will need to be experimented and tested to further our understanding of them, which will be expensive in terms of computational resources, and likely difficult to the large number of moving parts coupled with the emergent properties of these systems. For example: “while approaches based on weight-sharing have substantially reduced the computational resources required for NAS (from thousands to a few GPU days), it is currently not well understood which biases they introduce into the search if the sampling distribution of architectures is optimized along with the one-shot model. For instance, an initial bias in exploring certain parts of the search space more than others might lead to the weights of the one-shot model being better adapted for these architectures, which in turn would reinforce the bias of the search to these parts of the search space,” they write.
  Why it matters: Techniques like NAS let us arbitrage computers for human brains for some aspects of AI design, potentially letting us alter more aspects of AI experimentation, and therefore further speed up the experimental loop. But we’ll need to run more experiments, or develop better theoretical analysis of such systems, to be able to deploy them more widely. “While NAS has achieved impressive performance, so far it provides little insights into why specific architectures work well and how similar the architectures derived in independent runs would be,” the researchers write.
  Read more: Neural Architecture Search: A Survey (Arxiv).

Q&A with Yoshua Bengio on how to build a successful research career and maintain your sanity (and those of your students) while doing so:
… Deep learning pioneer gives advice to eager young minds…
Artificial intelligence professor Yoshua Bengio, one of the pioneers of deep learning, has dispensed some advice about research, work-life balance, and academia versus industry, in an interview with Cifar news. Some highlights follow:
  On research: “One thing I would’ve done differently is not disperse myself in different directions, going for the idea of the day and forgetting about longer term challenges”.
  On management: Try to put people into positions where they get management experience earlier. “We shouldn’t underestimate the ability of younger people to do a better job than their elders as managers”.
  Create your own AI expert: Some people can become good researchers without much experience. “Find somebody who has the right background in math or physics and has dabbled in machine learning: these people can learn the skills very fast”.
  Make a nice lab environment: Make sure people hang out together and work in the lab the majority of the time.
Set your students free by “giving them freedom to collaborate and strike new projects outside of what you’ve suggested, even with other professors”.
  The secret to invention: “These ideas always come from somewhere hidden in our brain and we must cultivate our ability to give that idea-generation process enough time”. (In other words, work hard, but not too hard, and create enough time for your brain to just mess around with interesting ideas).
  Read more: Q&A with Yoshua Bengio (Cifar).

Chinese teams sweep Activity Recognition Challenge 2018:
…Video description and captioning next frontier…
Computer vision researchers have pitted their various systems against eachother at correctly labeling activities carried out in video, as part of the ActivityNet 2018 Challenge. Systems were tested at their ability to label activities in videos, localize these activities, and provide accurate moment-by-moment captions for these activities. A team from Baidu won the first competition, a team from Shanghai Jiao Tong University won the second one, and a combined team from RUC and CMU won the third task.TK startup YH Technologies placed in the top three for each of these challenges as well. Additionally, organizations competed with eachother on specific computer vision recognition tasks over specific datasets, and here Chinese companies and organizations led the leaderboards (including one case where a team from Tsinghua beat a team from DeepMind).
Why it matters: Activity recognition is one area of AI that has clear economic applications as well as clear surveillance ones – benchmarks like ActivityNet give us a better sense of progress within this domain, and I expect that in the future competitions like this may take on a nationalistic or competitive overtone.
Read more: The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary (Arxiv).

Tech Tales:

Funeral for a Robot

A collection of Haikus written by school children attending the funeral of a robot, somewhere in Asia Pacific, sometime in the mid 21st century.

Laid to rest at last
No backups, battery, comms
Rain thuds on coffin

Like a young shelled egg
All armor and clothing gone
Like a child, like me

If robot heaven, then
All free electricity
If robot hell: you pay

One Comment to “Import AI: 108: Learning language with fake sentences, Chinese researchers use RL to train prototype warehouse robots; and what the implications are of scaled-up Neural Architecture Search”

Artificial Intelligence/Machine Learning Roundup #71 | Daily Artificial Intelligence & Machine Learning Curated News says:

August 21, 2018 at 8:36 am

[…] Reinforcement learning breaks out of the simulator with new Chinese research:…Training robots via reinforcement learning to solve warehouse robot problems…Researchers with the Department of Mechanical and Biomedical Engineering of City University of Hong Kong, China, along with Metoak Technology Co, and Fuzhou University’s College of Mathematics and Computer Science, have used reinforcement learning to train warehouse robots in simulation and transfer them to the real world. Read More […]

Loading...

Import AI