Import AI 155: Mastering robots with the ‘DRIVE’ dataset; facial recognition for monkeys; and why AI development is a collective action problem.
by Jack Clark
Chinese company seeks smarter robots with ‘DRIVE’ dataset:
…Crowds? Check. Trashcan-sized robots? Check. A challenging navigation and mapping benchmark? Check…
Researchers with Chinese robot company Segway Robotics Inc have developed the ‘DRIVE’ dataset and benchmark, which is designed to help researchers develop smarter delivery robots.
The company did the research because it wants to encourage research in an area relevant to its business, and because of larger macroeconomic trends: “The online shopping and on-demand food delivery market in China has been growing at a rate of 30%-50% per year, leading to labor shotage and rising delivery cost,” the researchers write. “Delivery robots have the potential to solve the dilemma caused by the growing consumer demand and decreasing delivery workforce.”
Robots! Each Segway robot used to gather the dataset is equipped with a RealSense visual inertial sensor, two wheel encoders, and a Hokuyuo 2D lidar.
The DRIVE dataset: The dataset consists of 100 movement sequences across five different indoor locations, and was collected by robots over the course of one year. It is designed to be extremely challenging, and incorporates the following confounding factors and traits:
- Commodity, aka cheap, inertial measurement units
- Busy: The gathered data includes scenes with many moving people and objects, which can break brittle AI systems
- Similar, similar: Some of the environments are superficially similar to eachother, which could trigger misclassification. Additionally, some of the places in the environments lack texture or include numerous reflections and shadows, making it harder for robots to visually analyze their environment. Additionally, some of the environments have bumpy or rough surfaces.
- Hurry up and wait: Some of the datasets include long sequences in which the robot is stationery (which makes it difficult to estimate depth), while at other times the robots perform rapid rotations (which can lead to motion blur and wheels slipping on the ground).
Why this matters: Datasets unlock AI progress, letting large numbers of people work together on shared challenges. Additionally, the creation of datasets usually imply specific business and research priorities, so the arrival of things like the DRIVE Benchmark point to broader maturation in smart, mobile robots.
Read more: Segway DRIVE Benchmark: Place Recognition and SLAM Data Collected by A Fleet of Delivery Robots (Arxiv).
Find out more about the benchmark here (Segway DRIVE website).
####################################################
You’ve heard of face identification. What about Primate face identification?
…Towards a future where we automatically scan and surveil the world around us…
Researchers with the Indraprastha Institute of Information Technology Delhi and the Wildlife Institute of India have teamed up to develop a system capable of identifying monkeys in the wild and have linked this to a crowd-sourced app, letting the “general public, professional monkey catchers and field biologists” crowd source images of monkeys for training larger, smarter models.
Why do this? Monkeys are a bit of a nuisance in Indian urban and semi-urban environments, the researchers write, so have designed the system to use data captured ‘in the wild’, helping people build systems to surveil and analyze primates in challenging contexts. “Typically, we expect the images to be captured in uncontrolled outdoor scenarios, leading to significant variations in facial pose and lighting”.
Datasets:
- Rhesus Macaque Dataset: 7679 images / 93 individuals.
- Chimpanzee Dataset: 7166 images / 90 primates. Pictures span good quality images from a Zoo, as well as uncontrolled images from a national park.
Results: The system outperforms a variety of baselines and sets a new state of the art across four validation scores, typically via a greater than 2 point absolute increase in performance, and sometimes via as much as a 6 or greater point increase. Their system is trained with a couple of different loss functions designed to capture smaller geometric features across faces, making the model more robust across multiple data distributions.
Why this matters: This research is an indication of how as AI has matured we’ve started to see it being used as a kind of general-purpose utility, with researchers mixing and matching different techniques and datasets, making slight tweaks, and solving tasks for socially relevant applications. It’s particularly interesting to see this approach integrated with a crowd sourced app, pointing to a future where populations are able to collaboratively measure, analyze, and quantify the world around them.
Read more: Primate Face Identification in the Wild (Arxiv).
####################################################
What Recursion’s big dataset release means for drug discovery:
…RxRx1 dataset designed to encourage “machine learning on large biological datasets to impact drug discovery and development”…
Recursion Pharmaceuticals, a company that uses AI for drug discovery, has released RxRx1, a 296GB dataset consisting of 125,510 images across 1,108 classes; an ImageNet-scale dataset, except instead of containing pictures of cats and dogs it contains pictures of human cells, to help scientists train AI systems to observe patterns across them, and generate insights for drug development.
The challenges of biology: Biological datasets can be challenging for image recognition algorithms due to variation across cell samples, and other factors present during data sampling, such as temperature, humidity, reagent concentration and so on. RxRx1 contains data from 51 instances of the same experiment, which should help scientists develop algorithms that are robust to the changes across experiments, and are thus able to learn underlying patterns in the data.
What parts of AI research could RxRx1 help with? Recursion has three main ideas:
- Generalization: The dataset is useful for refining techniques like transfer learning and domain adaptation.
- Context Modeling: Each RxRx1 image ships with a detailed metadata, so researchers can experiment with this as an additional form of signal.
- Computer Vision: RxRx1 “presents a very different data distribution than is found in most publicly available imaging datasets,” Recursion writes. “These differences include the relative independence of many of the channels (unlike RGB images) and the fact that each example is one of a population of objects treated similarly as opposed to singletons.”
Why this matters: We’re entering an era where people will start to employ large-scale machine learning to revolutionize medicine; tracking usage of datasets like RxRx1 and the results of a planned NeurIPS 2019 competition will help give us a sense of progress here and what it might mean for medicine and drug design.
Read more: RxRx1 official website (RxRx.ai).
####################################################
Why AI could leave people with disabilities behind:
…Think bias is a problem now? Wait until systems are deployed more widely…
Researchers with Microsoft and the Human-Computer Interaction Institute at Carnegie Mellon University have outlined how people with disabilities could be left behind by AI advances. People with disabilities could have trouble accessing the benefits of AI systems due to issues of fairness and bias inherent to machine learning, according to a position paper from researchers with Microsoft and the Human-Computer Interaction Institute at Carnegie Mellon University. To deal with some of these issues, they propose a research agenda to help remedy these shortcomings in AI systems. The agenda contains four key activities:
- Identify ways in which inclusion issues for people with disabilities could impact AI systems
- Test inclusion hypotheses to understand failure scenarios
- Create benchmark datasets to support replication and inclusion
- Develop new modeling, bias mitigation, and error measurement techniques
It’s all about representation: So, how might we expect AI systems to fail for people with disabilities? The authors survey current systems and provide some ideas. Spoiler alert: Mostly, these systems will fail to work for people with disabilities because they will have been designed by people who are neither disabled, nor are educated about the needs of people with disabilities.
- Computer Vision: It’s likely that facial recognition will not work well for people with differences in facial features and expressions (eg, people with Down’s syndrome) not anticipated by system designers; face recognition could also not work for blind people, who may have differences in eye anatomy or be wearing medical or cosmetic aids. For similar reasons, we can expect systems designed to recognize certain bodytypes failing for some people. Additionally, object/scene/text recognition systems are likely to break more frequently for poorly sighted people, as the pictures poorly sighted people take are very different to those taken by sighed people.
- Speech Systems: Speech recognition systems won’t work for people that have speech disabilities; we may also need more granular metrics beyond things like Word Error Rate to best model how well systems work for different people. Similarly, speaker analysis systems will need to be trained with different datasets to accurately hear people with disabilities.
- Text Analysis: These systems will need to be designed to correct for errors that emerge under certain disabilities (for instance, dyslexia), and will need to account for people that write in different emotional registers to typical people.
Why this matters: AI is an accelerant and a magnifier of whatever context it is deployed in due to the scale at which it operates, the number of automatic judgements it makes, and the increasingly comprehensive deployment of Ai-based techniques across society. Therefore, if we don’t think very carefully about how AI may or may not ‘see’ or ‘understand’ certain types of people, we could harm people or cut them off from accessing its benefits. (On the – extremely minor – plus side, this research suggests that people with disabilities may be harder to surveil than other people, for now.)
Read more: Toward Fairness in AI for People with Disabilities: A Research Roadmap (Arxiv).
####################################################
‘Visus’ software provides quality assurance for model training:
…Now that models can design themselves, we need software to manage this…
Researchers with New York University have developed Visus, software that makes it easier for people to build models, evolve models, and manage the associated data processing pipelines needed to train them. It’s a tool that represents the broader industrialization of the AI community, and prefigures larger uses of ML across society.
What is Visus? The software gives AI developers a software interface that lets them define a problem, explore summaries of the input dataset, augment the data, and then explore and compare different models according to their performance scores and prediction outputs. The software is presented via a nicely designed user interface, making it more approachable than tools solely accessible via the command line.
What can it do? What can’t it do! Visus is ‘kitchen sink software’, in the sense that it contains a vast number of features for tasks like exploratory data analysis, problem specification, data augmentation, model generation and selection, and confirmatory data analysis, and so on.
Example use case: The researchers outline a hypothetical example where the New York City Department of Transportation uses Visus to figure out policies that it can enact which can reduce traffic fatalities. Here, they’d use Visus first to analyze the dataset about traffic collisions, then can select a variable in the dataset (for instance, number of collisions) that they’d want to predict, then ask Visus to perform a model search (otherwise known as ‘AutoML’), where it tries to find appropriate machine learning models to use to achieve the objective. Once it comes up with models, the user can also try to augment the underlying dataset, and then iterate on model design and selection again.
Why this matters: Systems like ‘Visus’ are part of the industrialization of AI, as they take a bunch of incredibly complicated things like data augmentation and model design and analysis, then port it into more user-friendly software packages that broaden the number of people able to use such systems. This is like shifting away from artisanal individualized production to repeatable, system-based production. The outcome of adoption of tools like Visus will be more people using more AI systems across society – which will further change society.
Read more: Visus: An Interactive System for Automatic Machine Learning Model Building and Curation (Arxiv).
####################################################
AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net…
Collective action problems for safe AI:
In many industries, profit-seeking firms are incentivised to invest in product safety. This is generally because they have internalised the costs of safety failures via regulation, liability, and consumer behaviour. Consider the cost to a car manufacturer of a critical safety failure – they will have to recall the product, they will be liable to fines and litigation, and they will suffer reputational damage. AI firms are subject to these incentives, but they appear to be weaker. Their products are difficult for manufacturers, consumers, and regulators to assess for safety; it is difficult to construct effective regulation; and many of the potential harms might be hard to internalise.
Competition: Another special feature about AI development is the possibility of discontinuous and/or very rapid progress. If firms believe this, they likely believe that there are significant payoffs to the first firm to make a particular breakthrough or to ‘pull ahead’ from competitors. This increases the costs of investing in safety, by increasing the expected benefits of faster development. This assumption may not hold true, which would make the situation more benign, but it is important to consider what this ‘worst-case’ for responsible development.
Cooperation: A simple model of this problem is a two-player game, where two firms face a decision to cooperate (maintain some level of investment in safety) or defect (fail to maintain this level). This allows us to see factors that can increase the likelihood of cooperation, by making it rational for each firm to do so: high trust that others will cooperate; shared upside from mutual cooperation; shared downside from mutual defection; smaller benefits to not reciprocating cooperation; and lower costs to unreciprocated cooperation.
Four strategies: This analysis can help identify strategies for increasing cooperation on responsible development: dispelling incorrect beliefs about responsible AI development; promoting inter-firm collaboration on projects; opening AI development to appropriate oversight and feedback; and creating stronger incentives to safe practices.
Read more: The Role of Cooperation in Responsible AI Development (arXiv).
Read more: Why Responsible AI Development Needs Cooperation on Safety (OpenAI Blog).
Further thoughts on the project from corresponding author Dr Amanda Askell (Twitter).
####################################################
Tech Tales:
Stop-Start Computing
And so after the Climate Accords and the Generational Crime Rulings and the loss of some 20% of the world’s land surface to a combination of heat and/or flooding, after all of this society carried on. We were hot and we were sick and there were way too many of us, but we carried on.
We kept on moving little and big chunks of mass around on planet earth, and as we moved this stuff we mixed up the atmosphere and the underground and we changed our air and made it worse, but we carried on.
And all through this we used our computers. We used our phones to watch old movies of ‘the times before’. We listened to music from prior decades. We played games in which the planet was covered in forests, or where we were neanderthals playing with axes in a kind of wilderness, or ones where we rode out into space and managed vast interstellar armies. Our simulations and our software and our entertainment got better and better and so we used it more and more, and we carried on.
Everything has a breaking point. At some point computers started using so much energy that even with central planning and the imposition of controls, electrical utilities couldn’t keep up. Thirty percent of the electricity in some countries went to computers. In some smaller countries based around high-tech services, it was even higher. Data centers found themselves periodically running on backup generators – old salvaged WW2 diesel engines from submarines – and sometimes the power ran out entirely and these big computer cathedrals stood idle, mute blocks surrounded by farmland or forest or high-altitude steppes and deserts.
So after we hit our limit we created the coins as part of the Centrally Managed Sustainable Compute Initiative. We were meant to call them ‘compute tokens’ but everyone called them coins, and we were meant to call the computation power we exchanged these coins for the Shared Societal Computer but everyone just called it the timeshare.
So now here’s how it works:
- If you’re poor, you use a coin and you access lumps of computation and storage, rationed out according to the complex interplay of heat and consumption and climate.
- If you’re rich, you spend extra for Premium Compute Credits.
- If you’re ultrarich, you build yourself a powerplant or better yet something renewable – geothermal or wind or solar. Then you build your facility and you use that computation for yourself.
Private data centers will be outlawed soon, people say. There’s talk of using all of the compute left in the world to save the world – something about simulating the impossible complexity of the earth, and finding a way to carry on.
Things that inspired this story: The energy consumption of Bitcoin and large-scale AI models; climate change; inevitability.
[…] dataset is about ~185GB in size, and follows Recursion’s earlier release of RxRx1 last year (Import AI 155). “RxRx2 demonstrates both the great variety of morphological effects soluble factors have […]