Import AI 143: Predicting car accident risks by looking at the houses people live in; why data matters as much as compute; and using capsule networks to generate synthetic data

by Jack Clark

Predicting car accident risks from Google Street View images:
The surprising correspondences between different types of data…
Researchers with the University of Warsaw and Stanford University have shown how to use pictures from people’s houses to better predict the chances of that person getting into a car accident. (Import AI administrative note – standard warnings about ‘correlation does not imply’ causation apply).

For the project, the researchers analyzed 20,000 addresses of insurance company clients – a random sample of an insurer’s portfolio collected in Poland between January 2012 and December 2015. For each address, they collect an overhead Google satellite view and a Google Street View image of the property, and humans then annotate the image with labels relating to the type of property, age, condition, estimate wealth of its residents, along with the type and density of buildings in the neighborhood. They subsequently test these variables and find that five of the seven have significant with regard to the insurance prediction problem.

  “Despite the high volatility of data, adding our five simple variables to the insurer’s model improves its performance in 18 out of 20 resampling trials and the average improvement of the Gini coefficient is nearly 2 percentage points,” they write.

Ultimately, they show that – to a statistically significant extent – “features visible on a picture of a house can be predictive of car accident risk, independently from classically used variables such as age, or zip code”.

Why this matters: Studies like this speak to the power of large-scale data analysis, highlighting how data that is innocuous at the level of the individual can become significant when compared and contrasted with a vast amount of other data. The researchers acknowledge this, noting that:  “modern data collection and computational techniques, which allow for unprecedented exploitation of personal data, can outpace development of legislation and raise privacy threats”.
  Read more: Google Street View image of a house predicts car accident risk of its resident (Arxiv).

#####################################################

Your next pothole could be inspected via drone:
…Drones + NVIDIA cards + smart algorithms = automated robot inspectors…
Researchers with HKUST Robotics Institute have created a prototype drone system that can be used to automatically analyze a road surface. The project sees the researchers develop a dense stereo vision algorithm which the UAV uses to analyze the road surface. They’re able to use this algorithm to process road images on the drone in real-time, automatically identifying surface-area disparities.

Hardware: To accomplish this, they use a ZED stereo camera mounted on a DJI Matrice 100 drone, which itself has a JETSON TX2 GPU installed onboard for real-time processing.

Why this matters: AI approaches make it cheap for robots to automatically sense&analyze aspects of the world, and experiments like this suggest that we’re rapidly approaching the era when we’ll start to automate various types of surveillance (both for civil and military purposes) via drones.
  Read more: Real-Time Dense Stereo Embedded in a UAV for Road Inspection (Arxiv).
  Get the datasets used in the experiment here (Rui Fan, HKUST, personal website).
  Check out a video of the drone here (Rui Fan, YouTube).

#####################################################

Train AI to watch over the world with the iWildCam dataset:
…Monitoring the planet with deep learning-based systems…
Researchers with the California Institute of Technology have published the iWildCam dataset to help people develop AI systems that can automatically analyze wildlife seen in camera traps spread across the American Southwest. They’ve also created a challenge based around the dataset, letting researchers compete in developing AI systems capable of automatically monitoring the world.

Testing generalization: “If we wish to build systems that are trained once to detect and classify animals, and then deployed to new locations without further training, we must measure the ability of machine learning and computer vision to generalize to new environments,” the researchers write.

Common nuisances: There are six problems relating to the data gathered from the traps: variable illumination, motion blur, size of the region of interest (eg, an animal might be small and far away from the camera), occlusion, camouflage, and perspective.

iWildCam: The images come from cameras installed across the American Southwest, consisting of 292,732 images spread between 143 locations. iWildCam is designed to capture the complexities of the datasets that human biologists need to deal with: “therefore the data is unbalanced in the number of images per location, distribution of species per location, and distribution of species overall”, they write.

Why this matters: Datasets like this – and AI systems built on top of it – will be fundamental to automating the observation and analysis of the world around us; given the increasingly chaotic circumstances of the world, it seems useful to be able to have machines automatically analyze changes in the environment for us.
   Read more: The iWildCam 2018 Challenge Dataset (Arxiv).
   Get the dataset: iWildCam  2019 challenge (GitHub).

#####################################################

Compute may matter, but so does data, says Max Welling:
…”The most fundamental lesson of ML is the bias-variance tradeoff”…
A few weeks ago Richard Sutton, one of the pioneers of reinforcement learning, wrote a post about the “bitter lesson” of AI research (Import AI #138), namely that techniques which use huge amounts of computation and relatively simple algorithms are better to focus on. Now, Max Welling, a researcher with the University of Amsterdam, has written a response claiming that data may be just as important as compute.

  “The most fundamental lesson of ML is the bias-variance tradeoff: when you have sufficient data, you do not need to impose a lot of human generated inductive bias on your model,” he writes. “However, when you do not have sufficient data available you will need to use human-knowledge to fill the gaps.”

Self-driving cars are a good example of a place where compute can’t solve most problems, and you need to invest in injecting stronger priors (eg, an understanding of the physics of the world) into your models, Welling says. He also suggests generative models could help fill in some of these gaps, especially when it comes to generalization.

Ultimately, Welling ends up somewhere between the ‘compute matters’ versus the ‘strong priors matter’ (eg, data) arguments. “I would say if we ever want to solve Artificial General Intelligence (AGI) then we will need model-based RL,” he writes. “We cannot answer the question of whether we need human designed models without talking about the availability of data.”

Why this matters: There’s an inherent tension in AI research between bets that revolve predominantly around compute and those that revolve around data. That’s likely because different bets encourage different research avenues and different specializations. I do worry about a world where people that do lots of ‘big compute’ experiments end up speaking a different language to those without, leading to different priors when approaching the question of how much computation matters.
  Read more: Do we still need models or just more data and compute? (Max Welling, PDF).

#####################################################

Want to train AI on something but don’t have much data? There’s a way!
…Using Capsule Networks to generate synthetic data…
Researchers with the University of Moratuwa want to be able to teach machines to recognize handwritten characters using very small amounts of data, so have implemented an approach based on Capsule Networks – a recently-proposed technique promoted by deep learning pioneer Geoff Hinton – that lets them learn to classify handwritten letters from as few as 200 examples.

The main way they achieve this is by synthetically augmenting these small datasets by using some of the idiosyncratic traits of capsule networks – namely, their ability to learn data representations that are more robust to transforms, as a consequence of their technical implementation of things like ‘routing by agreement‘. The researchers use these traits to directly manipulate the sorts of data representations being produced on exposure to the data to algorithmically generate handwritten letters that look similar to those in the training dataset, but are not identical; this generates additional data that the system can be trained on, without needing to collect more data from (expensive!) reality.

“By adding a controlled amount of noise to the instantiation parameters that represent the properties of an entity, we transform the entity to characterize actual variations that happen in reality. This results in a novel data generation technique, much more realistic than augmenting data with affine transformations,” they write. “The intuition behind our proposed perturbation algorithm is that by adding controlled random noise to the values of the instantiation vector, we can create new images, which are significantly different from the original images, effectively increasing the size of the training dataset”.

How well does it work? The researchers test their approach by evaluating how well TextCaps can learn to classify images when trained on full datasets and 200-sample-size datasets from EMNIST, MNIST and the much more visually complex Fashion MNIST; TextCaps is able to exceed state-of-the-art when trained on full data of three variants of EMNIST and gets close to this using just 200 samples, and approaches SOTA on MINIST and Fashion MNIST (though does very badly on Fashion MNIST when using just 200 samples, likely because of the complexity).

Why this matters: Approaches like this show how as we develop increasingly sophisticated AI systems we may be able to better deal with some of the limitations imposed on us by reality – like a lack of large, well-labeled datasets for many things we’d like to use AI on (for instance: learning to spot and classify numerous handwritten languages for which there are relatively few digitized examples). “We intend to extend this framework to images on the RGB space, and with higher resolution, such as images from ImageNet and COCO. Further, we intend to apply this framework on regionally localized languages by extracting training images from font files,” they write.
  Read more:  TextCaps: Handwritten Character Recognition with Very Small Datasets (Arxiv).
  Read more: Understanding Hinton’s Capsule Networks (Medium).
  Read more: How Capsules Work (Medium).
  Read more: Understanding Dynamic Routing between Capsules (Capsule Networks explainer on GitHub).

#####################################################

Want to test language progress? Try out SuperGLUE:
…Step aside GLUE – you were too easy!…
Researchers with New York University have had to toss out a benchmark they developed last year and replace it with a harder one, due to the faster-than-expected progress in certain types of language modelling. The ‘SuperGLUE’ benchmark is a sequel to GLUE and has been designed to include significantly harder tasks than those which were in GLUE.

New tasks to frustrate your systems: Tasks in SuperGBLUE include: CommitmentBank, where the goal is to judge how committed an author is to a specific clause within a sentence; the Choice of Plausible Alternatives (COPA) in which the goal is to pick the more likely sentence given two options; the Gendered Ambiguous Pronoun Coreference Task (GAP), where systems need to ‘determine the referent of an ambiguous pronoun’; the Multi-Sentence Reading Comprehension dataset, a true-false question-answering task; RTE, a textual-entailment task which was in GLUE 1.0; WIC, which challenges systems to do disambiguation and the Winograd Schema Challenge, which is a reading comprehension task designed to specifically test for world modeling or the lack of it (eg, systems that think large objects can go inside small objects, and vice versa).

PyTorch toolkit: The researchers plan to release a toolkit based on PyTorch and software from AllenNLP which will include pretrained models like OpenAI GPT and Google BERT, as well as designs to enable rapid experimentation and prototyping. As with GLUE, there will be an online leaderboard that people can compete on.

Why this matters: Well-designed benchmarks are one of the best tools we have available to us to help judge AI progress, so when benchmarks are rapidly obviated via progress in the field it suggests that the field is developing quickly. The researchers believe SuperGLUE is sufficiently hard that it’ll take a while to solve, so think “there is plenty of space to test new creative approaches on a broad suite of difficult NLP tasks with SuperGLUE.”
  Read more: Introducing SuperGLUE: A New Hope Against Muppetkind (Medium).
  Read more: SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems (PDF).

#####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe has kindly offered to write some sections about AI & Policy for Import AI. I’m (lightly) editing them. All credit to Matthew, all blame to me, etc. Feedback: jack@jack-clark.net

European Commission releases pilot AI ethics guidelines:
Last year, the European Commission announced the formation of the High-Level Expert Group on AI, a core component of Europe’s AI strategy. The group released draft ethics guidelines in December (see Import #126), and embarked on a consultation process with stakeholders and member states. This month they released a new draft, and will be running a pilot program through 2019.

   Key requirements for trustworthy AI: The guidelines lay out 7 requirements: Human agency and oversight; Technical robustness and safety; Privacy and data governance; Transparency; Diversity, non-discrimination and fairness; Societal and environmental wellbeing; Accountability.

  International guidelines: The report makes clear the Commission’s ambition to play a leading role in developing internationally-agreed AI ethics guidelines.

  Why it matters: The foregrounding of AI safety (‘technical robustness and safety’ in the language of the guidelines) is good news. The previous draft revealed long-term concerns had proved highly-controversial amongst the experts, and asked specifically for consultation input on these issues. This latest draft suggests that the public and other stakeholders take these concerns seriously.
  Read more: Communication – Building Trust in Human Centric AI (EC).

Microsoft refuses to sell face recognition due to human rights concerns:
In a talk at Stanford, CEO Brad Smith described recent deals Microsoft had declined due to ethical concerns. He revealed that the company refused to provide face recognition technology to a California law enforcement agency. Microsoft concluded the proposed roll-out would have disproportionately impacted women and ethnic minorities. The company also declined a deal with a foreign country to install face recognition across the nation’s capital, due to concerns that it would have suppressed freedom of assembly.
  Read more: Microsoft turned down facial-recognition sales on human rights concerns (Reuters)

#####################################################

Tech Tales:

Until Another Dream

I get up and I hunt down the things that are too happy or too sad and I take them out of the world. This is a civil-general world and by decree we cannot have extremes. So I take their broken shapes with me and I put them in a chest in the basement of my simulated castle. Then I take my headset off and I go to my nearby bar and the barman calls me “dreamkiller” as his way of being friendly.
What dreams did you kill today, dreamkiller?
You still dream about that butterfly with the face of a kitten you whacked?
Ever see any more of those sucking-face spiders?
What happened to the screaming paving slabs, anyway?
You get the picture.

The thing about today is everyone is online and online is full of so much money that it’s just like real life: most people don’t see the most extreme parts of it, and by a combination of market pressures and human preferences, some people get paid to either erase the extremes or hide them away.

After the bar I go home and I get into bed and my muscle memory has me pick up the headset and have it almost on my head before my conscious brain kicks in – what some psychologists call The Supervisor. “Do I really want to do this?” my supervisor asks me? “Why not go to bed?”

I don’t answer myself directly, instead I slide the headset on, turn it on, and go hunting. There have been reports of unspeakably cute birds carrying wicker baskets containing smaller baby birds in the south quadrant. Meanwhile up in the north there’s some kind of parasite that eats up the power sub-systems of the zones, projecting worms onto all the simulated telescreens.

My official job title is Reality Harmonizer and my barman calls me Dreamkiller and I don’t have a name for myself: this is my job and I do it not willingly, but because my own tastes and habits compel me to do it. I have begun to wonder if real-life murderers and murder-police are themselves people that take off their headsets at night and go to bars. I have begun to wonder whether they themselves find themselves in the middle of the night choosing between sleep and a kind of addictive duty. I believe the rules change when fairytales are real.

Things that inspired this story: MMOs; the details change but the roles are always the same; detectives; noir; feature-space.