Import AI: #91: European countries unite for AI grand plan; why the future of AI sensing is spatial; and testing language AI with GLUE.

by Jack Clark

Want bigger networks with lower variance? Physics to the rescue!
…Combining control theory and machine learning leads to good things..
Researchers with NNAISENSE, a European artificial intelligence startup, have published details on NAIS-Net (Non-Autonomous Input-Output Stable Network), a new type of neural network architecture that they say can be trained to depths of ten or twenty times greater than other networks (eg, Residual Networks, Highway Networks) while offering greater guarantees of stability.
  Physics + AI: The network design takes inspiration from control theory and physics and yields a component that lets designers build systems which promise to be more adaptive to varying types of input data and therefore can be trained to greater degrees of convergence for a given task. NAIS-Nets essentially shrink the size of the dartboard that the results of any given run will fall into once trained to completion, offering the potential for lower variability and therefore higher repeatability in network training.
  Scale: “NAIS-Nets can also be 10 to 20 times deeper than the original ResNet without increasing the total number of network parameters, and, by stacking several stable NAIS-Net blocks, models that implement pattern-dependent processing depth can be trained without requiring any normalization,” the researchers write.
  Results: In tests on CIFAR-100 the researchers find that a NAIS-Net can roughly match the performance of a residual network but with significantly lower variance. The architecture hasn’t yet been tested on ImageNet, though, which is larger and seems more like the gold standard to evaluate a model on.
  Why it matters: One of the problems with current AI techniques is that we don’t really understand how they work at a deep and principled level and this is empirically verifiable via the fact we can offer fairly poor guarantees about variance, generalization, and performance tradeoffs during compression. Approaches like NAIS-Nets seem to reduce our uncertainty in some of these areas, suggesting we’re getting better at designing systems that have a sufficiently rich mathematical justification that we can offer better guarantees about some of their performance parameters. This is further indication that we’re getting better at creating systems that we can understand and make stronger prior claims about, which seems to be a necessary foundation from which to build more elaborate systems in the future.
  Read more: NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations (Arxiv).

European countries join up to ensure the AI revolution doesn’t pass them by:
…the EU AI power bloc emerges as countries seek to avoid what happened with cloud computing…
25 European countries have signed a letter indicating intent to “join forces” on developing artificial intelligence. What the letter amounts to is a promise in good faith from each of the signatories that they will attempt to coordinate with eachother as they carry out their respective national development programs.
  “Cooperation will focus on reinforcing European AI research centers, creating synergies in R&D&I funding schemes across Europe, and exchanging views on the impact of AI on society and the economy. Member States will engage in a continuous dialogue with the Commission, which will act as a facilitator,” according to a prepared quote from European Commissioners Andrus Ansip and Mariya Gabriel.
  Why it matters: Both China and the US have structural advantages for the development of AI as a consequence of their scale (hundreds of millions of people speaking and writing in the same language) as well as their ability to carry out well-funded national research initiatives. Individual European countries can’t match these assets or investment so they’ll need to band together or else, much like the cloud computing revolution, they’ll end up without any major companies and will therefore lack political and economic influence in the AI era.
  Read more: EU Member States sign up to cooperate on Artificial Intelligence (European Commission).

Why the future of AI is Spatial AI, and what this means for robots, drones, and anything that senses the world:
…What does the current landscape of simultaneous location and mapping algorithms tell us about the future of how robots will see the world?…
SLAM researcher Andrew Davison has written a paper surveying the current simultaneous, location and mapping (SLAM) landscape and predicting how it will evolve in the future based on contemporary algorithmic trends. For real-world AI systems to achieve much of their promise they will need to have what he terms ‘Spatial AI’; the suite of cognitive-like abilities that machines will need to perceive and categorize the world around themselves so that they can act effectively. This hypothetical Spatial AI system will, he hypothesizes, be central to future real world AI as it “incrementally builds and maintains a generally useful, close to metric scene representation, in real-time and from primarily visual input, and with quantifiable performance metrics”, allowing people to develop much richer AI applications.
  The gap between today and Spatial AI: Today’s SLAM systems are being changed by the arrival of learned methods to to accompany hand-written rules for key capabilities, particularly in the space of systems that build maps of the surrounding environment. The Spatial AI systems of the future will likely incorporate many more learned capabilities especially for resolving ambiguity or predicting changes in the world, and will need to do this across a variety of different chip architectures to maximize performance.
  A global map born from many ‘Spatial AIs’: Once the world has a few systems with this kind of Spatial AI capability they will also likely pool their insights about the world into a single, globally shared map, which will be constantly updated via all of the devices that rely on it. This means once a system identifies where it is it may not need to do as much on-device processing as it can pull contextual information from the cloud.
  What might such a device look like? Multiple cameras and sensors whose form factor will change according to the goal, for instance, “a future household robot is likely to have navigation cameras which are centrally located on its body and specialized extra cameras, perhaps mounted on its wrists to aid manipulation.” These cameras will maintain a world model that provides the system with a continuously updated location context, along with semantic information about the world around in. The system will also constantly check new information against a forward predictive scene model to help it anticipate and respond to changes in its environment. Computationally, these systems will label the world around themselves, track themselves within it, map everything into the same space, and perform self-supervised learning to integrate new sensory inputs. Ultimately, if the world model becomes good enough then the system will only need to sample information from its sensors which is different to what it predicted, letting it further optimize its own perception for efficiency.
  Testing: One tough question that this idea provokes is how we can assess the performance of such Spatial AI systems. SLAM benchmarks tend to be overly narrow or restrictive, with some researchers preferring instead to make subjective, qualitative assessments of SLAM progress. Davison suggests the usage of benchmarks like SlamBench which measure performance in terms of accuracy and computational costs across a bunch of different processor platforms. Benchmarking SLAM performance is also highly contingent on the platform the SLAM system is deployed in, so assessments for the same system deployed on a drone or a robot are going to be different. In the future, it would be good to assess performance via a variety of objectives within the same system, like segmenting objects, tracking changes in the environment, evaluating power usage, measuring relocalization robustness, and so on.
  Why it matters: Papers like this provide a holistic overview of a given AI area. SLAM capabilities are going to be crucial to the deployment of AI systems in the real world. It’s likely  that many contemporary AI components are going to be used in the SLAM systems of the future and, much like in other parts of AI research, the future design of such systems is going to be increasingly specialized, learned, and deployed on heterogeneous compute substrates.
  Read more: FutureMapping: The Computational Structure of Spatial AI Systems (Arxiv).

Machine learning luminary points out one big problem that we need to focus on:
…While we’re all getting excited about game-playing robots, we’re neglecting building the system needed to manage and support and learn from millions of these robots once they are deployed in the world…
Michael Jordan, the Michael Jordan of machine learning, believes that we must create a new engineering discipline to let us deal with the challenges and opportunities of AI. Though there have been many successes in recent areas in areas of artificial intelligence linked to mimicking human intelligence, less attention has been paid to the creation of the support infrastructure and data-handling techniques needed to allow AI to truly benefit society, he argues. For instance, consider healthcare, where there’s a broad line of research into using AI to improve specific diagnostic abilities, but less of a research culture about the problem of knitting all of the data from all of these separately-deployed medical systems together and then tracking and managing that data in a way that is sensitive to privacy concerns but allows us to learn from its aggregate flows. Similarly, though much attention has been directed to self-driving cars, less attention has been focused on the need to create a new type of system akin to air traffic control to effectively manage these coming fleets of autonomous vehicles where coordination will yield massive efficiencies.
  “Whether or not we come to understand “intelligence” any time soon, we do have a major challenge on our hands in bringing together computers and humans in ways that enhance human life. While this challenge is viewed by some as subservient to the creation of “artificial intelligence,” it can also be viewed more prosaically — but with no less reverence — as the creation of a new branch of engineering,” he writes. “The principles needed to build planetary-scale inference-and-decision-making systems of this kind, blending computer science with statistics, and taking into account human utilities, were nowhere to be found in my education.”
  Read more: Artificial Intelligence – The Revolution Hasn’t Happened Yet (Arxiv).
  Things that make you go ‘hmmm’: Mr Jordan thanks Jeff Bezos for reading an earlier draft of the post. If there’s any company well-placed to build a global ‘intelligent infrastructure’ that dovetails into the physical world, it’s Amazon.

New ‘GLUE’ competition tests limits of generalization for language models:
…New language benchmark aims to test models properly on diverse datasets…
Researchers from NYU, the University of Washington, and DeepMind, have released the General Language Understanding Evaluation (GLUE) benchmark and evaluation website. GLUE provides a way to check a single natural language understanding AI model across nine sentence- or sentence-pair tasks, including question answering, sentiment analysis, similarity assessments, and textual entailment. This gives researchers a principled way to check a model’s ability to generalize across a variety of different tasks. Generalization tends to be a good proxy for how scalable and effective a given AI technique is, so being able to measure it in a disciplined way within language should spur development and yield insights about the nature of the problem, like how the DAWNBench competition shows how to tune supervised classification algorithms for performance-critical criteria.
  Difficult test set: GLUE also incorporates a deliberately challenging test set which is “designed to highlight points of difficulty that are relevant to model development and training, such as the incorporation of world knowledge, or the handling of lexical entailments and negation”. That should also spur progress as it will help researchers spot the surprisingly dumb ways in which their models breakdown.
  Results: The researchers also implemented baselines for the competition by using a BiLSTM and augmenting it with sub-systems for attention and two two recent research inventions, ELMo and CoVe. No algorithm performed particularly adeptly at generalizing when compared to a strong single-system trained baseline.
  Why it matters: One repeated pattern in science is that shared evaluation criteria and competitions drive progress as they bring attention to previously unexplored problems. “When evaluating existing models on the main GLUE benchmark, we find that none are able to substantially outperform a relatively simple baseline of training a separate model for each constituent task. When evaluating these models on our diagnostic dataset, we find that they spectacularly fail on a wide range of linguistic phenomena. The question of how to design general purpose NLU models thus remains unanswered,” they write. GLUE should motivate further progress here.
  Read more: GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (PDF).
  Check out the GLUE competition website and leaderboard here.

OpenAI Bits & Pieces:

AI and Public Policy: Congressional Testimony:
  I testified in congress this week for the House Oversight Committee Subcommittee on Information Technology’s hearing about artificial intelligence and public policy. I was joined by Dr Ben Buchanan of Harvard’s Belfer Center, Terah Lyons of the Partnership on AI, and Gary Shapiro of the Consumer Technology Association. In my written testimony, oral testimony, and in responses to questions, I discussed the need for the AI community to work on better norms to ensure the technology achieves maximal benefit, discussed ways to better support the development of AI (fund science and make it easy for everyone to study AI in America) and also talked about the importance of AI measurement and forecasting schemes to allow for better policymaking and to protect against ignorant regulation.
  Watch the testimony here.
  Read my written comments here (PDF).
Things that make you go hmmmmm: One of the congresspeople played an audioclip of HAL 9000 refusing to open the pod bay doors from 2001 a Space Odyssey to illustrate some points about AI interpretibility.

Tech Tales:

The World is the Map.
[Fragment of writing picked up by Grand Project-class autonomous data intercept program. Year: 2062]

There were a lot of things we could have measured during the development of the Grand Project, but we settled on its own map of the world, and we think that explains many of the subsequent quirks and surprises in its rapid expansion. We covered the world in sensors and fed them into it, giving it a fused, continuous understanding of the heartbeat of things, ranging from solar panels, to localized wind counts, to pedestrian traffic on every street of every major metropolis, to the inputs and outputs of facial recognition algorithms run across billions of people, and more. We fed this data into the Grand Project super-system, which spanned the data centers of the world, representing an unprecedented combination of public-private partnerships – private petri dishes of capitalist enterprises, big lumps of state-directed investments, discontinuous capital agglomerations from unpredictable research innovations, and so on.

The Grand Project system grew in its understanding and in its ability to learn to model the world from these inputs, abstracting general rules into dreamlike hallucinations of not just what existed, but what could also be. And in this dreaming of its own versions of the world the system started to imagine how it might further attenuate its billions of input datastreams to allow it to focus on particular problems and manipulate their parameters and in doing so improve its ability to understand their rhythms and build rules for predicting how they will behave in the future.

We created the first data intercept program in ten years ago to let us see into its own predictions of the world. We saw variations on the common things of the world, like streetlights that burned green, or roads with blue pavements and red dashes. But we also saw extreme things: power systems configured to route only to industrial areas, leading to residential areas being slowly taken over by nature and thereby reducing risks from extreme weather events. But the broad distribution of things we saw seemed to fit our own notion of good that we started to wonder if we should give it more power. What if we let it change the world for real? So now we debate whether to cross this bridge: shall we let it turn solar panels off and on to satisfy continental-scale powergrids, or optimize shipping worldwide, or experiment with the sensing capabilities of every smartphone and telecommunications hub on the planet? Shall we let it optimize things not just for our benefit, but for its own?

Things that inspired this story: Ha and Schmidhuber’s “World Models“, Spartial AI, reinforcement learning, Jorge Luis Borges Tlön, Uqbar, Orbis Tertius,