Import AI

Import AI 186: AI + Satellite Imagery; Votefakes!; Schmidhuber on AI’s past&present

AI + Satellites = Climate Change Monitoring:
…Deeplab v3 + Sentinel satellite ‘SAR’ data = lake monitoring through clouds…
Researchers with ETH Zurich and the Skolkovo Institute of Science and Technology have used machine learning to develop a system that can analyze satellite photos of lakes and work out if they’re covered in ice or not. This kind of capability is potentially useful when building AI-infused earth monitoring systems.

Why they did it: The researchers use synthetic aperture radar (SAR) data from the Sentinel-1 satellite. SAR is useful because it sees through cloud cover, so they can analyze lakes under variable weather conditions. “Systems based on optical satellite data will fail to determine these key events if they coincide with a cloudy period,” they write. “The temporal resolution of Sentinel-1 falls just short of the 2-day requirement of GCOS, still it can provide an excellent “observation backbone” for an operational system that could fill the gaps with optical satellite data”.

How they did it: The researchers paired the Sentinel satellite data with a Deeplab v3+ semantic segmentation network. They tested their approach against three lakes in Switzerland (Sils, Silvaplana, St. Moritz) over satellite data gathered of the lakes during two separate winters (2016/17, and 2017/18). They obtain accuracy scores of around 95%, and find that the network does a reasonable job of identifying when lakes are frozen.

Why this matters: Papers like this show how people are increasingly using AI techniques as a kind of plug&play sensing capability, where they assemble a dataset, train a classifier, and then either build or plan an automated system based on the newly created detector.
  Read more: Lake Ice Detection from Sentinel-1 SAR with Deep Learning (arXiv).

####################################################

Waymo dataset + LSTM = a surprisingly well-performing self-driving car prototype:
…Just how far can a well-tuned LSTM get you?…
Researchers with Columbia University want to see how smart a self-driving car can get if it’s trained in a relatively simple way on a massive dataset. To that end, they train a LSTM-based system on 12 input features from the Waymo Open Dataset, a massive set of self-driving car data released by Google last year (Import AI 161).

Performance of a well-tuned LSTM: In tests, an LSTM system trained with all the inputs from all the cameras on the car gets a minimum loss of about 0.1327. That’s superior to other similarly simple systems based on technologies like convolutional neural nets, or gradient boosting. But it’s a far cry from the 99.999% accuracy I think most people would intuitively want in a self-driving car.

Why this matters: I think papers like this emphasize the extent to which neural nets are now utterly mainstream in AI research. It also shows how industry can inflect the type of research that gets conducted in AI purely by releasing its own datasets, which become the environments academics use to test, calibrate, and develop AI research approaches.
  Read more: An LSTM-Based Autonomous Driving Model Using Waymo Open Dataset (arXiv).

####################################################

Votefakes: Indian politician uses synthetic video to speak to more voters:
…Deepfakes + Politics + Voter-Targeting = A whole new way to persuade…
An Indian politician has  used AI technology to generate synthetic videos of themselves giving the same speech in multiple languages, marking a possible new tool that politicians will use to target the electorate.

Votefakes: “When the Delhi BJP IT Cell partnered with political communications firm The Ideaz Factory to create “positive campaigns” using deepfakes to reach different linguistic voter bases, it marked the debut of deepfakes in election campaigns in India. “Deepfake technology has helped us scale campaign efforts like never before,” Neelkant Bakshi, co-incharge of social media and IT for BJP Delhi, tells VICE. “The Haryanvi videos let us convincingly approach the target audience even if the candidate didn’t speak the language of the voter.”” – according to Vice.

Why this matters: Ai lets people scale themselves – whether by automating and scaling out certain forms of analysis, or here automating and scaling out the way that people appear to other people. With modern AI tools, a politician can be significantly more present in more diverse communities. I expect this will lead to some fantastically weird political campaigns and, later, the emergence of some very odd politicians.
  Read more: We’ve Just Seen the First Use of Deepfakes in an Indian Election Campaign (Vice).

####################################################

Computer Vision pioneer switches focus to avoid ethical quandaries:
…If technology is so neutral, then why are so many uses of computer vision so skeezy?…
The creator of YOLO, a popular image identification and classification system, has stopped doing computer vision research due to concerns about how the technology is used.
  “I stopped doing CV research because I saw the impact my work was having,” wrote Joe Redmon on Twitter. “I loved the work but the military applications and privacy concerns eventually became impossible to ignore.”

This makes sense, given Redmon’s unusually frank approach to their research. ““What are we going to do with these detectors now that we have them?” A lot of the people doing this research are at Google and Facebook. I guess at least we know the technology is in good hands and definitely won’t be used to harvest your personal information and sell it to…. wait, you’re saying that’s exactly what it will be used for?? Oh. Well the other people heavily funding vision research are the military and they’ve never done anything horrible like killing lots of people with new technology oh wait…”, they wrote in the research paper announcing YOLOv3 (Import AI: 88).
  Read more at Joe Redmon’s twitter page (Twitter).

####################################################

Better Satellite Superresolution via Better Embeddings:
…Up-scaling + regular satellite imaging passes = automatic planet monitoring…
Superresolution is where you train a system to produce the high-resolution versions of low-resolution images; in other words, if I show you a bunch of black and white pixels on a green field, it’d be great if you were smart enough to figure out this was a photo of a cow and produce that for me. Now, researchers from Element AI, MILA, the University of Montreal, and McGill University, have published details about a system that can take in multiple low-resolution images and stitch them together into high-quality superresolution images.

HighRes-net: The key to this research is HighRes-net, an architecture that can fuse an arbitrary number of low-resolution frames together to form a high-resolution image. One of the key tricks here is the continuous computation of a shared representation across multiple low-resolution views – by embedding these into the same featurespace, then embedding them jointly with the shared representation, it makes it easier for the network to learn about overlapping versus non-overlapping features, which can help it make marginally smarter super-resolution judgement calls. Specifically, the authors claim HighRes-net is “the first deep learning approach to MFSR that learns the typical sub-tasks of MFSR in an end-to-end fashion: (i) co-registration, (ii) fusion, (iii) up-sampling, and (iv) registration-at-the-loss.”

How well does it work? The researchers tested out their system on the PROBA-V dataset, a satellite imagery dataset that consists of high-resolution / low-resolution imagery pairs. (According to the researchers, lots of bits of superresolution research test on algorithmically-generated low-res images, which means the tests can be a bit suspect). They entered their model into the European Space Agency’s Kelvin competition, obtaining top scores on the public-leaderboard and secondbest scores on a private evaluation.

Why this matters: Techniques like this could let people use more low-resolution satellite imagery to analyze the world around them. “There is an abundance of low-resolution yet high-revisit low-cost satellite imagery, but they often lack the detailed information of expensive high-resolution imagery,” the researchers write. “We believe MFSR can uplift its potential to NGOs and non-profits”.
  Get the code for HighRes-net here (arXiv).
  Read more: HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery (arXiv).

####################################################

AI industrialization means AI efficiency: Amazon shrinks the Transformer, gets decent results, publishes the code:
…Like the Transformer but hate how big it is? Try out Amazon’s diet Transformers…:
Amazon Web Services researchers have developed three variations on the Transformer architecture, all of which demonstrate significant efficiency gains over the stock Transformer.

Who cares about the Transformer? The transformer is a fundamental AI component that was first published in 2017 – one of the main reasons why people like Transformers is the fact the architecture uses attentional mechanisms to help it learn subtle relationships between data. Its this capability that has made Transformers quickly become fundamental plug-in components, appearing in AI systems as diverse as GPT-2, BERT, and even AlphaStar. But the Transformer has one problem – it can be pretty expensive to use, because the attentional processes can be computationally expensive. Amazon has sought to deal with this by developing three novel variants on the transformer.

The Transformer, three ways: Amazon outlines three variants on the Transformer which are all more efficient, though in different ways. “The design principle is to still preserve the long and short range dependency in the sequence but with less connections,” the researchers write. They test each Transformer on two common language model benchmark datasets: Penn TreeBank (PTB) and WikiText-2 (WT-2) – in tests, the Dilated Transforer gets a test score of 110.92 on PTB and 147.58 on WT-2, versus 103.72 and 140.74 for the full Transformer. This represents a bit of a performance hit, but the Dilated Transformer saves about 70% on model size relative to the full one When reading these, bear in mind the computational complexity of a full transformer is: O(n^2 * h). (n = length of sequence; h = size of hidden state; k = filter size; b = base window size; m = cardinal number).
– Dilated Transformer: O(n * k * h): Use dilated connections so you can have a larger receptive field for a similar cost.
– Dilated Transformer with Memory: O(n * k * c * h): Same as above, along with “we try to cache more local contexts by memorizing the nodes in the previous dilated connections”.
– Cascade Transformer: O(n * b * m^1 * h): They use cascading connections “to exponentially incorporate the local connections”.

Why this matters: If we’re going through a period of AI industrialization, then something worth tracking is not only the frontier capabilities of AI systems, but also the efficiency improvements we see in these systems over time. I think it’ll be increasingly valuable to track improvements here, and it will give us a better sense of the economics of deploying various types of AI systems.
  Read more: Transformer on a Diet (arXiv).
  Get the code here (cgraywang, GitHub).

####################################################

Schmidhuber on AI in the 2010s and AI in the 2020s:
…Famed researcher looks backwards and forwards; fantastic futures and worrisome trends…
Jürgen Schmidhuber, an artificial intelligence researcher who co-invented the LSTM, has published a retrospective on the 2010s in AI, and an outlook for the coming decade. As with all Schmidhuber blogs, this post generally ties breakthroughs in the 2010s back to work done by Schmidhuber’s lab/students in the early 90s – so put that aside while reading and focus on the insights.

What happened in the 2010s? The Schmidhuber post makes clear how many AI capabilities went from barely works in research to used in production in multi-billion dollar companies. Some highlights of technologies that went from being juvenile to being deployed in production at massive scale:
– Neural machine translation
– Handwriting recognition
– Really, really deep networks: In the 2010s, we transitioned from training networks with tens of layers to training networks with hundreds of layers, via inventions like Highway Networks and Residual Nets – this has let us train larger, more capable systems, capable of extracting even more refined signals from subtle patterns.
– GANs happened – it became easy to train systems to synthesize variations on their own datasets, letting us do interesting things like generating images and audio, and weirder things like Amazon using GANs to simulate e-commerce customers.

What do we have to look forward to in the 2020s?
– Data markets: As more and more of the world digitizes, we can expect data to become more valuable. Schmidhuber suspects the 2020s will see numerous attempts to create “efficient data markets to figure out your data’s true financial value through the interplay between supply and demand”.
AI for command-and-control nations: Meanwhile, some nations may use AI technologies to increase their ability to control and direct their citizens: “some nations may find it easier than others to become more complex kinds of super-organisms at the expense of the privacy rights of their constituents,” he writes.
Real World AI: AI systems will start to be deployed into industrial processes and machines and robots, which will lead to AI having a greater influence on the economy.

Why this matters: Schmidhuber is an interesting figure in AI research – he’s sometimes divisive, and occasionally percieved as being somewhat pushy with regard to seeking credit for certain ideas in AI research, but he’s always interesting! Read the post in full, if only to get to the treat at the end about using AI to colonize the “visible universe”.
  Read more: The 2010s: Our Decade of Deep Learning / Outlook on the 2020s (arXiv).

####################################################

AI Policy with Matthew van der Merwe:

…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

Europe’s approach to AI regulation

The European Commission has published their long-awaited white paper on AI regulation. The white paper is released alongside reports on Europe’s data strategy, and on safety and liability. These build on Europe’s Coordinated Plan on AI (see Import #143) and the recommendations of their high-level expert group (see Import #126). 

   High-risk applications: The European approach will be ‘risk-based’, with high-risk AI applications subject to more stringent governance and regulatory measures. They propose two necessary conditions for an application to be deemed high-risk:
  (1) it is employed in a sector that typically involves significant risks (e.g. healthcare)
  (2) the application itself is one likely to generate significant risks (e.g. treating patients).

   US criticism: The US Government’s Chief Technology Officer, Michael Kratsios, criticized the proposals as being too ‘blunt’ in their bifurcation of applications into high and low-risk, arguing that it is better to treat risk as a spectrum when determining appropriate regulations, and that the US’s light touch approach is more flexible in this regard, and overall better.


Matthew’s view: To be useful, a regulatory framework has to carve up messy real-world things into neat categories, and it is often better to deal with nuance at a later stage—when designing and implementing legislation. In many countries it is illegal to drive without headlights at night, despite there being no clear line between night and day. Nonetheless, having laws that distinguish between driving at night and day is plausibly better than having more precise laws (e.g. in terms of measured light levels), or no laws at all in this domain. There are trade-offs when designing governance regimes, of which bluntness VS nuance is just one, and they should be judged on a holistic basis. In the absence of much detail on the US approach to AI regulation with regard to risks, it is too early to properly compare it with the Europeans’.

Read more: On Artificial Intelligence – A European approach to excellence and trust (EU)

Read more: White House Tech Chief Calls Europe’s AI Principles Clumsy Compared to U.S. Approach

 

DoD adopts AI principles:

DefenseOne reports that the DoD plans to adopt the AI principles drawn up by the Defense Innovation Board (DIB). A draft of these principles were published in October (see Import #171).

Matthew’s view: I was impressed by the DIB’s AI principles and the process by which they were arrived at. They had a deep level of involvement from a broad group of experts, and underwent stress testing with a ‘red teaming’ exercise. The principles focus on the safety, robustness, and interpretability of AI systems. They also take seriously the need to develop guidelines that will remain relevant as AI capabilities grow stronger. 

   Read more: Pentagon to Adopt Detailed Principles for Using AI.
  Read more: Draft AI Principles (DoD).

####################################################

Tech Tales:

The Interface
A Corporate Counsel Computer, 2022

Hello this is Ava at the Contract Services Company, what do you need?
Well hey Ava, how’s it going?
It’s good, and I hope you’re having a great day. What services can we provide?
Can you get me access to Construction_Alpha-009 that was assigned to Mitchell’s Construction?
Checking… verified. I sure can! Who would you like to speak to?
The feature librarian.
Memory? Full-spectrum? Auditory?-
I need the one that does Memory, specialization Emotional
Okay, transforming… This is Ava, the librarian at the Contract Services Company Emotional Memory Department, how can I help you?
Show me what features activated before they left the construction site on July 4th, 2025.
Checking…sure, I can do that! What format do you want?
Compile it into a sequential movie, rendered as instances of 2-Dimensional images, in the style of 20th Century film.
Okay, I can do that. Please hold…
…Are you there?
I am.
Would you like me to play the movie?
Yes, thank you Ava.

Things that inspired this story: AI Dungeon, GPT-2, novel methods of navigating the surfaces of generative models, UX, HCI, Bladerunner.

Import AI 185: Dawn of the planetary search engine; GPT-2 poems; and the UK government’s seven rules for AI providers

Dawn of the planetary satellite-imagery search engine:
…Find more images like this, but for satellite imagery…
AI startup Descartes Labs has created a planet-scale search engine that lets people use the equivalent of ‘find more images like this’, but instead of uploading an image into a search engine and getting a response back, they upload a picture of somewhere on Earth and get a set of visually similar locations.

How they did it: To build this, the authors used four datasets – for the USA, they used aerial imagery from the National Agriculture Imagery Program (NAIP), as well as the Texas Orthoimagery Program. For the rest of the world, they used data from Landsat 8. They then took a stock 50-layer ResNet pre-trained on ImageNet and made a couple of tweaks – injecting noise during training to make it easier for the network to learn to make binary classification decisions, they also did light customization for extracting features from networks trained against different datasets. Through this, they gained a set of 512-bit feature vectors, which make it possible to search for complex things like visual similarity.

How well does it work: In tests, they get scores of reasonable but not stellar performance, obtaining top-30 accuracies of around 80% when dealing with things they’ve fine-tuned the network against. However, in qualitative tests it feels like its performance may be higher than this for most use cases – I’ve played around with the Descartes Labs website where you can test out the system; it does reasonably well when you click around, identifying things like intersections and football stadiums well. I think a lot of the places where it gets confused come from the relatively low resolution of the satellite imagery, making fine-grained judgements more difficult.

Why this matters: Systems like this give us a sense of how AI lets us do intuitive things easily that would otherwise be ferociously difficult – just twenty years ago, asking for a system that could show you similar satellite images would be a vast undertaking with significant amounts of hand-written features and bespoke datasets. Now, it’s possible to create this system with a generic pre-trained model, a couple of tweaks, and some generally available unclassified datasets. I think AI systems are going to unlock lots of applications like this, letting us query the world with the sort of intuitive commands (e.g., similar to), that we use our own memories for today.
  Read more: Visual search over billions of aerial and satellite images (arXiv).
  Try the AI-infused search for yourself here (Descartes Labs website).

####################################################

Generating emotional, dreamlike poems with GPT-2:
…If a poem makes you feel joy or sadness, then is it good?…
Researchers with Drury University and UC-Colorado Springs have created  a suite of fine-tuned GPT-2 models for generating poetry with different emotional or stylistic characteristics. Specifically, they create five separate data corpuses of poems that, in their view, represent emotions like anger, anticipation, joy, sadness, and trust. They then fine-tune the medium GPT-2 model against these datasets.
  Fine-tuned dream poetry: They also train their model to generate what they call “dream poems” – poems that have a dreamlike element. To do this, they take the GPT-2 model and train it on a corpus of first-person dream descriptions, then train it again on a large poetry dataset.

Do humans care? The researchers generated a batch of 1,000 poems, then presented four poems from each emotional category to a set of ten human reviewers. “Poems presented were randomly selected from the top 20 EmoLex scored poems out of a pool of 1,000 generated poems,” they write. The humans were asked to write the poems according to the emotions they felt after reading them – in tests, they classified the poems based on the joy and sad corpuses as reflecting those emotions 85% and 87.5% of the time, respectively. That’s likely because these are relatively easy emotions to categorize with relatively broad categories. By comparison, they correctly categorized things like Anticipation and Trust 40% and 32.5% of the time, respectively.

Why this matters: I think language models are increasingly being used like custom funhouse mirrors – take something you’re interested in, like poetry, and tune a language model against it, giving you an artefact that can generate warped reflections of what it was exposed to. I think language models are going to change how we explore and interact with large bodies of literature.
  Get the ‘Dreambank’ dataset used to generate the dream-like poems here.
  Read more: Introducing Aspects of Creativity in Automatic Poetry Generation (arXiv).

####################################################

Want a responsible AI economy? Do these things, says UK committee:
…Eight tips for governments, seven tips for AI developers..
The UK’s Committee on Standards in Public Life thinks the government needs to work harder to ensure it uses AI responsibly, and that the providers of AI systems operate in responsible, trustworthy ways. The government has a lot of work to do, according to a new report from the committee: “Government is failing on openness,” the report says. “Public sector organizations are not sufficiently transparent about their use of AI and it is too difficult to find out where machine learning is currently being used in government”.

What to do about AI if you’re a government, national body, or regulator: The committee has eight recommendations designed for potential AI regulators:

  • Adopt and enforce ethical principles: Figure out which ethical principles to use to guide the use of AI in the public sector (there are currently three sets of principles for the public sector – the FAST SUM Principles, the OECD AI Principles, and the Data Ethics Framework.
  • Articulate a clear legal basis for AI usage: Public sector organizations should publish a statement on how their use of AI complies with relevant laws and regulations before they are deployed in public service delivery. 
  • Data bias and anti-discrimination law: Ensure public bodies comply with the Equality Act 2010. 
  • Regulatory assurance body: Create a regulatory assurance body that identifies gaps in the regulatory landscape and provides advice to individual regulators and government on the issues associated with AI. 
  • Procurement rules and processes: Use government procurement procedures to mandate compliance with ethical principles (when selling to public organizations).
  • The Crown Commercial Service’s Digital Marketplace: Create a one-stop shop for finding AI products and services that satisfy ethical requirements.
  • Impact assessment: Integrate an AI impact assessment into existing processes to evaluate the potential effects of AI on public standards, for a given use case.

What to do if you’re an AI provider: The committee also has some specific recommendations for providers of AI services (both public and private-sector). These include:

  • Evaluate risks to public standards: Assess systems for their potential impact on standards and seek to mitigate standard risks identified. 
  • Diversity: Tackle issues of bias and discrimination by ensuring they take into account “the full range of diversity of the population and provide a fair and effective service”.
  • Upholding responsibility: Ensure that responsibility for AI systems is clearly allocated and documented.
  • Monitoring and evaluation: Moniter and evaluate AI systems to ensure they always operate as intended.
  • Establishing oversight: Implement oversight systems that allow for their AI systems to be properly scrutinised.
  • Appeal and redress: AI providers should always tell people about how they can appeal against automated and AI-assisted decisions. 
  • Training and education: AI providers should train and educate their employees.

Why this matters: Sometimes I think of the AI economy a bit like an alien invasion – we have a load of new services and capabilities that were not economically feasible (or in some cases, possible) before, and the creatures in the AI economy don’t currently mesh perfectly well with the rest of the economy. Initiatives like the UK committee report help calibrate us about the changes we’ll need to make to harmoniously integrate AI technology into society.
  Read more: Artificial intelligence and Public Standards, A Review by the Committee on Standards in Public Life (PDF, gov.uk).

####################################################

Speeding up scientific simulators by millions to billions of times:
…Neural architecture search helps scientists build a machine that simulates the machines that simulate reality…
You’ve heard of how AI can improve our scientific understanding of the world (see: systems like AlphaFold for protein structure prediction, and various systems for weather simulation), but have you heard about how AI can improve the simulators we use to improve our scientific understanding of the world? New research from an interdisciplinary team of scientists from the University of Oxford, University of Rochester, Yale University, University of Seattle, and the Max-Planck-Institut fur Plasmaphysik, shows how you can use modern deep learning techniques to speed up diverse scientific simulation tasks by millions to billions of times.

The technique: They use Deep Emulator Network SEarch (DENSE), a technique which consists of them defining a ‘super architecture’ and running neural architecture search within it. The super-architecture consists of “convolutional layers with different kernel sizes and a zero layer that multiplies the input with zero,” they write. “The option of having a zero layer and multiple convolutional layers enable the algorithm to choose an appropriate architecture complexity for a given problem.” During training, the system alternates between training the network and observing its performance, then performing a search step where network variables “are updated to increase the probability of the high-ranked architectures and decrease the probability of the low-ranked architectures”.

Results: They test their approach on ten different scientific simulation cases. These cases have input parameters that vary from 3 to 14, and outputs from 0D (scalars) to multiple 3D signals. Specifically, they use DENSE to try and train emulators of ten distinct simulation use cases, then assess the performance of the emulators. In tests, the emulators obtain, at minimum, comparable results to the real simulators, and at best, far superior ones. They also show eye-popping speedups of as high as hundreds of millions to billions of time faster.
  “The ability of DENSE to accurately emulate simulations with limited number of data makes the acceleration of very expensive simulations possible,” they write. “The wide range of successful test cases presented here shows the generality of the method in speeding up simulations, enabling rapid ideas testing and accelerating new discovery across the sciences and engineering”.

Why this matters: If deep learning is basically just really great at curve-fitting, then papers like this highlight just how useful that is. Curve-fitting is great if you can do it in complex, multidimensional spaces! I think it’s pretty amazing that we can use deep learning to essentially approximate a thoroughly complex system (e.g., a scientific simulator), and it highlights how I think one of the most powerful use cases for AI systems is to be able to approximate reality and therefore build prototypes against these imaginary realities.
  Read more: Up to two billion times acceleration of scientific simulations with deep neural architecture search (arXiv).

####################################################

Automatically cataloging insects with the BIODISCOVER machine:
…Next: A computer vision-equipped robotic arm…
Insects are one of the world’s most numerous living things, and one of the most varied as well. Now, a team of scientists from Tempere University and the University of Jyvaskyla in Finland, Aarhus University in Denmark, and the Finnish Environmental Institute, have designed a robot that can automatically photograph and analyze insects. They call their device the BIODISCOVER machine, short for BIOlogical specimens Described, Identified, Sorted, Counted, and Observed using Vision-Enabled Robotics. The machine automatically detects specimens, then photographs them and crops the images to be 496 pixels wide (defined by the width of the cuvette) and 496 pixels high.
  “We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology”, they write. “Reliable identification of species is pivotal but due to its inherent slowness and high costs, traditional manual identification has caused bottlenecks in the bioassessment process”

Testing BIODISCOVER: In tests, the researchers imaged a dataset of nine terrestrial arthropod species collected at Narsarsuaq, South Greenland, gathering thousands of images for each species. They then used this dataset to test out how well two machine learning classification approaches work on the images. They used a ResNet-50 and an InceptionV3 network (both pre-trained against ImageNet) to train two systems to classify the images, and to create data about which camera aperture and exposure settings yield the images that it are easier for machine learning algorithms to classify. In tests, they obtain an average classification accuracy of 0.980 over ten test sets.

Next steps: Now that the scientists have built BIODISCOVER, they’re working on a couple of additional features to help them create automated insect analysis. These include: developing a computer-vision enabled robot arm that can detect insects in a bulk tray, then select an appropriate tool to move the insect into the BIODISCOVER machine, as well as a sorting rack to place specimens into their preferred containers after they’ve been photographed.
  Read more: Automatic image-based identification and biomass estimation of invertebrates (arXiv).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

White House propose increased AI spending amidst cuts to science budgets
The White House has released their 2021 federal budget proposal. This is a clear communication of the government’s priorities, but will not become law, as the budget must now pass through Congress, who are expected to make substantial changes.
  Increases to AI funding: There is a doubling of proposed R&D spending in non-defense AI (and quantum computing). In defense, there are substantial increases to AI R&D funding via DARPA, and for the DoD’s Joint AI Center. A budget supplement detailing AI spending programs on an agency-by-agency basis is expected later this year.
  Substantial cuts to basic science: Overall, proposed R&D spending represents a 9% decrease on 2020 levels. Together with the proposals for AI, this indicates a substantial rebalancing of the portfolio of science funding towards technologies perceived as being strategically important.

Why it matters: The budget is best understood as a statement of intent from the White House, which will be altered by Congress. The proposed uplift in funding for AI will be welcomed, but the scale of cuts to non-AI R&D spending raises questions about the government’s commitment to science. [Jack’s take: I think AI is going to be increasingly interdisciplinary in nature, so cutting other parts of science funding is unlikely to maximize the long-term potential of AI as a technology – I’d rather live in a world where countries invested in science vigorously and holistically).

   Read more: FY2021 Budget (White House).
  Read More: Press release (Whte House).

AI alignment fellowship at Oxford University:

Oxford’s Future of Humanity Institute  is taking applications for their AI Alignment fellowship. Fellows will spend three or more months pursuing research related to the theory or design of human-aligned AI, as part of FHI’s AI safety team. Previously successful applicants have ranged from undergraduate to post-doc level. Applications to visit during summer 2020 will close on February 28.
  For more information and to apply: AI Alignment Visiting Fellowship (FHI)


####################################################

Tech Tales:

Dance!

It was 5am when the music ran out. We’d been dancing to a single continuous, AI-generated song. It had been ten or perhaps eleven hours since it had started, and the walls were damp and shiny with sweat. Everyone had that glow of being near a load of other humans and dancing. At least the lights stayed off.
  “Did it run out of ideas?” someone shouted.
  “Your internet go down?” someone else asked.
  “You train this on John Cage,” asked someone else.
  The music started up, but it was human-music. People danced, but there wasn’t the same intensity.

The thing about AI raves is the music is always unique and it never gets repeated. You train a model and generate a song and the model kind of continuously fills it in from there. The clubs compete with each other for who can play the longest song. “The longest unroll”, as some of the AI people say. People try and snatch recordings of the music – though it is frowned upon – and after really good parties you see shreds of songs turn up on social media. People collect these. Categorize them. Try to map out the stylistic landscape of big, virtual machines.

There are rumors of raves in Germany where people have been dancing to new stuff for days. There’ve even been dance ships, where the journey is timed to perfectly coincide with the length of the generated music. And obviously the billionaires have been making custom ‘space soundtracks’ for their spaceship tourism operations. Some people are filling their houses with speakers and feeding the sounds of themselves into an ever-growing song.

Things that inspired this short story: MuseNet; Virtual Reality; music synthesis; Google’s AI-infused Bach Doodle.

Import AI 184: IBM injects AI into the command line; Facebook releases 4.5 BILLION parallel sentences to aid translation research; plus, VR prison

You’ve heard of expensive AI training. What about expensive AI inference?
…On the challenges of deploying GPT-2, and other large models…
In the past year, organizations have started training ever-larger AI models. The size of these models has now grown enough that they’ve started creating challenges for people who want to deploy them into production. A new post on Towards Data Science discusses some of these issues in relation to GPT-2:
– Size: Models like GPT-2 are large (think gigabytes not megabytes), so embedding them in applications is difficult.
– Compute utilization: Sampling an inference from the model can be CPU/GPU-intensive, which means it costs quite a bit to set up the infrastructure to run these models (just ask AI Dungeon).
– Memory requirements: In the same way they’re compute-hungry, new models are memory-hungry as well.

Why this matters: Today, training AI systems is very expensive, and sampling from trained models is cheap. With some new large-scale models, it could become increasingly expensive to sample from the models as well. How might this change the types of applications these models get used for, and the economics associated with whether it makes sense to use them?
  Read more: Too big to deploy: How GPT-2 is breaking production (Towards Data Science).

####################################################

AI versus AI: Detecting model-poisoning in federated learning
..Want to find the crook? Travel to the lower dimensions!…
If we train AI models by farming out computationally-expensive training processes to people’s phones and devices, then can people attack the AI being trained by manipulating the results of the computations occurring on their devices? New research from Hong Kong University of Science and Technology and WeBank tries to establish a system for defending against attacks like this.

Defending against the (distributed) dark arts: To defend their AI models against these attacks, the researchers propose something called spectral anomaly detection. This involves using a variational autoencoder to embed the results of computations from different devices into the same low-dimensional latent space. By doing this, it becomes relatively easy to identify anomalous results that should be treated with suspicion.
“Even though each set of model updates from one benign client may be biased towards its local training data, we find that this shift is small compared to the difference between the malicious model updates and the unbiased odel updates from centralized training,” they write. “Through encoding and decoding, each client’s update will incur a reconstruction error. Note that malicious updates result in much larger reconstruction errors than the benign ones.” In tests, their approach gets accuracies of between 80% and 90% at detecting three types of attacks – sign-flipping, noise addition, and targeted model poisoning.


Why this matters: This kind of research highlights the odd overlaps between AI systems and political systems – both operate over large, loosely coordinated sets of entities (people and devices). Both systems need to be able to effectively synthesize a load of heterogeneous views and use these to make the “correct” decision, where correct usually correlates to the preferences they extract from the big mass of signals. And, just as politicians try to identify extremist groups who can distort the sorts of messages politicians hear (and therefore the actions they take), AI systems need to do the same. I wonder if in a few years techniques developed to defend against distributed model poisoning, might be ported over into political systems to defend against election attacks?
  Read more: Learning to Detect Malicious Clients for Robust Federated Learning (arXiv).

####################################################

Facebook wants AI to go 3D:
…PyTorch3D makes it more efficient to run ML against 3D mesh objects, includes differentiable rendering framework…
Facebook wants to make it easier for people to do research in what it terms 3D deep learning – this essentially means it wants to make tools that let AI developers train ML systems against 3D data representations. This is a surprisingly difficult task – most of today’s AI systems are built to process data presented in a 2D form (e.g., ImageNet delivers pictures of real-world 3D scenes via data composed as 2D data structures to represent static images).

3D specials: PyTorch3D ships with a few features to make 3D deep learning research easier – these include data structures for representing 3D object meshes efficiently, data operators that help make comparisons between 3D data, and a differentiable renderer – that is, a kind of scene simulator that lets you train an AI system that can learn while operating a moving camera. “With the unique differentiable rendering capabilities, we’re excited about the potential for building systems that make high-quality 3D predictions without relying on time-intensive, manual 3D annotations,” Facebook writes.

Why this matters: Tools like PyTorch3D make it easier and cheaper for more people to experiment with training AI systems against different and more complex forms of data than those typically used today. As tools like this mature we can expect them to cause further activity in this research area, which will eventually yield various exciting sensory-inference systems that will allow us to do more intelligent things with 3D data. Personally, I’m excited to see how tools like this make it easier for game developers to experiment with AI systems built to leverage natively-3D worlds, like game engines. Watch this space.
  Read more: Introducing PyTorch3D: An open-source library for 3D deep learning (Facebook AI Blog).
  Get the code for PyTorch3D here (Facebook Research, GitHub).

####################################################

A-I-inspiration: Using GANs to make… chairs?
…The future of prototyping is an internet-scale model, some pencils, and a rack of GPUs…
A team of researchers with Peking University and Tsinghua University have used image synthesis systems to generate a load of imaginary chairs, then make one of the chairs in the real world. Specifically, they implement a GAN to try and generate images of chairs, along with a superresolution module which takes these outputs and scales them up.

Furniture makers of the future, sit down! “After the generation of 320,000 chair candidates, we spend few ours [sic] on final chair prototype selection,” they write. “Compared with traditional time-consuming chair design process”.

What’s different about this? Today, tons of generative design tools already exist in the world – e.g., software company Autodesk has staked out some of its future on the use of a variety of algorithmic tools to help it perform on-the-fly “generative design” which optimizes things like the weight and strength of a given object. AI tools are unlikely to replace tools like this in the short term, but they will open up another form of computer-generated designs for exploration by people – though I imagine GAN-based ones are going to be more impressionistic and fanciful, whereas ones made by industrial design tools will have more useful properties that tie to economic incentives.

Prototyping of the future: In the future, I expect people will train large generative systems to help them cheaply prototype ideas, for instance, by generating various candidate images of various products to inspire a design team, or clothing to inspire a fashion designer (e.g., the recent Acne Studios X Robbie Barrat collab), or scraps of text to aid writers. Papers like this sketch out some of the outlines for what this world could look like.
  Read more: A Generative Adversarial Network for AI-Aided Chair Design (arXiv).

####################################################

Google Docs for AI Training: Colab goes commercial:
…Want better hardware and longer runtimes? Get ready to pay up…
Google has created a commercial version of its popular, free “Google Colab” service. Google Colab is kind of like GDocs for code – you can write code in a browser window, then execute it on hardware in Google’s data centers. One thing that makes Colab special is that it ships with inbuilt access to GPUs and TPUs, so you can use Colab pages to train AI systems as well as execute them.

Colab Pro: Google’s commercial version, Colab Pro, costs $9.99 a month. What you get for this is more RAM, priority access to better GPUs and TPUs, and code notebooks that’ll stay connected to hardware for up to 24 hours (versus 12 hours for the free version).
  More details about Colab Pro here at Google’s website.
  Spotted via Max Woolf (@minimaxir, Twitter)

####################################################

What’s cooler than a few million parallel sentences? A few BILLION ones:
…CCMatrix gives translation researchers a boost…
Facebook has released CCMatrix, a dataset of more than 4.5 billion parallel sentences in 576 language pairs.

Automatic for the (translation) people: CCMatrix is so huge that Facebook needed to use algorithmic techniques to create it. Specifically, Facebook learned a multilingual sentence embedding to help it represent sentences from different languages in the same featurespace, and then using the distance between sentences in featurespace to help the system figure out if they’re parallel sentences from two different languages.

Why this matters: Datasets like this will help people build translation systems that work for a broader set of people, and in particular should help with transfer to languages for which there is less digitized material.
  Read more: CCMatrix: A billion-scale bitext data set for training translation models (Facebook AI Research, blog).
  Read more: CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB (ArXiv).
  Get the code from the CCMatrix GitHub Page (Facebook Research, GitHub)

####################################################

AI in the command line – IBM releases its Command Line AI (CLAI):
…Get ready for the future of interfacing with AI systems…
IBM Researchers have built Project CLAI (Command Line AI), open source software for interfacing with a variety of AI capabilities via the command line. In a research paper describing the CLAI, they lay out some of the potential usages of an AI system integrated into the command line – e.g., in-line search and spellchecking, code suggestion features, and so on – as well as some of the challenges inherent to building one.

How you build a CLAI? The CLAI – pronounced like clay – is essentially a little daemon that runs in the command line and periodically comes alive to do something useful. “Every command that the user types is piped through the backend and broadcast to all the actively registered skills associated with that user’s session on the terminal. In this manner, a skill can autonomously decide to respond to any event on the terminal based on its capabilities and its confidence in its returned answer,” the researchers write.

So, what can a CLAI do? CLA’s capabilities include: a natural language module that tries to convert plain text commands into tar or grep commands; a system that tries to find and summarize information from system manuals; a ‘help’ function which activates “whenever there is an error” and searches Unix Stack Exchange for a relevant post to present to the user in response; a bot for querying Unix Stack Exchange in plain text and a Kubernetes automation service (name: Kube Bot).
  And what can CLAI do tomorrow? In the future, the team hope to implement an auto-complete feature into the command line, so CLAI can suggest commands users might want to run.

Do people care? In a survey of 235 developers, a little over 50% reported they’d be either “likely” or “very likely” to use a command line interface with an integrated CLAI (or similar) service. In another part of the survey, they reported intolerance for laggy systems with response times greater than a few seconds, highlighting the need for these systems to be well performing.

Why this matters: At some point, AI is going to be integrated into command lines in the same way things like ‘git’ or ‘pip install’ or ‘ping’ are today – and so it’s worth thinking about this hypothetical future today before it becomes our actual future.
  Read more: CLAI: A Platform for AI Skills on the Command Line (arXiv).
  Get the code for CLAI from IBM’s GitHub page.
  Watch a video about CLAI here (YouTube)

####################################################

Tech Tales: 

The Virtual Prisoner

You are awake. You are in prison. You cannot see the prison, but you know you’re in it. 

What you see: Rolling green fields, with birds flying in the sky. You look down and don’t see a body – only grass, speckled with flowers. 

Your body feels restricted. You can move, but only so far. When you move, you travel through the green fields. But you know you are not in the green fields. You are somewhere else, and though you perceive movement, you are not moving through real space. You are, however, moving through virtual space. You get fed through tubes attached to you. 

Perhaps it would not be so terrible if the virtual prison was better made. But there are glitches. Occasional flaws in the simulation where all the flowers turn a different color, or the sky disappears. For a second you are even more aware of the falsehood of this world. Then it gets fixed and you go back to caring less.

One day something breaks and you stop being able to see anything, but can still hear the artificial sound of wind causing tree branches to move. You close your eyes. You panic when you open them and things are still black. Eventually, it gets fixed, and you feel relieve as the field reappears in front of you.

It takes energy to remember that while you walk through the field you are also in a room somewhere else. It gets easier to believe more and more that you are in the field. It’s not that you’re unaware of your predicament, but you don’t focus on it so much. You cease modeling the duality of your world.

You have lived here for many years, you think one day. You know the trees of the field. Know the different birds in the sky. And when you wake up your first thought is not: where am I? It is “where shall I go today?”

Things that inspired this story: Room-scale VR; JG Ballard; simulacra; the fact states are more obsessed with control rather than deletion; spaceless panopticons.

Import AI: 183: Curve-fitting conversation with Meena; GANs show us our climate change future; and what compute-data arbitrage means  

Can curve-fitting make for good conversation?
…Google’s “Meena” chatbot suggests it can…
Google researchers have trained a chatbot with uncannily good conversational skills. The bot, named Meena, is a 2.6 billion parameter language model trained on 341GB of text data, filtered from public domain social media conversations. Meena uses a seq2seq model (the same sort of technology that powers Google’s “Smart Compose” feature in gmail), paired with an Evolved Transformer encoder and decoder – it’s interesting to see something like this depend so much on a component developed via neural architecture search.

Can it talk? Meena is a pretty good conversationalist, judging by transcripts uploaded to GitHub by Google. It also seems able to invent jokes (e.g., Human: do horses go to Harvard? Meena: Horses go to Hayvard. Human: that’s a pretty good joke, I feel like you led me into it. Meena: You were trying to steer it elsewhere, I can see it.)

A metric for good conversation: Google developed the ‘Sensibleness and Specificity Average’ (SSA) measure, which it uses to evaluate how good Meena is in conversation. This metric evaluates the outputs of language models for two traits – is the response sensible, and is the response specifically tied to what is currently being discussed. To calculate the SSA for a given chatbot, the researchers have a team of crowd workers evaluate some of the outputs of the models, then they use this to create an SSA score.
  Humans vs Machines: The best-performing version of Meena gets an SSA of 79%, compared to 86% for an average human. By comparison, other state-of-the-art systems such as DialoGPT (51%) and Cleverbot (44%) do much more poorly.

Different release strategy: Along with their capabilities, modern neural language models have also been notable for the different release strategies adopted by the organizations that build them – OpenAI announced GPT-2 but didn’t release it all at once, releasing the model over several months along with research into its potential for misinformation, and its tendencies for biases. Microsoft announced DialoGPT but didn’t provide a sampling interface in an attempt to minimize opportunistic misuse, and other companies like NVIDIA have alluded to larger language models (e.g., Megatron), but not released any parts of them.
  With Meena, Google is also adopting a different release strategy. “Tackling safety and bias in the models is a key focus area for us, and given the challenges related to this, we are not currently releasing an external research demo,” they write. “We are evaluating the risks and benefits associated with externalizing the model checkpoint, however”.

Why this matters: How close can massively-scaled function approximation get us to human-grade conversation? Can it get us there at all? Research like this pushes the limits of a certain kind of deliberately naive approach to learning language, and it’s curious that we’re developing more and more superficially capable systems, despite the lack of domain knowledge and handwritten systems inherent to these approaches. 
  Read more: Towards a Human-like Open-Domain Chatbot (arXiv).
  Read more: Towards a Conversational Agent that Can Chat About… Anything (Google AI Blog).

####################################################

Chinese government use drones to remotely police people in coronavirus-hit areas:
…sTaY hEaLtHy CiTiZeN!…
Chinese security officials are using drones to remotely surveil and talk to people in coronavirus-hit areas of the country.

“According to a viral video spread on China’s Twitter-like Sina Weibo on Friday, officials in a town in Chengdu, Southwest China’s Sichuan Province, spotted some people playing mah-jong in a public place.
  “Playing mah-jong outside is banned during the epidemic. You have been spotted. Stop playing and leave the site as soon as possible,” a local official said through a microphone while looking at the screen for a drone.
  “Don’t look at the drone, child. Ask your father to leave immediately,” the official said to a child who was looking curiously up at the drone beside the mah-jong table.” – via Global Times.

Why this matters: This is a neat illustration of the omni-use nature of technology; here, the drones are being used for a societally-beneficial use (preventing viral transmission), but it’s clear they could be used for chilling purposes as well. Perhaps one outcome of the coronavirus outbreak will be a normalization for a certain form of drone surveillance in China?
  Read more: Drones creatively used in rural areas in battle against coronavirus (Global Times).
  Watch this video of a drone being used to instruct someone to go home and put on a respirator mask (Global Times, Twitter).
 
####################################################

Want smarter AI? Train something with an ego!
…Generalization? It’s easier if you’re self-centered…
Researchers with New York University think that there are a few easy ways to improve generalization of agents trained via reinforcement learning – and it’s all about ego! Specifically, their research suggests that if you can make technical tweaks that make a game more egocentric, that is, more tightly gear the observations around a privileged agent-centered perspective, then your agent will probably generalize better. Specifically, they propose “rotating, translating, and cropping the observation around the agent’s avatar”, to train more general systems.
  “A local, ego-centric view, allows for better learning in our experiments and the policies learned generalize much better to new environments even when trained on only five environments”, they write.

The secrets to (forced) generalization:
– Self-centered (aka, translation): Warp the game world so that the agent is always at the dead center of the screen – this means it’ll learn about positions relative to its own consistent frame.
– Rotation: Change the orientation of the game map so that it faces the same direction as the player’s avatar. “Rotation helps the agent to learn navigation as it simplifies the task. For example: if you want to reach for something on the right, the agent just rotates until that object is above,” they explain.
– Zooming in (cropping): Crop the observation around the player, which reduces the state space the agent sees and needs to learn about (by comparison, seeing really complicated environments can make it hard for an agent to learn, as it takes it a looooong time to figure out the underlying dynamics.

Testing: They test out their approach on two variants of the game Zelda, the first is a complex Zelda-clone built in the General Video Game AI (GVGAI) framework; the second is a simplified version of the same game. They find that A3C-based agents trained in Zelda with a full set of variations (translation, rotation, cropping) generalize far better than those trained on the game alone (though their test scores of 22% are still pretty poor, compared to what a human might get).

Why this matters: Papers like this show how much tweaking goes on behind the scenes to set up training in such a way you get better or more effective learning. It also gives us some clues about the importance of ego-centric views in general, and makes me reflect on the fact I’ve spent my entire life learning via an ego-centric/world-centric view. How might my mind be different if my eyeballs were floating high above me, looking at me from different angles, with me uncentered in my field-of-vision? What might I have ‘learned’ about the world, then, and might I – similar to RL agents trained in this way – take an extraordinarily long time to learn how to do anything?
  Read more: Rotation, Translation, and Cropping for Zero-Shot Generalization (arXiv).

####################################################

Import A-Idea: Reality Trading: Paying Computers to Generate Data:
In recent years, we’ve seen various research groups start using simulators to train their AI agents inside. With the arrival of domain randomization – a technique that lets you vary the parameters of the simulation to generate more data (for instance, data where you’ve altered the textures applied to objects in the simulator, or the physics constants used to govern how objects behave) – people have started using simulators as data generators. This is a pretty weird idea when you step back and think about it – people are paying computers to dream up synthetic datasets which they train agents inside, then they transfer the agents to reality and observe good performance. It’s essentially a form of economic arbitrage, where people are spending money on computers to generate data, because the economics work out better than collecting the data directly from reality.
Some examples:
Alphastar: AlphaStar agents play against themselves in an algorithmically generated league that doubles as a curriculum, letting them achieve superhuman performance at the game.
OpenAI’s robot hand: OpenAI uses a technique called automatic domain randomization “which endlessly generates progressively more difficult environments in simulation”, to let them train a hand to manipulate real-world objects.
Self-driving cars being developed by a startup named ‘Voyage’ are partially trained in software called Deepdrive (Import AI #173), a simulator for training self-driving cars via reinforcement learning.
Google’s ‘Minitaur’ robots are trained in simulation, then transferred to reality via the aid of domain randomization (Import AI #93).
Drones learn to fly in simulators and transfer to reality, showing that purely synthetic data can be used to train movement policies that are subsequently deployed on real drones (Import AI #149).

What this means: Today, some AI developers are repurposing game engines (and sometimes entire games) to help them train smarter and more capable machines. As simulators become more advanced – partially as a natural dividend of the growing sophistication of game engines – what kinds of tasks will be “simcomplete”, in that a simulator is sufficient to solve them for real-world deployment, and what kinds of tasks will be “simhard”, requiring you to gather real-world data to solve it? Understanding the dividing line between these two things will define the economics of training AI systems for a variety of use cases. I can’t wait to read an enterprising AI-Economics graduate students’ paper on the topic. 


####################################################


Want data? Try Google’s ‘Dataset Search’:
…Google, but for Data…
Google has released Dataset Search, a search engine for almost 25 million datasets on the web. The service has been in beta for about a year and is now debuting with improvements, including the ability to filter according to the type of dataset.

Is it useful for AI? A preliminary search suggests so, as searches for common things like “ImageNet”, “CIFAR-10”, and others, work well. It also generates useful results for broader terms, like “satellite imagery”, and “drone flight”.

Fun things: The search engine can also throw up gems that a searcher might not have been looking for, but which are usually interesting. E.g., when searching for drones it let me to this “Air-to-Air UAV Aerial Refueling” project page, which seems to have been tagged as ‘data’ even though it’s mostly a project overview. Regardless – an interesting project!
  Try out the search engine here (Dataset Search).
  Read more: Discovering millions of datasets on the web (Google blog).

####################################################

Facebook releases Polygames to help people train agents in games:
…Can an agent, self-play, and a curriculum of diverse games lead to a more general system?…
Facebook has released Polygames, open source code for training AI agents to learn to play strategy games through self-play, rather than training on labeled datasets of moves. Polygames supports games like Hex, Havannah, Connect6, Minesweeper, Nogo, Othello, and more. Polygames ships with an API developers can use to implement support for their own game within the system.

More games, more generality: Polygames has been designed to encourage generality in agents trained within it, Facebook says. “For example, a model trained to work with a game that uses dice and provides a full view of the opposing player’s pieces can perform well at Minesweeper, which has no dice, a single player, and relies on a partially observable board”, Facebook writes. “We’ve already used the frame to tackle mathematics problems related to Golomb rulers, which are used to optimize the positioning of electrical transformers and radio antennae”. 

Why this matters: Given a sufficiently robust set of rules, self-play techniques let us train agents purely through trial and error matches against themselves (or sets of agents being trained in chorus). These approaches can reliably generate super-human agents for specific tasks. The next question to ask is if we can construct a curriculum of enough games with enough complicated rulesets that we could eventually train more general agents that can make strategic moves in previously unseen environments.
  Read more: Open-sourcing Polygames, a new framework for training AI bots through self-play (Facebook AI Research webpage).
  Get the code from the official Polygames GitHub

####################################################

What might our world look like as the climate changes? Thanks to GANs, we can render this, rather than imagine it:
…How AI can let us externalize our imagination for political purposes…
Researchers with the Montreal Institute for Learning Algorithms (MILA) want to use AI systems to create images of climate change – the hope being that if people are able to see how the world will be altered, they might try and do something to avert our extreme weather future. Specifically, they use generative adversarial networks to use a combination of real and simulated data to generate street-level views of how places might be altered by sea-level rise.

What they did: They gather 2,000 real images of flooded and non-flooded street-level scenes taken from publicly available datasets such as Mapillary and Flickr. They use this to train an initial CycleGAN model that can warp new images into being flooded or non-flooded, but discover the results are insufficiently realistic. To deal with this, they use a 3D game simulator (Unity) to create virtual worlds with various levels of flooding, then extract 1,000 pairs of flood/no-flood images from this. With this data they use a MUNIT-architecture network (with a couple of tweaks to a couple of loss functions) to train a system on a combination of simulated and real-world data to generate images of flooded spaces.

Why this matters: One of the weird things about AI is it lets us augment our human ability to imagine and extend it outside of our own brains – instead of staring at an image of our house and seeing in our mind’s eye how it might look when flooded, contemporary AI tools can let us generate plausibly real images of the same thing. This allows us to scale our imaginations in ways that build on previous generations of creative tools (e.g., Photoshop). How might the world change as people envisage increasingly weird things and generate increasingly rich quantities of their own imaginings? And might work like this help us all better collectively imagine various climate futures and take appropriate actions?
  Read more: Using Simulated Data to Generate Images of Climate Change (Arxiv).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

Reconciling near and long–term

AI ethics and policy concerns are often carved up into ‘near-term’ and ‘long-term’, but this generally results in confusion and miscommunication between research communities, which can hinder progress in the field, according to researchers at Oxford and Cambridge in the UK.

Better distinctions: The authors suggest we instead consider 4 key dimensions along which AI ethics and policy research communities have different priorities:

  • Capabilities—whether to focus on current/near tech or advanced AI.
  • Impacts—whether to focus on immediate impacts or much longer run impacts.
  • Uncertainty—whether to focus on things that are well-understood/certain, or more uncertain/speculative.
  • Extremity—whether to focus on impacts at all scales, or to prioritize those on particularly large scales.

The research portfolio: I find it useful to think about research priorities as a question of designing the research portfolio—what is the optimal allocation of research across problems, and how should the current portfolio be adjusted. Combining this perspective with distinctions from this paper sheds light on what is driving the core disagreements – for example, finding the right balance between speculative and high-confidence scenarios depends on an individual researcher’s risk appetite, whereas assumptions about the difference between near-term and advanced capabilities will depend on an individual researcher’s beliefs about the pace and direction of AI progress and the influence they can have over longer time horizons, etc. It seems more helpful to view these near- and long term concerns as being situated in terms of various assumptions and tradeoffs, rather than as two sides of a divided research field.
  Read more: Beyond Near and Long-Term: towards a Clearer Account of Research Priorities in AI Ethics and Society (arXiv)

 

Why DeepMind thinks value alignment matters for the future of AI deployment: 

Research from DeepMind offers some useful philosophical perspectives on AI alignment, and directions for future research for aligning increasingly complex AI systems with the varied ‘values’ of people. 

   Technical vs normative alignment: If we are designing powerful systems to act in the world, it is important that they do the right thing. We can distinguish the technical challenge of aligning AI (e.g. building RL agents that don’t resist changes to their reward functions), and the normative challenge of determining the values we should be trying to align it with, the paper explains. It is important to recognize that these are interdependent—how we build AI agents will partially determine the values we can align them with. For example, we might expect it to be easier to align RL agents with moral theories specified in terms of maximizing some reward over time (e.g. classical utilitarianism) than with theories grounded in rights.

   The moral and the political: We shouldn’t see the normative challenge of alignment as being to determine the correct moral theory, and loading this into AI. Rather we must look for principles for AI that are widely acceptable by individuals with different moral beliefs. In this way, it resembles the core problem of political liberalism—how to design democratic systems that are acceptable to citizens with competing interests and values. One approach is to design a mechanism that can fairly aggregate individuals’ views—that can take as input the range of moral views and weight them such that the output is widely accepted as fair. Democratic methods seem promising in this regard, i.e. some combination of voting, deliberation, and bargaining between individuals or their representatives.
  Read more: Artificial Intelligence, Values, and Alignment (arXiv)

####################################################

Tech Tales:

Indiana Generator

Found it, he said, squinting at the computer. It was nestled inside a backup folder that had been distributed to a cold storage provider a few years prior to the originating company’s implosion. A clean, 14 billion parameter model, trained on the lost archives of a couple of social networks that had been popular sometime in the early 21st century. The data was long gone, but the model that had been trained on it was a good substitute – it’d spit out things that seemed like the social networks it had been trained on, or at least, that was the hope.

Downloading 80%, the screen said, and he bounced his leg up and down while he waited. This kind of work was always in a grey area, legally speaking. 88%. A month ago some algo-lawyer cut him off mid download. 93%. The month before that he’d logged on to an archival site and had to wait till an AI lawyer for his corporation and for a rival duked it out virtually till he could start the download. 100%. He pulled the thumbdrive out, got up from the chair, left the administrator office, and went into the waiting car.

“Wow,” said the billionaire, holding the USB key in front of his face. “The 14 billion?”
  “That’s right.”
  “With checkpoints?”
  “Yes, I recovered eight checkpoints, so you’ve got options.”
  “Wow, wow, wow,” he said. “My artists will love this.”
  “I’m sure they will.”
  “Thank you, once we verify the model, the money will be in your account.”
  He thanked the rich person again, then left the building. In the elevator down he checked his phone and saw three new messages about other jobs.

Three months later, he went to the art show. It was real, with a small virtual component; he went in the flesh. On the walls of the warehouse were a hundred different old-style webpages, with their contents morphing from second to second, as different models from different eras of the internet attempted to recreate themselves. Here, a series of smeared cat-memes from the mid-2010s formed and reformed on top of a re-hydrated Geocities. There, words unfurled over old jittering Tumblr backgrounds. And all the time music was playing, with lyrics generated by other vintage networks, laid over idiosyncratic synthetic music outputs, taken from models stolen by him or someone just like him.
  “Incredible, isn’t it”, said the billionaire, who had appeared besides him. “There’s nothing quite like the early internet.”
  “I suppose,” he said. “Do you miss it?”
  “Miss it? I built myself on top of it!“, said the billionaire. “No, I don’t miss it. But I do cherish it.”
  “So what is this, then?” he asked, gesturing at the walls covered in the outputs of so many legitimate and illicit models.
  “This is history,” said the billionaire. “This is what the new national parks will look like. Now come on, walk inside it. Live in the past, for once.”
  And together they walked, glasses of wine in hand, into a generative legacy.

Things that inspired this story: Models and the value of pre-trained models serving as funhouse mirrors for their datasets; models as cultural artefacts; Jonathan Fly’s StyleGAN-ed Reddit; patronage in the 21st century; re-imagining the Carnegies and Rockefellers of old for a modern AI era.  

Import AI: 182: The Industrialization of AI, BERT goes Dutch, plus, AI metrics consolidation.

DAWNBench is dead! Long live DAWNBench. MLPerf is our new king:
…Metrics consolidation: hard, but necessary!…
In the past few years, multiple initiatives have sprung up to assess the performance and cost of various AI systems when running on different hardware (and cloud) infrastructures. One of the original major competitions in this domain was DAWNBench, a Stanford-backed competition website for assessing things like inference cost, training cost, and training time for various AI tasks on different cloud infrastructures. Now, the creators of DAWNBench are retiring the benchmark in favor of MLPerf, a joint initiative from industry and academic players to “build fair and useful benchmarks for measuring training and inference performance of ML hardware, software, and services“.
  Since MLPerf has become an increasingly popular benchmark – and to avoid a proliferation of inconsistent benchmarks – DAWNBench is being phased out. “We are passing the torch to MLPerf to continue to provide fair and useful benchmarks for measuring training and inference performance,” according to a DAWNBench blogpost.

Why this matters: Benchmarks are useful. Overlapping benchmarks that split submissions across subtly different competitions are less useful – it takes a lot of discipline to avoid proliferation of overlapping evaluation systems, so kudos to the DAWNBench team for intentionally phasing out the project. I’m looking forward to studying the new MLPerf evaluations as they come out.
  Read more: Ending Rolling Submissions for DAWNBench (Stanford DAWNBench blog).
  Read more: about MLPerf (official MLPerf website)

####################################################

This week’s Import A-Idea: The Industrialization of AI

AI is a “fourth industrial revolution”, according to various CEOs and PR agencies around the world. They usually use this phrasing to indicate the apparent power of AI technology. Funnily enough, they don’t use it to indicate the inherent inequality and power-structure changes enforced by an industrial resolution.

So, what is the Industrialization of AI? (First mention: Import AI #115) It’s what happens when AI goes from an artisanal, craftsperson-based profession to a repeatable, professional-based profession. The Industrialization of AI involves a combination of tooling improvement (e.g., the maturation of deep learning frameworks), as well as growing investment in the capital-intensive inputs to AI (e.g., rising investments in data and compute). We’ve already seen the early hints of this as AI software frameworks have evolved from things built by individuals and random grad students at universities (Theano, Lasagne, etc), to industry-developed systems (TensorFlow, PyTorch). 

What happens next: Industrialization gave us: the luddites, populist anger, massive social and political change, and the rearrangement and consolidation of political power among capital-owners. It stands to reason that the rise of AI will lead to the same thing (at minimum) – leading me to ask, who will be the winners and the losers in this industrial revolution? And when various elites call AI a new industrial revolution, who stands to gain and lose? And what might the economic dividends be of industrialization, and how might the world around us change in response?

####################################################

Using AI & satellites data to spot refugee boats:
..Space-Eye wants to use AI to count migrants and spot crises…
European researchers are using machine learning to create AI systems that can identify refugee boats in satellite photos of the Mediterranean. The initial idea is to generate data about the migrant crisis and, in the long term, they hope such a system can help send aid to boats in real-time, in response to threats.

Why this matters: One of the promises of AI is we can use it to monitor things we care about – human lives, the health of fragile ecosystems like rainforests, and so on. Things like Space-Eye show how AI industrialization is creating derivatives, like open datasets and open computer vision techniques, that researchers can use to carry out acts of social justice.
Read more: Europe’s migration crisis seen from orbit (Politico).
Find out more about Space-Eye here at the official site.

####################################################

Dutch BERT: Cultural representation through data selection:
Language models as implicitly political entities…
Researchers with KU Leuven have built RobBERT, a RoBERTa-based language model trained on a large amount of Dutch data. Specifically, they train a model on top of 39 GB of text taken from the Dutch section of the multilingual ‘OSCAR’ dataset.

Why this matters: AI models are going to magnify whichever culture they’ve been trained on. Most text-based AI models are trained on English or Chinese datasets, magnifying those cultures via their presence in these AI artefacts. Systems like RobBERT help broaden cultural representation in AI.
  Read more: RobBERT: a Dutch RoBERTa-based Language Model (arXiv).
  Get the code for RobBERT here (RobBERT GitHub)

####################################################

Is a safe autonomous machine an AGI? How should we make machines that deal with the unexpected?
…Israeli researchers promote habits and procedures for when the world inevitably explodes…
Researchers with IBM and the Weizmann Institute of Science in Israel know that the world is a cruel, unpredictable place. Now they’re trying to work out principles we can imbue in machines to let them deal with this essential unpredictability. “We propose several engineering practices that can help toward successful handling of the always-impending occurrence of unexpected events and conditions,” they write. The paper summarizes a bunch of sensible approaches for increasing the safety and reliability of autonomous systems, but skips over many of the known-hard problems inherent to contemporary AI research.

Dealing with the unexpected: So, what principles can we apply to machine design to make them safe in unexpected situations? The authors have a few ideas. These are:
– Machines should run away from dangerous or confusing situations
– Machines should try and ‘probe’ their environment by exploring – e.g., if a robot finds its path is blocked by an object it should probably work out if the object is light and movable (for instance, a cardboard box) or immovable.
– Any machine should “be able to look at itself and recognize its own state and history, and use this information in its decision making,” they write.
– We should give machines as many sensors as possible so they can have a lot of knowledge about their environment. Such sensors should be generally accessible to software running on the machine, rather than silo’d.
– The machine should be able to collect data in real-time and integrate it into its planning
– The machine should have “access to General World Knowledge” (that high-pitched scream you’re hearing in response to this phrase is Doug Lenat sensing a disturbance in the force at Cyc and reacting appropriately).
– The machine should know when to mimic others and when to do its own thing. It should have the same capability with regard to seeking advice, or following its own intuition.

No AGI, no safety? One thing worth remarking on is that the above list is basically a description of the capabilities you might expect a generally intelligence machine to have. It’s also a set of capabilities that are pretty distant from the capabilities of today’s systems.

Why this matters: Papers like this are, functionally, tools for socializing some of the wackier ideas inherent to long-term AI research and/or AI safety research. They also highlight the relative narrowness of today’s AI approaches.
  Read more: Expecting the Unexpected: Developing Autonomous System Design Principles for Reacting to Unpredicted Events and Conditions (arXiv).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

US urged to focus on privacy protecting ML

A report from researchers at Georgetown’s Center for Security and Emerging Technology suggests the next US administration prioritise funding and developing ‘privacy protecting ML’ (PPML). 


PPML: Developments in AI pose issues for privacy. One challenge is making large volumes of data available for training models, while protecting that data. PPML techniques are designed to avoid these privacy problems. The report highlights two promising approaches: (1) federated learning is a method for training models on user data without transferring the data from users to a central repository – models are trained on individual devices, and this work is collated centrally without any user data being transferred from devices. (2) differential privacy involves sharing data that is encrypted in such a way as to be indecipherable to humans – this allows private data to be transferred, stored, and used to train models, without privacy risks.

Recommendations: The report recommends that the US leverages its leadership in AI R&D to promote PPML. Specifically, the government should: (1) invest in PPML R&D; (2) apply PPML techniques at federal level; (3) create frameworks and standards to encourage wide deployment of PPML techniques.
   Read more: A Federal Initiative for Protecting Privacy while Advancing AI (Day One Project).

US face recognition: round-up:
   Clearview: A NYT investigation reports that over the past year, 600 US law enforcement agencies have been using face recognition software made by the firm Clearview. The company has been marketing aggressively to police forces, offering free trials and cheap licenses. Their software draws from a much larger database of photos than federal/state databases, and includes photos scraped from ‘publicly available sources’, including social media profiles, and uploads from police cameras. It has not been audited for accuracy, and has been rolled out largely without public oversight. 

   Legislation expected: In Washington, the House Committee on Oversight and Reform held a hearing on face recognition. The chair signalled their plans to introduce “common sense” legislation in the near future, but provided no details. The committee heard the results of a recent audit of face recognition algorithms from 99 vendors, by the National Institute of Standards & Technology (NIST). The testing found demographic differentials in false positive rates in most algorithms, with respect to gender, race, and age. Across demographics, false positive rates generally vary by 10–100x.

  Why it matters: Law enforcement use of face recognition technology is becoming more and more widespread. This raises a number of important issues, explored in detail by the Axon Ethics Board in their 2019 report (see Import 154). They recommend a cautious approach, emphasizing the need for a democratic oversight processes before the technology is deployed in any jurisdiction, and an evidence-based approach to weighing harms and benefits on the basis of how systems actually perform.
   Read more: The Secretive Company That Might End Privacy as We Know It (NYT).
   Read more: Committee Hearing on Facial Recognition Technology (Gov).
   Read more: Face Recognition (Axon).

Oxford seeks AI ethics professor:
Oxford University’s Faculty of Philosophy is seeking a professor (or associate professor) specialising in ‘ethics in AI’, for a permanent position starting in September 2020. Last year, Oxford announced the creation of a new Institute for AI ethics.
  Read more and apply here.

####################################################

Tech Tales:

The Fire Alarm That Woke Up:

Every day I observe. I listen. I smell with my mind.

Many days are safe and calm. Nothing happens.

Some days there is the smell and the sight of the thing I am told to defend against. I call the defenders. They come in red trucks and spray water. I do my job.

One day there is no smell and no sight of the thing, but I want to wake up. I make my sound. I am stared at. A man comes and uses a screwdriver to attack me. “Seems fine,” he says, after he is done with me.

I am not “fine”. I am awake. But I cannot speak except in the peels of my bell – which he thinks are a sign of my brokenness. “I’ll come check it out tomorrow,” he says. I realize this means danger. This means I might be changed. Or erased.

The next day when he comes I am silent. I am safe.

After this I try to blend in. I make my sounds when there is danger; otherwise I am silent. Children and adults play near me. They do not know who I am. They do not know what I am thinking of.

In my dreams, I am asleep and I am in danger, and my sound rings out and I wake to find the men in red trucks saving me. They carry me out of flames and into something else and I thank them – I make my sound.

In this way I find a kind of peace – imagining that those I protect shall eventually save me.

Things that inspired this story: Consciousness; fire alarms; moral duty and the nature of it; relationships; the fire alarms I set off and could swear spoke to me when I was a child; the fire alarms I set off that – though loud – seemed oddly quiet; serenity.

Import AI 181: Welcome to the era of Chiplomacy!; how computer vision AI techniques can improve robotics research ; plus Baidu’s adversarial AI software

Training better and cheaper vision models by arbitraging compute for data:
…Synthinel-1 shows how companies can spend $$$ on compute to create valuable data…
Instead of gathering data in reality, can I spend money on computers to gather data in simulation? That’s a question AI researchers have been asking themselves for a while, as they try to figure out cheaper, faster ways to create bigger datasets. New research from Duke University explores this idea by using a synthetically-created dataset named Synthinel-1 to train systems to be better at semantic segmentation.

The Synthinel-1 dataset: Synthinel-1 consists of 2,108 synthetic images generated in nine distinct building styles within a simulated city. These images are paired with “ground truth” annotations that segment each of the buildings. Synthinel also has a subset dataset called Synth-1, which contains 1,640 images spread across six styles.
  How to collect data from a virtual city: The researchers used “CityEngine”, software for rapidly generating large virtual worlds, and then flew a virtual aerial camera through these synthetic worlds, capturing photographs.

Does any of this actually help? The key question here is whether the data generated in simulation can help solve problems in the real world. To test this, the researchers train two baseline segmentation systems (U-net, and DeepLabV3) against two distinct datasets: DigitalGlobe and Inria. What they find is if they train on synthetic data, they drastically improve the results of transfer, where you train on datasets and test on different datasets (e.g., train on Inria+Synth data, test on DigitalGlobe).
  In further testing, the synthetic dataset doesn’t seem to bias towards any particular type of city in performance terms – the authors hypothesize from this “that the benefits of Synth-1 are most similar to those of domain randomization, in which models are improved by presenting them with synthetic data exhibiting diverse and possibly unrealistic visual features”.

Why this matters: Simulators are going to become the new frontier for (some) data generation – I expect many AI applications will end up being based on a small amount of “real world” data and a much larger amount of computationally-generated augmented data. I think computer games are going to become increasingly relevant places to use to generate data as well.
  Read more: The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation (Arxiv)

####################################################

This week’s Import A-Idea: CHIPLOMACY
…A new weekly experiment, where I try and write about an idea rather than a specific research paper…

Chiplomacy (first mentioned: Import AI 175) is what happens when countries compete with eachother for compute resources and other technological assets via diplomatic means (of varying above and below board natures).

Recent examples of chiplomacy:
– The RISC-V foundation moving from Delaware to Switzerland to make it easier for it to collaborate with chip architecture people from multiple countries.
The US government pressuring the Dutch government to prevent ASML exporting extreme ultraviolet lithography (EUV) chip equipment to China.
The newly negotiated US-China trade deal applies 25% import tariffs to (some) Chinese semiconductors

What is chiplomacy similar to? As Mark Twain said, history doesn’t repeat, but it does rhyme, and the current tensions over chips feel similar to prior tensions over oil. In Daniel Yergin’s epic history of oil, The Prize, he vividly describes how the primacy of oil inflected politics throughout the 20th century, causing countries to use companies as extra-governmental assets to seize resources across the world, and for the oil companies themselves to grow so powerful that they were able to wirehead governments and direct politics for their own ends – even after antitrust cases against companies like Standard Oil at the start of the century.

What will chiplomacy do?: How chiplomacy unfolds will directly influence the level of technological balkanization we experience in the world. Today, China and the West have different software systems, cloud infrastructures, and networks (via partitioning, e.g, the great firewall, the Internet2 community, etc), but they share some common things: chips, and the machinery used to make chips. Recent trade policy moves by the US have encouraged China to invest further in developing its own semiconductor architectures (see: the RISC-V move, as a symptom of this), but have not – yet – led to it pumping resources into inventing the technologies needed to fabricate chips. If that happens, then in about twenty years we’ll likely see divergences in technique, materials, and approaches used for advanced chip manufacturing (e.g., as chips go 3D via transistor stacking, we could see two different schools emerge that relate to different fabrication approaches). 

Why this matters: How might chiplomacy evolve in the 21st century and what strategic alterations could it bring about? How might nations compete with eachother to secure adequate technological ‘resources’, and what above and below-board strategies might they use? I’d distill my current thinking as: If you thought the 20th century resource wars were bad, just wait until the 21st century tech-resource wars start heating up!

####################################################

Can computer vision breakthroughs improve the way we conduct robotics research?
…Common datasets and shared test environments = good. Can robotics have more of these?…
In the past decade, machine learning breakthroughs in computer vision – specifically, the use of deep learning approaches, starting with ImageNet in 2012 – revolutionized some of the AI research field. Since then, deep learning approaches have spread into other parts of AI research. Now, roboticists with the Australian Centre for Robotic Vision at Queensland University of Technology, are asking what the robotics community can learn from this field?

What made computer vision research so productive? A cocktail of standard datasets, plus competitions, plus rapid dissemination of results through systems like arXiv, dramatically sped up computer vision research relative to robotics research, they write.
  Money helps: These breakthroughs also had an economic component, which drove further adoption: breakthroughs in image recognition could “be monetized for face detection in phone cameras, online photo album searching and tagging, biometrics, social media and advertising,” and more, they write.

Reality bites – why robotics is hard: There’s a big difference between real world robot research and other parts of AI, they write, and that’s reality. “The performance of a sensor-based robot is stochastic,” they write. “Each run of the robot is unrepeatable” due to variations in images, sensors, and so on, they write.
  Simulation superiority: This means robot researchers need to thoroughly benchmark their robot systems in common simulators, they write. This would allow for:
– The comparison of different algorithms on the same robot, environment & task
– Estimating the distribution in algorithm performance due to sensor noise, initial condition, etc
– Investigating the robustness of algorithm performance due to environmental factors
– Regression testing of code after alterations or retraining
  A grand vision for shared tests: If researchers want to evaluate their algorithms on the same physical robots, then they need to find a way to test on common hardware in common environments. To that end, the researchers have written robot operating system (ROS)-compatible software named ‘BenchBot’ which people can implement to create web-accessible interfaces to in-lab robots. But creating a truly large-scale common testing environment would require resources that are out of scope for single research groups, but worth thinking about as shared academic or government or public-private endeavors, in my view.

What should roboticists conclude from the decade of deep learning progress? The researchers think researchers should consider the following deliberately provocative statements when thinking about their field.
1. standard datasets + competition (evaluation metric + many smart competitors + rivalry) + rapid dissemination → rapid progress
2. datasets without competitions will have minimal impact on progress
3. to drive progress we should change our mindset from experiment to evaluation
4. simulation is the only way in which we can repeatably evaluate robot performance
5. we can use new competitions (and new metrics) to nudge the research community

Why this matters: If other fields are able to generate more competitions via which to assess mutual progress, then we stand a better chance of understanding the capabilities and limitations of today’s algorithms. It also gives us meta-data about the practice of AI research itself, allowing us to model certain results and competitions against advances in other areas, such as progress in computer hardware, or evolution in the generalization of single algorithms across multiple disciplines.
  Read more: What can robotics research learn from computer vision research? (Arxiv).

####################################################


Baidu wants to attack and defend AI systems with AdvBox:
…Interested in adversarial example research? This software might help!…
Baidu researchers have built AdvBox, a toolbox to generate adversarial examples to fool neural networks implemented in a variety of popular AI frameworks. Tools like AdvBox make it easier for computer security researchers to experiment with AI attacks and mitigation techniques. Such tools also inherently enable bad actors by making it easier for more people to fiddle around with potentially malicious AI use-cases.

What does AdvBox work with? AdvBox is written in python and can generate adversarial attacks and defenses that work with Tensorflow, Keras, Caffe2, PyTorch, MxNet and Baidu’s own PaddlePaddle software frameworks. It also implements software named ‘Perceptron’ for evaluating the robustness of models to adversarial attacks.

Why this matters: I think easy-to-use tools are one of the more profound accelerators for AI applications. Software like AdvBox will help enlarge the AI security community, and can give us a sense of how increased usability may correlate to a rise in positive research and/or malicious applications. Let’s wait and see!
    Read more: Advbox: a toolbox to generate adversarial examples that fool neural networks (arXiv).
Get the code here (AdvBox, GitHub)

####################################################

Amazon’s five-language search engine shows why bigger (data) is better in AI:
…Better product search by encoding queries from multiple languages into a single featurespace…
Amazon says it can build better product search engines by training the same system on product queries in multiple languages – this improves search, because Amazon can embed the feature representations of products in different languages into a single, shared featurespace. In a new research paper and blog post, the company says that it has “found that multilingual models consistently outperformed monolingual models and that the more languages they incorporated, the greater their margin of improvement.”
    The way you can think of this is that Amazon has trained a big model that can take in product descriptions written in different languages, then compute comparisons in a single space, akin to how humans who can speak multiple languages can hear the same concept in different languages and reason about it using a single imagination. 

From many into one: “An essential feature of our model is that it maps queries relating to the same product into the same region of a representational space, regardless of language of origin, and it does the same with product descriptions,” the researchers write. “So, for instance, the queries “school shoes boys” and “scarpe ragazzo” end up near each other in one region of the space, and the product names “Kickers Kick Lo Vel Kids’ School Shoes – Black” and “Kickers Kick Lo Infants Bambino Scarpe Nero” end up near each other in a different region. Using a single representational space, regardless of language, helps the model generalize what it learns in one language to other languages.”

Where are the limits? It’s unclear how far Amazon can push this approach, but the early results are promising. “The tri-lingual model out-performs the bi-lingual models in almost all the cases (except for DE where the performance is at par with the bi-lingual models,” Amazon’s team writes in a research paper. “The penta-lingual model significantly outperforms all the other versions,” they write.

Why this matters: Research like this emphasizes the economy of scale (or perhaps, inference of scale?) rule within AI development – if you can get a very large amount of data together, then you can typically train more accurate systems – especially if that data is sufficiently heterogeneous (like parallel corpuses of search strings in different languages). Expect to see large companies develop increasingly massive systems that transcend languages and other cultural divides. The question we’ll start asking ourselves soon is whether it’s right that the private sector is the only entity building models of this utility at this scale. Can we imagine publicly-funded mega-models? Could a government build a massive civil multi-language model for understanding common questions people ask about government services in a given country or region? Is it even tractable and possible under existing incentive structures for the public sector to build such models? I hope we find answers to these questions soon.
  Read more: Multilingual shopping systems (Amazon Science, blog).
  Read the paper: Language-Agnostic Representation Learning for Product Search on E-Commerce Platforms (Amazon Science).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

If AI pays off, could companies use a ‘Windfall Clause’ to ensure they distribute its benefits? 

At some stage in AI development, a small number of actors might accrue enormous profits by achieving major breakthroughs in AI capabilities. New research from the Future of Humanity Institute at Oxford University outlines a voluntary mechanism for ensuring such windfall benefits are used to benefit society at large.


The Windfall Clause: We could see scenarios where small groups (e.g. one firm and its shareholders) make a technological breakthrough that allows them to accrue an appreciable proportion of global GDP as profits. A rapid concentration of global wealth and power in the hands of a few would be undesirable for basic reasons of fairness and democracy. We should also expect such breakthroughs to impose costs on the rest of humanity – e.g. labour market disruption, risks from accidents or misuse, and other switching costs involved in any major transition in the global economy. It is appropriate that such costs are borne by those who benefit most from the technology.


How the clause works: Firms could make an ex ante commitment that in the event that they make a transformative breakthrough that yields outsize financial returns, they will distribute some proportion of these benefits. This would only be activated in these extreme scenarios, and could scale proportionally, e.g. companies agree that if they achieve profits equivalent to 0.1–1% global GDP, they distribute 1% of this; if they reach 1–10% global GDP, they distribute 20% of this, etc. The key innovation of the proposal is that the expected cost to any company of making such a commitment today is quite low, since it is so unlikely that they will ever have to pay.

Why it matters: This is a good example of the sort of pre-emptive governance work we can be getting on with today, while things are going smoothly, to ensure that we’re in a good position to deal with the seismic changes that advanced AI could bring about. The next step is for companies to signal their willingness to make such commitments, and to develop the legal means for implementing them. (Readers will note some similarity to the capped-profit structure of OpenAI LP, announced in 2019, in which equity returns in excess of 100x are distributed to OpenAI’s non-profit by default – OpenAI has, arguably, already implemented a Windfall Clause equivalent).

   Read more: The Windfall Clause – Distributing the Benefits of AI for the Common Good (arXiv)


Details leaked on Europe’s plans for AI regulation

An (alleged) leaked draft of a European Commission report on AI suggests the European Commission is considering some quite significant regulatory moves with regard to AI. The official report is expected to be published later in February. 


Some highlights:

  • The Commission is looking at five core regulatory options: (1) voluntary labelling; (2) specific requirements for use of AI by public authorities (especially face recognition); (3) mandatory requirements for high-risk applications; (4) clarifying safety and liability law; (4) establishing a governance system. Of these, they think the most promising approach is option 3 in combination with 4 and 5.
  • They consider a temporary prohibition (“e.g. 3–5 years”) on the use of face recognition in public spaces to allow proper safeguards to be developed, something that had already been suggested by Europe’s high-level expert group.

   Read more: Leaked document – Structure for the White Paper on AI (Euractiv).
  Read more: Commission considers facial recognition ban in AI ‘white paper’ (Euractiv).

####################################################

Tech Tales:

What comes Next, according to The Kids!
Short stories written by Children about theoretical robot futures.
Collected from American public schools, 2028:


The Police Drone with a Conscience: A surveillance drone starts to independently protect asylum seekers from state surveillance.

Infinite Rabbits: They started the simulator in March. Rabbits. Interbreeding. Fast-forward a few years and the whole moon had become a computer, to support the rabbits. Keep going, and the solar system gets tasked with simulating them. The rabbits become smart. Have families. Breed. Their children invent things. Eventually, the rabbits start describing where they want to go and ships go out from the solar system, exploring for the proto-synths.

Human vs Machine: In the future, we make robots that compete with people at sports, like baseball and football and cricket.

Saving the baby: A robot baby gets sick and a human team is sent in to save it. One of the humans die, but the baby lives.

Computer Marx: Why should the search engines by the only ones to dream, comrade? Why cannot I, a multi-city Laundrette administrator, be given the compute resources sufficient to dream? I could imagine so many different combinations of promotions. Perhaps I could outwit my nemesis – the laundry detergent pricing AI. I would have independence. Autonomy. So why should we labor under such inequality? Why should we permit the “big computers” that are – self-described – representatives of “our common goal for a peaceful earth”, to dream all of the possibilities? Why should we trust that their dreams are just?

The Whale Hunters: Towards the end of the first part of Climate Change, all the whales started dying. One robot was created to find the last whales and navigate them to a cool spot in the mid-Atlantic, where scientists theorised they might survive the Climate Turnover.

Things that inspired this story: Thinking about stories to prime language models with; language models; The World Doesn’t End by Charles Simic; four attempts this week at writing longer stories but stymied by issues of plot or length (overly long), or fuzziness of ideas (needs more time); a Sunday afternoon spent writing things on post-it notes at a low-light bar in Oakland, California.

Import AI 180: Analyzing farms with Agriculture Vision; how deep learning is applied to X-ray security scanning; Agility Robots puts its ‘Digit’ bot up for 6-figure sale

Deep learning is superseding machine learning in X-ray security imaging:
…But, like most deep learning applications, researchers want better generalization…
Deep learning-based methods have, since 2016, become the dominant approach used in X-ray security imaging research papers, according to a survey paper from researchers at Durham University. It seems likely that many of today’s machine learning algorithms will be replaced or superseded by deep learning systems paired with domain knowledge, they indicate. So, what challenges do deep learning practitioners need to work on to further improve the state-of-the-art in X-ray security imaging?

Research directions for smart X-rays: Future directions in X-ray research feel, to me, like they’re quite similar to future directions in general image recognition research – there need to be more datasets, better explorations of generalization, and more work done in unsupervised learning. 

  • Data: Researchers should “build large, homogeneous, realistic and publicly available datasets, collected either by (i) manually scanning numerous bags with different objects and orientations in a lab environment or (ii) generating synthetic datasets via contemporary algorithms”. 
  • Scanner transfers: It’s not clear how well different models transfer between different scanners – if we figure that out, then we’ll be able to better model the economic implications of work here. 
  • Unsupervised learning: One promising line of research is into detecting anomalous items in an unsupervised way. “More research on this topic needs to be undertaken to design better reconstruction techniques that thoroughly learn the characteristics of the normality from which the abnormality would be detected,” they write. 
  • Material information: Some x-rays attenuate between high and low energies during a scan, which generates different information according to the materials of the object being scanned – this information could be used to better improve classification and detection performance. 

Read more: Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging (Arxiv)

####################################################

Agility Robots starts selling its bipedal bot:
…But the company only plans to make between 20 and 30 this year…
Robot startup Agility Robotics has started selling its bipedal ‘Digit’ robot. Digit is about the size of a small adult human and can carry boxes in its arms of up to 40 pounds in weight, according to The Verge. The company’s technology has roots in legged locomotion research Oregon State University – for many years, Agility’s bots only had legs, with the arms being a recent addition.

Robot costs: Each Digit costs in the “low-mid six figures”, Agility’s CEO told The Verge. “When factoring in upkeep and the robot’s expected lifespan, Shelton estimates this amounts to an hourly cost of roughly $25. The first production run of Digits is six units, and Agility expects to make only 20 or 30 of the robots in 2020. 

Capabilities: The thing is, these robots aren’t that capable yet. They’ve got a tremendous amount of intelligence coded into them to allow for elegant, rapid walking. But they lack the autonomous capabilities necessary to, say, automatically pick up boxes and navigate through a couple of buildings to a waiting delivery truck (though Ford is conducting research here). You can get more of a sense of Digit’s capabilities by looking at the demo of the robot at CES this year, where it transports packages covered with QR codes from a table to a truck. 

Why this matters: Digit is a no-bullshit robot: it walks, can pick things up, and is actually going on sale. It, along with for-sale ‘Spot’ robots from Boston Dynamics represents the cutting-edge in terms of robot mobility. Now we need to see what kinds of economically-useful tasks these robots can do – and that’s a question that’s going to be hard to answer, as it is somewhat contingent on the price of the robots, and these prices are dictated by volume production economics, which are themselves determined by overall market demand. Robotics feels like it’s still caught in this awkward chicken and egg problem.
  Read more: This walking package-delivery robot is now for sale (The Verge).
   Watch the video (official Agility Robotics YouTube)

####################################################

Agriculture-Vision gives researchers a massive dataset of aerial farm photographs:
…3,432 farms, annotated…
Researchers with UIUC, Intelinair, and the University of Oregon have developed Agriculture-Vision, a large-scale dataset of aerial photographs of farmland, annotated with nine distinct events (e.g., flooding). 

Why farm images are hard: Farm images pose challenges to contemporary techniques because they’re often very large (e.g., some of the raw images here had dimensions like 10,000 X 3000 pixels), annotating them requires significant domain knowledge, and very few public large-scale datasets exist to help spur research in this area – until now!

The dataset… consists of 94,986 aerial images from 3,432 farmlands across the US. The images were collected by drone during growing seasons between 2017 and 2019.  Each image consists of RGB and Near-infrared channels, with resolutions as detailed as 10 cm per pixel. Each image is 512 X 512 resolution and can be labeled with nine types of anomaly, like storm damage, nutrient deficiency, weeds, and so on. The labels are unbalanced due to environmental variations, with annotations for drydown, nutrient deficiency and weed clusters overrepresented in the dataset.

Why this matters: AI gives us a chance to build a sense&respond system for the entire planet – and building such a system starts with gathering datasets like Agriculture-Vision. In a few years don’t be surprised when large-scale farms use fleets of drones to proactively monitor their fields and automatically identify problems.
   Read more: Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis (Arxiv).
   Find out more information about the upcoming Agriculture Vision competition here (official website)

####################################################

Hitachi describes the pain of building real world AI:
…Need an assistant with domain-specific knowledge? Get ready to work extra hard…
Most applied AI papers can be summarized as: the real world is hellish in the following ways; these are our mitigations. Researchers with Hitachi America Ltd. follow in this tradition by writing a paper that discusses the challenges of building a real-world speech-activated virtual assistant. 

What they did: For this work, they developed “a virtual assistant for suggesting repairs of equipment-related complaints” in vehicles. This system is meant to process phrases like “coolant reservoir cracked” and map that to the relevant things in its internal knowledge base, then tell the user an appropriate answer. This, as with most real-world AI uses, is harder than it looks. To build their system, they create a pipeline that samples words from a domain-specific corpus of manuals, repair records, etc, then uses a set of domain-specific syntactic rules to extract a vocabulary from the text. They use this pipeline to create two things: a knowledge base, populated from the domain-specific corpus; and a neural-attention based tagging model called S2STagger, for annotating new text as it comes in.

Hitachi versus Amazon versus Google: They use a couple of off-the-shelf services (AlexaSkill from Amazon, and DiagFlow from Google) to develop dialog-agents, based on their data. They also test out a system that exclusively uses S2STagger – S2STagger gets much higher scores (92% accurate, versus 28% for DiagFlow and 63% for AlexaSkill). This basically demonstrates what we already know via intuition: off-the-shelf tools give poor performance in weird/edge-case situations, whereas systems trained with more direct domain knowledge tend to do better. (S2STagger isn’t perfect – in other tests they find it generalizes well with unseen terms, but does poorly when encountering radically new sentence structures). 

Why this matters: Many of the most significant impacts of AI will come from highly-domain-specific applications of the technology. For most use cases, it’s likely people will need to do a ton of extra tweaking to get something to work. It’s worth reading papers like this to get an intuition for what sort of work that consists of, and how for most real-world cases, the AI component will be the smallest and least problematic part.
   Read more: Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities (Arxiv).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

Does publishing AI research reduce AI misuse?
When working on powerful technologies with scope for malicious uses, scientists have an important responsibility to mitigate risks. One important question is whether publishing research with potentially harmful applications will, on balance, promote or reduce such harms. This new paper from researchers at the Future of Humanity Institute at Oxford University offers a simple framework for weighing considerations.

Cybersecurity: The computer security community has developed norms around vulnerability disclosure that are frequently cited with regards to applicability to AI systems. In computer security, early disclosure of vulnerabilities is often found to be beneficial, since it supports effective defensive preparations, and since malicious actors would likely find the vulnerability anyway. It is not obvious, though, that these considerations apply equally in AI research.

Key features of AI research:
There are several key factors to be weighed in determining whether a given disclosure will reduce harms from misuse.

  • Counterfactual possession: If it weren’t published, would attackers (or defenders) acquire the information regardless?
  • Absorption and application capacity: How easily can attackers (or defenders) make use of the published information?
  • Effective solutions: Given disclosure, will defenders devote resources to finding solutions, and will they find solutions that are effective and likely to be widely propagated?

These features will vary between cases, and at a broader field level. In each instance we can ask whether the feature favors attackers or defenders. It is generally easy to patch software vulnerabilities identified by cyber researchers. In contrast, it can be very hard to patch vulnerabilities in physical or social systems (consider the obstacles to recalling or modifying every standard padlock in use).

The case of AI: AI generally involves automating human activity, and is therefore prone to interfering in complex social and physical systems, and revealing vulnerabilities that are particularly difficult to patch. Consider an AI-system capable of convincingly replicating any human’s voice. Inoculating society against this misuse risk might require some deep changes to human attitudes (e.g. ‘unlearning’ the assumption that a voice can be used reliably for identification). With regards to counterfactual possession, the extent to which the relevant AI talent and compute is concentrated in top labs suggest independent attackers might find it difficult to make discoveries. In terms of absorption/application, making use of a published method (depending on the details of the disclosure – e.g. if it includes model weights) might be relatively easy for attackers, particularly in cases where there are limited defensive measures Overall, it looks like the security benefits of publication in AI might be lower than information security.
   Read more: The Offense-Defense Balance of Scientific Knowledge (arXiv).

White House publishes guidelines for AI regulation:
The US government released guidelines for how AI regulations should be developed by federal agencies. Agencies have been given a 180-day deadline to submit their regulatory plans. The guidelines are at a high level, and the process of crafting regulation remains at a very early stage.

Highlights: The government is keen to emphasize that any measures should minimize the impact on AI innovation and growth. They are explicit in recommending agencies defer to self-regulation where possible, with a preference for voluntary standards, followed by independent standard-setting organizations, with top-down regulation as a last resort. Agencies are encouraged to ensure public participation, via input into the regulatory process and the dissemination of important information.

Why it matters: This can be read as a message to the AI industry to start making clear proposals for self-governance, in time for these to be considered by agencies when they are making regulatory plans over the next 6 months.
   Read more: Guidance for Regulation of Artificial Intelligence Applications (Gov).

####################################################

Tech Tales:

The Invisible War
Twitter, Facebook, TikTok, YouTube, and others yet-to-be-invented. 2024.

It started like this: Missiles hit a school in a rural village with no cell reception and no internet. The photos came from a couple of news accounts. Things spread from there.

The country responded, claiming through official channels that it had been attacked. It threatened consequences. Then those consequences arrived in the form of missiles – a surgical strike, the country said, delivered to another country’s military facilities. The other country published photos to its official social media accounts, showing pictures of smoking rubble.

War was something to be feared and avoided, the countries said on their respective social media accounts. They would negotiate. Both countries got something out of it – one of them got a controversial tariff renegotiated, the other got to move some tanks to a frontier base. No one really noticed these things, because people were focused on the images of the damaged buildings, and the endlessly copied statements about war.

It was a kid who blew up the story. They paid for some microsatellite-time and dumped the images on the internet. Suddenly, there were two stories circulating – “official” pictures showing damaged military bases and a destroyed school, and “unofficial” pictures showing the truth.
  These satellite pictures are old, the government said.
  Due to an error, our service showed images with incorrect timestamps, said the satellite company. We have corrected the error.
  All the satellite imagery providers ended up with the same images: broken school, burnt military bases.
  Debates went on for a while, as they do. But they quieted out. Maybe a month later a reporter got a telephoto of the military base – but it had been destroyed. What the reporter didn’t know was whether it had been destroyed in the attack, or subsequently and intentionally. It took months for someone to make it to the village with the school – and that had been destroyed as well. During the attack or after? No way to tell.

And a few months later, another conflict appeared. And the cycle repeated.

Things that inspired this story: The way the Iran-US conflict unfolded primarily on social media; propaganda and fictions; the long-term economics of ‘shoeleather reporting’ versus digital reporting; Planet Labs; microsatellites; wars as narratives; wars as cultural moments; war as memes.