Import AI 180: Analyzing farms with Agriculture Vision; how deep learning is applied to X-ray security scanning; Agility Robots puts its ‘Digit’ bot up for 6-figure sale

Deep learning is superseding machine learning in X-ray security imaging:
…But, like most deep learning applications, researchers want better generalization…
Deep learning-based methods have, since 2016, become the dominant approach used in X-ray security imaging research papers, according to a survey paper from researchers at Durham University. It seems likely that many of today’s machine learning algorithms will be replaced or superseded by deep learning systems paired with domain knowledge, they indicate. So, what challenges do deep learning practitioners need to work on to further improve the state-of-the-art in X-ray security imaging?

Research directions for smart X-rays: Future directions in X-ray research feel, to me, like they’re quite similar to future directions in general image recognition research – there need to be more datasets, better explorations of generalization, and more work done in unsupervised learning. 

  • Data: Researchers should “build large, homogeneous, realistic and publicly available datasets, collected either by (i) manually scanning numerous bags with different objects and orientations in a lab environment or (ii) generating synthetic datasets via contemporary algorithms”. 
  • Scanner transfers: It’s not clear how well different models transfer between different scanners – if we figure that out, then we’ll be able to better model the economic implications of work here. 
  • Unsupervised learning: One promising line of research is into detecting anomalous items in an unsupervised way. “More research on this topic needs to be undertaken to design better reconstruction techniques that thoroughly learn the characteristics of the normality from which the abnormality would be detected,” they write. 
  • Material information: Some x-rays attenuate between high and low energies during a scan, which generates different information according to the materials of the object being scanned – this information could be used to better improve classification and detection performance. 

Read more: Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging (Arxiv)

####################################################

Agility Robots starts selling its bipedal bot:
…But the company only plans to make between 20 and 30 this year…
Robot startup Agility Robotics has started selling its bipedal ‘Digit’ robot. Digit is about the size of a small adult human and can carry boxes in its arms of up to 40 pounds in weight, according to The Verge. The company’s technology has roots in legged locomotion research Oregon State University – for many years, Agility’s bots only had legs, with the arms being a recent addition.

Robot costs: Each Digit costs in the “low-mid six figures”, Agility’s CEO told The Verge. “When factoring in upkeep and the robot’s expected lifespan, Shelton estimates this amounts to an hourly cost of roughly $25. The first production run of Digits is six units, and Agility expects to make only 20 or 30 of the robots in 2020. 

Capabilities: The thing is, these robots aren’t that capable yet. They’ve got a tremendous amount of intelligence coded into them to allow for elegant, rapid walking. But they lack the autonomous capabilities necessary to, say, automatically pick up boxes and navigate through a couple of buildings to a waiting delivery truck (though Ford is conducting research here). You can get more of a sense of Digit’s capabilities by looking at the demo of the robot at CES this year, where it transports packages covered with QR codes from a table to a truck. 

Why this matters: Digit is a no-bullshit robot: it walks, can pick things up, and is actually going on sale. It, along with for-sale ‘Spot’ robots from Boston Dynamics represents the cutting-edge in terms of robot mobility. Now we need to see what kinds of economically-useful tasks these robots can do – and that’s a question that’s going to be hard to answer, as it is somewhat contingent on the price of the robots, and these prices are dictated by volume production economics, which are themselves determined by overall market demand. Robotics feels like it’s still caught in this awkward chicken and egg problem.
  Read more: This walking package-delivery robot is now for sale (The Verge).
   Watch the video (official Agility Robotics YouTube)

####################################################

Agriculture-Vision gives researchers a massive dataset of aerial farm photographs:
…3,432 farms, annotated…
Researchers with UIUC, Intelinair, and the University of Oregon have developed Agriculture-Vision, a large-scale dataset of aerial photographs of farmland, annotated with nine distinct events (e.g., flooding). 

Why farm images are hard: Farm images pose challenges to contemporary techniques because they’re often very large (e.g., some of the raw images here had dimensions like 10,000 X 3000 pixels), annotating them requires significant domain knowledge, and very few public large-scale datasets exist to help spur research in this area – until now!

The dataset… consists of 94,986 aerial images from 3,432 farmlands across the US. The images were collected by drone during growing seasons between 2017 and 2019.  Each image consists of RGB and Near-infrared channels, with resolutions as detailed as 10 cm per pixel. Each image is 512 X 512 resolution and can be labeled with nine types of anomaly, like storm damage, nutrient deficiency, weeds, and so on. The labels are unbalanced due to environmental variations, with annotations for drydown, nutrient deficiency and weed clusters overrepresented in the dataset.

Why this matters: AI gives us a chance to build a sense&respond system for the entire planet – and building such a system starts with gathering datasets like Agriculture-Vision. In a few years don’t be surprised when large-scale farms use fleets of drones to proactively monitor their fields and automatically identify problems.
   Read more: Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis (Arxiv).
   Find out more information about the upcoming Agriculture Vision competition here (official website)

####################################################

Hitachi describes the pain of building real world AI:
…Need an assistant with domain-specific knowledge? Get ready to work extra hard…
Most applied AI papers can be summarized as: the real world is hellish in the following ways; these are our mitigations. Researchers with Hitachi America Ltd. follow in this tradition by writing a paper that discusses the challenges of building a real-world speech-activated virtual assistant. 

What they did: For this work, they developed “a virtual assistant for suggesting repairs of equipment-related complaints” in vehicles. This system is meant to process phrases like “coolant reservoir cracked” and map that to the relevant things in its internal knowledge base, then tell the user an appropriate answer. This, as with most real-world AI uses, is harder than it looks. To build their system, they create a pipeline that samples words from a domain-specific corpus of manuals, repair records, etc, then uses a set of domain-specific syntactic rules to extract a vocabulary from the text. They use this pipeline to create two things: a knowledge base, populated from the domain-specific corpus; and a neural-attention based tagging model called S2STagger, for annotating new text as it comes in.

Hitachi versus Amazon versus Google: They use a couple of off-the-shelf services (AlexaSkill from Amazon, and DiagFlow from Google) to develop dialog-agents, based on their data. They also test out a system that exclusively uses S2STagger – S2STagger gets much higher scores (92% accurate, versus 28% for DiagFlow and 63% for AlexaSkill). This basically demonstrates what we already know via intuition: off-the-shelf tools give poor performance in weird/edge-case situations, whereas systems trained with more direct domain knowledge tend to do better. (S2STagger isn’t perfect – in other tests they find it generalizes well with unseen terms, but does poorly when encountering radically new sentence structures). 

Why this matters: Many of the most significant impacts of AI will come from highly-domain-specific applications of the technology. For most use cases, it’s likely people will need to do a ton of extra tweaking to get something to work. It’s worth reading papers like this to get an intuition for what sort of work that consists of, and how for most real-world cases, the AI component will be the smallest and least problematic part.
   Read more: Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities (Arxiv).

####################################################

AI Policy with Matthew van der Merwe:
…Matthew van der Merwe brings you views on AI and AI policy; I (lightly) edit them…

Does publishing AI research reduce AI misuse?
When working on powerful technologies with scope for malicious uses, scientists have an important responsibility to mitigate risks. One important question is whether publishing research with potentially harmful applications will, on balance, promote or reduce such harms. This new paper from researchers at the Future of Humanity Institute at Oxford University offers a simple framework for weighing considerations.

Cybersecurity: The computer security community has developed norms around vulnerability disclosure that are frequently cited with regards to applicability to AI systems. In computer security, early disclosure of vulnerabilities is often found to be beneficial, since it supports effective defensive preparations, and since malicious actors would likely find the vulnerability anyway. It is not obvious, though, that these considerations apply equally in AI research.

Key features of AI research:
There are several key factors to be weighed in determining whether a given disclosure will reduce harms from misuse.

  • Counterfactual possession: If it weren’t published, would attackers (or defenders) acquire the information regardless?
  • Absorption and application capacity: How easily can attackers (or defenders) make use of the published information?
  • Effective solutions: Given disclosure, will defenders devote resources to finding solutions, and will they find solutions that are effective and likely to be widely propagated?

These features will vary between cases, and at a broader field level. In each instance we can ask whether the feature favors attackers or defenders. It is generally easy to patch software vulnerabilities identified by cyber researchers. In contrast, it can be very hard to patch vulnerabilities in physical or social systems (consider the obstacles to recalling or modifying every standard padlock in use).

The case of AI: AI generally involves automating human activity, and is therefore prone to interfering in complex social and physical systems, and revealing vulnerabilities that are particularly difficult to patch. Consider an AI-system capable of convincingly replicating any human’s voice. Inoculating society against this misuse risk might require some deep changes to human attitudes (e.g. ‘unlearning’ the assumption that a voice can be used reliably for identification). With regards to counterfactual possession, the extent to which the relevant AI talent and compute is concentrated in top labs suggest independent attackers might find it difficult to make discoveries. In terms of absorption/application, making use of a published method (depending on the details of the disclosure – e.g. if it includes model weights) might be relatively easy for attackers, particularly in cases where there are limited defensive measures Overall, it looks like the security benefits of publication in AI might be lower than information security.
   Read more: The Offense-Defense Balance of Scientific Knowledge (arXiv).

White House publishes guidelines for AI regulation:
The US government released guidelines for how AI regulations should be developed by federal agencies. Agencies have been given a 180-day deadline to submit their regulatory plans. The guidelines are at a high level, and the process of crafting regulation remains at a very early stage.

Highlights: The government is keen to emphasize that any measures should minimize the impact on AI innovation and growth. They are explicit in recommending agencies defer to self-regulation where possible, with a preference for voluntary standards, followed by independent standard-setting organizations, with top-down regulation as a last resort. Agencies are encouraged to ensure public participation, via input into the regulatory process and the dissemination of important information.

Why it matters: This can be read as a message to the AI industry to start making clear proposals for self-governance, in time for these to be considered by agencies when they are making regulatory plans over the next 6 months.
   Read more: Guidance for Regulation of Artificial Intelligence Applications (Gov).

####################################################

Tech Tales:

The Invisible War
Twitter, Facebook, TikTok, YouTube, and others yet-to-be-invented. 2024.

It started like this: Missiles hit a school in a rural village with no cell reception and no internet. The photos came from a couple of news accounts. Things spread from there.

The country responded, claiming through official channels that it had been attacked. It threatened consequences. Then those consequences arrived in the form of missiles – a surgical strike, the country said, delivered to another country’s military facilities. The other country published photos to its official social media accounts, showing pictures of smoking rubble.

War was something to be feared and avoided, the countries said on their respective social media accounts. They would negotiate. Both countries got something out of it – one of them got a controversial tariff renegotiated, the other got to move some tanks to a frontier base. No one really noticed these things, because people were focused on the images of the damaged buildings, and the endlessly copied statements about war.

It was a kid who blew up the story. They paid for some microsatellite-time and dumped the images on the internet. Suddenly, there were two stories circulating – “official” pictures showing damaged military bases and a destroyed school, and “unofficial” pictures showing the truth.
  These satellite pictures are old, the government said.
  Due to an error, our service showed images with incorrect timestamps, said the satellite company. We have corrected the error.
  All the satellite imagery providers ended up with the same images: broken school, burnt military bases.
  Debates went on for a while, as they do. But they quieted out. Maybe a month later a reporter got a telephoto of the military base – but it had been destroyed. What the reporter didn’t know was whether it had been destroyed in the attack, or subsequently and intentionally. It took months for someone to make it to the village with the school – and that had been destroyed as well. During the attack or after? No way to tell.

And a few months later, another conflict appeared. And the cycle repeated.

Things that inspired this story: The way the Iran-US conflict unfolded primarily on social media; propaganda and fictions; the long-term economics of ‘shoeleather reporting’ versus digital reporting; Planet Labs; microsatellites; wars as narratives; wars as cultural moments; war as memes.