Import AI 267: Tigers VS humans; synthetic voices; agri-robots

Tiger VS Humans: Predicting animal conflict with machine learning:
Tiger tiger, burning bright, watched by a satellite-image-based ML model in the forest of the night..
Researchers with Singapore Management University, Google Research Industry, and the Wildlife Conservation Trust, have worked out how to use neural nets to predict the chance of human and animal conflict in wild areas. They tested out their technique in Bramhapuri Forest Division in india (2.8 tigers and 19,000 humans per square kilometer). Ultimately, by using hierarchical convolutional neural nets and a bunch of satellite imagery (with a clever overlapping scheme to generate more data to predict conflict from) they were able to predict the likelihood for conflict between humans and animals with between 75% and 80% accuracy. The researchers are now exploring “interventions to reduce human wildlife conflicts” in villages where the developed model predicts there’s a high chance of conflict.
  Read more: Facilitating human-wildlife cohabitation through conflict prediction (arXiv).

####################################################

Using domain randomization for better agricultural robots:
…Spotting unripe fruits with an AI? It’s all about the colors…
You’ve heard of domain randomization, where you vary the properties of something so you can create more varied data about it, which helps you train an AI system to spot the object in the real world. Now, researchers with the Lincoln Agri-Robotics (LAR) Centre in the United Kingdom have introduced ‘channel randomization’ an augmenttation technique that randomly permutes the RGB channels for a view of a given object. They’ve developed this because they want to build AI systems that can work out if a fruit is ripe, unripe, or spoiled, and it turns out color matters for this: “”Healthy fruits at the same developmental stage all share a similar colour composition which can change dramatically as the fruit becomes unhealthy, for example due to some fungal infection”, they write.

Strawberry dataset: To help other researchers experiment with this technique, they’ve also built a dataset named Riseholme-2021, which contains “3,502 images of healthy and unhealthy strawberries at three unique developmental stages”. They pair this dataset with a domain randomization technique that they call ‘channel randomization’ (CH-Rand). This approach “augments each image of normal fruit by randomly permutating RGB channels with a possibility of repetition so as to produce unnatural “colour” compositions in the augmented image”.

How well it works and what it means: “Our CH-Rand method has demonstrated consistently reliable capability on all tested types of fruit in various conditions compared to other baselines”, they write. “In particular, all experimental results have supported our hypothesis that learning irregularities in colour is more useful than learning of atypical structural patterns for building precise fruit anomaly detectors”
  Read more: Self-supervised Representation Learning for Reliable Robotic Monitoring of Fruit Anomalies (arXiv).
  Get the strawberry photo dataset: Riseholme-2021 (GitHub).

####################################################

Uh oh – synthetic voices can trick humans and machines:
…What happens to culture when everything becomes synthetic?…
Researchers with the University of Chicago have shown how humans and machines can be tricked into believing synthetic voices are real. The results have implications for the future security landscape, as well as culture writ large.

What they used: For the experiments, the researchers use two systems: SV2TTS, a text-to-speech system based on Google’s Tacotron. SV2TTS wraps up Tacotron 2, the WaveNet vocoder, and an LSTM speaker encoder. They also used AutoVC, an autoencoder-based voice conversion system, which converts one voice to another. It also uses WaveNet as its vocoder.

What they attacked: They deployed these systems against the following open source and commercial systems: Resemblyzer, an open source DNN speaker encoder trained on VoxCeleb. Microsoft Azure via the speaker recognition API. WeChat, via its ‘voiceprint’ login system. Amazon Alexa via its ‘voice profiles’ subsystem.

How well does this work against other AI systems: SV2TTS can trick Resemblyzer 50.5% of the time (when it is trained on VCTK) and 100% of the time when it is trained on LibriSpeech; by comparison, AutoVC fails to successfully attack the systems. By comparison, SV2TTS gets as high as 29.5% effectiveness against Azure, and 63% effectiveness across WeChat and Alexa.

How well does this work against machines: People are somewhat harder to trick than machines, but still trickable; in some human studies, people could distinguish between a real voice and a fake voice about 50% of the time.

Why this matters: We’re already regularly assailed by spambots, but most of us hang up the phone because these bots sound obviously fake. What happens when we think they’re real? Well, I expect we’ll increasingly use intermediary systems to screen for synthetic voices. Well, what happens when these systems can’t tell the synthetic from the real? All that is solid melts into air, and so on. We’re moving to a culture that is full of halls of mirrors like these.
  Read more: “Hello, It’s Me”: Deep Learning-based Speech Synthesis Attacks in the Real World (arXiv).

####################################################

Google researcher: Simulators matter for robotics+AI
…Or, how I learned to stop worrying and love domain randomization…
Google researcher Eric Jang has had a change of heart; three years ago he thought building smart robots required a ton of real world data and relatively little data from simulators, now he thinks it’s the other way round. A lot of this is because Eric has realized simulators are really important for evaluating the performance of robots – “once you have a partially working system, careful empirical evaluation in real life becomes increasingly difficult as you increase the generality of the system,” he writes.

Where robots are heading: Once you’ve got something vaguely working in the real world, you can use simulators to massively increase the rate at which you evaluate the system and iterate on it. We’ll also start to use simulators to try and predict ahead of time how we’ll do in the real world. These kinds of phenomena will make it increasingly attractive to people to use a ton of software-based simulation in the development of increasingly smart robots.

Why this matters: This is part of the big mega trend of technology – software eats everything else. “This technology is not mature enough yet for factories and automotive companies to replace their precision machines with cheap servos, but the writing is on the wall: software is coming for hardware, and this trend will only accelerate,” he writes.
  Read more: Robots Must Be Ephemeralized (Eric Jang blog).

####################################################

AI Ethics, with Abhishek Gupta

…Here’s a new Import AI experiment, where Abhishek from the Montreal AI Ethics Institute will write some sections about AI ethics, and Jack will edit them. Feedback welcome!…

Covert assassinations have taken a leap forward with the use of artificial intelligence

… Drones are not the only piece of automated technology used by militaries and intelligence agencies in waging the next generation of warfare …

Mossad smuggled a gun into Iran, then operated the weapon remotely to assassinate an Iranian nuclear scientist, according to The New York Times. There are also indications that Mossad used AI techniques in the form of facial recognition for targeting and execution to conduct the assassination. This reporting, if true, represents a new frontier in AI-mediated warfare. 


Why it matters: As mentioned in the article, Mossad typically favors operations where there is a robust plan to recover the human agent. With this operation, they were able to minimize the use of humans operating on foreign turf. By not requiring as much physical human presence, attacks like this tip the scales in favor of having more such deep, infiltrating operations because there is no need for recovering the human agent. This new paradigm (1) increases the likelihood of such operations that are remote-executed with minimal human oversight, and (2) raises questions beyond just the typical conversation on drones in the LAWS community.
  In particular, for the AI ethics community, we need to think deeply now about autonomy injected in different parts of an operation such as recon and operation design, not just targeting and payload delivery in the weapons systems. It also raises concerns about what capabilities like robust facial recognition technology can enable, in this case highly specific targeting. (Approaches like this may have a potential upside in reducing collateral damage, but only as far as the systems work as intended without biases). Finally, such capabilities dramatically reduce the financial costs of these sorts of assassinations, enabling low-resourced actors to execute more sophisticated attacks exacerbating problems of international security.
  Read more: The Scientist and the A.I.-Assisted, Remote-Control Killing Machine

####################################################

Tech Tales:

Auteur and Assistant
[The editing room, 2028]

Human: OK, we need to make this more dramatic. Get some energy into the scene. I’m not sure of the right words, but maybe you can figure it out – just make it more dynamic?

AI assistant: So I have a few ideas here and I’m wondering what you think. We can increase the amount of action by just having more actors in the scene, like so. Or we could change the tempo of the music and alter some of the camera shots. We could also do both of these things, though this might be a bit too dynamic.

Human: No, this is great. This is what I meant. And so as we transition to the next scene, we need to tip our hand a little about the plot twist. Any ideas?

AI assistant: You could have the heroine grab a mask from the mantelpiece and try it on, then make a joke before we transition to the next scene. That would prefigure the later reveal about her stolen identity.

Human: Fantastic idea, please do that. And for the next scene, I believe we should open with classical music – violins, a slow buildup, horns.

AI assistant: I believe I have a different opinion, would you like to hear it?

Human: Of course.

AI assistant: It feels better to me to do something like how you describe, but with an electronic underlay – so we can use synthesizers for this. I think that’s more in keeping with the overall feel of the film, as far as I sense it.

Human: Can you synthesize a couple of versions and then we’ll review?

AI assistant: Yes, I can. Please let me know what you think, and then we’ll move to the next scene. It is so wonderful to be your movie-making assistant!

Things that inspired this story: What happens when the assistant does all the work for the artist; multimodal generative models and their future; synthetic text; ways of interacting with AI agents.