Inside the world of AI that forges beautiful art and terrifying deepfakes
In the last three weeks, we laid down the basics of AI. To recap:
- Most AI advances and applications are based on a type of algorithm known as machine learning that finds and reapplies patterns in data.
- Deep learning, a powerful subset of machine learning, uses neural networks to find and amplify even the smallest patterns.
- Neural networks are layers of simple computational nodes that work together to analyze data, kind of like neurons in the human brain.
Now we get to the fun part. Using one neural network is really great for learning patterns; using two is really great for creating them. Welcome to the magical, terrifying world of generative adversarial networks, or GANs.
GANs are having a bit of a cultural moment. They are responsible for the first piece of AI-generated artwork sold at Christie’s, as well as the category of fake digital images known as “deepfakes.”
Their secret lies in the way two neural networks work together—or rather, against each other. You start by feeding both neural networks a whole lot of training data and give each one a separate task. The first network, known as the generator, must produce artificial outputs, like handwriting, videos, or voices, by looking at the training examples and trying to mimic them. The second, known as the discriminator, then determines whether the outputs are real by comparing each one with the same training examples.
Each time the discriminator successfully rejects the generator’s output, the generator goes back to try again. To borrow a metaphor from my colleague Martin Giles, the process “mimics the back-and-forth between a picture forger and an art detective who repeatedly try to outwit one another.” Eventually, the discriminator can’t tell the difference between the output and training examples. In other words, the mimicry is indistinguishable from reality.
You can see why a world with GANs is equal measures beautiful and ugly. On one hand, the ability to synthesize media and mimic other data patterns can be useful in photo editing, animation, and medicine (such as to improve the quality of medical images and to overcome the scarcity of patient data). It also brings us joyful creations like this:
And this:
On the other hand, GANs can also be used in ethically objectionable and dangerous ways: to overlay celebrity faces on the bodies of porn stars, to make Barack Obama say whatever you want, or to forge someone’s fingerprint and other biometric data, an ability researchers at NYU and Michigan State recently showed in a paper.
Fortunately, GANs still have limitations that put some guard rails in place. They need quite a lot of computational power and narrowly scoped data to produce something truly believable. In order to produce a realistic image of a frog, for example, such a system needs hundreds of images of frogs from a particular species, preferably facing a similar direction. Without those specifications, you get some really wacky results, like this creature from your darkest nightmares:
(You should thank me for not showing you the spiders.)
But experts worry that we’ve only seen the tip of the iceberg. As the algorithms get more and more refined, glitchy videos and Picasso animals will become a thing of the past. As Hany Farid, a digital image forensics expert, once told me, we’re poorly prepared to solve this problem.
This originally appeared in our AI newsletter The Algorithm. To have it delivered directly to your inbox, subscribe here for free.
Deep Dive
Artificial intelligence
Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
OpenAI teases an amazing new generative video model called Sora
The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.
Google’s Gemini is now in everything. Here’s how you can try it out.
Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.
Providing the right products at the right time with machine learning
Amid shifting customer needs, CPG enterprises look to machine learning to bolster their data strategy, says global head of MLOps and platforms at Kraft Heinz Company, Jorge Balestra.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.