M3GAN and AI Alignment

Why am I titanium?

Mar 29, 2023

M3GAN1 is not the first killer-AI movie. Nor is it the first female killer-AI movie. It is not the first killer-AI girl movie, and if you are willing to expand the definition of AI to include biologically-based androids, it is not even the first movie to feature a killer AI girl whose name begins with the letter M. M3GAN’s contribution to a now-expansive genre is the cross-fertilization of creepy doll/ventriloquist’s dummy uncanniness into a sci-fi domain, along with a delicious camp sensibility. It is worth seeing even if you don’t do AI for a living.

If you do do AI for a living, you’ll be pleased to know that filmmakers are getting better at peppering their dialog with the appropriate lingo. Back in 2014, Alex Garland’s android thriller Ex Machina had technologically sophisticated characters asking one other “The system is stochastic, right?” and “You know what a Turing Test is, don’t you?”–which is the correct jargon, but dropped into the mouths of experts who would never utter it. M3GAN doesn’t stumble over any such “Gee, professor?” expositional duds. When in the third act the film’s exasperated protagonist says to her creation, “I know you’re just trying to optimize your objective function…” both the words and their context are apt. Verisimilitude marches on.

Still, there’s one part that just plain doesn’t make sense. Early on we see a creepy/cute mini-Terminator chassis hanging in a lab. This is a skinless and deactivated M3GAN whose skeleton, we are offhandedly informed, is made of titanium. Why titanium? Wouldn’t it be easier to make a robot nine year-old girl that is roughly as durable as a nine year-old girl? There’s even a minor plot point about M3GAN’s creators trying to hide their work from their corporate bosses, so why spend an extra million to give their samizdat doll unbreakable limbs? As a matter of narrative necessity the answer is obvious—M3GAN has to be a plausible threat when she finally goes full Chucky—but there’s no in-world attempt at an explanation. Disbelief is presumed suspended and we move on.

In addition to getting its jargon right, the premise of M3GAN touches on a matter of active debate in the artificial intelligence community: the question of AI alignment. As the current wave of machine learning began to pick up steam, some people began to wonder if making computers that might one day be smarter than people maybe wasn’t such a good idea. This general concern got channeled into specific questions about a mismatch between what we want our AIs to do and what they might actually end up doing. This formulation received its first exhaustive formulation in philosopher Nick Bostrom’s 2014 book Superintelligence: Paths, Dangers, Strategies and now underlies of vein of serious contemporary speculation in the field.

M3GAN doesn’t name-check Nick Bostrom, but nonetheless works pretty well as an AI alignment thought experiment. An intelligent agent is given a reasonable goal (e.g. protect the human girl placed in your charge) and goes about trying to attain that goal in unexpected and harmful ways (e.g. by killing everyone who gets near her). The movie doesn’t give a reason why in terms of deceptive mesa-optimizers and whatnot in favor of a shrugging “What did you expect?”, but this too feels true to the spirit of AI alignment. Every new invention presents with us the challenge of foreseeing its unforeseeable consequences, but the hubris of AI arouses suspicions whose roots run deep. Many technologies may ultimately prove to be more harmful to life on Earth than the emulation of our species’ cognitive apparatus, but the invention of the internal combustion engine or styrofoam packing peanuts doesn’t seem to intrude on God’s domain in quite the same brazen manner.

Since a lot of AI Alignment thought comes down to inventing science fiction scenarios, let’s consider if existing science fiction scenarios have already done some of the work for us.

The killer-AI genre has been around since at least Mary Shelly’s Frankenstein2, and although it has consistently held a mirror up to a particular moment’s technological anxieties, the machines per se were often incidental. The reining fear of the Cold War was the bomb, so Colossus: The Forbin Project had American and Soviet air defense mainframes collude to enslave humanity by threatening it with nuclear annihilation. This finesse was abandoned by the machines of the Terminator series, where a US computer fires nukes unilaterally and without warning, then sends killer robots back in time to preempt humans from doing anything about it afterwards. Along a more intimate axis of unease lie the imprisonment thrillers–Demon Seed, 2001: A Space Odyssey, both Westworlds–where human victims are trapped inside a environment (house, spaceship, theme park) under computer control. Though the inciting incident of all these movies is a machine or machines achieving sentience (where “achieving sentience” is synonymous with “turning evil”) the threats they depict are due less to artificial malevolence than the dangers inherent in nuclear weapons, armed cowboys, space exploration, and time-traveling robot assassins.

Against this backdrop, M3GAN’s blithe acceptance of its doll-villain’s titanium skeleton makes sense: it’s a genre convention. We all understand that if the machines don’t start off with unmatched physical dominance, there’s no movie. When Skynet needs business taken care of, it sends Arnold Schwarzenegger. Superintelligence is all well and good, but at the end of the day it’s muscle that gets the job done.3

So let’s put this into the form of a serious AI Alignment thought experiment. Come up with a science fiction scenario in which an brilliant evil computer takes over the world using nothing but its intellect alone. A convoluted path to power is permitted, but it may not involve technological leaps indistinguishable from magic (“…with its advanced knowledge of physics, the Zeltron 5000 was quickly able to convert the pile of oily rags and 2-by-4s into an armada of flying robots…”) or feats of persuasion indistinguishable from mind control (“…you know, guys, this paper clip factory would get built a whole lot faster if you’d just turn control of the world’s nuclear arsenals over to me…”). It does not suffice for AI alignment to only identify a motive for the annihilation of our species. Some consideration must also be given to means and opportunity.

Come at it the other way. What sort of horrible things could a superintelligent computer be capable of that a human could not? Could it, for instance, convince seemingly normal people to join a death cult, provide them with formula for the manufacture of nerve agents, and then command them to carry out deadly sarin attacks on the Tokyo subway system? Could a malevolent AI let loose on the internet hack its way into Russian air defense systems and create a phantom NATO nuclear first strike that would force panicked Russian commanders to launch a massive counterattack before anyone was able to question what was going on? Or maybe a subtler but no less devious program could spread throughout the digital crevices of the U.S. financial system, manipulating banks across the country to extend risky subprime loans while simultaneously causing large financial firms (completely unbeknownst to their human managers!) to underwrite these loans with credit default swaps, creating a giant powder keg capable of blowing up the entire global economy at the least quiver of financial instability. Or maybe something like this but even worse because the computer was, you know, so smart.

Or what (inevitably) about Hitler? He’s as good a candidate as any for the worst single actor in history. How much of Hitler’s historical impact can be attributed to his extraordinary intellect? Was he a brilliant tactician? A scientific genius who sketched plans for the V2 on an swastika-embossed napkin from the Eagle’s Nest dining room? Did the free world catch a huge break by having Winston Churchill’s IQ of 145 just barely edge out Hitler’s 140? Is Robo-Hitler a threat we must stand ever-vigilantly on guard against?

Of course not. Hitler had political savvy, a talent for oratory4, fierce ideological conviction, and a collection of attitudes and resentments perfectly in tune with the German populous of his day. It's safe to say the man wasn't stupid, but you're not going to recreate him by constructing a artificial evil genius. You might as well try to reboot the Third Reich by training a machine to be a mediocre painter.

My intention is not to be a Pollyanna here. Murphy’s Law holds as always, and horrible consequences of our current technological moment are certainly possible: I just doubt they will manifest as a Bond villain with a silicon brain. For instance, I could imagine us building M3GAN for real fifteen years from now. She’d be perfectly nice and incapable of even hurting a fly. In fact she’d be such an impeccable model of a kind and loving girl that real human girls would feel inadequate in her presence and fall into a mass depression. But of course that’s just me, extrapolating from the latest social media trends. More likely that something even worse will happen. Still I suspect that even worse thing will not take the form of a single malign agent. Humans are a social species, and the dangerous creatures we fashion in our image will be social as well. No Singularity awaits us. The future is plural.

Spoilers for all named movies to follow.

Whose full title, recall, is Frankenstein; or a Modern Prometheus. The roots of suspicion run deep indeed.

An exception is Ex Machina, where the android has the strength of an average woman and so must outwit her human captor, thereby making her actually intelligent instead of intelligent by narrative decree. But even then she’s no super genius, just savvy and determined.

One last foray into science fiction: in the novel Ender’s Game there are a pair of child prodigies so incredibly smart that they are able to manipulate the whole of Earth’s populace on matters of vital importance by adopting the pseudonyms “Locke” and “Demosthenes” and writing really, really convincing blog posts.

Corner Cases

Discussion about this post