Susan Bennett says she is the voice of the original U.S. version of Siri on Apple's iPhone
Apple won't comment, but other sources -- including an audio forensic expert -- confirm this
Recordings from 2005 were used for Siri; hearing herself six years later was a surprise
How CNN's Jessica Ravitz, who had never used Siri, found Bennett is also shocking
For the past two years, she’s been a pocket and purse accessory to millions of Americans. She’s starred alongside Samuel L. Jackson and Zooey Deschanel. She’s provided weather forecasts and restaurant tips, been mocked as useless and answered absurd questions about what she’s wearing.
She is Siri, Apple’s voice-activated virtual “assistant” introduced to the masses with the iPhone 4S on October 4, 2011.
Behind this groundbreaking technology there is a real woman. While the ever-secretive Apple has never identified her, all signs indicate that the original voice of Siri in the United States is a voiceover actor who laid down recordings for a client eight years ago. She had no idea she’d someday be speaking to more than 100 million people through a not-yet-invented phone.
Her name is Susan Bennett and she lives in suburban Atlanta.
Apple won’t confirm it. But Bennett says she is Siri. Professionals who know her voice, have worked with her and represent her legally say she is Siri. And an audio-forensics expert with 30 years of experience has studied both voices and says he is “100%” certain the two are the same.
Bennett, who won’t divulge her age, fell into voice work by accident in the 1970s. Today, she can be heard worldwide. She speaks up in commercials and on countless phone systems. She spells out directions from GPS devices and addresses travelers in Delta airport terminals.
Until now, it’s been a career that’s afforded her anonymity.
But a new Apple mobile operating system, iOS 7, with new Siri voices means that Bennett’s reign as the American Siri is slowly coming to an end. At the same time, tech-news site The Verge posted a video last month, “How Siri found its voice,” that led some viewers to believe that Allison Dufty, the featured voiceover talent, was Siri. A horrified Dufty scrambled in response, writing on her website that she is “absolutely, positively NOT the voice of Siri,” but not before some bloggers had bought into the hype.
And there sat Bennett, holding onto her secret, laughing and watching it all. For so long she’d been goaded by others, including her son and husband, to come forward. Her Siri counterparts in the UK and Australia had revealed their identities, after all.
So why not her? It was her question to wrestle with, and finally she found her answer.
“I really had to weigh the importance of it for me personally. I wasn’t sure that I wanted that notoriety, and I also wasn’t sure where I stood legally. And so, consequently, I was very conservative about it for a long time,” she said. “And then this Verge video came out … And it seemed like everyone was clamoring to find out who the real voice behind Siri is, and so I thought, well, you know, what the heck? This is the time.”
The Siri surprise
The story of how Bennett became this iconic voice began in 2005. ScanSoft, a software company, was looking for a voice for a new project. It reached out to GM Voices, a suburban Atlanta company that had established a niche recording voices for automated voice technologies. Bennett, a trusted talent who had done lots of work with GM Voices, was one of the options presented. ScanSoft liked what it heard, and in June 2005 Bennett signed a contract offering her voice for recordings that would be used in a database to construct speech.
For four hours a day, every day, in July 2005, Bennett holed up in her home recording booth. Hour after hour, she read nonsensical phrases and sentences so that the “ubergeeks” – as she affectionately calls them; they leave her awestruck – could work their magic by pulling out vowels, consonants, syllables and diphthongs, and playing with her pitch and speed.
These snippets were then synthesized in a process called concatenation that builds words, sentences, paragraphs. And that is how voices like hers find their way into GPS and telephone systems.
“There are some people that just can read hour upon hour upon hour, and it’s not a problem. For me, I get extremely bored … So I just take breaks. That’s one of the reasons why Siri might sometimes sound like she has a bit of an attitude,” Bennett said with a laugh. “Those sounds might have been recorded the last 15 minutes of those four hours.”
But Bennett never knew exactly how her voice would be used. She assumed it would be employed in company phone systems, but beyond that didn’t think much about it. She was paid by the hour – she won’t say how much – and moved on to the next gig.
The surprise came in October 2011 after Apple released its iPhone 4S, the first to feature Siri. Bennett didn’t have the phone herself, but people who knew her voice did.
“A colleague e-mailed me [about Siri] and said, ‘Hey, we’ve been playing around with this new Apple phone. Isn’t this you?’”
Bennett went to her computer, pulled up Apple’s site and listened to video clips announcing Siri. The voice was unmistakably hers.
“Oh, I knew,” she said. “It’s obviously me. It’s my voice.”
It certainly does sound like Bennett. But proving who supplied the voice of Siri isn’t easy. It’s not like Steve Jobs sent Bennett a thank-you note, or a certificate to hang on her wall.
There are others who vouch for her. But the tech world – and specifically the text-to-speech, or TTS, space – is a complicated business, one that’s shrouded in secrecy and entangled in a web of nondisclosure agreements.
Bennett is not bound by such restrictions, which is why she’s talking. But the industry has a vested interest in keeping their voices anonymous.
“The companies are competing to create the best-sounding and functioning systems. Their concern is driving revenues,” said Marcus Graham, CEO of GM Voices. “Talking about the voice talent, from their perspective, is likely seen as a distraction.”
Bennett’s attorney, Steve Sidman, can’t breach attorney-client privilege to share documents and contracts, but since he began representing Bennett in 2012 he’s been intensely aware of her connection to Siri.
“I’ve engaged in substantial negotiations – multiple, months-long negotiations – with parties along the economic food chain, so to speak, that involved her rendering services as the voice of Siri,” he told CNN. “It’s as simple as that.”
And then there’s Graham, of GM Voices, a man who has built a career around providing voiceover talent for interactive voice technologies.
Graham won’t divulge details about any deals he made back in 2005. But he has worked with Bennett for 25 years, has recorded “literally millions of words with Susan” and has installed her voice with clients across the globe. He knows her voice as well as anyone, and he doesn’t hesitate when asked if she and Siri are the same.
“Most female voices are kind of thin, but she’s got a rich, full voice,” he said. “Yes, she’s the voice of Siri. … She’s definitely the voice.”
A ‘100% match’
In October 2005, a few months after Bennett made those recordings, ScanSoft bought and took on the name of Nuance Communications. Nuance is the company widely accepted to have provided to Apple the technology behind Siri.
When CNN contacted Nuance to try and confirm Bennett’s identity as a voice of Siri, a Nuance spokeswoman said, “As a company, we don’t comment on Apple.”
Apple, too, declined to comment.
So CNN took the investigation one step further by hiring an audio forensics expert to compare Bennett’s voice with Siri’s.
Ed Primeau, of Rochester Hills, Michigan, has been doing this work for three decades. He’s testified in courts, analyzed “hundreds, if not thousands” of recordings and is a member of the American Board of Recorded Evidence. He spent four hours studying our “known voice” – in this case Siri – with the unknown voice of Bennett.
“I believe, and I’ve lived this for 30 years, no two voices are the same,” he said, after finishing his analysis of the Siri voice and Bennett’s. “They are identical – a 100% match.”
To reach his conclusion Primeau created back-to-back comparison files, lifted and listened to consonants and reviewed deliveries. He took the hiss off the Siri sound, created in recording from a phone, and dropped it into Bennett’s file.
After studying Bennett’s normal speaking voice, he was about 70% certain of the match. But once he had audio of her saying the same words as Siri, he knew his work was done. Even so, he said he asked a colleague for a second opinion.
“I understand the importance of accuracy,” Primeau said. “Rest assured: It’s 100% Susan.”
How CNN got this story
This isn’t the sort of story I’d naturally go after. Technology is far from my beat. In fact, the first time I ever spoke to Siri was on my work phone – the kind that’s plugged into a wall jack and has a tangled cord attached to the handset.
Bennett was a voiceover artist I was interviewing for a CNN special project on the world’s busiest airport – Hartsfield-Jackson Atlanta International – scheduled to come out next month. I was tracking down the airport’s voices, and she, a voice of Delta terminals, was one of them.
In the course of our phone conversation, I asked her to rattle off some jobs she’s had over the years. She gave me a quick and general rundown and then added that she’s done a lot of IVR work.
“IVR?” I asked.
“Interactive voice response,” she answered. “The sort of thing you hear on a company’s phone system.”
For reasons I can’t explain – I was still struggling to understand my first iPhone – I blurted out, “Hey, are you Siri?”
She gasped. And then I gasped.
“Oh my God,” I said. “You’re totally Siri, aren’t you?”
What followed was a short, panicked flurry of non-denials and non-confirmations, and a promise from me that I wouldn’t do or say a thing.
That was months ago. About two weeks ago, after the confusion over the Verge video, Bennett reached out to me. She was ready to speak as herself and set the record straight.
’My career as a machine’
As a child, Bennett’s favorite toy was a play phone-operator system, a big red block with a receiver and lines she could patch in to help imaginary callers make their connections.
Years later, while singing jingles, she was tapped to be the radio and TV voice of First National Bank’s “Tillie the All-Time Teller,” the first ATM machine. Though that was about 40 years ago, she can – and does – still break seamlessly into the high-pitched song.
“I began my career as a machine many years ago,” Bennett said. “I’m sure that you hear my voice at some point every day.”
But the way she is heard was a surprise even to her.
Music and singing had always been a part of Bennett’s life. At Brown University, she sang in a jazz band and also with another group at the Berklee School of Music. After graduating, she toured as a backup singer with Burt Bacharach and Roy Orbison. Today, she and husband Rick Hinkle – a guitarist, composer and sound engineer – still play in a band, mostly at private events.
She fell into voiceover work by chance in the 1970s when she walked into Atlanta’s Doppler Studios for a jingle job and the voiceover talent was a no-show. The studio owner looked around and said, “Susan, come over here. You don’t have an accent. Go ahead and read this.”
She did, and a new career path was born.
Bennett wasn’t always accent-free, though. She was born in Vermont and grew up all over New England. Her voice – dropped Rs and all – was “SNL”-skit ready. Can she imagine Siri as a New Englander? “Neva! Neva!”
A stint in upstate New York helped her lose the accent. By the time she arrived in Atlanta in 1972, with her first husband, former NHL player Curt Bennett of the Atlanta Flames, she was ready to fight off the Southern twang. She fell in love with Atlanta and, after that marriage ended, stayed.
Even though her voice can be heard everywhere, she’s enjoyed being out of the spotlight.
“You have a certain anonymity which can be very advantageous,” she said. “People don’t judge you by how you look … That’s been kind of freeing in a lot of ways.”
’Part of history’
Bennett works in a sound-proof recording booth in her home, a tin of lozenges at the ready. Her voice is transmitted to the world, while she – if she so chooses – sits in her jammies, or more likely her Zumba clothes. Auditions are done by e-mail. She can grocery shop and go unrecognized.
It’s not as though her natural speaking voice, heard out of context in the produce aisle, sparks reactions.
So the idea of coming out as the voice of Siri was one she pushed aside. It probably wouldn’t have even occurred to her if not for the goading of others, including her 36-year-old son – whom she, and he, jokingly refers to as “Son of Siri.”
“Her voice has been everywhere throughout my life. I’d call my bank while I was in college in Colorado, and it was my mom telling me I had $4,” said Cameron Bennett, a photographer in Los Angeles.
He first found out she was the voice of Siri while watching an iPhone 4S commercial on TV. There, on the screen, was director Martin Scorsese talking to his mother. When Cameron bought the phone himself, she began barking at him through its GPS feature, prompting him to yell, “Mom, stop!”
“She’s part of history,” he said. “It was funny trying to explain to her how big it was. She uses her cell phone for 8% of what it can do.”
When Bennett upgraded her phone and first talked to … well, herself, she says she was a little horrified. It was weird, to say the least. But she was blown away, she said, to play a part in such a technological feat.
Being the voice of Siri, though, doesn’t mean she’s immune to the sorts of frustrations others sometimes have with the technology.
“But I never yell at her – very bad karma,” Bennett said. That said, she knows not everyone is as gracious: “Yes, I worry about how many times I get cursed every day.”
Now, though, with iOS 7 she is passing the telephonic torch to a new Siri. Bennett would be lying if she said she wasn’t a bit disappointed, but in her field of work she’s learned to expect evolution – and even revolution.
As technology improves, and the concatenation process becomes less robotic and more human, Bennett thinks anything will be possible.
“I really see a time when you’ll probably be able to put your own voice on your phone and have your own voice talk back to you,” she said. “Which I’m used to, but maybe you aren’t.”