RENEE MONTAGNE, host:
After more than a year of silence, Osama bin Laden's voice was heard again yesterday. The Arab television network, Al-Jazeera, broadcast an audiotape from the al-Qaeda leader warning that he is planning new attacks in the United States. The CIA authenticated the tape quickly. The agency never divulges how it goes about positively identifying the voice of bin Laden. NPR's Robert Smith spoke to forensic audio experts for a look inside the world of voiceprinting.
ROBERT SMITH reporting:
For those who have heard previous messages from Osama bin Laden, the voice did sound familiar.
(Soundbite of audiotape)
SMITH: He has a soft, high tone; slow and deliberate.
(Soundbite of audiotape)
SMITH: But audio experts say that verifying such a recording takes more than just recognizing the voice; it's a painstaking process. And that's why Steve Cain, the president of the company Forensic Tape Analysis, was amazed when the CIA authenticated bin Laden's voice four hours after the tape was released. That's much faster than they've done it before.
Mr. STEVE CAIN (President, Forensic Tape Analysis, Inc.): It's based probably on two issues: new equipment that we don't know about, or perhaps new procedures, new algorithms, new mathematical ways of analyzing a person's voice that are not known to the real forensic world.
SMITH: A spokesperson for the CIA would only reveal that they did a technical analysis of the voice. But audio experts say that the basic principles the CIA uses are likely the same ones that have driven the field of voice recognition since the 1940s: use of a spectrogram, a visual representation of the frequencies of human speech. Of course, there's some tricky elements when you're dealing with a tape like bin Laden's. Tom Owen, who runs an audio analysis firm, Owl Investigations, has verified bin Laden's voice before for news organizations. He says after listening to known samples of bin Laden's voice, he found it distinctive.
Mr. TOM OWEN (President and CEO, Owl Investigations, Inc.): He speaks in a very structured manner, and he speaks that way all the time. His manner of speaking is very concise and very pentamic. It has a meter to it, if you know what I mean by that.
SMITH: But those are also the kind of characteristics that can be imitated. Owen says the next level of analysis takes you deeper into how speech is structured.
Mr. OWEN: You can zero in on it and say, ah, you know, this kind of looks like him. Now, let's do some spectrograms, which people commonly refer to as voiceprint, to measure the rate of speech, the distance between the words, how many words are spoken in how many seconds, that sort of thing.
SMITH: The spectrogram looks like a long, Rorschach ink blot. A single word becomes a multi-layered sketch of tones and overtones. You can see the frequency and intensities of each vowel and consonant. Those, say Owen, are the types of things that are hard to fake, because it comes from how you are built.
Mr. OWEN: The teeth, the lips, the soft palette, the upper palette, your gums, the structure of your jaw, all these things predicate how you're going to speak.
SMITH: So when it comes to a bin Laden tape, analysts are really peering deep into the mouth and throat of the al Qaeda leader and comparing the unique characteristics to a spectrogram from a proven recording. In Owens' case, he used a 1998 ABC News interview with bin Laden. Standards in the industry say you have to find at least 20 direct correlations between your two samples to declare them to be the same person. A frustration for analyst Steve Cain is that the CIA says it has high confidence in its match, but never reveals the details.
Mr. CAIN: They don't publish their findings. They never show you any hard evidence as to what words match or did not match, in the, this recent bin Laden communication.
SMITH: Cain says that for voice analysis to be trusted by the public, the mystery has to be taken out of the process.
Robert Smith, NPR News, New York.
(Soundbite of music) Transcript provided by NPR, Copyright NPR.