Skip to main content

Eavesdropping with a camera and potted plants

Heather Kelly, CNN
A visual image of a houseplant might be all it takes to record your conversations, researchers say.
A visual image of a houseplant might be all it takes to record your conversations, researchers say.
STORY HIGHLIGHTS
  • Researchers have found a way to re-create audio from silent videos
  • They translate the small vibrations in common objects into audio files
  • The technology could be used for law enforcement, space research and more

(CNN) -- Before you spill your deepest darkest secrets, or plans for world domination, look around you. Is there a gossipy potato chip bag or leafy green houseplant nearby picking up your conversation?

Researchers at MIT, Microsoft and Adobe have developed a way to turn regular objects into visual microphones. They have re-created audio from silent video recordings by analyzing the subtle movements of a leaf or empty Coke can that are created by soundwaves traveling through the air.

"We weren't sure at first that this was possible, since those vibrations are so subtle, so we took a loudspeaker and blasted some objects with sounds while filming them with a high-speed camera, and quickly realized that the signal was there, and that we could use our processing techniques to pick it up," said Michael Rubinstein of Microsoft Research, who worked on the project.

The technique they came up with is based on video processing algorithms developed to analyze the tiniest movements in videos, which the researchers previously used for magnifying the tiny movements, much like a microscope

In one experiment, they played the notes to "Mary Had a Little Lamb" on a loud speaker near a potted plant. They used a custom processing algorithm on a high speed video of the plant, shot without audio, to detect movements not visible to the naked eye. They then translated those movements into a sound file, which recreated the song.

Tracking crime from 10,000 feet above
New NSA scoop will reveal American targets
NSA surveillance revelations

In another experiment, they picked up spoken words from an empty potato chip bag recorded through sound proof glass. They even used the popular app Shazam to correctly identify a recovered version of Queen's "Under Pressure" based on the movements of a pair of earbuds.

"We spent a lot of time talking and yelling at objects ... . It was a fun project," said Rubinstein.

The technology is just in the proof of concept phase, but the potential practical uses are fascinating. Law enforcement immediately comes to mind. Detectives might use video cameras as an alternative to wiretaps or sound amplifiers. Investigators could mine soundless surveillance videos for a whole new layer of helpful information.

The researchers are thinking bigger than just fighting crime.

"Perhaps we could take a video of a concert hall or a recording studio and determine their acoustic properties by seeing how the sound propagates through them," said Rubinstein. "Maybe we could use this with telescopes to recover sounds across space, where sound cannot travel."

The technology is different from laser microphones, which require actively shining a laser light onto a scene in order to measure sound. The MIT researchers are starting with regular video shot in natural light -- what they call a passive technique.

The young technology does have limitations. The quality of the audio is better when the video captures more frames per second, though researchers also had some luck with information gathered from a regular camera.

Not every object registers subtle sound well. The most successful visual eavesdroppers are light and rigid objects such as plastic bags, foam cups and tinfoil, according to Rubinstein. Water and plants are OK at picking up audio and solid, but heavy items like bricks are too heavy to register average vibrations.

Next, the researchers want to explore what else they can do with the recovered information. They are already looking into new processing algorithms and techniques to make the data even more useful. Even when they can't recreate words or notes exactly, there is interesting data lurking in the video.

"We showed that we can determine pretty reliably the gender of a speaker from low-quality sound we managed to recover from a tissue box, where it was completely unclear what the person was saying," said Rubinstein.  

Re-creating audio from visual information could pose privacy concerns. But, at least for now, Rubinstein doesn't see these types of visual microphones posing any greater threat to privacy than the existing technology that's already out there.

"I don't think people need to start hiding their bags of chips just yet," he said.

ADVERTISEMENT
ADVERTISEMENT