(Ars Technica) -- The next time you're low on cash and need to get a quick read on the public's feeling on politics or current events, consider sampling Twitter.
According to a new report out of Carnegie Mellon University's computer science department, sentiments expressed via the millions of daily tweets strongly correlate with well-established public opinion polls, such as the Index of Consumer Sentiment (ICS) and Gallup polls.
The data analysis methodology still needs some tweaking, but the researchers still believe that Twitter posts could act as a "cheap, rapid means of gauging public opinion."
Assistant professor Noah Smith and his team collected 1 billion Twitter messages posted in 2008 and 2009 and analyzed them for topic (politics versus economy) and sentiment (positive or negative). They compared the consumer confidence tweets against ICS data from the same period as well as Gallup's Economic Confidence Index.
Tweets about President Obama were compared against Gallup's daily tracking polls from that time period, and tweets about the election were compared against 46 polls created by Pollster.
The researchers found that there was a strong correlation between opinions expressed on Twitter and the traditional polls on topics like Obama's job performance, the job market, and the economy. While the ICS and Gallup polls showed an 86 percent correlation between them, Twitter showed between a 72 and 79 percent correlation to the traditional polls.
Still, there were some areas where the Twitter data didn't correlate particularly well. Twitter mentions of Obama did tend to correlate with his rising popularity during the runup to the 2008 presidential election, but mentions of McCain also correlated with Obama's increasing popularity.
Smith and the team acknowledged that natural language processing would have to be improved before Twitter could be used to predict things like elections, and a number of other considerations should be taken into account when using tweets for analysis.
For example, should retweets or news headlines count in the data? Still, even with so much noise in the average Twitter stream, the researchers were pleased to have extracted some signal that apparently shows something useful.
"In this work, we treat polls as a gold standard. Of course, they are noisy indicators of the truth ... just like extracted textual signals," reads the report. "Future work should seek to understand how these different signals reflect public opinion either as a hidden variable, or as measured from more reliable sources like face-to-face interviews."
The paper will be presented later this month at the Association for the Advancement of Artificial Intelligence's International Conference on Weblogs and Social Media.
COPYRIGHT 2011 ARSTECHNICA.COM