In an article titled “Watch how word of Ebola exploded in America” Time.com last week published data showing how 10.5 million tweets referenced Ebola between September 16 and October 6. Predictably there has been a massive surge in the conversation since the first Ebola diagnosis inside the USA on September 30.
The article features a graph from @twitterReverb which captures this surge brilliantly, but also highlights some of the limitations of relying solely on traditional social networks: five tweets are highlighted in the timeline – two informative tweets from the Centers for Disease Control & Prevention (CDC) garnered barely over 1000 retweets between them, while the drummer for Australian boy band 5 Seconds of Summer secured 38,000 retweets with his observation on October 2 “The Ebola virus is super scary, hopefully it can be contained!”
The World Health Organisation (WHO) reported that 4,447 people had died as of October 15. For organizations charged with the deadly serious task of identifying early warnings signs of threats to public health, social media can be part of the solution, but focusing exclusively on social data risks missing critical signals from other data sources.
In a recent article Kalev Leetaru, of Georgetown University, argued that “(d)espite all of the attention and hype paid to social media as a sensor network over human society, mainstream media still plays a critical role as an information stream in many areas of the world.” He reports that the first international warning of the Ebola outbreak in West Africa did not come from Twitter or other social networks, but from an article in Xinhua’s French-language newswire titled “Guinée: une étrange fièvre fait 8 morts à Macenta”. The article reported a March 13, 2014 press conference by the Guinea Department of Health in which officials announced eight deaths due to haemorrhagic fever and described symptoms similar to Lassa Fever.
At DataSift we know that valuable insight can be mined from multiple types of Human Data – social data, data from news and blogs, message boards, discussion engines and so on. By aggregating over 2 billion interactions daily across these data types, enriching and normalising all that data, we enable our customers to search across immense volumes of human data using a single body of code.
What does this mean for organizations using the DataSift platform to looking for early warning signs of public health threats?
Xinhua’s French-language article was received by the DataSift platform at 12:08am EST on March 14, 2014 via our partner LexisNexis who provide us a premium feed of licensed news content from more than 20,000 media outlets spanning every continent including daily newspapers, consumer magazines, trade journals and TV transcripts. Six hours later the story was reported on two Facebook pages delivered through our platform. The article was then referenced in several posts on a discussion forum which were received by our platform on March 16 via our boards source. The first tweet referencing the story appeared later the same day and was then retweeted four times over the following six hours. Volumes across all our data sources subsided until WHO issued their alert on March 19.
If your predictive analytics solution relies on only a handful of social networks and ignores the multitude of other human data sources, it is time to ask yourself this question: what signals could I be missing?