DataSift Announces Next Generation Historics

26th September 2013 40 Comments

For companies that analyze social data, history is important. We know because we currently process more than 1 trillion historical Tweets for our customers every week. History provides a perspective on how your brand presence is growing on social, how sentiment is changing, and what content works for your customers. But it’s important to realize that a historical view shouldn’t be limited to one network.

Exploring Historic Data In A Multi-Channel World

One of our recent data science projects, is a perfect example. Our team found that usage on Facebook and Twitter peaks prior to a movie release, but on Instagram, the peak comes following the release, as people start to share images of themselves at the event.  Historic insights to Bitly click-data and Tumblr Posts add to the experience by providing access to a rich, expansive sets of social data that allow companies to find these unique patterns and capitalize on them.  Sounds like a world of possibility right?

What we realized last week while talking to our customers and prospects at Social Data Week, is that while access to a rich data set is important, when you’re researching social media data, it can be a daunting exercise to balance quantity of data with quality of data.

Historics Preview: Faster Access to the Right Data

With so much noise, today’s businesses need more than a single ‘keyword’ to be able to find relevant content and conversations that relate to their business. This is especially important when you pay for the data you receive. It’s the right data that’s important. And it should be delivered to you as cleanly as possible.

That’s why we’re excited to introduce a new feature to help you get to better data, faster. Historic Preview is a new addition to our platform that enables you to quickly run a filter and find out exactly the quality and quantity of data you’ll get prior to launching your full search. Out of the box we provide five standard preview reports.

Our reports:

  • Basic:
    A broad overview of hourly volume of data, a word cloud of key terms, pie chart of data sources, plus the language breakdown and more.
  • Natural Language Processing Preview:
    Sentiment for the content and title plus entities (names of people, places and companies) found in the content and title.
  • Twitter Preview:
    Summary data for Tweets including word clouds for content and user description, hourly data volume for Tweets and Retweets, plus top 50 languages, mentions, timezones, and hashtags.
  • Links Preview:
    Hourly volume of links, Twitter card data, and Facebook OpenGraph data, plus keywords, language, and a word cloud of link titles.
  • Twitter Demographics:
    A wealth of anonymized demographic information including gender, age-ranges, location by city, state, and country, plus profession and likes & interests.

For example – using DataSift’s Historic Preview APIs for the launch of Dove’s Hidden Beauty Campaign, one of the most viral ads ever produced. We tracked this by looking for the YouTube video link being shared socially – enabling us to track every time the link was shared, and ensuring we don’t need to rely on people posting content with specific hashtags.


From here you can see a snapshot summary of the social conversations around the content, including

  • A word cloud showing the most prominent words used to describe the content
  • Volume breakdowns for Tweets vs Retweets
  • What Twitter accounts where most-mentioned in the post
  • Timezone, Language and hashtag summaries.

By surfacing this – you can quickly answer questions like “What users are generating the most engagement with this content. Or, what hashtags should we advertise against to help promote this content further?”

Answering this kind of question simply isn’t possible when you are skimming across the surface of social data by doing simple searches. Looking deeper into the Data, is the key to your success.

APIs Make it Easy to Use

What’s more, our templates are just the start of what you can do.  All of the data is available through two simple API endpoints and there is no limit to the complexity of your filter or the number of previews you generate.  Choose one of these four aggregation functions to create any number of custom reports:

  • Frequency Distribution
    This analysis takes a threshold argument that controls the number of values returned in the output
  • Numeric statistical analysis
    Given target will be analysed in hourly intervals, and for each interval you will receive minimum, maximum, count, and sum values
  • Target Volume
    Returns how many times the specified target appears in the results, in hourly intervals
  • Word Count
    Returns a list of up to 100 most often used words found in the specified target.

Check our dev site for further detail on how to get started.

Share This