Last week Facebook and Datasift announced the release of topic data, a new way for brands to understand their audiences through their Facebook activity. Today, Francesco Dorazio of Pulsar, a researcher and designer with a social science and digital media background discusses why Facebook topic data is a big deal for the marketing industry.
Facebook topic data, which can be tapped into through Datasift’s new product PYLON, analyzes what 1.39 billion people are talking about on the platform and provides aggregated and anonymized data about the topics they discuss and the demographics of the people engaging with each topic.
Now it might seem like this is just a new stream of data being injected into an existing research and planning framework. But don’t be mistaken: the release of Facebook topic data has the potential to change the game for the research, planning, advertising and social listening industries. Here’s 8 reasons why.
1. Representative data
For the last 15 years the excuse slowing down the adoption of social data in decision making has remained pretty much the same: it’s not representative of what the real audience thinks. The argument was based on three points: not everyone is online; only a minority of the people who are online express their opinions via social media; and we don’t know the demographics of the people on social media.
All of the above is just not valid any more with Facebook entering the social data scene.
Globally the platform has a massive 1.39 billion users, making it the biggest social network in the world. This scale means it’s less a case of ‘representative’ coverage and perhaps more accurate to think of Facebook data in terms of a census, near 100% of the total population. A 2014 study from Pew reports that in the US 71% of the online population uses Facebook and in the UK, a 2014 Ofcom report notes that Facebook remains the default social networking site for almost all UK adults who are online (66%).
And when it comes to audience composition, Facebook has the audience most varied in age, with 80% of its audience almost evenly split across three age groups between 18 and 54 years old. This makes Facebook data relevant to a huge range of brands and businesses.
2. A network of “real” people
People use each social media platform in many different ways, but it’s still possible to generalize prevalent use cases.
Because of the private nature of the platform, interaction on Facebook tends to be more about your friends and family and the interests you share with them.
Now, social media always involves various levels of staged behavior and demands the researcher to use their knowledge of psychology, society and culture to make sense of the data in context.
But because Facebook has been built as a network of real people with real names, most of which the user tends to also know in real life, the online behaviors, opinions and actions of the users are likely to be more consistent with their offline personas.
3. Demographic context
One of the challenges of using social data in research and planning has always been uncertainty about the people behind the status updates: who are they? The data industry has found ways around this, using machine learning algorithms to infer a few basic demographic details based on the content people post and their profile bios. However this has always been inferential: a good guess, not certainty.
With Facebook topic data we can now access aggregated granular demographic information on the audience. And because the platform is more about interacting with the people you know in real life, that self reported demographic information is more likely to be accurate than on other platforms, and it’s definitely no less accurate than the answers people might give a survey.
4. Highly structured data
Social media data is a mix of structured and unstructured data. The raw data we have been working on until now tends to be 80% unstructured and 20% structured across most of the social media platforms. This makes it interesting but also challenging to extract insights from it.
Facebook’s topic data is changing this radically.
To start with, their data is vastly more structured than the data from any other social platform. Over the last few years Facebook has put a lot of work into structuring their data in order to allow better exploration, discovery, recommendation and of course advertising. We gained a glimpse of this through the Graph Search feature – and although that feature’s now been deprecated, it doesn’t mean all the work has gone away. From the categorization of emotional states to the categorization of topics, the granularity of the demographics and behavioral profiles of their audience… Facebook knows a huge amount about who its users are and what they like.
Exploration aside, Facebook topic data offers the most precise model for data sampling and collection I have ever seen, with more than 70 ‘targets’ to sample data against. Moreover these can be combined for pin-point targeting – for example, you could search for stories around a specific product, generated by a specific demographic within a specific geographic area.
Further, once you’ve got your data you can then filter on them using more than 60 parameters to get exactly what you’re looking for.
This is an approach that’s enabling unprecedented precision in analyzing the data, advanced data exploration and manipulation and solid cross validation of any research hypothesis across multiple audience segments.
5. It’s not a firehose
Most of the coverage on the story has been focusing on the idea that Facebook is finally opening up its firehose to benefit marketers – which is a misapprehension. The reality is that Facebook are going for a very different model from the one adopted by all the other social platforms.
The firehose model is called as such because it delivers a stream of interactions, either 100% or a sample of the content users create on the platform, plus the metadata associated with those posts. The topic data model is not a firehose, however, because it doesn’t deliver the stream nor any single bit of content generated on Facebook. Instead it delivers analytics on those interactions and nothing more. This means you’ll see the topics, demographics and engagements– but not the status updates that underlie this data.
This makes it a lot less onerous to handle the data from a tech point of view, which will facilitate adoption.
But it also introduces another key innovation: standards.
6. Analytics standards
Because the raw data (the actual posts) never leaves Facebook, all primary analysis on it must be done in-house, from topic extraction to demographics profiling.
This means that everyone receiving the data is working with the same set of analytics, calculated against the same set of targets, and divisible with the same set of filters.
Take topic extraction for example: whereas now every analyst uses a different algorithm to categorize the topics of social media activity depending on what technology vendor they use, with Facebook topic data, every single analyst is looking at the topics extracted using the same tokenization, NLP and machine learning algorithm.
This means Facebook is effectively introducing analytics standards and therefore enabling directly comparable results across industries, brands, agencies and technology vendors for the first time in the history of the social listening industry.
7. Designed for audience insights, not for influencer targeting
Facebook topic data won’t let you see any specific stories or specific profiles. In fact if your query finds stories involving less than 100 users you won’t be able to see any data at all. This might seem a limitation compared to how people have been using social data until now.
One of the key things brands and organizations use social media for is identifying influencers. With Facebook topic data, you can basically forget about that. But that’s a good thing.
I’m not the biggest fan of influencer marketing and I believe social influence feeds on myriads of micro actions rather than a few gatekeepers (unless you’re enrolling celebrities, in which case you’re not doing influencer marketing anyway but media buying). But the key point here is another.
Social media has always been stronger at audience understanding than audience targeting. People don’t like to be targeted: the more they feel targeted the less freely they express themselves.
Privacy is crucial and it will be more important going forward as our daily actions produce more and more data streams. What Facebook topic data is doing is switching the focus of social media listening from audience targeting to audience understanding.
And the wealth of structured, aggregated, anonymized data it brings to the table is enough to make a solid case for it.
8. Quality, quantified
I’ve been banging this drum for awhile now: social media is not quantitative data, but qualitative data on a quantitative scale. What I mean by this is that analyzing social media with a purely quantitative approach does throw away a lot of the value that social data carries, value which lies in the qualities of what people say and how they say it.
This means that to make the most of social media data you need to enable exploration at the micro level (the single post) and at the macro level (the millions of posts over time).
But this also means enabling smarter ways of analyzing the quality of the data using discourse analysis, content analysis, semantic analysis, machine learning and deep learning approaches. The aim of these approaches is to try and quantify the qualities in the data in order to support observations on a mass scale.
So whilst I think that not being able to look at the verbatim of individual stories will undoubtedly remove from the equation some of the value that social data carries, the work Facebook is doing is making it easier to quantify those qualities and ultimately will produce more comparable observations while also being respectful of users privacy.
So yes, Facebook topic data is not simply just another stream of data being pumped in the brains of the social data industry, it’s a whole new model for extracting value from social data, a model designed to protect users privacy and at the same time to foster a more solid and replicable way of doing research with social media data.
Could, and should, this be the future of the social media listening industry?