Using Facebook Topic Data to Understand Car Purchasing Decisions

Richard Caudle
3rd February 2016 0 Comments

You might have seen our recent blog post where we discussed how an agency used Facebook topic data to understand how audiences make car purchasing decisions. In this post we’ll take a look at how the research was carried out using our platform.

Using Facebook topic data to understand behavior
It can be difficult to understand how members of an audience consider a large purchasing decision, such as buying a car. Buying a car is a major decision that people spend months considering. What features do people look for? What brands do they consider? Anything we can learn about the decision factors is valuable insight that can drive product and marketing direction.

In this instance an agency worked on behalf of the US arm of a global automotive brand.

The agency used Facebook topic data to study their client’s target audience. Their investigation revealed:

  • which brands their client’s brand was being considered against.
  • which features were most important to English-speaking and Spanish-speaking audiences.
  • their client’s brand was being considered for purchase less often than other brands, and often being rejected by customers.

These findings meant that the agency could recommend changes of marketing approach to their client.

Let’s take a look at how the research was carried out.

Working with PYLON
Before we look at the detailed steps, here’s a quick reminder of how PYLON works in practice.

Screen Shot 2016-02-02 at 12.33.57 PM

You work with PYLON by:

  • Filtering the stream of data from Facebook to stories and engagements (such as likes and comments) you’d like to analyze. Filtered data is recorded into an index.
  • Classifying the data using your own custom rules to add extra metadata for your use case.
  • Analyzing the data you have recorded to the index.

You can learn more about the platform in our What is PYLON? guide. Now look at these steps in the context of this specific use case.

Filtering stories and engagements
The first step of working with Facebook topic data is recording data from a target audience for your analysis.

Using the DataSift platform you can capture stories and engagements on stories by creating a filter in CSDL. The filter specifies what data you’d like to be recorded from the Facebook data source to your index for analysis. The rules in your filter operate against the values of targets (data fields) of the stories and engagements.

For this study the agency recorded the audience discussing automotive topics mentioning key brands or sharing links to brand websites.

For example a filter like so would take a similar approach for three well-known brands:


Screen Shot 2016-02-02 at 12.35.21 PM

Notice here we’re filtering to posts from users only (not posts created by brands pages). We’re making use of topics to restrict discussions to only those related to cars, and we’re making use of link domains to pick out people sharing links to key websites.

Based on this filter if someone posted the following:

I love the look of the new BMW 3 series!

This story and any likes, comments or reshares on the story will be recorded.

The agency started a recording using a filter similar to the example above to record data for their analysis.

Adding value through classification
Facebook topic data is already a rich data set but you can add additional value using classification rules. By adding classification rules to a filter the platform will record additional meta-data for each story and engagement. You can use this additional metadata in your analysis.

In this case the agency was interested in identifying automotive brands, car features discussed, conversations that mention multiple brands (signifying cross shopping) and the intent expressed by authors.

Firstly for any share of voice use case it is useful to add tags for each brand. For our example filter we could add:

Screen Shot 2016-02-02 at 12.36.17 PM


We could also add tags that would pick out car features in different languages:

Screen Shot 2016-02-02 at 12.36.55 PM


Notice here that the tags help us to group posts by feature and language, and normalize the data for analysis later.

When a story or engagement matches the filter conditions the classification rules are applied before the data is recorded to your index. So in this case if the content of a post reads:

I love the look of the new BMW 3 series, it would look awesome in black!

The story will be tagged with “BMW” and “Style” when it is stored to the index.

We could also add tags to identify where authors discuss multiple brands, to find evidence of ‘cross shopping’:

Screen Shot 2016-02-02 at 12.37.36 PM

Finally we could add tags to pick out intent and purchase stage:


Screen Shot 2016-02-02 at 12.38.20 PM

These are simple illustrative example tags. The agency took a similar approach with more complete sets of terms to classify data as it was being recorded.

Finding audience insights
Once you’ve recorded data to your index you can immediately perform initial analysis using analysis queries.

You can perform a time series analysis to see how an audience engaged over time. You can perform a frequency distribution analysis to quantify the engagement by segments of your audience. A more advanced form of analysis is nested queries where you can segment and quantify your audience by multiple dimensions.

You also have the option of using query filters to filter to a portion of your recorded data before performing analysis. So for example you could use the example tags above and filter to only stories and engagements relating to friends before performing a time series analysis.

With data being neatly classified and recorded into an index, the agency submitted analysis queries to investigate the recorded audience.

Investigating cross shopping
Firstly the agency looked at which brands were being discussed at the same time. This can be considered an indication of cross-shopping (where two brands are being compared).

To perform this analysis the agency used a frequency distribution query analyzing the cross-shopping tags they had added to their filter.

Screen Shot 2016-02-02 at 12.38.55 PM


Grouping the results by brand revealed the following results:

Screen Shot 2016-02-02 at 12.39.36 PM

Here I’ve removed the brand names, but you can see how for instance ‘Brand 2’ is clearly being associated alongside ‘Brand E’.

Understanding key features for audiences
Next the agency looked at which features were being discussed, split by the language in use.

Again the agency used a frequency distribution queries analyzing the feature tags they had added to their filter.

Screen Shot 2016-02-02 at 12.40.11 PM


Note the filter argument for the query filters the dataset to just posts and engagements from females prior to the analysis being performed.

Plotting the results as revealed the key features for the English-speaking audience:

Screen Shot 2016-02-02 at 12.40.47 PM

And for the Spanish-speaking audience:

Screen Shot 2016-02-02 at 12.41.12 PM

For the English-speaking audience style was most important. For the Spanish-speaking audience mechanical specifications were most important.

Studying intent
Finally the agency look at the intent (or purchase stage) of discussions for each brand.

Again they made use of the tags they had added to their filter, performing a nested query in this case:


Screen Shot 2016-02-02 at 12.41.40 PM

Plotting the analysis results gave the following result:

Screen Shot 2016-02-02 at 12.43.14 PM

You can see that the client’s brand was often talked about by owners, but relatively less considered as a purchase choice.

Learn more…
PYLON for Facebook Topic Data gives analysts access to a vast new audience to test their assumptions and to inform better decisions.

To learn more about the platform take a look at our What is PYLON? guide.
Also, keep an eye on this blog for more Facebook topic data use cases which we’ll be posting soon.

Share This