In this blog we dive into what exactly is “Data-Driven Attribution” and how it can help your business.

As a customer, how many purchases have you made after only one interaction with a business and its product? I’m guessing the number is somewhere close to zero.

Now, picture an industry-leading business with an amazing engaging website, a plethora of digital media platforms, and multiple online and offline selling avenues. All of these channels are aiming to achieve one thing – conversions. Now, imagine you are a customer of this business in this scenario:

You see a Google Ads advertisement for a product and it’s something you’ve always wanted. You click on it and go to the website. You immediately fall in love with this product – but you don’t buy it just yet, maybe you should sleep on it. A week later you see a link on a social media platform for the same product and decide to return for another look. You look for your credit card, but you need to get to work right now. A few hours later on your lunch break, you get an email campaign for the same product again! – Perfect! You go directly to the website and make the purchase.

So you’ve made a purchase and the company has obviously done something right to make a sale, but what was it? Was it the Google Ads advertisement that started this customer journey? Was it your catchy social campaign? Your well-timed email campaign? Or something else entirely?

Whether your business aims to make sales or has other conversion goals, understanding which touch points on your customers’ journeys get credit for sales or conversions is hugely important. Otherwise, how do you know what assets in your marketing mix are doing the job right and where you should allocate your valuable time and creative resources? This is known as data-driven attribution modelling, and Google Analytics has some great tools to help.

Attribution Modelling in Google Analytics

Google Analytics offers a number of default attribution models. Let’s recap some of the most common ones.

Last Interaction: This model gives all the credit for a conversion to the last channel in the customer’s journey.

First Interaction: This model attributes credit for the conversion entirely to the very first channel in the customer’s journey.

Linear: This model recognises all channels on the conversion pathway as important. So it shares credit evenly across all channels in the customer’s journey.

Time Decay: This model also shares credit across channels, but it recognises more recent interactions could be more important than touch points that occurred, say, a week ago.

There are more models you can use in Google Analytics and even a Model Comparison Tool which allows you to compare up to three models at once.

This is a great start, but they all have one thing in common – they are heuristics. In other words, you ultimately make the decision on which model to use and this logic is hard-wired into the report. While these methods are quite intuitive, they are also quite naive. How can we use a method that better resembles the real-life complexities of your online business?

Data-Driven Attribution Model

Google’s Data-Driven Attribution is a feature only available in Google Analytics 360, part of the Google Marketing Platform. Rather than using the position-based heuristics above, Data-Driven Attribution uses real data from your Google Analytics account to generate a custom model, driven by a more sophisticated algorithm.

How does Attribution work?

The more basic, position-based methods are only interested in the paths that led to a conversion. Google’s Data-Driven Attribution model analyses both converting and non-converting pathways. According to Google, it has two main steps:

  1. Analyse all available path data to develop a probabilistic model of how customer journeys progress on your site.
  2. Apply a sophisticated algorithm to this data to assign credit to particular marketing touch points.

The algorithm used in Data-Driven Attribution is based on a concept called the Shapley Value, which is from the field of cooperative game theory (more on this below). This method recognises the contribution a marketing touchpoint makes depends on where in the conversion pathway it occurs. By comparing many similar customer pathway sequences both with and without a given touch point included, a form of weighted contribution can be calculated. Put another way, intuitively this is like removing a particular marketing channel from a sequence of touch points in a customer’s journey and wondering what downstream impact it would have on conversions.

But what is ‘the Shapley Value’?

Google’s Data-Driven Attribution is based on a method known as the Shapley Value, but what is this?

The Shapley Value originated in 1953 by Nobel Prize-winning mathematician Lloyd Shapley. It is a concept in Game Theory, specifically in systems where many factors need to ‘cooperate’ to achieve a given outcome. The Shapley Value is a way of allocating credit for the total outcome achieved among these many cooperating factors.

A simple analogy for building our intuition is that of a soccer game. If the striker scores the most goals, he or she will traditionally get all of the credit (this is effectively Last Interaction attribution as the striker got the last touch before the ball went in the goal). But what if these goals were actually because of a brilliant pass from the midfield? A pass so perfectly timed that anyone could have simply tapped the ball in, surely the credit must be given to the mid-fielder? If this is the case, the striker is not adding much ‘marginal value’. Would the team score as many goals if the brilliant midfielder was not playing?

Clearly, it is not just what factors are involved, but how they interact, and in what order that is truly important to understand how they contribute to the overall goal.

Going back to a digital analytics example, we can see a comparison of two pathways below. Including ‘Display’ at this point in the sequence increases the likelihood of a purchase by 50%. Therefore we can attribute this increase to ‘Display’ despite it being only a link in the sequence. To get complete credit given to ‘Display’, more comparisons need to be made with ‘Display’ occurring at different locations and working with different touch points.

To translate this: For any given marketing touch point (e.g. Display), calculate the payoff achieved where ‘Display’ is part of the pathway sequence. Subtract from this the marginal contributions made by all the touch points preceding it. Add these marginal contributions up across all permutations containing ‘Display’ and divide by the total number of possible pathway permutations. This gives us our weighted marginal contribution to the overall conversions for a given marketing channel.

Interpreting Results

Model Explorer Report

To understand the results of Google’s Data-Driven Attribution model, we can use the ‘Model Explorer’ report, located at Conversions > Attributions.

This visualisation shows the overall weighting given for each channel at different locations in the conversion path. This can be really useful in understanding how to acquire customers and how to tailor your marketing activities to cultivate conversions along your funnels.

Download Results

If you would like to export the full model results, Google Analytics 360 allows you to do this with the ‘Download the full model’ button on the Model Explorer page. This can be useful to expand on, and incorporate the results into your own analyses.

Model Comparison Tool

We mentioned above that the ‘Model Comparison’ tool is used to compare Google’s default attribution models. The good news is you can compare the Data-Driven Attribution Model alongside up to two other models.


There are some important factors to be aware of with Data-Driven Attribution.

Firstly, the results presented are refreshed by Google on a weekly basis and look back at the last 28 days of conversion history at the time the model is trained. The benefit here is that models will evolve as your online activity does. Also, the model will only look back to a maximum of 4 interactions within the prior 90 days to each conversion.

While clicks and direct traffic are of course included, the model is also able to incorporate data from other products linked to Google Analytics such as Google Ads, the Google Display Network, and Campaign Manager.

Given this method learns from historical data, for the results to be meaningful there are thresholds imposed. These thresholds set the minimum amount of pathways and conversions to ensure there is enough data to train the model. At the time of writing these thresholds are:

  • 400 conversions per conversion type with a path length of 2+ interactions (i.e., 400 conversions for a specific goal or transaction, not a sum of 400 overall conversion types) AND
  • 10,000 paths in the selected reporting view (roughly equivalent to 10,000 users, although a single user may generate multiple paths)


Famous statistician George Box is quoted as saying “All models are wrong but some are useful”. This is the key concept here. While it is impossible to exactly tell what has caused each and every customer to convert on our websites, by understanding the tools available to us we can learn something really useful and start making data-driven decisions.

Starting with basic attribution models, looking into data-driven attribution, and hopefully speaking to us about how we can further tailor a model to suit your business needs, are critical steps to ensure you invest your time and money into the channels that are actually working for you to drive conversions.

At XPON we have a specialist team of data analysts and MarTech experts. Hence, if you have any further questions on attribution modeling or strategies you can take to give your data more meaning, get in touch with us..