Posts

003 | The Past, Present and Future of Online Attribution


When it comes to measurement, the marketers goal has always been the same. To quantify the impact of marketing on commercial metrics in order to validate and optimise marketing strategy and channel mix. 


Using data to drive decision making is what first attracted me to digital marketing. I could apply my mathematical background with creative problem solving in order to solve key business challenges. Almost twenty years ago I created a multi-touch cross channel attribution model for First Choice Holidays (part of the TUI Travel Group). Due to the processing power that was required of a standard Windows PC, it had to run on an isolated computer over a Thursday night so that I could analyse the results with a bacon sandwich on a Friday morning. 


Today online attribution faces its biggest challenges yet. Fundamental tools required to make connections between the data are going away to conform with new regulations and a privacy conscious consumer. Before examining the future of digital marketing attribution within this landscape, I wanted to spend some time looking back at the journey that brought us to this point. Today’s challenges may be significant but this past will demonstrate that the industry has been able to meet and overcome some sizable problems to get to this point. It is by leveraging the same creative thinning combined with technological innovation that will enable marketers to continue to use attribution models to make the right decisions for their business. 


It’s almost impossible to construct a fixed timeline that details the full history of attribution. Organisations of varying sizes developed new models and technologies at different times and many of these were released. Therefore the below is told through the lens of my own exposures and experiences with over a 20 year career in digital marketing. I’d love to be challenged, further educated or corrected by anyone with greater expertise in this field. I love to learn. 


---


Sales Attribution
c.2000


With the rise of digital marketing in the early 2000’s, effectiveness would initially be judged by the number of times that an ad was shown compared to the forecast. Advertisers would buy impressions at a certain cost per 1000 (CPM = cost / 1000 impressions) and measure whether or not the publisher could deliver the desired quantity. 


In 2002 the growth of the pay-per-click model enabled clicks to become the dominant metric to understand ad effectiveness. Advertisers now understood how many people had taken an action directly on their ad with the most common result being a visit to their website. These models the metrics clickthrough rate (CTR = clicks / impressions)  and cost per click (CPC = cost / click).  


Analytics organisations such as Omniture and WebTrends led the way in further evolving this model. If a visit to a website could be recorded then additional onsite activity can also be captured, including crucially accessing the sales confirmation page. The ability to connect these ‘converters’ with the marketing touchpoints they had used along the journey was the foundation of online sales attribution. A small data file deployed on a consumer's computer allowed websites to identify them and facilitate the connections between the initial click and the action. 


New metrics became available to online marketers with the tools to make connections between sales and marketing activity with cost per acquisition (CPA = marketing cost / acquisition / action) used by the majority of organisations. Organisations now understood at a high level how many of their sales / actions were driven by their digital marketing activity. 



Multichannel Attribution

c.2005


Recognising that sales and revenue could be directly associated with marketing activity was a powerful tool but challenges soon emerged. As access to information increased, consumers would interact with an increasing number of marketing touchpoints before making a purchase. Allocating the full value of the sale to every single touchpoint led to a biassed view of the digital marketing channel overall. I recall regularly viewing marketing analytics results showing sales recorded through digital marketing were more than double the total sales transacted via the website on the day! 


The last-click attribution model became the solution to these challenges. Essentially the last known click on a tracked digital marketing advert from any publisher would receive the full credit for the sale made. Despite universal recognition of these limitations of this model, something I wrote about back in 2009, it became the industry standard for many years. 


It was these limitations that led me to develop the cross-platform model used at TUI. This model distributed the value of the sale equally across every recorded touchpoint in the users online journey to purchase. The model was also adaptable depending on the business need. To drive the most efficient sales, it could be weighted towards the last click, or to increase new leads towards the first click. Over the years similar rules based models are available in the majority of online analytics or ad platforms. 



Cross-Platform / Cross-Device

c.2010


As the rise of smartphones accelerated during the 2010’s a new problem emerged. Marketing touchpoints increasingly started to occur across multiple devices. These interactions could not be associated back to the same user and as such the path to purchase became fragmented.  This was further complicated as media interactions did not always occur via a traditional browser. A user could click on an advert within the Facebook App but later make a purchase via their desktop browser, yet the two events could not be connected. 


New deterministic methods were required to try and match the user across their devices. Publishers and platforms would increasingly advocate for users to login to their services to allow them to identify users across their widening portfolio of devices. If a user was logged into their accounts on every device used throughout their journey, then conversions could be attributed accurately. 


Probabilistic methods are also used to complement deterministic methods, or where no logged in first party data existed. With this approach, models and algorithms are to match users across their devices using a range of identifiers from mobile devices such as the advertising IDs, IP addresses, browser type or language type. 


These methods helped to provide a robust solution to cross-device attribution. 



Online to Offline 

c.2013


By this time, the majority of organisations were able to recognise how online sales were driven by digital marketing activity at a granular level. But within the majority of verticals, the majority of sales still occurred in physical stores. Research online / purchase offline became the most common route to purchasing products. 


Organisations such as Billiups started to leverage data from users who had opted in to location tracking on their smartphones in order to identify movement near physical stores. The smartphone had turned from a tool that complicated online attribution to one that could help connect digital activity to offline actions. In 2014 Google introduced the in-store visits metric into Adwords to provide marketers with metrics such as cost per visit (CPV = visit to a store / cost). 


Organisations looking to go further could also understand the impact on instore sales. Propositions emerged that combined publisher clickstream data with offline sales information leveraging user email addresses as the primary key. It was now possible to connect digital marketing activity with total sales regardless of channel. 



Data-Driven Attribution (DDA)

c. 2015 


Recognising the role of digital marketing activity in driving both online and offline sales was incredibly powerful for organisations at a high level. However the models used by organisations to attribute value across marketing touchpoints had not evolved at the same pace. Although advertisers understood the limitations of rules based attribution models, the outputs of these models had become embedded within their internal reporting frameworks. “We know it’s not ideal but it’s working for us” was a common pushback from advertisers that had driven online growth through paid media. 


In the early 2010’s adtech platforms started to BETA test new models that used historical data to analyse the significance of digital touchpoints in generating a conversion and attribute accordingly. These sophisticated algorithms are able to consider factors such as the sequence of interactions, the type of interaction (view, click, impression) and the time between interactions and then assign credit to each touchpoint based on its role in driving the conversion. 


Today the majority of advertisers will have migrated to a data-driven attribution model, particularly for their biddable media purchasing. A key strength of DDA is that it can work in almost real-time and therefore can power bidding and buying through dynamic platforms.


---


THE BIGGEST CHALLENGE YET



Third-party cookies have served as the key enabler for attribution within digital media. Used to connect different marketing touchpoints with an eventual sale, they also enabled personalisation, retargeting and help websites remember preferences. However cookies also have the capability of capturing more user data and feeding this back into platforms and publishers. Their uncontrolled usage resulting in extensive tracking and profiling of users was the key factor in the development of reformed data privacy regulations between 2016-2018. These regulations modernised existing data protection directives for the digital age, providing greater user rights and increased accountability for organisations using consumer data. Directives such as GDPR (Europe) set a high bar for data protection internationally. 


The movement to a more privacy focused digital world was not just led by regulation. Organisations, evaluated their own positions on responsible use of online data and their own set of principles and ideologies. It then became incumbent on organisations to develop technology solutions to ensure that these principles were upheld. Apple made one of the first moves introducing Intelligent Tracking Protection (ITP) to it’s Safari browser in 2017. This prevented the ability to track individuals by blocking third party cookies, fingerprinting and more. By the end of 2020, all iOS browsers were required to have ITP in place. By this point the mobile device had become the dominant product for accessing internet services, either via a browser or via an app. In 2020 Apple released IOS 14 with the inclusion of App Tracking Transparency (ATT). With this technology,  every single app was required to prompt a user for permission in order to track their activity. Although increasing every year, only 34% of consumers opt in for this tracking. 


In January of 2020 Google also announced that Chrome would phase out support for third-party cookies and after a few extensions set the deadline for the removal as the end of 2024. Chrome is the leading internet browser in the world with a global market share of 64% (statcounter) so the ramifications are significant for publishers, website owners and marketers. 


---


THE FUTURE 

c.2025….


The impact of these changes has ramifications throughout the online world. Advertisers will face reduced targeting accuracy leading to increased CPA’s, they will not be able to retarget consumers to encourage them to take an action and they will no longer be able to attribute actions across marketing touchpoints. Publishers may see lower revenues with far fewer signals to develop personalised ads whilst technology platforms need to invest in developing or adopting new solutions. 


As a result, the industry as a whole is working hard to develop privacy centric solutions that allow the recording of online actions and for these actions to be attributed back to marketing activity. Here is a non exhaustive summary of potential solutions that are either being proposed or prototyped. 


Probabilistic Attribution

In the short term, the new limitations on data collection will move the industry further towards probabilistic approaches. Due to the reduced volume of observable data available, data-driven attribution models driven by machine learning will need to become increasingly predictive when it comes to attributing the value of actions across marketing touchpoints. 


Authenticated First Party Data

Solutions such as Enhanced Conversions from Google look to supplement existing conversion tags by sending hashed first-party conversion data (EG email address) from a website to Google in a privacy safe way. This data can then be matched with Google data in order to record a conversion. This solution relies on the user being signed into their Google account when taking the action, with attribution only possible against Google properties. 


Identity Graph Solutions

Identity graphs also use hashed user data. An identity graph is essentially a digital map that connects data points about individuals to create a unified view of the user while ensuring anonymity and adhering to privacy regulations. Sophisticated algorithms match and link data points that can also be blended with demographic data with customers segmented based on shared characteristics. 


Server to Server (S2S)

Some measurement experts talk about server-to-server (S2S) as a core method of attribution moving forwards. This method involves passing unique identifiers directly between advertisers, publishers and attribution platforms. These unique identifiers do not contain any personal information and data remains secure without the need for cookies. 


Clean Rooms

Another option could be ‘clean rooms’, secure and controlled environments where multiple parties can collaborate and analyse data. These rooms do not contain any user specific data so any attributed outputs will always be at an aggregate level. These solutions can be technically challenging to set up creating a barrier for the majority of advertisers right now. However they are privacy safe 


Browser Based Attribution 

Some of the latest proposals to maintain online attribution involve leveraging the user's browser to record the marketing interaction (source event) and then match this with a conversion. This data alone is later transmitted back to the adtech company and in some cases publishers. To ensure user identity can remain private, some additional noise is added to the reported data. This means that this solution may not be viable in cost per conversion scenarios or when required to make real-time bidding decisions. 


---

EVERPRESENT MEASUREMENT SOLUTIONS 


Some marketing measurement solutions have been used for decades but become more sophisticated and intuitive thanks to advances in digital technology. Limitations on available online data will still impact these models, albeit to a lesser degree. As such these methods will play an increasingly critical role in the marketing measurement toolkit moving forwards. 


Econometric / Media Mix Modelling

Econometrics is an established means of testing media activity be it online or offline. Historically econometrics required significant data collection. Granular marketing, website, offline and even external factors like weather and economic trends are fed into advanced statistical models that output insights into channel effectiveness, ROI and attribution. 


Media mix modelling (MMM) tends to be a simpler approach that focuses specifically on marketing channels and predefined metrics as outputs. Data is fed into a statistical model (typically a multivariate regression model) with a view to understanding the impact of the marketing channels on sales. 


The great strength of econometric modelling is that it is one of the few techniques to incorporate offline marketing activity such as TV, outdoor and print into the model. As such they are often the default source of truth for agencies and brands who invest across these channels. From an online perspective a weakness of econometrics can be a limited recognition of the nuances and granularity of online data. I remember a consultancy advising me to increase investment in branded search following the results of their econometric study across our channels.  


While contemporary advertising leverages a wealth of data streams feeding directly into econometric models, studies based on this data rarely translate into frequent and timely insights essential for dynamic bid optimisation, a cornerstone of effective online advertising. This disconnect between readily available data and actionable knowledge hinders advertisers from fully capitalising on its potential.


Incrementality Studies

Incrementality studies are used to understand the significance of a media interaction in driving a conversion. The key question being whether the media actually drove the conversion or if it would have occurred naturally. 


Incrementality studies use a test and control methodology often using location to divide users into groups. I was involved in one of the first geo-experiments in the UK and the biggest challenge was creating these groups to ensure the experiment was as fair as possible. Today, the geo-X framework is well established and far simpler to deploy with established granular splits at a geo and a user level built into many advertising platforms.  


There remain some drawbacks. In order to achieve statistically significant results and consequently the most valuable outputs, advertisers are encouraged to ensure the experiment is of a certain size and runs for a period of time required in order to capture this data. This means that either the test or control group is exposed to theoretical conditions. For example, turning off marketing to users in the test group for a period of time, could well lead to a drop in sales. As such incrementality tests are most powerful when proof is required to support a hypothesis that is already predicted to be true. 


---


CONCLUSIONS


At one point in my career I was dreaming of a utopian measurement solution where every marketing and sales touchpoint could be tracked and sales and conversion data recorded appropriately with automated bidding and buying working in tandem. I always looked to evolve models used to ensure that new marketing touchpoints could be considered, in particular championing the inclusion of impressions and views into models previously focused purely on clicks. 


It is clear today that this expectation looks unlikely to materialise. If consumers do not want organisations to know their marketing interactions leading to a purchase, then this preference should be respected but an attribution model will never be complete. 


Consent has become the bedrock of privacy-compliant data collection which serves as the enabler for digital online attribution. The online industry collectively will need to work harder to communicate the benefits of consent to consumers. In my personal view, the web is a far poorer experience without data driven advertising. Without data driven placements, publishers are forced to find new means of recouping lost advertising revenues by increasing the volume and intrusive nature of adverts in a quest to obtain the all important click. 


Greater consent generates more first-party data which can be fed into probabilistic data driven models. The more observable data that can be fed into these models, the more accurate they will become. Today, the majority of data driven models are platform specific, due to the ability to leverage first party data platform data. However I expect more platform agnostic solutions to materialise in the coming years. 


2024 promises to be a pivotal year for online attribution as organisations look to adapt to a world without third party cookies and make the necessary adjustments to their online strategy. I believe it will be the organisations that are able to rapidly develop a first party data strategy and combine this with the latest machine learning based models that will reap the rewards.