BLOG
October 4, 2024
Demystifying Universal IDs: Value Propositions, Myths, and Real-World Applications
Unlock the potential of universal IDs in a cookieless world. In our latest article, we break down the value, debunk myths, and explore real-world applications of universal IDs, offering insights into how advertisers can leverage this technology for more accurate, privacy-focused, and effective digital campaigns.
Introduction
The digital advertising ecosystem is undergoing a transformative shift as third-party cookies are being phased out. As the industry searches for sustainable alternatives, universal IDs have emerged as a promising solution for maintaining user identity and delivering personalized advertising. However, the implementation of these IDs is not without its challenges and misconceptions.
At PrimeAudience, we view available technologies as tools to help us achieve our clients’ goals. We don't have biases tied to business affiliations; instead, we strive to remain objective, evaluating each technology based on its utility for marketers and AdTech companies. This article aims to demystify universal IDs, explore their value propositions, debunk common myths, and discuss realistic use cases and challenges faced by advertisers and publishers. We also sought to include insights from renowned professionals in the AdTech industry who are deeply engaged in the world of identity, offering diverse perspectives on the topic.
What are universal IDs?
Universal IDs are unique identifiers used to recognize users across different websites, platforms, and devices in the absence of third-party cookies. Unlike domain-specific cookies, universal IDs aim to provide a consistent identity for users, enabling more accurate targeting, measurement, and personalization in digital advertising.
As privacy regulations tighten and browser policies evolve, the importance of universal IDs has grown. They offer a way to continue delivering personalized ads across domains and environments. However, their success depends on widespread adoption and, crucially, standardization or at least interoperability.
Understanding identification methods: probabilistic vs. deterministic approaches
To fully grasp how universal IDs function, it’s essential to understand the identification methods they employ. Universal IDs typically rely on two primary methods for user identification: probabilistic and deterministic approaches. However, our experience suggests that there are four distinct categories:
- Deterministic identification: Deterministic methods rely on data points perceived as stable in time, such as email addresses, login information, or phone numbers, to identify users with a high degree of accuracy. This approach is preferred for precise targeting and measurement, but requires access to reliable first-party data.
- Probabilistic identification: This method uses statistical modeling to infer user identity based on device characteristics, IP addresses, and behavioral data. While it allows for broader targeting across fragmented data sources, it is inherently less precise, making it more suited for situations where accuracy is not critical.
- Hybrid methods: Some IDs may use a combination of both probabilistic and deterministic approaches to maximize reach and accuracy. However, it's crucial for DSPs and other stakeholders to know which method was used to assign the ID during bidding. This transparency is vital for making informed decisions about how to handle the data and optimize campaigns effectively. A prime example of how this can be done right is LiveRamp’s approach, which relies on adding a two-letter prefix to the ID, which indicates to the buyer what type of identification was used to assign the ID.
- Telecom identifiers: Amongst the vendors using the solutions above, there is one that stands out. UTIQ, a joint initiative of telecom providers, has developed a unique methodology that leverages mobile network data. By accessing mobile network identifiers, UTIQ claims to offer a more robust and accurate deterministic user identification process. The IDs generated through the service can be used for the traditional digital marketing use cases: targeting, frequency and reach, attribution and as they are generated from the telecom network, they can be treated as verified. The service is currently mobile only which places a limit on its scale, but thanks to UTIQ’s CPO Will Harmer, we found that it will soon be available at a household level through WiFi networks.
The value and limitations of non-cookie IDs
Deterministic IDs, while offering superior accuracy compared to probabilistic methods, face significant challenges when it comes to scale.
- Value for premium publishers: Deterministic IDs are particularly valuable for premium publishers on the open web who have a substantial amount of logged-in traffic. These publishers can offer highly accurate audience data, which is attractive to advertisers looking for precise targeting.
- Challenges in scaling: However, the effectiveness of deterministic IDs is not universal. In many geographical markets, users may be less inclined to log in, either due to cultural preferences or a reluctance to share personal information. The success of deterministic IDs often hinges on the value the publisher provides or the level of intrusiveness in forcing users to log in. This creates a disparity in effectiveness across different markets, making deterministic IDs a powerful tool for some but not a one-size-fits-all solution. Also, it is worth keeping in mind that this method will work in retargeting only if the user is also logged in on the advertiser's website, making a pool of addressable users even smaller.
- Cross-device capabilities: The biggest advantage of this methodology is that once a user logs in to any service, his behavior can be later linked between devices, allowing for a broader picture of his activity and thus improved targeting. Modern users are reachable across a range of devices—PCs, mobile, and CTV. This, however, requires either an omnichannel approach for the vendors or them creating data partnerships between other companies with whom they have overlapping clients and/or users.
Probabilistic IDs use statistical modeling to infer user identity, but they come with significant limitations.
- Inconsistent frequency capping: The inferred identities often don't match consistently, making it hard to manage frequency capping effectively. In some cases, vendors might not even assign the ID, returning the null value when they don’t feel confident in their prediction. In this case, an ad-serving vendor needs to have a backup tactic to still maintain the capping.
- Accuracy trade-offs: The lack of precision in probabilistic IDs can result in inaccurate targeting, leading to inefficiencies in campaign performance. How exactly? In some cases, if the ID is assigned incorrectly while retargeting, one might be surprised as to why he is seeing an ad for the products he did not previously browse. It’s at the brand’s own discretion whether they want to place such a risky bet.
- Better reach for branding: The possibility of going beyond logged-in traffic allows for broader user reach, making them useful for branding campaigns.
Implementation of universal IDs: when and how are they assigned?
To effectively use universal IDs, the process begins with the assignment of an ID to a user as they interact with a website.
- On the buy-side: As a buyer, when a user visits your website, you need to call the ID provider's service to assign an identifier to that user. This ID is created based on whichever identification method—probabilistic, deterministic, hybrid, or telecom-based—is available and applicable. As the user browses your website, you continue to build up their profile by associating various data points with this assigned ID. This enables more personalized and accurate targeting down the line.
- On the sell-side: As a seller, you also assign an ID to the user as they engage with your content. However, instead of just building a profile, you pass this ID along to the bidstream during ad auctions. This ID helps potential buyers recognize the user if they have encountered the ID before, allowing them to make more informed bid decisions. Besides passing the raw ID into the bidstream, publishers might also opt to use DMP-like technology to build audiences for behavioral targeting of their own.
This raises an important question: Are all of these IDs visible in the bidstream? If so, how does this differ from the use of cookies, and how is user privacy being preserved in this system?
These concerns lead us to the next topic—encryption of IDs—a process designed to address these very issues by enhancing privacy protections in the bidstream.
The role of encryption: salting of IDs
To enhance privacy and security, many universal ID providers use encryption techniques such as hashing and salting. Hashing is a process that converts data, such as an email address, into a fixed-length string of characters, effectively obscuring the original information. For example, when an email like "name@example.com" is hashed using the MD5 algorithm, it’s transformed into a coded string like “564f0b682a023cc0e88e2674d9137b77”. However, a drawback of simple hashing is that the same input will always produce the same hash, making it vulnerable to brute-force attacks. This is particularly risky with email addresses, which often use common name combinations, leaving them susceptible to dictionary attacks—where attackers try various combinations of names and domains to find matches. Salting addresses this vulnerability by adding a unique value to each hash request, ensuring that the same email address will generate different hashes depending on who is asking for it, thereby significantly increasing security.
Here are some benefits of using these solutions with universal IDs.
- Preventing data scraping: One of the primary benefits of salting is that it prevents unauthorized scraping of data from the bidstream—a practice that was once common and posed significant privacy risks. By encrypting IDs, vendors can protect user information from being easily accessed or misused.
- Controlled decryption: To legitimately decrypt user IDs so that they can successfully be used for advertising use cases, one has to have access to so-called decryption keys, which are tools that allow one to reverse the decryption process. These keys are rotating on a regular basis, meaning they are replaced to prevent theft or unauthorized access. Access to such keys and rotation intervals are controlled by the ID vendor, which usually requires payment. However, this model raises concerns about the true extent of privacy protection, as it doesn’t eliminate the risk of data misuse but rather makes it more difficult and expensive.
Universal ID use cases: how do they REALLY handle them?
Now that we know how these IDs work and that they can provide privacy protection to the user, we need to better understand their use cases and how they actually solve them. There is a common perception that simply enabling universal IDs will directly and instantly boost revenue for publishers and solve all of their identity problems. However, the reality is more complex.
- Frequency capping: One potential benefit of universal IDs is the ability to manage frequency capping—ensuring users aren’t overexposed to the same ads. However, this is not typically recommended with probabilistic IDs due to their lower accuracy, which can lead to either over-exposure or under-delivery of ads. You might want to rely on 1st party cookies or publisher-provided IDs instead to do this task which will already increase the effectiveness of the ads but probably won’t match the performance of 3rd party cookies. Please note that the same limitations apply to publishers' IDs; thus, depending on implementation, the capping might not always work for cross-domain, even within entities owned by a single publisher. There are some exceptions to this rule with Related-Website Sets being implemented or publishers implementing their custom technology to allow cross-domain identification but these should be evaluated on a one-by-one basis.
- Audience profiling:
- Building audience profiles: The true value of a universal ID comes from building detailed audience profiles. Publishers can achieve this by investing in robust first-party data collection and analysis, but this requires significant effort and infrastructure. Once they possess reliable information about their users, they have to share it in a way that’s useful for the buy side. Universal IDs are one such way, but there are many others, such as first-party cookies, PPID (publisher-provided ID), PPS (publisher-provided signals), Secure Signals, and SDA (seller-defined audiences). A great advantage of Universal IDs in comparison to these tools is that they work across domains, and so marketers can link publisher’s data with other data sources.
- Risks of combining solutions: Linking data, however, poses risks for publishers. While sharing audience data through open pipes like SDA, doing so in parallel with universal IDs can create vulnerabilities. If SDA data is linked to a decrypted ID or even a third-party cookie (where still possible), it could allow for first-party data scraping, undermining the very privacy protections that universal IDs are supposed to enhance. It may also lower publisher revenue from its inventory, as its data may be activated elsewhere.
- Retargeting challenges:
- Adoption on the buy-side: For retargeting to be effective, both the publisher and the advertiser must adopt the same universal ID. Even when IDs are correctly assigned, small and medium-sized ecommerce businesses may find it difficult to implement, use, and benefit from these IDs without a universally adopted market standard. This is especially important, as the user needs to log in to both platforms using the same email address. This might lead to small overlaps of databases even with above-average log-in rates. To illustrate this example with data, let’s look at it from the perspective of an online store that wants to retarget its users. In this case, we assume that 50% of ecommerce users are logged in, but this ratio is only at 20% on the publisher's site. Furthermore, 80% of online store users also visit the said publisher’s site, with 40% of those users sharing the same email domain across both platforms. This leads to a deterministic match rate of 3.2% of total ecommerce traffic (0.5 x 0.8 x 0.2 x 0.4), enabling accurate user identification and retargeting using a deterministic ID. This means that only a fraction of the users can be retargeted with this method, and these ratios might further decrease for publishers that are not market leaders. Note that we used 80% traffic overlap in this example!
- Market fragmentation: The lack of standardization across different IDs leads to inconsistent retargeting efforts. Smaller players may struggle to adopt and manage multiple ID systems, resulting in less effective campaigns. This causes a significant dilemma for publishers as to which ID providers they should bother onboarding and how to evaluate them. The answer to these questions seems to be “it depends”—and that’s the crux of the issue. Each publisher needs to interpret this based on their specific circumstances, taking their first-party data strategy into account. Factors that each publisher will list can vary widely between them, contributing further to the fragmentation of the market.
- Adoption on the buy-side: For retargeting to be effective, both the publisher and the advertiser must adopt the same universal ID. Even when IDs are correctly assigned, small and medium-sized ecommerce businesses may find it difficult to implement, use, and benefit from these IDs without a universally adopted market standard. This is especially important, as the user needs to log in to both platforms using the same email address. This might lead to small overlaps of databases even with above-average log-in rates. To illustrate this example with data, let’s look at it from the perspective of an online store that wants to retarget its users. In this case, we assume that 50% of ecommerce users are logged in, but this ratio is only at 20% on the publisher's site. Furthermore, 80% of online store users also visit the said publisher’s site, with 40% of those users sharing the same email domain across both platforms. This leads to a deterministic match rate of 3.2% of total ecommerce traffic (0.5 x 0.8 x 0.2 x 0.4), enabling accurate user identification and retargeting using a deterministic ID. This means that only a fraction of the users can be retargeted with this method, and these ratios might further decrease for publishers that are not market leaders. Note that we used 80% traffic overlap in this example!
Overall, while universal IDs provide tools that can enhance targeting and measurement, they do not inherently guarantee a boost in revenue. Success relies on a well-integrated approach that includes strong first-party data, standardized ID adoption, and a careful assessment of the limitations and challenges involved. Even with these measures in place, the benefits may not be substantial enough to justify the ongoing costs of implementing and maintaining such a solution.
Challenges and DSP responses to multiple probabilistic IDs
Given the aforementioned market fragmentation, a crucial issue that is often overlooked is the addition of more IDs into the bidstream. While this might generate some added value, it currently does not always outweigh the increased complexity and associated costs. The implementation of universal IDs introduces significant challenges for demand-side platforms (DSPs), especially when handling bid requests containing multiple IDs.
We asked one of the industry experts, Radek Szafranek, Justtag Group COO, an AdTech veteran and consultant for insights on the matter:
“Choosing an identifier on which to activate and personalize the message to the user is not easy. There is already fragmentation and regionalization of solutions. An additional difficulty that cannot be overlooked is the readiness of the platforms to handle these identifiers.
To date, we have operated in a cookie sync-based model, where individual DSPs rely on their own identifiers. With the progressive paradigm shift, we are faced with a gradual multiplication of the number of IDs available in the bidrequest. It is not difficult to guess what this translates into—the difficulty of bidding increases exponentially.
Imagine that the DSP receives 500K QPS (requests per second) of traffic from integrated partners. For each request, the DSP should respond with a bid or nobid in a maximum of tens of milliseconds. When platforms relied on their own ID, this was easier and optimized since even the segmentation provided by the Providers was using this ID.
Now it has become very complicated. Segments delivered to the platform may contain various IDs, even regionally (rather than globally). These IDs are additionally clustered using Identity Resolution Graphs, and the bidding itself must take into account several IDs—potentially different depending on the Advertisers' campaign setups. All this leads to, among other things, the need for a list of supported IDs, their waterfall (prioritization), and probably larger-scale sampling of traffic (in order to be able to respond to a bid before time out). This will be followed by software changes and investment in infrastructure.
Looking at it from this perspective, we can recognize the enormity of the challanges that platforms face in order to adapt to the new realities. Probably for these reasons, among others, many of the ongoing tests and campaigns do not yet provide meaningful results and do not show what the post-third-party world will look like.”
To summarize:
- Frequency capping: DSPs must decide how to maintain frequency capping when multiple IDs are present. Each ID might relate to different information about the user, complicating efforts to manage exposure effectively.
- Scoring bid value: Determining how to score bid value separately for different IDs is another challenge. Each ID could provide varying levels of user data, leading to inconsistencies in how bids are valued and prioritized. This also might encourage the recently created term ID-Bridging which can inflate the bid prices without a sufficient amount of transparency to the buyer.
- Profile stitching: DSPs face the decision of whether to stitch together profiles associated with multiple IDs. This process can impact campaign performance, as stitching might improve targeting accuracy or introduce errors that degrade effectiveness.
- Maintaining user privacy: Managing user privacy becomes increasingly complex when multiple IDs are involved. DSPs must ensure that they comply with privacy regulations and do not inadvertently expose user data by combining or mismanaging different IDs. As a user, if I decide to opt out from tracking by one of many IDs, will my data still be removed if it was previously assigned based on profile-stitching methods?
- Geographical coverage and local standards: Adding to the complexity, different universal IDs have varying geographical coverage. In some regions, local standards and regulations for user identification are emerging, requiring DSPs to adapt their operations accordingly. This increases the hassle for DSPs, who must align with these standards while managing the challenges of multiple IDs. We can already observe the impact of these factors by significantly lower penetration of deterministic IDs in Europe, the result of GDPR and ePrivacy regulations.
Taking the above into account, alternative technologies, such as contextual targeting, Privacy Sandbox, and Seller-Defined audiences offer advantages that in some cases allow to reduce the burden of managing multiple ID systems.
Expert insight
To further enrich our discussion, we reached out to Mikołaj Twardowski—an expert in data-driven marketing and the Director of Analytics at Wirtualna Polska—for his insights on the matter:
“After years of delaying the blocking of cookies in Chrome, Google has decided to hand over the decision to disable tracking technologies to users, which initially caused euphoria and then a series of questions about what the future of the ecosystem will look like. The transition to a cookie-free world will depend on how Google implements the blocking mechanism, with estimates ranging from 50% to 90% of users blocking cookies in Chrome. In this context, one thing is certain: a significant portion of traffic will be cookieless and it is necessary to implement alternative approach to digital marketing.
In this evolving landscape, there isn't a one-size-fits-all solution for advertising. It seems that a reasonable approach is to combine various available solutions, such as the use of universal identifiers (deterministic IDs are recommended due to their quality), the use of Privacy Sandbox tools and context-based strategies. Implementing such a multi-faceted approach will be complex but necessary.
To effectively navigate this change, it is recommended to start with established tactics such as universal identifier-based advertising in cookieless environments (e.g. Safari and Firefox). This strategy not only provides a valuable testing ground, but also leverages a significant traffic source, contributing to the effectiveness of advertising campaigns.
For those looking to explore deterministic identifiers in this changing landscape, ceeid.eu and the
technology behind it offers a solid starting point.”
Conclusion
In conclusion, universal IDs represent a significant advancement in the quest to maintain user identity and deliver personalized advertising in a post-cookie world. They have all the potential to satisfy use cases ranging from frequency capping, and audience profiling, to retargeting. However, the challenges associated with universal IDs, like managing multiple IDs, ensuring data privacy, complex utility assessment, are often overlooked or not talked about enough. Many vendors in the market tend to gloss over these issues, with everyone pretending that everything is fine—much like in the story of "The Emperor's New Clothes," where everyone pretends to see the emperor’s new outfit, despite its non-existence, out of fear of being the only one to speak the truth. Similarly, the industry may be avoiding a frank discussion about their limitations and complexities, leaving some stakeholders unaware of the real challenges they face.