Table of Contents
Programmatic advertising and Real-Time-Bidding (RTB) is one of the most exciting areas in the tech industry. With a global network of servers processing billions of auctions per second and delivering targeted advertising to billions of users, this sector stands as the largest Machine Learning application in history.
In this initial episode of our two-part article, we dive into the evolution of Machine Learning within programmatic advertising. We’ll navigate the challenges that DSPs encounter today and examine how they harness Machine Learning to overcome these obstacles.
Early Days of RTB
The RTB boom started more than a decade ago. The first event that made waves was Google’s acquisition of DSP Invite Media. As a thought leader in the industry, Google set the course for the advertising landscape. Transitioning from keyword auctions, Google foresaw the emergence of real-time auctions for hyper-targeted ads.
Amid this evolution and looking specifically at Machine Learning, the systems were more technical and had more computational limitations. The most common algorithms to estimate users Click-Through-Rate (CTR) centered on Decision-Trees or linear models like Logistic-Regression. The challenge was, and still is, achieving optimal user estimate within milliseconds, all while processing massive data volumes.
Gaining a competitive advantage meant drawing ideas from the well established Recommender System community. The eCommerce sector and content platforms had evolved in leveraging Machine Learning for targeted recommendations and understanding user behavior. This was reflected in the famous 2009 Netflix Prize, offering a staggering $1 million reward to improve movie recommendations by at least 10%.
Data-Drive Success and Emerging Leaders
The user-centric approach permeated the RTB community, facilitated by less privacy restrictions as compared to today. Advertisers readily shared their user data with third-party services to enhance performance of their marketing campaigns as well as deliver personalized experiences. Numerous websites were packed with tracking pixels, sending user-related data to various entities.
This development witnessed the combination of abundant user data with strong recommender algorithms that propelled the success of retargeting companies. In Europe, Criteo and Sociomantic swiftly rose to prominence.
However, using Machine Learning algorithms to target new users turned out to be much more difficult and DSPs had a hard time challenging Google’s dominance with new user targeting. Google had a massive advantage due to substantially more user data. Third-party data providers also flourished, offering more user data and audience segmentation, effectively helping DSPs in optimizing their targeting strategies.
In those days, most DSPs operated as full black-boxes with a business model based on arbitrage. Advertisers had little to no control over budget allocation, while the black-box system fostered extensive cross-pollination of advertisers’ data. This resulted in a pervasive mistrust towards media partners such as networks, DSPs, platforms, and other suppliers, casting a shadow on the industry.
Present Challenges and Its Implications
Since then, the industry has made a lot of progress and remarkable innovations. Numerous facets have evolved. Currently, we are seeing three predominant challenges confronting DSPs when using large scale Machine Learning for RTB, each with its own consequences: privacy protection, the advent of first-price-auctions, and escalating computing power.
Reshaping Algorithms in a Privacy-Conscious World
The increasing privacy protection mechanisms implemented nowadays complicate user tracking. Audiences and user segmentation becomes more challenging, affecting most ad impressions. This means that DSPs have to re-think algorithms for bid price optimization due to drastic data changes.
Retargeting and all recommender system based approaches become a lot harder. Algorithms need to shift the focus from user-centric approaches to contextual information and cohort based algorithms. This might even mean fundamentally changing the architecture of the Machine Learning stack. For instance, the iOS’ SKAN standard disables event mapping (connecting an event (e.g. install) with an impression). Basically the tagging of training data, the bread and butter of every algorithm, is a lot trickier and hence ad tech platforms are now fundamentally revisiting their approach to supervised learning. This means that DSPs have to rethink data pipelines to support Machine Learning and Data Science.
Precision and Dynamics with First-Price Auctions
Next, the rise of first-price-auctions intensifies pressure on DSPs to determine the right price. The safety net of second price auctions has disappeared, demanding heightened accuracy from Machine Learning systems to avoid overspending.
Today, DSPs have to track and monitor market dynamics with a more comprehensive data analyses. Not only are bids and won impressions relevant, but also all lost auctions are now vital to track and analyze market behavior. Due to potential errors, Machine Learning algorithms need to be carefully monitored. This puts a lot more pressure on reliable Machine Learning infrastructure with focus on health checks and model performance. Infamous examples of Machine Learning systems going rogue are Zillow’s pricing algorithms and Unity’s bad data ingestion, both leading to massive losses for the companies.
Complex Algorithms and Expanding Computing Resources
Lastly, a notable shift happened in recent years, driven by increasing computational resources. Cloud computing opens up access to massive amounts of servers, while GPUs become mainstream tools in the Machine Learning community. New libraries and tools simplify resource utilization for Machine Learning Researchers and Data Scientists. This shift entails moving away from linear models towards non-linear models like Deep Neural Networks and Gradient Boosting Decision Trees. These algorithms excel in uncovering complex patterns within vast amounts of data, although they come with costs.
In response, Machine Learning teams must cultivate knowledge in cloud computing and adopt tools traditionally based in the dev-ops domain. Accommodating such complex algorithms mandates that the teams have the theoretical foundation and the practical experience with libraries to implement them at scale. Consequently, the roles of Machine Learning Engineers and Data Scientists need to have a much broader skill set, catering to evolving technology.
This surge in skills is mirrored on the advertisers side as well. With internal Data Science teams becoming commonplace, companies better understand their data, goals, and requirements for successful advertising campaigns. This leads to more and closer communication and collaboration between DSP’s Machine Learning teams and their respective clients.
Stay tuned for the upcoming episode of this article, where we dive deep into how Kayzen is effectively tackling these challenges discussed through the lens of transparency, control, and performance.