Skip to content

Latest commit

 

History

History

FLoC

Evaluation of Cohort Algorithms for the FLoC API

Last year, the Google Chrome team posted an explainer for Federated Learning of Cohorts (FLoC), a privacy preserving API designed to support interest based advertising. Under the FLoC proposal, the central idea is that users will be assigned to cohorts, members of cohorts will have similar browsing behaviors, and the cohort identifier can be used as a privacy-first replacement for pseudonymous identifiers to serve relevant ads.

Chrome laid out the framework and principles that govern FLoCs and solicited industry engagement on an implementation. The key to the usefulness of FLoC is the quality of the clustering algorithm employed. If the clusters are homogenous and coherent, we expect that interest-based audience targeting based on cohorts can be comparable to third party cookies. If the clusters are diverse, we expect to see limited utility.

Google Research in collaboration with the Google Ads team performed an initial evaluation of several standard clustering algorithms that would abide by these core FLoC principles to better understand the privacy-utility trade-offs for interest based advertising. These algorithms are expected to create homogenous user cohorts that preserve anonymity when applied to any dataset representing users and their interests. To test this, we performed our evaluations both on Google’s interest based advertising and also on publicly available datasets. The public datasets include movies rated by users and over a million songs listened to by users. On each of the datasets, the algorithms created cohorts with members that were significantly more alike than a baseline algorithm that randomly assigns users to a cohort.

Overall, our initial evaluations demonstrate that FLoCs can provide a strong signal for audience interest-based targeting use cases (e.g., demographics, affinity, in-market) even at high anonymity levels. This work is just a first step in evaluating the effectiveness of FLoCs. The results are encouraging. There's still ample room for further innovation and experimentation with the FLoC signal, such as pairing FLoC IDs with contextual signals to derive interests.

We plan to continue these efforts across other Privacy Sandbox APIs and evaluate on additional 3P datasets. We hope that the ideas, algorithms, and results presented here can spark a greater discussion across the industry on the Federated Learning of Cohorts (FLoC) API and the broader concepts of the Privacy Sandbox.

More details in the linked whitepaper.

UPDATE 1: For proprietary data set evaluation, we did not use Chrome sync data to evaluate results. We used data collected from the Display Network through our publisher ad serving tags and advertiser conversion pings. We hope other DSPs could also use a similar setup to evaluate results.

UPDATE 2: Published another clustering algorithm k Random Centers.