
Tyler
Interesting. The court document says, “Google will not be required to share granular, query-level data with
advertisers or provide them with more access to such data. Nor will it have to restore an “exact match” keyword bidding option.” More black box search data.
And for the search data it has to share with competitors, “Google will have to make available to Qualified Competitors certain search index and user-interaction data, though not ads data…”
Qualified Competitor is likely not OpenAI. (p 103)
I’ll have to read again, but…
(starting page 129) Google has to share specific datasets with Qualified Competitors at a cost. pages 145-146 - Google has to do a one-time dump of their Search Index data - DocID for each web page, the URL mapping, when they first saw it, when they last crawled it, spam scores, and if it’s mobile or desktop.
Court rejected sharing popularity and quality signals aka ranking signals (pp 143-144). Court said they’re “largely a product of engineering and innovation” and Google shouldn’t have to give those up.
For User-side Data (pages 157-158), Google has to share Glue data (what people click on, what they hover over, how long they stay on pages) and RankEmbed training data a couple times over the 10 years. They don’t have to share the models and signals built from the data - but do need to share click patterns.
Knowledge Graph was rejected (pages 149-151). The court said Google’s Knowledge Graph wasn’t built from the user data Google got through its illegal distribution deals, so competitors don’t get it. Funny because I actually think this is Google’s best asset.
Google has to anonymize all this data for privacy (pp 164). But when you anonymize data to protect users, you destroy most of its value. The court references when Google had to share data in Europe under their Digital Marketing Act - the privacy filters removed 99% of searches from the dataset (pp 161).
So if I’m reading this right…
Court orders Google to share user click data to help competitors compete, then admits privacy filters will remove 99% of it. Google gets to keep its data monopoly, look like they’re being cooperative, and nothing changes. Ha!