So if Amazon sues Google, claiming that it is being disadvantaged in search rankings, a court should be able to force Google to log all search activity, even when users delete it?
Maybe you misunderstood. The data is required to be retained, but there is no requirement to make it accessible to the opposition. OpenAI already has this data and presumably mines it themselves.
Courts generally require far more data to be retained than shared, even if this ask is much more lopsided.
If Amazon sues Google, a legal obligation to preserve all evidence reasonably related to the subject of the suit attaches immediately when Google becomes aware of the suit, and, yes, if there is a dispute about the extent of that obligation and/or Google's actual or planned compliance with it, the court can issue an order relating to it.
>At Google's scale, what would be the hosting costs of this I wonder. Very expensive after a certain point, I would guess.
Which would be chump change[0] compared to the costs of an actual trial with multiple lawyers/law firms, expert witnesses and the infrastructure to support the legal team before, during and after trial.
> It can be just anonymised search history in this case.
Depending on the exact issues in the case, a court might allow that (more likely, it would allow only turning over anonymized data in discovery, if the issues were such that that there was no clear need for more) but generally the obligation to preserve evidence does not include the right to edit evidence or replace it with reduced-information substitutes.
We found that one was a bad idea in the earliest days of the web when AOL thought "what could the harm be?" about turning over anonymised search queries to researchers.
That sounds impossible to do well enough without being accused of tampering with evidence.
Just erasing the userid isn’t enough to actually anonymize the data, and if you scrubbed location data and entities out of the logs you might have violated the court order.
Though it might be in our best interests as a society we should probably be honest about the risks of this tradeoff; anonymization isn’t some magic wand.