because you can have multiple breakpoints with Anthropic's approach, whereas wit...

simonw · 2025-10-16T05:25:43 1760592343

I thought OpenAI would still handle case? Their cache would work up to the end of the file and you would then pay for uncached tokens for the user's question. Have I misunderstood how their caching works?

thefroh · 2025-10-16T14:30:30 1760625030

not if call #1 is the file + the question, call #2 is the file + a different question, no.

if call #1 is the file, call #2 is the file + the question, call #3 is the file + a different question, then yes.

and consider that "the file" can equally be a lengthy chat history, especially after the cache TTL has elapsed.

simonw · 2025-10-16T19:38:18 1760643498

I vibe-coded up a quick UI for exploring this: https://tools.simonwillison.net/prompt-caching

As far as I can tell it will indeed reuse the cache up to the point, so this works:

Prompt A + B + C - uncached

Prompt A + B + D - uses cache for A + B

Prompt A + E - uses cache for A