I don't get your point, how is what you're suggesting here different from a few ...

		miven on May 19, 2024 \| parent \| context \| favorite \| on: Llama3 implemented from scratch I don't get your point, how is what you're suggesting here different from a few papers we already have on KV cache pruning methods like [1]? [1] https://arxiv.org/abs/2305.15805