Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't get your point, how is what you're suggesting here different from a few papers we already have on KV cache pruning methods like [1]?

[1] https://arxiv.org/abs/2305.15805



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: