Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
tarruda
on Feb 23, 2024
|
parent
|
context
|
favorite
| on:
Phind-70B: Closing the code quality gap with GPT-4...
This video [1] shows someone running at 4-bit quant in 48gb VRAM. I suspect you need 4x that to run at full f16 precision, or approx 3 H100.
https://www.youtube.com/watch?v=dJ69gY0qRbg
jxy
on Feb 23, 2024
[–]
Yeah, 4bit would take 35 GB at least. 16bit would be 140 GB. I'm more interested in how Phind is serving it. But I guess that's their trade secret.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
https://www.youtube.com/watch?v=dJ69gY0qRbg