Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A 4060 Ti with 16GB memory costs around $700-800 over here, can I connect two of these to get 32 GB memory to run the bigger models (13B stock or 30B quantized)?


One additional caveat worth considering... a lot of LLM computation is often memory bandwidth bound. The 4060 Ti is infamous for how deeply Nvidia cut the memory system. The 3060 Ti had a 256-bit bus that was capable of 448GB/s of memory bandwidth. The 4060 Ti has a paltry 128-bit bus that can only do 288GB/s, about 64% of the memory bandwidth of the previous generation.

For comparison, an RTX 3090 has 935GB/s of memory bandwidth. As the other person mentioned, that would likely be a much better card if you can find a used one for a reasonable price... but that's just my opinion.


Thanks for the info!


A 3090 with 24GB of vram is $650 used on Facebook where I live. I think that would be a better card as you can use nvlink.

Alternatively, the 4000 series cards rely on the PCI express lanes for linking, so if you want to get the benefits of linking the 4000 series cards as a consumer you’ll want a Threadripper or even maybe an EPYC CPU/motherboard combo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: