For basic usage, you can get away with a small graphics card or no graphics card at all (albeit it will be very slow).
The general rule of thumb is, take a model size (7B, 13B, 34B, 70B) and multiply that by 0.5 or 0.625. If that number is smaller than the combined amount of system RAM and VRAM in your system, you can run the model at 4-bit and 5-bit quantization respectively.
A jacked up PC can do really well and there is much fun to be had there.
...but you'd struggle to get close to even GPT 3.5 let alone 4 for generic tasks.
For custom tunes...yeah sure custom rolls will beat generic openAI. But that's a bit like pitting customed tuned cars against street legal manufacturer cars. It's an apple to oranges comparison