Perhaps I should clarify, that when I say SOTA GPU, I mean, rtx 3060 (midrange),...

Perhaps I should clarify, that when I say SOTA GPU, I mean, rtx 3060 (midrange), which has 12gb vram, and is a good starting point to climb into the LLM market. I have been playing with LLM's for months now, and for large periods of time had no access to GPU due to daily scheduled rolling blackouts in our country.

Even so, I am able to produce insane results locally with open source efforts on my RTX3060, and now I am starting to feel confident enough that I could take this to the next level by either using cloud (computerender.com for images) or something like vast.ai to run my inference (or even training if I spend more time learning). And if that goes well I will feel confident going to the next step, which is getting an actual SOTA GPU. But that will only happen once I have gained sufficient confidence that the investment will be worthwhile. Regardless, apologies for suggesting the RTX3060 is SOTA, but to me in a 3rd World Country, being able to run vicuna13b entirely on my 3060 with reasonable inference rates is revolutionary.