Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is an excellent article, which does a good job of detailing several factors involved here. But while it does suggest several ways to reduce the cost of training models, I'm left with a huge questions at the end.

How much does it ultimately cost to train a model at this size, and is it feasible to do without VS funding (and cloud credits)?



Author here. Thanks for your comments!

In general - this is expensive stuff. Training big, accurate models just requires a lot of compute, and there is a "barrier to entry" wrt costs, even if you're able to get those costs down. I think it's similar to startups not really being able to get into the aerospace industry unless they raise lots of funding (ie, Boom Supersonic).

Practically speaking though, for startups without funding, or access to cloud credits, my advice would be to just train the best model you can, with the compute resources you have available. Try to close your first customer with an "MVP" model. Even if your model is not good enough for most customers - you can close one, get some incremental revenue, and keep iterating.

When we first started (2017), I trained models that were ~1/10 the size of our current models on a few K80s in AWS. These models were much worse compared to our models today, but they helped us make incremental progress to get to where we are now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: