Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a bootstrapper I camped all night outside of bestbuy to get some 3090s.

Other tips not mentioned in the article:

1. Tune your hyper parameters on a subset of the data.

2. Validate new methods with smaller models on public datasets.

3. Tune models instead of training from scratch (either public models or your previously trained ones).



Great hacks, although you have to be aware of the trade-offs:

1. if you choose the wrong subset, you'll find a non optimum local min

2. still risk dead ends when expanding the model and lengthen the time to finding that out

3. a lot of public models are made from inaccurate datasets, so beware

Overall you have to start somewhere though, and your points are still valid.


1. The small subset is to test that your training pipeline works and converges near 0 loss.

2. Sure, but for most new hacks like mixup, randaugment and etc the results usually transfer over. Problem with deep learning is that most of the new results don't replicate so it's good to have a way to quickly validate things.

3. The lower level features are usually pretty data agnostic and transfer well to new tasks.


1. Gradient descent almost always finds a non optimum local min (it is not guaranteed to find a global min).


Isn’t the current best practice to train highly over-parametrized models to zero training error? That’d be a global optima, no?

Unless we’re talking about the optima of test error.


If you find a zero in a non negative function, I would call that a global minima, yes.


Yeah but depending on the data you might have even worse results, selecting the right subset to be representative is really important.


Would a random sample be representative? Statistically this seems to be the case for any large N. In fact it's not clear to me that any other sample would be more representative.


Many public datasets have skewed classes so if you take a random approach you're not gonna have a good result. And N might not be big enough anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: