More

unixpickle · 2025-03-16T03:52:00 1742097120

To optimize for fast nearest neighbors, I chose 256 dims. Notably, this actually hurt some of the pre-training classification losses pretty severely compared to 2k dims, so it definitely has a quality cost.

The site uses cosine distance. The code itself implements Euclidean distance, but I decided to normalize the vectors last minute out of FUD that some unusually small vectors would appear as neighbors for an abnormal number of examples.

unixpickle · 2025-03-16T02:17:21 1742091441

The "shop for random products" direction was actually fun for me too. Reminds me of amazon.com/stream a bit.

unixpickle · 2025-03-15T23:21:13 1742080873

Probably the same complaint as

https://news.ycombinator.com/item?id=43375415

unixpickle · 2025-03-15T21:59:46 1742075986

You definitely highlighted a shortcoming of the feature vector model in this case. Indeed it's quite a small model trained on a single Mac for about a week, so it's not very "smart".

I'd expect that this is a problem that could be solved by using larger off the shelf models for image similarity. For this project, I thought it would be cooler to train the model end-to-end myself, but doing so has negative consequences for sure.

unixpickle · 2025-03-15T18:12:39 1742062359

I think it would be a useful feature. For the sake of being a fun project, I didn't use CLIP because I only wanted to use models that I trained myself on a single Mac. However, to make this more useful, text search would be quite helpful.

unixpickle · 2025-03-15T17:03:11 1742058191

Yup, it's a small model I trained on my Mac mini! The model itself just classifies product attributes like keywords, price, retailer, etc. The features it learns are then used as embeddings

unixpickle · 2025-03-15T16:08:59 1742054939

Ideally pose and lighting wouldn't matter as much as it currently does.

I think using a better model to produce feature vectors could achieve this, or perhaps even finetuning the feature model to match human preferences.

unixpickle · 2025-03-05T22:58:49 1741215529

This should just be called "why VPNs are useful", i think?

unixpickle · 2025-03-05T03:41:03 1741146063

This seems to be pretty much exactly a standard Bayesian deep learning approach, albeit with a heavily engineered architecture.

unixpickle · 2024-12-29T06:56:19 1735455379

Wait what? Who actually calls trainable params "hyperparameters"? Nobody at OpenAI does, as far as I know.

minimaxir · 2024-12-29T07:04:50 1735455890

People who are making quick social media posts while taking a casual walk outside on websites that don't make it easy to edit posts and are not expecting to be nitpicked about it.

Overall, it's something I've seen very often on social media and less technical articles about LLMs. OpenAI would fall into the "almost" category.

llm_nerd · 2024-12-29T17:31:13 1735493473

It's okay to say that you mistyped or whatever, while taking a casual walk outside on websites that don't make it easy to edit posts and are not expected to be nitpicked about it. Throwing in that everyone uses them interchangeably, however, is just profoundly wrong on every level.

I wasn't nitpicking. It is a HUGE differentiation, and I pointed it out specifically because people pick up on terminology so people who might not know better will go forward and just drop in the more super duper hyperparameter, not realizing that it makes them look like they don't know what they're talking about. As I said in the other post, no one who knows anything uses them interchangeably. It is just completely wrong.

minimaxir · 2024-12-29T20:42:32 1735504952

Again, I've heard and used the terminology "model hyperparameter" in place of "model parameter", and I've also heard "model parameter" in place of "model hyperparameter" because not every human interaction is a paper on arXiv and the terms are obviously very similar. The context of the term is what matters in the end (as demonstrated by other comments following my correct intent), and society will not crumble if using either term incorrectly in casual conversation. No one intentionally uses the wrong term, but as jokingly said in another comment "when you get really deep into model training, it can seem like there are a billion hyperparameters you have to worry about."

I appreciate being corrected, but you are the one who asked for my opinion based on my extensive time in AI, you can choose to believe it or not.