> Using components to recreate image Well, it seems like to me that the recreate...

dthal · on Aug 18, 2015

Scroll down to where it says:

   It even works for dresses that were not in the training set:

Yes, that first dress is in the training data, but she did do reconstructions on dresses not used to build the PCA basis. They look decent, but they aren't as good as that first example. And, as she points out, they can't reproduce patterns that are not in the initial data very well, and can't reproduce accessories that were not in the initial data at all.

geb · on Aug 18, 2015

Good suggestion! Added a section to the post about this (and gave you credit for the suggestion at the bottom of the article).

leni536 · on Aug 18, 2015

Nice! Now these look plausible but it's still surprising how accurate these are.

Note since this is a linear approach, the choice of the colorspace can have a large impact on the results. You didn't mention the colorspace, my best bet is that you used sRGB. Maybe you can try it on a linear colorspace too.

Edit: According to the source you use PIL.Image.getdata(), which according to the doc returns "pixel values", then they have an RGB->XYZ conversion example that only works for linear colorspace. So it suggests that they already return linear RGB values, but many software mess these things up so I'm not 100% sure.

astrosi · on Aug 17, 2015

Likewise with the "predictions" of the authors likes/dislikes. Testing how the model will perform on an independent data-set (or at least cross validation [1]) would be much more interesting.

[1] https://en.wikipedia.org/wiki/Cross-validation_(statistics)

gwern · on Aug 18, 2015

The other thing I wondered about the predictions: she apparently rated all of the dresses, and the top/bottom matched the ratings. Fair enough. But what about the residuals, the missclassified ones - the ones where the logistic regression predicts a high or low score and her rating was actually the opposite? That might be interesting to look at.

dspeyer · on Aug 18, 2015

That's there. Search for:

> The misclassifications are interesting too

One problem seems to be that it concluded she'd dislike anything the exact opposite color from her favorite shade of red. A common flaw in linear models.

dthal · on Aug 18, 2015

The blog post seems to be getting modified at this moment. When I first saw it, it didn't have anything about the misclassifications, but that has been added now.

dthal · on Aug 18, 2015

I've only glanced at her code, but it looks like[1] the predictions are from held-out data.

EDIT: All of the data was used in forming the PCA basis, but that isn't (necessarily) an error, depending on the use-case. And the logistic regression model was evaluated on held-out data.

[1]https://github.com/graceavery/Eigenstyle/blob/master/visuals...

a-dub · on Aug 17, 2015

This is actually pretty easy, you just walk the top few basis vectors in the low-d subspace and reproject back into the original space. Sounds complicated, but it's really not.