Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question: is there any difference between the highest variance dimension pca finds and a line that linear regression would find?


if recall yeah there probably will be. linear regression minimises the vertical distance of a point to the regression line whereas PCA minimises the orthogonal distance of the point to the line.


Linear regression uses a measure of an "error" for every data point. Visually, the error is the vertical difference between a data point and the line/plane of linear regression. In contrast, PCA measures the distance from the data point along the line perpendicular to the PCA axis. The PCA distance is also known as a "projection".

There is something known as orthogonal regression (total least squares) which uses the same measure as PCA. Unfortunately it doesn't work well across incompatible variables.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: