If you had a really great model, you might be able to make some money with it--find underpriced (per the model) houses, buy them, and then try to sell them at the model's price.
Real estate markets are rarely liquid. Such models would have to take that into account, of course, and that isn't something I'd think is straightforward at all.
The reason I think it isn't straightforward to account for is based on several things: it isn't fungible, the purchase date can have little or no relationship to the price, and the means of purchase typically carry other time-based constraints that are also not necessarily related to price (although they can be). Furthermore, the liquidity in a given market may change fairly quickly and without any (apparent) other reason.
I haven't tried to develop a model for this, so this is just sort of a gut feel for how difficult feature engineering to account for how a market behaves in this respect might be. And that "gut feel" is mainly informed by my experience purchasing and selling property (for personal and rental use).
I'm not convinced that risk as used in this context contributes to price significantly in the residential real estate market. People in this market don't (typically) view transactions as investments and they don't view the home as assets. Lenders do, so the availability of mortgage funds may be linked to price and liquidity but I suspect that relationship is tenuous at best.
I imagine it's tricky because value per square foot varies quite a bit in adjoining neighborhoods. And sometimes within a neighborhood. And neighborhoods aren't necessarily well defined geographically. And, some neighborhoods go long periods of time without any houses being listed or sold.
> value per square foot varies quite a bit in adjoining neighborhoods. And sometimes within a neighborhood
The quality or rather "reputation" of the assigned public school is probably the best predictor of this variance. The problem is that the school's reputation can diverge from the official grades, so this is difficult to quantify.
Sure, but there are other drivers that would cause different value per square foot. A house with 12 foot ceilings, granite countertops, stone floors, wainscoting, built-ins, high-end appliances, tile roof, and so on...could be 100 yards from house with the exact same square footage, and none of that. It trips up the Zestimates around here. Especially in areas with larger lots, where bulldozing the old plain ranch house and replacing it with a higher end house is common. The per-square foot values are very different.
Edit: also, near power lines, near water tower, on cul de sac, next door to elementary school, etc.
That might be a good project actually. Try to predict previous houses prices from historical data. A retrospective analysis would be fun since you know what the "right answer" is and can adjust your model accordingly.
It seems this and predicting the stock market have some similarities, but I think real estate is less volatile.
>... buy them, and then try to sell them at the model's price.
At the liquidity and costs of marketing and buying-selling a property, this strategy would require you to find really high margin houses and those are snatched , marketed and sold within a week.
They buy houses from owners at a fixed price and then need to turn a profit on it. Therefore they need to have a good idea of what the eventual selling price will be.
There is a kaggle competition right now to predict house prices in russia. If you can do it better than anyone else, you can win tens of thousands of dollars.
a) https://www.kaggle.com/c/house-prices-advanced-regression-te...
b) http://www.bis.org/statistics/pp_detailed.htm
c) https://archive.ics.uci.edu/ml/datasets/housing
If you had a really great model, you might be able to make some money with it--find underpriced (per the model) houses, buy them, and then try to sell them at the model's price.