If you want to try it, there are a few data sets kicking around, including a) ht...

sidlls · on May 10, 2017

Real estate markets are rarely liquid. Such models would have to take that into account, of course, and that isn't something I'd think is straightforward at all.

stupidhn · on May 11, 2017

>Such models would have to take that into account

I believe the liquidity of the asset is inherently taken into account in the price at which the exchange is made.

sidlls · on May 11, 2017

Why do you believe that, out of curiosity?

The reason I think it isn't straightforward to account for is based on several things: it isn't fungible, the purchase date can have little or no relationship to the price, and the means of purchase typically carry other time-based constraints that are also not necessarily related to price (although they can be). Furthermore, the liquidity in a given market may change fairly quickly and without any (apparent) other reason.

I haven't tried to develop a model for this, so this is just sort of a gut feel for how difficult feature engineering to account for how a market behaves in this respect might be. And that "gut feel" is mainly informed by my experience purchasing and selling property (for personal and rental use).

stupidhn · on May 11, 2017

>Why do you believe that, out of curiosity?

Because the liquidity of an asset is a form of risk.

>The reason I think it isn't straightforward

I don't think it's straightforward either, but that only means individual's valuation might be way off.

sidlls · on May 11, 2017

I'm not convinced that risk as used in this context contributes to price significantly in the residential real estate market. People in this market don't (typically) view transactions as investments and they don't view the home as assets. Lenders do, so the availability of mortgage funds may be linked to price and liquidity but I suspect that relationship is tenuous at best.

rockinghigh · on May 10, 2017

The house would have to be mispriced by more than the transaction costs (5-10%).

tyingq · on May 10, 2017

I imagine it's tricky because value per square foot varies quite a bit in adjoining neighborhoods. And sometimes within a neighborhood. And neighborhoods aren't necessarily well defined geographically. And, some neighborhoods go long periods of time without any houses being listed or sold.

So the whole idea of "comparable sales" is shaky.

natoliniak · on May 10, 2017

> value per square foot varies quite a bit in adjoining neighborhoods. And sometimes within a neighborhood

The quality or rather "reputation" of the assigned public school is probably the best predictor of this variance. The problem is that the school's reputation can diverge from the official grades, so this is difficult to quantify.

tyingq · on May 10, 2017

Sure, but there are other drivers that would cause different value per square foot. A house with 12 foot ceilings, granite countertops, stone floors, wainscoting, built-ins, high-end appliances, tile roof, and so on...could be 100 yards from house with the exact same square footage, and none of that. It trips up the Zestimates around here. Especially in areas with larger lots, where bulldozing the old plain ranch house and replacing it with a higher end house is common. The per-square foot values are very different.

Edit: also, near power lines, near water tower, on cul de sac, next door to elementary school, etc.

beamatronic · on May 10, 2017

Common deal-breakers around here are - being on a T intersection - recent deaths in the home - Located under a high-tension power line

I don't think these attributes are accounted for in any MLS, Zillow, or Redfin database schema

_m8fo · on May 10, 2017

That might be a good project actually. Try to predict previous houses prices from historical data. A retrospective analysis would be fun since you know what the "right answer" is and can adjust your model accordingly.

It seems this and predicting the stock market have some similarities, but I think real estate is less volatile.

azernik · on May 10, 2017

Make sure to separate your training and validation data sets! :-)

GabrielBen · on May 10, 2017

>... buy them, and then try to sell them at the model's price.

At the liquidity and costs of marketing and buying-selling a property, this strategy would require you to find really high margin houses and those are snatched , marketed and sold within a week.

coredog64 · on May 11, 2017

Isn't that similar to the model for Opendoor?

They buy houses from owners at a fixed price and then need to turn a profit on it. Therefore they need to have a good idea of what the eventual selling price will be.

Houshalter · on May 11, 2017

There is a kaggle competition right now to predict house prices in russia. If you can do it better than anyone else, you can win tens of thousands of dollars.