*>When their accuracy is touted it is usually based on the prediction the day be...

foobarqux · on July 23, 2013

I didn't communicate my point effectively. Let me try again.

I believe strongly in quantitative analysis of election data. I place zero value in punditry, especially from main stream media. What I don't believe is that some of these complicated models, Silver's in particular, are meaningfully superior to predicting the election by trivially applying recent polling data augmenting it perhaps with a simple weighted combination of polls based on recency or sample size.

I take as a given that Silver's model is better than punditry. I am skeptical that it is better than the trivial model which any undergraduate stats student would cook up.

> The problem is, which polling data do you use? They provide a scientific answer to that question. Instead of cherry picking, they use it all, and weight it based on past accuracy and other factors.

What I dispute is whether the "other factors", which is the secret-sauce that allows Silver to give the impression he has a uniquely predictive model, have any real value.

Thanks for the Linzer link. I have to take time to read it carefully but on first glance it again shows one of the things I take issue with: If you are going to claim that a model is accurate you should be asking, compared to what? How can you justify a complex model if you aren't even going to try to show that it is better than some trivial baseline?

DanBC · on July 23, 2013

> I take as a given that Silver's model is better than punditry. I am skeptical that it is better than the trivial model which any undergraduate stats student would cook up.

This seems quite easy to test. Has anyone done so yet?

foobarqux · on July 23, 2013

I haven't seen it.

SkyMarshal · on July 23, 2013

What would the trivial/baseline model be, just a straight average of all state polls for the last X days?

That would be interesting to know, but also easy to find out.

foobarqux · on July 23, 2013

I wouldn't say there is a canonical baseline model but your example would be a reasonable place to start. Weighted average by sample size is a straightforward modification. Just using the most recent reliable poll would also be interesting.

I think most of the extra value you could add would be from analyzing the poll data to get a sense of which polls were unreliable.