Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The only correct answer is "A: mail them" if you want to stay statistically relevant.

You cannot ignore the group that didn't answer the questionnaire, as they will most likely expose some of the behavior that you are researching (i.e. about life etc), and might have a huge impact on your results.

So, the statistical result you currently have (based on the 90/120 students) will most likely be biased, and is invalid. (25% of missing input that might heavily impact your outcome is most likely making your results useless)

Thus, the only way to make it statistically relevant is getting more answers from the no-show group.

B. If you start over and do the same thing you will most likely get similar results/no-shows. So that will not be a good solution

D. As explained before, ignoring the no-shows results in a potentially biased outcome.

E. If your initial set is most likely biased, adding another 30 subjects will not fix your initial bias.



The behavior that the non-responders are exposing is that they don't like answering surveys. Factor that in how you will, but if you beg/incent/demand that they answer, you are no more likely to get honest answers.

I remember kids in school who would just answer "A" to every question because they didn't like surveys and didn't care.


> the only way to make it statistically relevant is getting more answers from the no-show group.

You can't get answers from the no-show group, because they don't respond! You are just creating three groups instead of two: those who responded on the first try, those who responded on the second try, and those who don't respond at all.


The fact that you have to nag them will introduce bias by itself. Now 25% of your sample will be biased by the nag. It’s just a question of whether that’s worth it compared to not including them at all.

Anyway, this should be an open ended question.


> You cannot ignore the group that didn't answer the questionnaire, as they will most likely expose some of the behavior that you are researching (i.e. about life etc), and might have a huge impact on your results.

This is completely paradoxical.

You are saying that using the data from the 90 would be jumping to conclusion because you would probably ignore data that would not match those 90.

But making this claim IS jumping to conclusions, because you are making an assumption (the 30 have something in common explaining why they didn't fill the form).


See: non-response bias [1]

Parent comment should've said "could" instead of "will most likely", but their point is correct.

[1] https://en.wikipedia.org/wiki/Participation_bias


The article you link to establishes the reality of participation bias, but it does not exactly endorse option A. It does say "In e-mail surveys those who didn't answer can also systematically be phoned and a small number of survey questions can be asked. If their answers don't differ significantly from those who answered the survey, there might be no non-response bias. This technique is sometimes called non-response follow-up." This is not, however, the same as option A, which (as far as it goes) commingles responses from those who respond to the second prompt with those from the first, potentially concentrating a non-response bias in those who don't respond to either prompting. Furthermore, neither option A nor the above quote offers a remedy if evidence of non-response bias is found.


It's really not. The 30 might have something in common, which calls any findings that exclude them into question. This doesn't rely on any unreasonable assumptions


Option A is absolutely the worst choice. The non-responders have already selected themselves into a non-random set, and making a request to them alters the entire experiment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: