What statistical knowledge and training do I need to evaluate the legitimacy of scientific papers, esp. in the medical field?
Whenever I read about any scientific claims, I ignore the press and go straight to the original paper cited (if there is one, often it is misquoted). I then read the abstract and the testing methodology. If I can spot any issues in the methodology I usually stop reading, e.g. small sample size, obvious confounding variable, blatant causation/correlation errors. But if all seems well, that still doesn't tell you if the study's claims match the test results or if the threshold parameters make sense.
Given a basic stats background, how can I obtain a deeper, intuitive understanding of things like p-values (which seem to be outdated anyway) and other sample sizes. Thanks, HN.
About p-values: they aren't exactly outdated, but they are the subject of a pretty fierce controversy. Many, if not most scientists still use them when doing statistical analyses because they are simple to apply and to understand and provide a quick metric for measuring the significance of a study's results.
Other scientists say that they are too simple and don't convey important information about the actual data (such as the spread). Apparently, there are more modern statistical procedures that do a better job than p-values do. (I'm not a statistician though, so don't ask me what these procedures are...) Also, p-values are all too often subjected to "p-hacking" - massaging the data until you get a statistically significant result (p <= 0.05). In fact, the very concept of significance is problematic. Originally, the p <= 0.05/0.01/0.005 significance limits were just approximate guidelines to help scientists interpret their data. Nowadays, they are often treated as definite boundaries of "truth". ("If my data gives a p-value of 0.049, the result is significant therefore my hypothesis must be true. If p=0.051, it is not significant, therefore my hypothesis must be wrong - or I must tweak my data until I get p=0.05.") This is obviously nonsense, yet a surprisingly common attitude (though not always as extreme as in my example).
As far as I personally am concerned, the real problem is not the actual p-values as such, but perhaps a lack of understanding of statistics by many scientists. (Coupled with the pressure exerted by journals that only want to publish "significant" results and so indirectly encourage p-hacking.)