# Statistical issues

Here is a collection of statistical issues and misunderstandings you often will encounter in empirical research. My plan is to add more examples in the future.

 Issue Description Source Absence of evidence fallacy No evidence for a finding should not be interpreted as there is evidence of its absence. Altman and Bland (1995) Berkson’s paradox When conditioning on a variable creates a spurious correlation (i.e., collider bias, the conditioning on a collider). Berkson (1946) Cronbach’s alpha People often misunderstand the coefficient. There is not a particular level of alpha that is desired or adequate. Hoekstra et al. (2018) Garbage can regression Adding too many independent variables to your regression model (i.e. a kitchen-sink approach). Achen (2004) Garden of forking paths When researchers conduct multiple analyses but only end up reporting a subset of these (data-dependent analysis). Gelman and Loken (2014) Moderation vs. Mediation A moderator is a variable that affects the direction and/or strength of the relation between two variables – not the same as mediation. Baron and Kenny (1986) Multivariate vs. Multivariable A multivariate model is a model with multiple dependent variables. Mustillo et al. (2018) p-value as a probability The p-value is not the probability that the null hypothesis is true. Greenland et al. (2016) Prosecutor’s fallacy Incorrectly assuming that Pr(A|B) = Pr(B|A). Westreich et al. (2014) Simpson’s paradox A trend in the data can disappear or reverse when looking at subgroups in the data. Simpson (1951) Spurious correlation When two variables correlate but are not casually related. Simon (1954) Statistical power The importance of having sufficient data to estimate the effect size of interest. Cohen (1992)