Statistical issues – Erik Gahner Larsen

Here is a collection of statistical issues and misunderstandings you often will encounter in empirical research. My plan is to add more examples in the future.

Issue	Description	Source
Absence of evidence fallacy	No evidence for a finding should not be interpreted as there is evidence of its absence.	Altman and Bland (1995)
Berkson’s paradox	When conditioning on a variable creates a spurious correlation (i.e., collider bias, the conditioning on a collider).	Berkson (1946)
Cronbach’s alpha	People often misunderstand the coefficient. There is not a particular level of alpha that is desired or adequate.	Hoekstra et al. (2018)
Garbage can regression	Adding too many independent variables to your regression model (i.e. a kitchen-sink approach).	Achen (2004)
Garden of forking paths	When researchers conduct multiple analyses but only end up reporting a subset of these (data-dependent analysis).	Gelman and Loken (2014)
Moderation vs. Mediation	A moderator is a variable that affects the direction and/or strength of the relation between two variables – not the same as mediation.	Baron and Kenny (1986)
Multivariate vs. Multivariable	A multivariate model is a model with multiple dependent variables.	Mustillo et al. (2018)
p-value as a probability	The p-value is not the probability that the null hypothesis is true.	Greenland et al. (2016)
Prosecutor’s fallacy	Incorrectly assuming that Pr(A\|B) = Pr(B\|A).	Westreich et al. (2014)
Simpson’s paradox	A trend in the data can disappear or reverse when looking at subgroups in the data.	Simpson (1951)
Spurious correlation	When two variables correlate but are not casually related.	Simon (1954)
Statistical power	The importance of having sufficient data to estimate the effect size of interest.	Cohen (1992)
Table 2 fallacy	The interpretation of exposure and confounder effect estimates from a single model.	Westreich and Greenland (2013)

Recommended readings