In a previous post, I argued that people should not use pie charts. In this post I am going to make a similar case for word clouds. In short, I will argue that word clouds provide ‘foggy’ insights (pun intended). Specifically, I will discuss a word cloud that got a lot of attention in the […]
Category: statistics
Potpourri: Statistics #85
– Introduction to Data Science: Data Analysis and Prediction Algorithms with R – K-Nearest Neighbor (KNN) Explained – The Ultimate Guide to Deploying a Shiny App on AWS – Controlling for “X”? – The Bias Variance Tradeoff – Double Descent: A Visual Introduction, A Mathematical Explanation – Decision Trees – The Importance of Data Splitting […]
At what age are people considered old? #2
In 2018, I wrote a post about a data visualisation showing that the age you consider ‘old’ is the square root of your age times 8. So if you are 25 you will consider 40 to be old. In brief, my main problem with the data visualisation was the low N (~200), i.e., the small […]
How to improve your figures #9: Use colourblind-friendly colours
Colours are often used in data visualisations to emphasise variation, ease interpretation and make them aesthetically pleasing. However, we do not all see colours in the same way and you should take this fact into account when you design figures. Specifically, as a start, you should make sure that everybody can actually see the variation […]
Årets første sommerdag
På Twitter delte DMI en figur med information om, hvornår vi hvert år i perioden 2005 til 2022 har haft årets første sommerdag (en maksimumtemperatur på over 25 °C). Figuren kan ses her: Jeg var ingenlunde imponeret over denne figur og ville sågar kalde den problematisk. Generelt har jeg det svært med figurer, der formidler […]
Potpourri: Statistics #84
– Introduktion til R – Improving Your Statistical Inferences – Dataviz Inspiration – A guide to modeling outcomes that have lots of zeros with Bayesian hurdle lognormal and hurdle Gaussian regression models – The Effects of Regularization and Data Augmentation are Class Dependent – latentcor: An R Package for estimating latent correlations from mixed data […]
Experiments and societal challenges
The randomised controlled trial is seen as the gold standard within the social sciences. I do not disagree. I love well-executed experiments with strong causal identification. If two studies, one experimental and one non-experimental, differ in their conclusions, there is no doubt which study I, all else equal, will side with. Most importantly, I often […]
70% of meetings keep employees from doing productive work
How can we measure whether a meeting is productive or not? And if we can, what is the percentage of meetings that keep employees from doing productive work? 10%? 15%? 80%? Did you know that 70% of meetings keep employees from doing productive work? Boro baba. In this post I will show that there is […]
Potpourri: Statistics #83
– A detailed guide to colors in data vis style guides – parlscot: An R package to download Scottish Parliamentary data – France 2022: How to predict an election – peacesciencer: Tools and Data for Quantitative Peace Science – Left-Right Placements of GB Westminster Constituencies in 2021 – Effects of Causes and Causes of Effects […]
Book Review: Covid by Numbers
Most people have their own personal stories to tell about the COVID-19 pandemic. The first encounter with the new virus, the experience of lockdowns (or lack hereof), (not) getting vaccinated, etc. We all have our own unique view on life during the pandemic. However, at the core of the pandemic was data. Statistics, numbers, graphs, […]