1470. D’ya Like DAGs? A Survey on Structure Learning and Causal Discovery 1471. Custom Discrete Color Scales for ggplot2 1472. A New Coefficient of Correlation 1473. Ten simple rules for teaching yourself R 1474. The case for variable labels in R 1475. Fooled by beautiful data: Visualization aesthetics bias trust in science, news, and social […]

# Category: statistics

### How to improve your figures #11: Do not repeat information

Do not repeat information in your visualisations. I repeat: Do not repeat information in your visualisations. Space is limited when you visualise data, and you need to make sure that every pixel is worth the space. If you repeat information in a figure, you are most likely using certain defaults and not repeating information consciously […]

### Uber and irresistibly interesting data

The other day, the Guardian could report that Uber paid Alan Krueger $100,000 for a study that was positive towards Uber. The fact that the study was positive towards Uber should come as no surprise. You do not pay a lot of money for a study showing that your business is bad for the world. […]

### The liar paradox in self-reported survey data

The liar paradox is the logical paradox in the statement “I am lying”. If you are telling the truth about lying, are you then indeed lying? I have been thinking about this paradox and how it might also be relevant for survey research with implications for our understanding of measurement error and the interpretation of […]

### Potpourri: Statistics #87

1420. Statistical Control Requires Causal Justification 1421. Graphic Design with ggplot2: How to Create Engaging and Complex Visualizations in R 1422. ggplot2 Wizardry: My Favorite Tricks and Secrets for Beautiful Plots in R 1423. Deep Learning of Potential Outcomes 1424. Six tips for better spreadsheets 1425. Principal Component Analysis (PCA) from Scratch 1426. US Election […]

### Introduktion til R #2

I 2016 underviste jeg et kandidatfag på Syddansk Universitet, hvor jeg anbefalede de studerende at lære R. R var ikke en del af faget, men de studerende fik mig overtalt til at afholde en workshop, hvor jeg introducerede dem til R. I den forbindelse lavede jeg noget undervisningsmateriale på dansk, der kunne gøre det så […]

### Tips and tricks for ggplot2

Here is a collection of 50+ tips and tricks for ggplot2. Do read them as suggestions, or rule of thumbs, rather than principles and ideas that will generalise across contexts and presentation formats. Most of the tips and tricks are based upon material that is also available via my GitHub repository awesome-ggplot2. Here we go: […]

### Potpourri: Statistics #86

1366. The Probability and Statistics Cookbook 1367. A Complete Introduction to R for Data Science 1368. One year in vis 1369. 10 Tips for Using Geolocation and Open Source Data to Fuel Investigations 1370. Using the terra R package to view, download and analyze Google Earth Engine Images 1371. Converting Between Currencies Using priceR 1372. […]

### How to improve your figures #10: Do not use word clouds

In a previous post, I argued that people should not use pie charts. In this post I am going to make a similar case for word clouds. In short, I will argue that word clouds provide ‘foggy’ insights (pun intended). Specifically, I will discuss a word cloud that got a lot of attention in the […]

### Potpourri: Statistics #85

1309. Introduction to Data Science: Data Analysis and Prediction Algorithms with R 1310. K-Nearest Neighbor (KNN) Explained 1311. The Ultimate Guide to Deploying a Shiny App on AWS 1312. Controlling for “X”? 1313. The Bias Variance Tradeoff 1314. Double Descent: A Visual Introduction, A Mathematical Explanation 1315. Decision Trees 1316. The Importance of Data Splitting […]