– Keep It Together: Using the tidyverse for machine learning
– Learn to purrr
– Mastering Shiny
– A Comprehensive List of Handy R Packages
– The challenges of using machine learning to identify gender in images
– How is polling done around the world?
– How to Get Better at Embracing Unknowns
– Drawing maps in R
– Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics
– Visualizing Locke and Mill: a tidytext analysis
– Tutorial: Cleaning UK Office for National Statistics data in R
– Transitioning into the tidyverse: part 1, part 2
– Your Friendly Guide to Colors in Data Visualisation
– Optimising your R code – a guided example
– Learning data visualization
– Reference Collection to push back against “Common Statistical Myths”
– mutate_all(), select_if(), summarise_at()… what’s the deal with scoped verbs?!
– Tools for Exploring and Comparing Data Frames
– Tom’s Cookbook for Better Viz
– Themes to Improve Your ggplot Figures
– Lesser Known R Features
– What Statistics Can and Can’t Tell Us About Ourselves
– A Graphical Introduction to tidyr’s pivot_*()
– n() cool #dplyr things
– Bayesian Linear Mixed Models: Random Intercepts, Slopes, and Missing Data
– Prepping data for #rstats #tidyverse and a priori planning
– NYT-style urban heat island maps
– Keep It Together: Using the tidyverse for machine learning
Different political science journals have different article formats with different word/page limits. Consequently, whenever you want to submit an article to a journal, the first thing to look up is the exact word limit.
In order to get a sense of the different article formats and word limits in political science journals, I have created an overview. The overview shows word limits for long articles, short articles and review essays/articles.
The overview currently consists of 65 journals and I will most likely add more journals (and more features) in the future. Do reach out on Twitter or drop me a mail if you got any feedback or if there is a specific journal of relevance to political scientists that I should add to the overview.
Last, the overview is sorted by impact factor (obtained with the excellent
scholar package in R).
1. Associations with cancer risk or benefits have been claimed for most food ingredients (Schoenfeld and Ioannidis 2013)
2. People in non-English speaking countries with subtitled TV are better at English than people in countries with dubbed television (Micola et al. 2019)
3. Walking speed is a function of city size in that pedestrians move more quickly in big cities than in small towns (Walmsley and Lewis 1989)
4. Littered cigarette filters reduce growth and alter short-term primary productivity of terrestrial plants (Green et al. 2019)
5. In soccer penalty kicks, goalkeepers almost always jump right or left (the optimal strategy is to stay in the goal’s center) (Bar-Eli et al. 2007)
6. Credit card payments increase unhealthy food purchases (Thomas et al. 2011)
7. Autocracies systematically build more new skyscrapers than democracies (Gjerløw and Knutsen 2019)
8. Bacteria persist more efficiently on laminated restaurant menus as compared to paper menus (Sirsat et al. 2013)
9. People view their own perceptions and beliefs as objective reflections of reality but others’ as distorted by bias (Pronin 2008)
10. The price of champagne falls before New Year’s Eve due to the entry of a large share of new consumers (Bayot and Caminade 2014)
11. In prison, inmates cooperate in Prisoner’s Dilemma (Khadjavi and Lange 2013)
12. The World Cup in soccer increases state aggression (Bertoli 2017)
13. Vaccines are not associated with autism (Taylor et al. 2014)
14. Open office spaces make workers rely more on email while decreasing face-to-face interaction (Bernstein and Turban 2018)
15. GDP data can be systematically manipulated for political ends (Wallace 2016)
16. In used-car transactions, there is left-digit bias in the processing of odometer values, i.e. people focus on the number’s leftmost digits, with implications for the sale price (Lacetera et al. 2012)
17. Thanksgiving dinners attended by residents from opposing political party precincts are 30 to 50 minutes shorter than same-party dinners (Chen and Rohla 2018)
18. Bronze medalists tend to be happier than silver medalists (Medvec et al. 1995)
19. Warming oceans are killing coral reefs (Hughes et al. 2018)
20. National trust levels are negatively associated with the length of countries’ constitutions (Bjørnskov and Voigt 2014)
21. In an experiment, increased sexual frequency did not lead to increased happiness (Loewenstein et al. 2015)
22. WTO membership is likely to have no causal effect on domestic corruption overall (if anything, it is likely to increase corrupt practices, particularly among firms that are government owned) (Choudhury 2019)
23. Walking is good for creative thinking (Oppezzo and Schwartz 2014)
In the current version, you will find an introduction to the basics of R (e.g. how to import and manipulate data), how to collect political data (primary and secondary data), how to visualise data and a brief introduction to OLS regression.
The material follows – for the most part – tools within the tidyverse (such as the
ggplot2 packages). In future versions you will find additional techniques to analyse data, more on scoped verbs, functional programming tools etc. Any suggestions, feedback and comments are more than welcome.
(This is old news for people following me on Twitter.)
Boris Johnson writes in The Telegraph that since we could get to the Moon, we should be able to get out of the EU: “They went to the Moon 50 years ago. Surely today we can solve the logistical issues of the Irish border”.
I sympathise with the sentiment in the argument. A lot of smart people – including a lot of social scientists – are working on solving complicated and difficult social issues, and it should be possible to solve the issue at hand. And why not expect this when scientists can solve complicated issues in the natural sciences? After all, it’s not rocket science.
However, there are very good reasons to why we cannot simply solve social issues that – intuitively – should be easy to solve. In brief, social science is much more complicated than a lot of the issues that we deal with in the natural sciences. We simply believe that we understand complex social phenomena when the truth is that we are not good at understanding or/and predicting such phenomena. Accordingly, when social scientists say that social science is not rocket science, we envy the simplicity of rocket science.
Duncan J. Watts describes this clearly in his book, Everything Is Obvious: How Common Sense Fails Us: “Well, I’m no rocket scientist, and I have immense respect for the people who can land a machine the size of a small car on another planet. But the sad fact is that we’re actually much better at planning the flight path of an interplanetary rocket than we are at managing the economy, merging two corporations, or even predicting how many copies of a book will sell.”
The problem is that Boris Johnson is assuming that the ontological parsimony in the natural sciences easily applies to the social sciences. There are specific reasons to which this perspective only works for the natural sciences and not the social sciences. Seva Gunitsky (2019), for example, describes why ontological parsimony works in the natural sciences: “The scientific version of ontological parsimony, most often associated with theoretical physics and mathematics (but sometimes imported into social science), argues that reality itself is governed by parsimonious physical laws. The fundamental physical nature of matter itself, at least at the subatomic level, possesses a symmetry that abets and even demands parsimonious explanations. Parsimonious theories that take advantage of this symmetry are appealing not just because they are elegant, but because they are more likely to be true.”
However, we cannot draw on the simplicity of the natural sciences to infer the potential to identify and suggest solutions to social issues. In the social sciences, we do not have the luxury of studying parsimonious physical laws. On the contrary, the social world is much more complicated. Acknowledging this is important if we are to actually understand and solve social issues – including the logistical issues of the Irish border.
A new study, published in Science Advances, questions the validity of the Big Five personality traits outside of Western, educated, industrialized, rich and democratic (WEIRD) populations.
I was interviewed by New Scientist in order to give my take on the implications of the study. The article is available online.
Robert Klemmensen, Michael Baggesen Klitgaard and I have a new article in the May issue of the European Political Science Review. The article is titled ‘Bailout or bust? Government evaluations in the wake of a bailout‘. Here is the abstract:
Governments are often punished for negative events such as economic downturns and financial shocks. However, governments can address such shocks with salient policy responses that might mitigate public punishment. We use three high-quality nationally representative surveys collected around a key event in the history of the Dutch economy, namely the outbreak of the financial crisis in 2008, to examine how voters responded to a salient government bailout. The results illustrate that governments can get substantial credit for pursuing a bailout in the midst of a financial crisis. Future research should take salient policy responses into account to fully understand the public response to the outbreak of financial and economic crises.
– Hands-on Machine Learning with R
– The Truth About Linear Regression
– Data Viz Book Reviews
– Make Multi-point “dumbbell” Plots in ggplot2
– City Intelligence Data Design Guidelines
– shinyApp(), runApp(), shinyAppDir(), and a fourth option
– Reordering and facetting for ggplot2
– R Docker tutorial
– S4: a short guide for the perplexed
– Introducing trendyy
– R you ready to make charts?
– Ten more random useful things in R you may not know about
– Testing statistical software
– Tidy Data Science Workshop
– Using tidyverse tools with Pew Research Center survey data in R
– A Gentle Introduction to tidymodels
– Changing Glyph in legend in ggplot2
– Practical Data Science: an introduction to the PeerJ collection
I forlængelse af mit seneste indlæg om antallet af meningsmålinger i valgkampen, har jeg talt med Fagbladet Journalisten omkring kvaliteten af dækningen af meningsmålinger.
Jeg kommer blandt andet med følgende pointe omkring meningsmålinger: “Meningsmålinger er ikke i sig selv ubrugelige. De har et formål. Men der er en naturlig grænse for, hvor meget information man faktisk kan få ud af dem”.
Artiklen kan findes her.
Lisbeth Knudsen kom før valget med følgende profeti: “Når valget udskrives, vil vi igen blive oversvømmet af meningsmålinger.”
Blev vi igen oversvømmet med meningsmålinger? Det er et svært spørgsmål at besvare. Dette da nogen helst ser et komplet fravær af meningsmålinger, hvorfor mere end nul meningsmålinger er for mange meningsmålinger.
Vi kan dog se nærmere på, hvor mange meningsmålinger vi fik under denne valgkamp og sammenligne dette med tidligere valgkampe. For fire år siden hjalp jeg Politiken med at kigge på antallet af meningsmålinger i en historisk kontekst, hvor det kunne ses, at vi har fået flere meningsmålinger over tid.
Derfor ville en simpel lineær ekstrapolation fortælle os, at vi skulle forvente et hav af meningsmålinger under denne valgkamp, hvis vi skulle toppe tidligere valgkampe – og blive oversvømmet med meningsmålinger.
Her kigger jeg på de tre seneste folketingsvalg, hvor jeg har fyldestgørende data: 2011, 2015 og 2019. I Figur 1 vises de dage, der dækker de respektive valgkampe i 2011, 2015 og 2019, samt antallet af meningsmålinger for hver dag.
Figur 1: Antallet af meningsmålinger i valgkampene
Figuren viser at vi i 2011 og 2015 havde dage i valgkampen med omkring 7-8 daglige meningsmålinger. Dette så vi ikke i 2019. Generelt var der langt flere målinger i 2011 og 2015 end i 2019. Der er derfor – relativt betragtet – ingen evidens for, at vi blev oversvømmet med meningsmålinger i 2019.
En del af forklaringen på dette kan være, at vi i 2019 havde en lang valgkamp. Det ville være voldsomt at have over fem daglige meningsmålinger i løbet af en måned. Dette kan dog ikke være den eneste forklaring, da flere medier meldte ud forud for valgkampen, at de ikke ville foretage mange meningsmålinger.
I 2011 så vi i løbet af valgkampen 114 meningsmålinger (5,7 meningsmålinger om dagen), i 2015 så vi 129 meningsmålinger (5,9 meningsmålinger om dagen) og i 2019 fik vi (relativt) beskedne 74 meningsmålinger (2,5 meningsmålinger om dagen).
Det var især YouGov og Voxmeter, der med tæt på daglige meningsmålinger, bidrog til størstedelen af de meningsmålinger, vi havde til rådighed i løbet af valgkampen. De andre analyseinstitutter foretog meningsmålinger få gange i løbet af valgkampen, der alt bidrager til et gennemsnit på 2,5 daglige meningsmålinger.
Det kan som sagt altid diskuteres, hvad der konstituerer mange og få meningsmålinger, men der er intet der taler for, at vi blev oversvømmet med meningsmålinger i denne valgkamp.