– Supervised vs. Unsupervised Learning: Exploring Brexit with PLS and PCA
– Sentiment Analysis of 5 popular romantic comedies
– How to justify your alpha: step by step
– Comprehensive list of color palettes available in R
– Visualizing Outliers
– Generating codebooks in R
– Introducing geofacet
– Causal Inference: The Mixtape
– Prime Hints For Running A Data Project In R
– Introducing Raincloud Plots!
– What to consider when creating choropleth maps
– Statistical vignette of the day as a teaching tool
– Recreate – Sankey flow chart
– Data: Sharing Is Caring
– ggplot2: How Geoms & Aesthetics ≈ Whipped Cream
– Animated population pyramids in R
– Supervised vs. Unsupervised Learning: Exploring Brexit with PLS and PCA
There is a lot of new and interesting academic research coming out every day. Working papers, book chapters (you can usually ignore these), journal articles, books etc. So, how to stay up to date on all this new research? Here are my personal recommendations.
First and most importantly: Twitter. This is by far the easiest way to keep yourself updated. You don’t need to (re)tweet or in any other way engage in the conversations on Twitter, but you should at least have an account and follow your favourite scholars1.
Luckily, it is impossible not to hear about new research from a person if you follow that person on Twitter. Furthermore, people are usually good at tweeting about interesting research similar to their own interests (which hopefully will overlap with your interests).
That also brings us to the challenge of using Twitter: information overload. The more people you follow on Twitter, the more difficult it is to ensure that you notice the tweets relevant to you. It is very easy to follow new people on Twitter. Good Twitter use is not about following as many researchers as possible but about optimizing the signal-to-noise ratio, i.e. seeing more relevant tweets and less irrelevant tweets.
I can recommend that you do a mental cluster analysis and create (private) lists of people connected within their respective domains. For example, you can create lists with academics within different fields/topics (U.S. political scientists, European political scientists, open science, R, economists, psychologists, sociologists etc.)
While there is an overlap between the different lists, they can structure your Twitter use and make it easier to stay up to date on what is going on compared to one major feed with everybody, especially if you are offline or busy not being on Twitter for multiple days and eventually have to catch up. You can read more about lists on Twitter here.
Second, Google Scholar. An important feature of Google Scholar is that you can follow researchers, articles and key words (so-called email alerts). If you follow a researcher on Google Scholar, this will give you a mail notification when the person has new research. You can also follow citations to that persons, i.e. get mail notifications on the new research that is citing work by the person.
Within any scientific subfield there is usually a review piece or two that everybody cites. It is a good idea to sign up for notifications in relation to those articles so you get a mail when there is new work that cite this work. Last, if you work with specific concepts it is a good idea to follow such key words as well.
Third, journal RSS feeds. This was my main method for years, basically getting notifications about the most recent number of a journal and/or articles available in advance/FirstView. I still follow the journals but it is getting less useful for three reasons. First, there is a heavy delay so you have often seen the work months (if not years) in advance of the actual publication (especially if you use the two methods above). Second, there is an overlap with the above methods, so if anything relevant is coming out, you can be sure that it will reach your Twitter feed. Third, going back to the signal-to-noise ratio, the more generic journals you follow, the more irrelevant research will end up in your feed.
These are just a few of the ways in which you can find new research (again, my recommendations). If you want another example on how you can find new research in line with your interests, see this tweet from John B. Holbein (he usually tweets a lot of interesting political science research).
- If they are not on Twitter you should reconsider whether they are in fact your favourites. [↩]
Or, why is Rwanda doing better than Denmark?
In this post I outline basic methodological problems with The Global Gender Gap Report (the GGGR). The GGGR is developed by the World Economic Forum (WEF) and “benchmarks 144 countries on their progress towards gender parity across four thematic dimensions.”
Benchmarking 144 very different countries on their gender parity is a challenging task. Sadly, the report from the World Economic Forum is not doing a great job accommodating the challenges. The issues in the report are severe and the rankings should not be taken seriously. In short, the country rankings in the GGGR are misleading at best and completely meaningless at worst.
I will look at the most recent report from 2017 and illustrate some interrelated problems. There are other issues with the report but below I touch upon some of the most important. For some of the other issues in the report, see my (and others) comments in this article (sorry, it is in Danish).
The GGGR measure the relative gaps between women and men across four thematic dimensions: health, education, economy and politics. For each of the four dimensions we see that 13 out of the 14 variables are ratios.
For the subindex Health and Survival, the variables are 1) sex ratio at birth and 2) female healthy life expectancy (also as a ratio relative to the male value). This subindex will help us understand one of the main problems with the report, namely that it is not tapping into any meaningful gender gaps. Specifically, we will look at healthy life expectancy. This is a measure of “Average number of years that a person can expect to live in full health, calculated by taking into account years lived in less than full health due to disease and/or injury.”
Since men are doing exceptionally bad on the healthy life expectancy variable in Rwanda (with a value of 52.3), Rwanda is getting a very good score on this variable and this is affecting its overall rank as number 4 in the Global Gender Gap Report. Figure 1 shows the top 15 countries doing best on the gender parity list (notice Rwanda as number 4). The blue lines indicate the size of the gender gap.
Figure 1: Gender gap rankings, top 15 countries
The report is partially aware about this issue, as they write: “the Index is constructed to rank countries on their gender gaps not on their development level.” (p. 4). However, this is a serious problem as developed countries are doing much better in terms of the gender gap in health and survival, but this is not to be seen in the rankings (on the contrary, countries are punished for this, cf. below).
In other words, the first key problem is that the index is not necessarily measuring progress towards gender parity.
The report argues that the “Index rewards countries that reach the point where outcomes for women equal those for men, but it neither rewards nor penalizes cases in which women are outperforming men in particular indicators in some countries.” (p. 5) However, this is simply not correct for the measure on healthy life expectancy.
If we take Rwanda in 2017 as an example, the healthy life expectancy for women is 60.8 whereas it is 52.3 for men (a difference of 8.5 years). This is a big gender gap but is rewarded by the Index as women are outperforming men (remember that Rwanda is number 1 on the subindex). If we then look at Denmark in 2017, the value for women is 72.3 and 70.0 for men (a difference of 2.3 years). This is punished by the Index with a rank of 104 to Denmark.
In other words, while the gender gap is obviously smaller in Denmark (2.3 years) than in Rwanda (8.5 years), Rwanda is getting a much better ranking on the specific variable (103 rankings better!). This leads to a better overall ranking as the Index rewards the gender gap in Rwanda (leading to an overall placement of 4 in the ranking system) and punishes Denmark with an overall score of 14. Consequently, we cannot say anything about the overall gender gap score in Rwanda or/and Denmark by looking at the Index (or any other country for that matter).
When we combine these issues with the report, we will see that the Index – all else equal – directly rewards countries with low development. To illustrate this, let us compare Rwanda and Denmark in 2016. In Denmark, the gender gap in healthy life expectancy was 2 years resulting in a female-to-male ratio of 1.03 (71 years/69 years). In Rwanda the gender gap was also 2 years resulting in a female-to-male ratio of 1.04 (57 years/55 years).
As the Index is rewarding a greater ratio, lower development values are rewarded (i.e. lower healthy life expectancy). Consequently, since the gender gap was the same in Denmark and Rwanda in 2016, but Rwanda had a lower life expectancy, they performed better on the Index (13 places better than Denmark). This problem becomes more and more serious when the overall level of development decreases and the gender gap increases.
To show the implication of this, Figure 2 presents a list of the countries with the best ranking (number 1) in 2017 on health and survival. There is an interesting absence of developed Western countries. (But do note that even Syria is doing a top notch job in the GGGR when it comes to health and survival!)
Figure 2: Gender gap in health and survival, best countries
The nature of the problems makes it difficult to make comparisons between countries and use the rankings to say anything meaningful about what is going on in the individual countries over time. Accordingly, it is a bad measure for any meaningful policy discussion.
The World Economic Forum writes in the report: “The Global Gender Gap Index was first introduced by the World Economic Forum in 2006 as a framework for capturing the magnitude of gender-based disparities and tracking their progress over time.” (page vii)
However, the problem is that we cannot say anything about progress over time when we look at the Index! From 2016 to 2017, Rwanda went from being number 100 to number 1 in healthy life expectancy despite an increase in the gender gap.
Gender parity is an important topic and I am sure the World Economic Forum is doing a great job pushing this agenda and turning it into an even more salient issue. However, in the current setup with these measures, I see no reason to take the ranking serious. Future reports will have to take the aspects discussed above into account before we might be able to compare gender parity across different countries.
Together with 87 other scientists I am co-author on a new paper in Nature Human Behavior. The paper is titled Justify your alpha and the abstract is as follows:
In response to recommendations to redefine statistical significance to P ≤ 0.005, we propose that researchers should transparently report and justify all choices they make when designing a study, including the alpha level.
– Fundamentals of Data Visualization
– Analyze Survey Data for Free: Step by Step Instructions to Explore Public Microdata from an Easy to Type Website
– Webcast: Machine Learning and Econometrics
– Step-by-step introduction to quantitative text analysis using quanteda
– Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data (article based on two blog posts)
– Scraping Wikipedia Tables from Lists for Visualisation
– Visualizing Incomplete and Missing Data
My article, ‘Welfare Retrenchments and Government Support: Evidence from a Natural Experiment’, is now published in the European Sociological Review (vol. 34, no. 1). The abstract sums up the content of the article:
A large body of literature has provided mixed results on the impact of welfare retrenchments on government support. This article examines whether the impact of welfare retrenchments can be explained by proximity, i.e. whether or not the retrenched policy is related to people’s everyday lives. To overcome limitations in previous studies, the empirical approach utilizes a natural experiment with data from the European Social Survey collected concurrently with a salient retrenchment reform of the education grant system in Denmark. The results confirm that people proximate to a welfare policy react substantially stronger to retrenchment reforms than the general public. Robustness and placebo tests further show that the results are not caused by non-personal proximities or satisfaction levels not related to the reform and the government. In sum, the findings speak to a growing body of literature interested in the impact of government policies on mass public.
– Data organization in spreadsheets
– Statistical Computing for Scientists and Engineers
– Data science terminology
– Using Big Data to Solve Economic and Social Problems
– Decoding error messages in R
– Data Visualization for Political Scientists
– No. 1 — Introduction, and Averaging Polls of the 2016 U.S. Presidential Election
En af de seneste nyheder i det forgangne år var, at et klart flertal af danskerne ønskede at dronningen skulle abdicere. Baggrunden for dette var en meningsmåling foretaget af analysebureauet Wilke for Avisen.dk, der blev samlet op af diverse andre nyhedsmedier.
Der er flere gode grunde til at være kritisk i forhold til den pågældende meningsmåling, hvorfor jeg også glædeligt har kommenteret på meningsmålingen for TjekDet.
Hvad jeg ikke vidste da jeg kommenterede meningsmålingen var, at 21 procent af de adspurgte i meningsmålingen svarede ‘ved ikke’, som er blevet frasorteret i meningsmålingen. Dette gør blot kritikpunkterne endnu mere relevante. Ligeledes er det sjovt at se, hvordan man kan forsvare en misvisende overskrift med, at man “laver journalistik”.