Erik Gahner Larsen

At what age are people considered old?

At what age are people being described as being old? I saw the figure below getting a lot of attention on Twitter with the description: ‘As a kid my dad told me “The age you consider ‘old’ is the square-root of your age times 10”. At 9 you think 30 is old, at 16 you think 40, etc. Turns out he was wrong. It’s the square-root of your age times 8.’

That is a great fit, i.e. the overlap between the blue and the orange line, but is it true? A follow-up tweet describes that the figure shows the answers from ~200 people. That is not a lot.

I always tell my students that they should never go out and collect low-quality data if they can download high-quality secondary data for free. Luckily, the European Social Survey (round 4) provides data on this exact question for more than 50,000 respondents. Using this data, I replicated the figure:

In this figure we do not see as great a fit as in the other figure. For 18-year-olds the average answer is 57 years. For 25-year-olds the average answer is 60. This is far from the estimates we get with the square-root of the respondent’s age times 8.

The lesson? Do not overfit your model (especially not to your N≈200 sample).

Potpourri: Statistics #46

Why you should not trust the Facebook experiment

Recently, there has been a lot of focus on the implications of using Facebook. One study, “The Facebook Experiment: Quitting Facebook Leads to Higher Levels of Well-Being“, argues that people who leave Facebook feel better with their lives. Matthew Yglesias talks about the study in this clip from Vox:

The study also got some attention back in 2016 when it was published (see e.g. The Guardian). This is not surprising as the study presents experimental evidence that people who are randomly assigned to not using Facebook felt better with their lives on a series of outcomes.

The only problem is that the study is fundamentally flawed.

The study finds that people who did not use Facebook for a week reported significantly higher levels of life satisfaction. The design relied on pre and post test measures from a control and treatment group, where the treatment group did not use Facebook for a week. The problem – and the reason we should not believe the results – is that people who took part in the study were aware of the purpose of the experiment and signed up with the aim of not using Facebook! In short, this will bias the results and thereby have implications for the inferences made in the study. Specifically, we are unable to conclude whether the differences between the treatment and the control group is due to an effect of quitting Facebook or is an artifactual effect.

First, when respondents are aware of the purpose of the study, we face serious challenges with experimenter demand effects. People assigned to the treatment group will know that they are expected to show positive reactions to the treatment. In other words, there might not be a causal effect of not being on Facebook for a week, but simply an effect induced by the design of the study.

An example of the information available to the respondents prior to the experiment can be found in the nation-wide coverage. The article (sorry – it’s in Danish) informs the reader that the researchers expect that using Facebook will have a negative impact on well-being.

Second, when people know what the experiment is about and sign up with the aim of not using Facebook, we should expect a serious attrition bias, i.e. that people who are not assigned to their preferred treatment will drop out of the experiment. In other words, attrition bias arises when the loss of respondents is systematically correlated with experimental conditions. This is also what we find in this case. People who got the information that they should continue to use Facebook dropped out of the study.

Figure 1 shows the number of subjects in each group before and after the randomisation in the Facebook experiment. In short, there was a nontrivial attrition bias, i.e. people assigned to the control group dropped out of the study.

Figure 1: Attrition across conditions

The dashed line indicates the attrition bias. We can see that the control group is substantially smaller than the treatment group.

Third, when people sign up to an experiment with a specific purpose (i.e. not using Facebook), they will be less likely to comply with their assigned treatment status. This is also what we see in the study. Specifically, as is described in the paper: “in the control group, the participants’ Facebook use declined during the experiment from a level of 1 hour daily use before the experiment to a level of 45 minutes of daily Facebook use during the week of the experiment.” (p. 663)

These issues are problematic and I see no reason to believe any of the effects reported in the paper. When people sign up to an experiment with a preference for not being on Facebook, we cannot draw inferences beyond this sample and say anything about whether people will be more or less happy by not using Facebook.

Potpourri: Statistics #45

How I find new research

There is a lot of new and interesting academic research coming out every day. Working papers, book chapters (you can usually ignore these), journal articles, books etc. So, how to stay up to date on all this new research? Here are my personal recommendations.

First and most importantly: Twitter. This is by far the easiest way to keep yourself updated. You don’t need to (re)tweet or in any other way engage in the conversations on Twitter, but you should at least have an account and follow your favourite scholars1.

Luckily, it is impossible not to hear about new research from a person if you follow that person on Twitter. Furthermore, people are usually good at tweeting about interesting research similar to their own interests (which hopefully will overlap with your interests).

That also brings us to the challenge of using Twitter: information overload. The more people you follow on Twitter, the more difficult it is to ensure that you notice the tweets relevant to you. It is very easy to follow new people on Twitter. Good Twitter use is not about following as many researchers as possible but about optimizing the signal-to-noise ratio, i.e. seeing more relevant tweets and less irrelevant tweets.

I can recommend that you do a mental cluster analysis and create (private) lists of people connected within their respective domains. For example, you can create lists with academics within different fields/topics (U.S. political scientists, European political scientists, open science, R, economists, psychologists, sociologists etc.)

While there is an overlap between the different lists, they can structure your Twitter use and make it easier to stay up to date on what is going on compared to one major feed with everybody, especially if you are offline or busy not being on Twitter for multiple days and eventually have to catch up. You can read more about lists on Twitter here.

Second, Google Scholar. An important feature of Google Scholar is that you can follow researchers, articles and key words (so-called email alerts). If you follow a researcher on Google Scholar, this will give you a mail notification when the person has new research. You can also follow citations to that persons, i.e. get mail notifications on the new research that is citing work by the person.

Within any scientific subfield there is usually a review piece or two that everybody cites. It is a good idea to sign up for notifications in relation to those articles so you get a mail when there is new work that cite this work. Last, if you work with specific concepts it is a good idea to follow such key words as well.

Third, journal RSS feeds. This was my main method for years, basically getting notifications about the most recent number of a journal and/or articles available in advance/FirstView. I still follow the journals but it is getting less useful for three reasons. First, there is a heavy delay so you have often seen the work months (if not years) in advance of the actual publication (especially if you use the two methods above). Second, there is an overlap with the above methods, so if anything relevant is coming out, you can be sure that it will reach your Twitter feed. Third, going back to the signal-to-noise ratio, the more generic journals you follow, the more irrelevant research will end up in your feed.

These are just a few of the ways in which you can find new research (again, my recommendations). If you want another example on how you can find new research in line with your interests, see this tweet from John B. Holbein (he usually tweets a lot of interesting political science research).

  1. If they are not on Twitter you should reconsider whether they are in fact your favourites. []

Problems with The Global Gender Gap Report

Or, why is Rwanda doing better than Denmark?

In this post I outline basic methodological problems with The Global Gender Gap Report (the GGGR). The GGGR is developed by the World Economic Forum (WEF) and “benchmarks 144 countries on their progress towards gender parity across four thematic dimensions.”

Benchmarking 144 very different countries on their gender parity is a challenging task. Sadly, the report from the World Economic Forum is not doing a great job accommodating the challenges. The issues in the report are severe and the rankings should not be taken seriously. In short, the country rankings in the GGGR are misleading at best and completely meaningless at worst.

I will look at the most recent report from 2017 and illustrate some interrelated problems. There are other issues with the report but below I touch upon some of the most important. For some of the other issues in the report, see my (and others) comments in this article (sorry, it is in Danish).

The GGGR measure the relative gaps between women and men across four thematic dimensions: health, education, economy and politics. For each of the four dimensions we see that 13 out of the 14 variables are ratios.

For the subindex Health and Survival, the variables are 1) sex ratio at birth and 2) female healthy life expectancy (also as a ratio relative to the male value). This subindex will help us understand one of the main problems with the report, namely that it is not tapping into any meaningful gender gaps. Specifically, we will look at healthy life expectancy. This is a measure of “Average number of years that a person can expect to live in full health, calculated by taking into account years lived in less than full health due to disease and/or injury.”

Since men are doing exceptionally bad on the healthy life expectancy variable in Rwanda (with a value of 52.3), Rwanda is getting a very good score on this variable and this is affecting its overall rank as number 4 in the Global Gender Gap Report. Figure 1 shows the top 15 countries doing best on the gender parity list (notice Rwanda as number 4). The blue lines indicate the size of the gender gap.

Figure 1: Gender gap rankings, top 15 countries

The report is partially aware about this issue, as they write: “the Index is constructed to rank countries on their gender gaps not on their development level.” (p. 4). However, this is a serious problem as developed countries are doing much better in terms of the gender gap in health and survival, but this is not to be seen in the rankings (on the contrary, countries are punished for this, cf. below).

In other words, the first key problem is that the index is not necessarily measuring progress towards gender parity.

The report argues that the “Index rewards countries that reach the point where outcomes for women equal those for men, but it neither rewards nor penalizes cases in which women are outperforming men in particular indicators in some countries.” (p. 5) However, this is simply not correct for the measure on healthy life expectancy.

If we take Rwanda in 2017 as an example, the healthy life expectancy for women is 60.8 whereas it is 52.3 for men (a difference of 8.5 years). This is a big gender gap but is rewarded by the Index as women are outperforming men (remember that Rwanda is number 1 on the subindex). If we then look at Denmark in 2017, the value for women is 72.3 and 70.0 for men (a difference of 2.3 years). This is punished by the Index with a rank of 104 to Denmark.

In other words, while the gender gap is obviously smaller in Denmark (2.3 years) than in Rwanda (8.5 years), Rwanda is getting a much better ranking on the specific variable (103 rankings better!). This leads to a better overall ranking as the Index rewards the gender gap in Rwanda (leading to an overall placement of 4 in the ranking system) and punishes Denmark with an overall score of 14. Consequently, we cannot say anything about the overall gender gap score in Rwanda or/and Denmark by looking at the Index (or any other country for that matter).

When we combine these issues with the report, we will see that the Index – all else equal – directly rewards countries with low development. To illustrate this, let us compare Rwanda and Denmark in 2016. In Denmark, the gender gap in healthy life expectancy was 2 years resulting in a female-to-male ratio of 1.03 (71 years/69 years). In Rwanda the gender gap was also 2 years resulting in a female-to-male ratio of 1.04 (57 years/55 years).

As the Index is rewarding a greater ratio, lower development values are rewarded (i.e. lower healthy life expectancy). Consequently, since the gender gap was the same in Denmark and Rwanda in 2016, but Rwanda had a lower life expectancy, they performed better on the Index (13 places better than Denmark). This problem becomes more and more serious when the overall level of development decreases and the gender gap increases.

To show the implication of this, Figure 2 presents a list of the countries with the best ranking (number 1) in 2017 on health and survival. There is an interesting absence of developed Western countries. (But do note that even Syria is doing a top notch job in the GGGR when it comes to health and survival!)

Figure 2: Gender gap in health and survival, best countries

The nature of the problems makes it difficult to make comparisons between countries and use the rankings to say anything meaningful about what is going on in the individual countries over time. Accordingly, it is a bad measure for any meaningful policy discussion.

The World Economic Forum writes in the report: “The Global Gender Gap Index was first introduced by the World Economic Forum in 2006 as a framework for capturing the magnitude of gender-based disparities and tracking their progress over time.” (page vii)

However, the problem is that we cannot say anything about progress over time when we look at the Index! From 2016 to 2017, Rwanda went from being number 100 to number 1 in healthy life expectancy despite an increase in the gender gap.

Gender parity is an important topic and I am sure the World Economic Forum is doing a great job pushing this agenda and turning it into an even more salient issue. However, in the current setup with these measures, I see no reason to take the ranking serious. Future reports will have to take the aspects discussed above into account before we might be able to compare gender parity across different countries.

New paper in Nature Human Behavior: Justify your alpha

Together with 87 other scientists I am co-author on a new paper in Nature Human Behavior. The paper is titled Justify your alpha and the abstract is as follows:

In response to recommendations to redefine statistical significance to P ≤ 0.005, we propose that researchers should transparently report and justify all choices they make when designing a study, including the alpha level.

The paper can be found here and more information on the context for the project can be found here.

Potpourri: Statistics #44

New article in European Sociological Review: Welfare Retrenchments and Government Support

My article, ‘Welfare Retrenchments and Government Support: Evidence from a Natural Experiment’, is now published in the European Sociological Review (vol. 34, no. 1). The abstract sums up the content of the article:

A large body of literature has provided mixed results on the impact of welfare retrenchments on government support. This article examines whether the impact of welfare retrenchments can be explained by proximity, i.e. whether or not the retrenched policy is related to people’s everyday lives. To overcome limitations in previous studies, the empirical approach utilizes a natural experiment with data from the European Social Survey collected concurrently with a salient retrenchment reform of the education grant system in Denmark. The results confirm that people proximate to a welfare policy react substantially stronger to retrenchment reforms than the general public. Robustness and placebo tests further show that the results are not caused by non-personal proximities or satisfaction levels not related to the reform and the government. In sum, the findings speak to a growing body of literature interested in the impact of government policies on mass public.

The article is available as open access here. The replication material can be found at the Harvard Dataverse and at GitHub.