Erik Gahner Larsen

Potpourri: Statistics #45

How I find new research

There is a lot of new and interesting academic research coming out every day. Working papers, book chapters (you can usually ignore these), journal articles, books etc. So, how to stay up to date on all this new research? Here are my personal recommendations.

First and most importantly: Twitter. This is by far the easiest way to keep yourself updated. You don’t need to (re)tweet or in any other way engage in the conversations on Twitter, but you should at least have an account and follow your favourite scholars1.

Luckily, it is impossible not to hear about new research from a person if you follow that person on Twitter. Furthermore, people are usually good at tweeting about interesting research similar to their own interests (which hopefully will overlap with your interests).

That also brings us to the challenge of using Twitter: information overload. The more people you follow on Twitter, the more difficult it is to ensure that you notice the tweets relevant to you. It is very easy to follow new people on Twitter. Good Twitter use is not about following as many researchers as possible but about optimizing the signal-to-noise ratio, i.e. seeing more relevant tweets and less irrelevant tweets.

I can recommend that you do a mental cluster analysis and create (private) lists of people connected within their respective domains. For example, you can create lists with academics within different fields/topics (U.S. political scientists, European political scientists, open science, R, economists, psychologists, sociologists etc.)

While there is an overlap between the different lists, they can structure your Twitter use and make it easier to stay up to date on what is going on compared to one major feed with everybody, especially if you are offline or busy not being on Twitter for multiple days and eventually have to catch up. You can read more about lists on Twitter here.

Second, Google Scholar. An important feature of Google Scholar is that you can follow researchers, articles and key words (so-called email alerts). If you follow a researcher on Google Scholar, this will give you a mail notification when the person has new research. You can also follow citations to that persons, i.e. get mail notifications on the new research that is citing work by the person.

Within any scientific subfield there is usually a review piece or two that everybody cites. It is a good idea to sign up for notifications in relation to those articles so you get a mail when there is new work that cite this work. Last, if you work with specific concepts it is a good idea to follow such key words as well.

Third, journal RSS feeds. This was my main method for years, basically getting notifications about the most recent number of a journal and/or articles available in advance/FirstView. I still follow the journals but it is getting less useful for three reasons. First, there is a heavy delay so you have often seen the work months (if not years) in advance of the actual publication (especially if you use the two methods above). Second, there is an overlap with the above methods, so if anything relevant is coming out, you can be sure that it will reach your Twitter feed. Third, going back to the signal-to-noise ratio, the more generic journals you follow, the more irrelevant research will end up in your feed.

These are just a few of the ways in which you can find new research (again, my recommendations). If you want another example on how you can find new research in line with your interests, see this tweet from John B. Holbein (he usually tweets a lot of interesting political science research).

  1. If they are not on Twitter you should reconsider whether they are in fact your favourites. []

Problems with The Global Gender Gap Report

Or, why is Rwanda doing better than Denmark?

In this post I outline basic methodological problems with The Global Gender Gap Report (the GGGR). The GGGR is developed by the World Economic Forum (WEF) and “benchmarks 144 countries on their progress towards gender parity across four thematic dimensions.”

Benchmarking 144 very different countries on their gender parity is a challenging task. Sadly, the report from the World Economic Forum is not doing a great job accommodating the challenges. The issues in the report are severe and the rankings should not be taken seriously. In short, the country rankings in the GGGR are misleading at best and completely meaningless at worst.

I will look at the most recent report from 2017 and illustrate some interrelated problems. There are other issues with the report but below I touch upon some of the most important. For some of the other issues in the report, see my (and others) comments in this article (sorry, it is in Danish).

The GGGR measure the relative gaps between women and men across four thematic dimensions: health, education, economy and politics. For each of the four dimensions we see that 13 out of the 14 variables are ratios.

For the subindex Health and Survival, the variables are 1) sex ratio at birth and 2) female healthy life expectancy (also as a ratio relative to the male value). This subindex will help us understand one of the main problems with the report, namely that it is not tapping into any meaningful gender gaps. Specifically, we will look at healthy life expectancy. This is a measure of “Average number of years that a person can expect to live in full health, calculated by taking into account years lived in less than full health due to disease and/or injury.”

Since men are doing exceptionally bad on the healthy life expectancy variable in Rwanda (with a value of 52.3), Rwanda is getting a very good score on this variable and this is affecting its overall rank as number 4 in the Global Gender Gap Report. Figure 1 shows the top 15 countries doing best on the gender parity list (notice Rwanda as number 4). The blue lines indicate the size of the gender gap.

Figure 1: Gender gap rankings, top 15 countries

The report is partially aware about this issue, as they write: “the Index is constructed to rank countries on their gender gaps not on their development level.” (p. 4). However, this is a serious problem as developed countries are doing much better in terms of the gender gap in health and survival, but this is not to be seen in the rankings (on the contrary, countries are punished for this, cf. below).

In other words, the first key problem is that the index is not necessarily measuring progress towards gender parity.

The report argues that the “Index rewards countries that reach the point where outcomes for women equal those for men, but it neither rewards nor penalizes cases in which women are outperforming men in particular indicators in some countries.” (p. 5) However, this is simply not correct for the measure on healthy life expectancy.

If we take Rwanda in 2017 as an example, the healthy life expectancy for women is 60.8 whereas it is 52.3 for men (a difference of 8.5 years). This is a big gender gap but is rewarded by the Index as women are outperforming men (remember that Rwanda is number 1 on the subindex). If we then look at Denmark in 2017, the value for women is 72.3 and 70.0 for men (a difference of 2.3 years). This is punished by the Index with a rank of 104 to Denmark.

In other words, while the gender gap is obviously smaller in Denmark (2.3 years) than in Rwanda (8.5 years), Rwanda is getting a much better ranking on the specific variable (103 rankings better!). This leads to a better overall ranking as the Index rewards the gender gap in Rwanda (leading to an overall placement of 4 in the ranking system) and punishes Denmark with an overall score of 14. Consequently, we cannot say anything about the overall gender gap score in Rwanda or/and Denmark by looking at the Index (or any other country for that matter).

When we combine these issues with the report, we will see that the Index – all else equal – directly rewards countries with low development. To illustrate this, let us compare Rwanda and Denmark in 2016. In Denmark, the gender gap in healthy life expectancy was 2 years resulting in a female-to-male ratio of 1.03 (71 years/69 years). In Rwanda the gender gap was also 2 years resulting in a female-to-male ratio of 1.04 (57 years/55 years).

As the Index is rewarding a greater ratio, lower development values are rewarded (i.e. lower healthy life expectancy). Consequently, since the gender gap was the same in Denmark and Rwanda in 2016, but Rwanda had a lower life expectancy, they performed better on the Index (13 places better than Denmark). This problem becomes more and more serious when the overall level of development decreases and the gender gap increases.

To show the implication of this, Figure 2 presents a list of the countries with the best ranking (number 1) in 2017 on health and survival. There is an interesting absence of developed Western countries. (But do note that even Syria is doing a top notch job in the GGGR when it comes to health and survival!)

Figure 2: Gender gap in health and survival, best countries

The nature of the problems makes it difficult to make comparisons between countries and use the rankings to say anything meaningful about what is going on in the individual countries over time. Accordingly, it is a bad measure for any meaningful policy discussion.

The World Economic Forum writes in the report: “The Global Gender Gap Index was first introduced by the World Economic Forum in 2006 as a framework for capturing the magnitude of gender-based disparities and tracking their progress over time.” (page vii)

However, the problem is that we cannot say anything about progress over time when we look at the Index! From 2016 to 2017, Rwanda went from being number 100 to number 1 in healthy life expectancy despite an increase in the gender gap.

Gender parity is an important topic and I am sure the World Economic Forum is doing a great job pushing this agenda and turning it into an even more salient issue. However, in the current setup with these measures, I see no reason to take the ranking serious. Future reports will have to take the aspects discussed above into account before we might be able to compare gender parity across different countries.

New paper in Nature Human Behavior: Justify your alpha

Together with 87 other scientists I am co-author on a new paper in Nature Human Behavior. The paper is titled Justify your alpha and the abstract is as follows:

In response to recommendations to redefine statistical significance to P ≤ 0.005, we propose that researchers should transparently report and justify all choices they make when designing a study, including the alpha level.

The paper can be found here and more information on the context for the project can be found here.

Potpourri: Statistics #44

New article in European Sociological Review: Welfare Retrenchments and Government Support

My article, ‘Welfare Retrenchments and Government Support: Evidence from a Natural Experiment’, is now published in the European Sociological Review (vol. 34, no. 1). The abstract sums up the content of the article:

A large body of literature has provided mixed results on the impact of welfare retrenchments on government support. This article examines whether the impact of welfare retrenchments can be explained by proximity, i.e. whether or not the retrenched policy is related to people’s everyday lives. To overcome limitations in previous studies, the empirical approach utilizes a natural experiment with data from the European Social Survey collected concurrently with a salient retrenchment reform of the education grant system in Denmark. The results confirm that people proximate to a welfare policy react substantially stronger to retrenchment reforms than the general public. Robustness and placebo tests further show that the results are not caused by non-personal proximities or satisfaction levels not related to the reform and the government. In sum, the findings speak to a growing body of literature interested in the impact of government policies on mass public.

The article is available as open access here. The replication material can be found at the Harvard Dataverse and at GitHub.

Skal dronningen abdicere?

En af de seneste nyheder i det forgangne år var, at et klart flertal af danskerne ønskede at dronningen skulle abdicere. Baggrunden for dette var en meningsmåling foretaget af analysebureauet Wilke for, der blev samlet op af diverse andre nyhedsmedier.

Der er flere gode grunde til at være kritisk i forhold til den pågældende meningsmåling, hvorfor jeg også glædeligt har kommenteret på meningsmålingen for TjekDet.

Hvad jeg ikke vidste da jeg kommenterede meningsmålingen var, at 21 procent af de adspurgte i meningsmålingen svarede ‘ved ikke’, som er blevet frasorteret i meningsmålingen. Dette gør blot kritikpunkterne endnu mere relevante. Ligeledes er det sjovt at se, hvordan man kan forsvare en misvisende overskrift med, at man “laver journalistik”.

Skal medierne formidle metodiske informationer i dækningen af meningsmålinger?

Nyhedsartikler med meningsmålinger fortæller ofte historier, der har rod i tilfældig støj, laver absurde fortolkninger på baggrund af misvisende spørgsmålsformuleringer, “glemmer” at informere om hvem der har betalt for meningsmålingerne og så videre.

Derfor har jeg argumenteret tidligt og silde for, at medierne skal informere om metodiske aspekter, da disse er altafgørende for at kunne vurdere, hvor god dækningen af en meningsmåling er. Hvis metodiske informationer udebliver, er vi med andre ord ikke i stand til at kunne vurdere kvaliteten af en meningsmåling.

I 2011 besluttede jeg mig for, sammen med en god ven, at indsamle en masse artikler i medierne og undersøge, hvor gode medierne var til at informere om metodiske aspekter. Motivationen for dette var en begrænset systematisk viden herom i en dansk sammenhæng, men også en frustration over mediernes – i vores optik – mangelfulde dækning af metodiske informationer.

På baggrund af tidligere studier valgte vi at fokusere på bestemte aspekter, herunder om spørgsmålsformuleringens ordlyd, stikprøvestørrelsen og den statistiske usikkerhed blev formidlet. Resultaterne bekræftede overordnet betragtet vores forventninger og blev publiceret i Tidsskriftet Politik.

Til trods for at jeg i de fleste sammenhænge finder metodiske informationer relevante, er jeg ikke ukritisk tilhænger af blot at formidle så mange metodiske informationer som muligt. I dette indlæg vil jeg derfor gøre hvad jeg kan for at mindske relevansen af vores føromtalte studie. Eller som minimum komme ind på nogle af de forbehold, det er vigtige at holde sig for øje.

For det første er der begrænset plads i nyhedsartikler. AAPOR opererer eksempelvis med +10 metodiske informationer, der bør formidles, og der vil være tilfælde, hvor pladsen ikke tillader formidlingen af så mange informationer. Pladsbegrænsninger er en mindre bekymring når det handler om netartikler, men man skal ikke desto mindre være bevidst om, at der ganske enkelt er naturlige begrænsninger på, hvor lange historier der kan skrives om meningsmålinger.

For det andet er alle metodiske informationer ikke lige relevante. Hvad der kan være relevant i én sammenhæng kan i andre sammenhænge være tilnærmelsesvist ligegyldigt. Hvis der eksempelvis er tale om en meningsmåling omkring partivalg, er den eksakte ordlyd på spørgsmålet som regel ikke afgørende, hvor ordlyden i en meningsmåling omkring holdningen til et bestemt politisk emne er yderst relevant – og i mange tilfælde afgørende for, hvilke svar man får.

For det tredje kan formidlingen af mange metodiske informationer føre til at læseren husker mindre fra en meningsmålingsartikel. Derfor kan det give mening at anbefale journalister ikke at bestræbe sig på at formidle et tocifret antal af metodiske informationer, men i stedet at skulle forholde sig til, hvilke metodiske informationer der er relevante i den pågældende sammenhæng.

For det fjerde er det ikke selvskrevet, at metodiske informationer hjælper læserne med at forstå meningsmålinger. Det er således muligt, at en læser kan huske hvad den statistiske usikkerhed er i en meningsmåling, men dette betyder ikke, at vedkommende forstår hvad den statistiske usikkerhed helt præcist er for en størrelse og hvordan den skal tolkes. Derfor kan metodiske informationer ofte ikke stå alene. Nogle informationer kan være nødvendige, men de er sjældent tilstrækkelige.

For det femte er der andre elementer, der påvirker hvordan læserne tolker meningsmålingernes troværdighed. Vi skal dermed ikke forholde os til metodiske informationer isoleret set. Et amerikansk studie viser således, at borgerne er mere tilbøjelige til at finde en meningsmåling pålidelig, hvis den harmonerer med egne politiske overbevisninger, og det at formidle metodiske informationer gør hverken fra eller til i forhold til dette. Gevinsterne ved at formidle metodiske informationer er dermed sandsynligvis mindre end vi har antaget.

Alt dette fører til, at formidlingen af metodiske informationer ikke kan stå alene. Det er vigtigere at fokusere på, om de narrativer journalisterne laver, er konsistente med meningsmålingerne, der formidles, end at tælle hvor mange metodiske aspekter, der formidles. Der kan således være tilfælde, hvor 2-3 metodiske informationer er alt, der er brug for, og ekstra informationer tilføjes på bekostning af andre informationer og læserens oplevelse.

Da vi lavede vores undersøgelse i 2011 fandt vi talrige eksempler på nyheder, hvor der var en eksplicit diskrepans mellem de metodiske aspekter og selve artiklen. Eksempel 1: “Alle forskydninger ligger dog inden for målingens statistiske usikkerhed på 2,8 procent.” Eksempel 2: “Men selvom S går frem fra 25 pct. af stemmerne i går til 26,5 i dagens måling, så skal man bemærke, at bevægelsen ligger indenfor den statistiske usikkerhed, som stikprøveundersøgelsen bevæger sig indenfor.” Eksempel 3: “Bevægelserne er inden for den statistiske usikkerhed.”

I de tilfælde blev den statistiske usikkerhed nævnt, men det var ikke en god dækning. Vi kan derfor gå nok så meget op i, hvor mange metodiske informationer, der formidles, men hvis vi i sidste instans ikke har en dækning, der tager dem seriøst, har vi langt større problemer. Dette er elementer Yosef Bhatti og Rasmus Tue Pedersens tager op i deres undersøgelse af formidlingen af meningsmålinger i relation til den statistiske usikkerhed.

Min opfattelse er, at journalister i de fleste tilfælde ikke er kvalificerede til at vurdere, hvilke informationer der er relevante. Derfor er det ofte tilfældigt, om metodiske informationer formidles – og i så fald hvilke. Som tommelfingerregel bør medierne formidle metodiske informationer, men flere informationer er ikke altid bedre, og i værste fald fjerner det fokus fra, hvilke metodiske informationer der er vigtige samt hvordan de bruges.

Smittede kommunalvalget af på partiernes opbakning?

I forlængelse af kommunalvalget har der været flere artikler om, at kommunalvalget har smittet af på partiernes nationale opbakning. Vi har således set historier som Kommunalvalgssejr smitter af på Mette Frederiksen, Dansk Folkepartis lokale lussing giver ekko på landsplan og Dansk Folkeparti taber terræn – S i medvind.

Fælles for disse artikler er, at de alle tager udgangspunkt i enkeltmålinger (de to førstnævnte bygger begge på tal fra Megafon). Derfor besluttede jeg mig for, at kigge nærmere på målingerne fra institutterne, hvor vi har en måling både før og efter kommunalvalget. Dette har vi fra fem institutter: Greens, Epinion, Gallup, Voxmeter og Megafon.

Figur 1 viser forskellen i opbakningen til de respektive partier hos de forskellige institutter før og efter kommunalvalget. På x-aksen vises forskellen mellem kommunalvalget i år (KV17) og sidste kommunalvalg (KV13). På y-aksen vises forskellen mellem målingen fra før kommunalvalget og efter kommunalvalget (hvor positive værdier indikerer fremgang).

Figur 1: Opbakning før og efter kommunalvalget

Hvad kan vi sige på baggrund af disse tal? Det mest iøjnefaldende er, at de fleste udsving er små og ikke statistisk signifikante fra den ene måling til den næste. Der er enkeltmålinger, der giver en stor fremgang til Socialdemokraterne, men her skal man bemærke følgende forbehold: Greens’ før-måling er fra begyndelsen af oktober (næsten to måneder før kommunalvalget), og den anden er fra Megafon.

I det store hele er der ikke de store tegn på, at kommunalvalget førte til store ændringer i partiernes nationale opbakning fra én måling til den næste. De fleste ændringer er små og fortjener blot, at man tager den statistiske usikkerhed in mente. Dette er selvfølgelig ikke det samme som, at kommunalvalget (og dækningen heraf) ikke kan have betydning på længere sigt, men man skal være meget varsom med at konkludere, at kommunalvalget har smittet af på partiernes landspolitiske opbakning.