“This letter closes the question”

Here is the beginning of an abstract from a recent study published in the American Political Science Review:

Larger protests are more likely to lead to policy changes than small ones are, but whether or not attendance estimates provided in news or generated from social media are biased is an open question. This letter closes the question: news and geolocated social media data generate accurate estimates of protest size variation.

I disagree from the bottom of my heart. The question is not closed. I don’t understand why researchers need to mislead in the abstract in order to make the research stand out.

Let me make a guess. It’s because the editor, reviewers and authors all know that this can be one of the studies people will cite uncritically (when they use unreliable or reliable protest size data). “We thank Reviewer 1 for paying attention to this issue. We have consulted the literature and we now provide a reference to a study showing that social media data accurately measure protest size variation“. I mean, it makes sense if you read the title of the paper: “News and Geolocated Social Media Accurately Measure Protest Size Variation”. My prediction is that we will see a lot of research where news and social media data will be used to measure protest size with no further validation (beyond a reference to this study).

Is that a problem? The research is strong and I have no reason to be concerned with the quality of the work (I also admire the work of some of the researchers mentioned in the ‘Acknowledgement’, so I assume it’s good). I believe there is a problem, however, in how researchers try to sell their findings as closing a debate that has merely started.

The study in question looks at one salient protest type (Women’s March protest) in one country (the US). The (peaceful) Women’s March protest is not necessarily representative for most protests we see around the world (or representative for the sociodemographic composition of protest participants in general) – and the researchers know that. Actually, they are explicit about this towards the end of the paper: “Since this paper’s validation has only been tested on one event, the scope to which it holds remains to be tested. The results probably hold in other wealthy democracies, though for now that claim remains an assumption.” Why not add that to the abstract? Seriously. “The results probably hold in other wealthy democracies, though for now that claim remains an assumption”.

If that claim remains an assumption, why then close the question in the abstract? I am tired of how researchers are pandering to the intellectual laziness of their colleagues. Why not be honest and just say that the study adds evidence to the matter? And why do I bother writing a blog post about a single paragraph in a paper? Because I believe we should care about how we communicate research findings – and in particular what can be learned from a single study.

25 interesting facts #5

101. Claude Monet’s late works were the result of cataracts and not conscious experimentation with a more expressionistic style (Gruener 2015)

102. There are about ~40,000 virus species in mammals, including ~10,000 viruses with zoonotic potential (Carlson et al. 2019)

103. People mispredict the time course of their own creativity (Lucas and Nordgren 2020)

104. Attractive individuals are more likely to identify as conservative (Peterson and Palmer 2017)

105. Political leaders who experienced economic hardship in their youth increase social welfare spending when in power (Han and Han 2020)

106. Digital technology explains less than 0.4% of the variation in well-being among adolescents (Orben and Przybylski 2019)

107. Exposure to elite discourse about fake news leads to less accurate identification of real news (Van Duyn and Collier 2019)

108. The arrival of children creates a long-run gender gap in earnings (Kleven et al. 2019)

109. Proximity to fast food has significant deleterious effects on children’s weight outcomes (Han et al. 2020)

110. Conservatives are more likely to express concern about climate change when framed as a national security issue (Motta et al. 2020)

111. Access to an uncensored internet does not induce people to acquire politically sensitive information (Chen and Yang 2019)

112. Voluntary environmental action induces positive reactions among citizens towards individual firms (Kolcava et al. 2020)

113. More films with US presidents are released during Democratic administrations than during Republican administrations (Uscinski 2009)

114. There is no effect of oxytocin on trusting behaviour (Declerck et al. 2020)

115. Lacking household sanitation is associated with women’s risk of non-partner sexual violence (Jadhav et al. 2016)

116. Voters evaluate female party leaders more positively than male party leaders (Bridgewater and Nagel 2020)

117. Bronze medalists tend to be happier than silver medalists (Husted et al. 1995; Hedgcock et al. 2020)

118. Blood alcohol concentration predicts utilitarian responses in moral dilemmas (Duke and Bègue 2015)

119. Adoption of copyrights within Italy increased the number and the quality of operas (Giorcelli and Moser 2020)

120. When consuming Coca Cola and Pepsi, people are influenced more by the label than the actual taste (Woolfolk et al. 1983)

121. In Spain, the more foreign players in the club winning, the more favourable the views of immigrants in the club’s region (Lago and Lago-Peñas 2020)

122. The presence of African American soldiers in the U.K. during World War II reduced anti-minority prejudice (Schindler and Westcott 2020)

123. In New York and London, individuals in high-income occupations are concentrating in the city centre (van Ham et al. 2020)

124. In speed dating, male selectivity is invariant to group size, while female selectivity is strongly increasing in group size (Fisman et al. 2006)

125. People who are unaware of the price do not derive more enjoyment from more expensive wine (Goldstein et al. 2008)

‘A Diamond is Forever’ and Other Fairy Tales … and Other Fairy Tales

One of the most popular studies on the Social Science Research Network (SSRN) is a study published in Economic Inquiry in 2015. The abstract is viewed more than half a million times and the study is downloaded ~80,000 times. These are impressive numbers. The study is in top 10 of the all time total number of downloads on the SSRN.

Why do people share and like this study? Because there is a certain irony to the main finding, namely that the more money people spend on an engagement ring, the less likely it is that their marriage will last. Unsurprisingly, this study was also picked up by the popular press (e.g. here, here and here).

The title of the study is ‘“A diamond is forever” and other fairy tales: The relationship between wedding expenses and marriage duration‘. Here is the full abstract of the study:

In this study, we evaluate the association between wedding spending and marriage duration using data from a survey of more than 3,000 ever-married persons in the United States. Controlling for a number of demographic and relationship characteristics, we find evidence that marriage duration is inversely associated with spending on the engagement ring and wedding ceremony.

With the impressive number of downloads of the study, I am surprised that I haven’t seen more critical reflections on the study. I decided to take a closer look at the study and I am now even more surprised by all the attention that the study has received. Let’s try to understand what the study is about.

The basic idea of the study is to compare whether people with a more expensive engagement ring are more likely to get divorced (the outcome of the study is marital dissolution). I applaud the authors for taking on such a difficult task as this is one of the question we cannot study in a randomised experiment. Specifically, we cannot sample a lot of people that are about to get married and assign half of the sample to a cheap engagement ring and the other half to an expensive engagement ring.

More importantly, when we study a question like these, we need to be very much concerned about whether people who are divorced and have an expensive engagement ring are more likely to take part in a survey for money than people who are still married with an expensive engagement ring.

The sample composition is from what I can see one of the primary limitation with the study. Specifically, the study is using a sample of people from mTurk which is not a nationally representative sample. On the contrary, the study is relying on a lot of participants that were offered $0.50–$0.75 to answer approximately 40 questions. This sample is not sufficient to say anything meaningful about the impact of an expensive engagement ring on marriage duration. Importantly, the authors are aware of this limitation on page 1920 in the study: “Of course, for our purposes, a large national probability sample would be preferable.” Of course.

There is a much more severe problem with the study that, even if we assume that the design is strong, there is no reason to believe that people with a more expensive engagement ring are more likely to get divorced.

In brief, the authors conduct a survival analysis (a Cox proportional hazard model). Take a look at this description from the paper (page 1922) and see if you can get a sense of what some of the issues are:

In bivariate regressions, having no engagement ring and having wedding expenses below $1,000 are each associated with increases in the hazard of divorce, while spending $8,000 or more on an engagement ring is associated with a decrease in the hazard of divorce. This appears to be consistent with the relationship between wedding expenses and marriage duration posited by wedding industry advertising. However, the picture changes in multivariate regressions. Spending between $2,000 and $4,000 on an engagement ring is significantly associated with an increase in the hazard of divorce among the sample of men. Specifically, in the sample of men, spending between $2,000 and $4,000 on an engagement ring is associated with a 1.3 times greater hazard of divorce as compared with spending between $500 and $2,000.

There is a lot of stuff going on in the text above. The first problem is that bivariate models actually suggest the opposite of what the authors argue in the paper. The consistent finding in a bivariate model is that people engaged with no engagement ring are significantly more likely to be divorced. And people with an expensive engagement ring are more likely to stay married.

So what about the multiple regressions? There are two issues here. First, and least problematic, we are dealing with a strong portion of regression soup. Here is a list of the covariates in the model: age, marriage age, gender, race/ethnicity, education, employment, household income, region of residence, religious attendance, respondent-spouse differences (age difference, race difference, education difference), children with spouse, knew spouse very well, length of time dated before proposal, feelings and attitudes at time of proposal, had a honeymoon, wedding attendance, and total wedding expenses. One problem here is that some of these variables can induce post-treatment bias and I see no reason to be much more confident in the impact of an expensive engagement ring in the multiple regressions compared to the bivariate regression.

Second, even if we assume that nothing of the above is problematic, there is no pattern in the multiple regressions suggesting that people with a more expensive engagement ring are more likely to get a divorce. Take a look at the regression coefficients for engagement ring expenses in Table 2 (to keep it simple, I do not show all the other variables in the models):

It is a little difficult to understand the coefficients as the reference category is “$500 to $2,000”. In the bivariate model we can see, as stated above, that the more expensive the proposer’s engagement ring, the more likely that the respondent is still married. If we look at the multiple models for all respondents, we see no significant effects. In other words, no indication at all that “marriage duration is inversely associated with spending on the engagement ring” (as pointed out in the abstract).

Let’s take another look at what the authors say they find with the table above in mind: “in the sample of men, spending between $2,000 and $4,000 on an engagement ring is associated with a 1.3 times greater hazard of divorce as compared with spending between $500 and $2,000”. In the sample of men. Of course! Between $2,000 and $4,000. Of course! There is one statistically significant regression coefficient and nine statistically non-significant (not counting the reference category and the coefficients for “Don’t know”). Again, if anything, the models suggest that having spend more money on an engagement ring correlates with a longer marriage – not shorter.

Marriage duration is not inversely associated with spending on the engagement ring. This is just yet another social science fairy tale.

Potpourri: Statistics #68

Rain, Rain, Go away: 137 potential exclusion-restriction violations for studies using weather as an instrumental variable
Awesome R Learning Resources
A Quick Guide for Journalists to the Use and Reporting of Opinion Polls
Mapping congressional roll calls
Fancy Times and Scales with COVID data
Colors via clrs.cc in R
grstyle: Customizing Stata graphs made easy
American political data & R
Best-Practice Recommendations for Defining, Identifying, and Handling Outliers
Likelihood Ratios: A Tutorial
Data Science related quotes
cleanplots: Stata graphics scheme
PLSC 31101: Computational Tools for Social Science
Covid-19: The global crisis — in data
Working with Large Spatial Data in R
PCA tidyverse style
The many Flavours of Missing Values
Introducing RStudio and R Markdown
Tools for Analyzing R Code the Tidy Way
Dive into dplyr (tutorial #1)
The Good, the Bad and the Ugly: how (not) to visualize data
Programatically Generating PDF Reports with the Tidyverse
Building an animation step-by-step with gganimate
“package ‘foo’ is not available” – What to do when R tells you it can’t install a package

Hvorfor er flere respondenter ikke nødvendigvis bedre? #2

I 2011 skrev jeg et trivielt indlæg om, at flere respondenter ikke nødvendigvis er bedre i en meningsmåling, og ofte er værre, da det kan være et tegn på, at meningsmålingen ikke er repræsentativ. Hvis du ser en meningsmåling med 100.000 respondenter, kan du være ganske sikker på, at der er problemer.

I løbet af de seneste par dage har jeg tænkt på en anden grund til, at flere respondenter i og for sig selv ikke gør meningsmålingen bedre – selv hvis meningsmålingen i udgangspunktet er repræsentativ. Forklaringen herpå skal findes i, at der kan være mange respondenter i en meningsmåling, der angiver, at de ikke ved, hvad de vil stemme, at de nægter at stemme m.v., som ikke bruges til at udregne andelene i en meningsmåling.

Lad os eksempelvis antage, at vi har to meningsmålinger. Den ene meningsmåling har 1.000 respondenter, hvoraf 10% af disse ikke vil stemme på et parti. Den anden meningsmåling har 1.200 respondenter, hvoraf 30% af disse ikke vil stemme på et parti. Den anden meningsmåling har i udgangspunktet en mere imponerende stikprøve, men når man kigger på, hvor mange respondenter vi har, der har stemt på de forskellige partier, har vi en mindre stikprøve. Dette handler ikke om at vurdere, hvilken af meningsmålingerne der er bedst, blot at den stikprøvestørrelse du ser i en meningsmåling, og dermed den statistiske usikkerhed, ikke altid er retvisende.

Den statistiske usikkerhed i en meningsmåling påvirkes som bekendt af stikprøvestørrelsen. Jo flere respondenter, desto mindre usikkerhed. Du kan selv beregne den statistiske usikkerhed for et parti her.

Den seneste Voxmeter-måling kan fungere som et fint eksempel. Her er beskrivelsen af meningsmålingen:

Voxmeters politiske meningsmåling, offentliggjort den 8. november 2020, baserer sig på telefoninterview med 1.028 repræsentativt udvalgte personer 18 år+ og er gennemført i perioden fra d. 2. november 2020 til d. 7. november 2020.
Voxmeter afdækker i tabellen kun de personer, der ved, hvad de vil stemme, hvis der var folketingsvalg i morgen.
Ca 25,6% af de adspurgte var på det pågældende tidspunkt “i tvivl om partivalg”, “vil ikke stemme”, “vil stemme blankt” eller “er ikke stemmeberettiget”.
Tallene i parentes angiver forskellen i procentpoint fra folketingsvalget d. 5. juni 2019. Den maksimale usikkerhed på de enkelte partier i denne måling er på +/-2,8 % point.

Målingen har interviewet 1.028 danskere omkring deres partivalg og giver Socialdemokratiet 32,5% af stemmerne. Her er det interessante: De 32,5% er ikke et tal udregnet på baggrund af 1.028 repræsentativt udvalgte respondenter. De 32,5% er kalkuleret med udgangspunkt i et tal der er betydeligt lavere. Vi kan se i målingen, at ca. 25,6% af de adspurgte var i tvivl om deres partivalg, ikke ville stemme, vil stemme blankt eller ikke er stemmeberettiget. 25,6% af 1.028 er ~263, hvilket betyder at 263 respondenter ikke er medregnet i den stikprøve, som er brugt til at udregne de 32,5% til Socialdemokratiet. Stikprøven der er brugt til at udregne denne andel er ~765.

Dette er vigtigt når vi skal forholde os til den statistiske usikkerhed. Den statistiske usikkerhed for et parti med 32,5% af stemmerne med en stikprøve på 1.028 er som rapporteret omkring 2,8%. Men den statistiske usikkerhed for et parti med 32,5% af stemmerne, når tallet er udregnet med udgangspunkt i 765 respondenter, er 3,3%.

Jeg bruger ikke Voxmeter som eksempel fordi de har problemer. Tværtimod. Sammen med Megafon, der angiver andelen af respondeter, der er “i tvivl om partivalg eller ikke ønsker at stemme”, er de blandt de få institutter, hvor det er muligt at finde informationer som ovenstående i deres seneste meningsmålinger. Det er problematisk, at det er undtagelsen snarere end normen, at sådanne informationer formidles.

Kigger vi på meningsmålingerne fra Voxmeter og Megafon i 2020, kan vi se substantiel variation mellem institutterne i, hvor stor en andel af deres respondenter, der ikke angiver et partivalg.

Der kan være forskellige grunde til disse forskelle. For det første er det muligt, at Megafon gør mere for at presse et partivalg ud af deres respondenter (og eksempelvis gør mere for at respondenterne ikke kan vælge, at de vil stemme blankt). For det andet kan det skyldes metodiske forskelle i, hvordan respondenter udvælges og interviewes (via telefon eller over nettet). Det mest interessante er dog hvor begrænset variation der er hos Megafon, hvor andelen af respondenter uden et partivalg er noget nær konstant. Dette forekommer meget mystisk, da der også burde være usikkerhed omkring disse estimater. Variationen vi finder hos Voxmeter ser betydeligt bedre ud.

På baggrund af ovenstående kan vi se, at der vil være forskelle i, hvor stor den statistiske usikkerhed er for de forskellige partier, alt efter hvor stor en andel af respondenterne, der ikke har et partivalg. Jo større denne gruppe er, desto større en diskrepans vil der være mellem de rapporterede statistiske usikkerheder og den reelle statistiske usikkerhed for den rapporterede opbakning til et parti.

Dette minder mig om et indlæg jeg skrev i forbindelse med folketingsvalgkampen i 2015, hvor medierne rapporterede opbakningen til hhv. rød og blå blok, men brugte den maksimale usikkerhed for partierne til at sige noget om, hvorvidt der var forskel på rød og blå blok. Her kan man også sige, at dette vil være et større problem 1) jo tættere rød og blå blok er på 50% og 2) jo mindre det største parti i en måling er (når vi taler om den maksimale statistiske usikkerhed).

Heldigvis er det en norm, at medierne rapporterer noget så basalt som antallet af respondenter i en meningsmåling (i vores studie af folketingsvalg fra 2005 til 2011 fandt vi, at ~94% af meningsmålingsartiklerne formidlede denne information). Det er dog et problem, at det ikke er muligt at finde information om, hvor stor den reelle stikprøve i en meningsmåling er. Derfor skal der lyde en kæmpe ros til Voxmeter og Megafon for at dele sådanne tal. Man kan kun håbe, at de andre analyseinstitutter med tiden vil begynde også at offentliggøre disse tal, hvis de vil have, at man skal tage deres målinger seriøst.

25 guidelines for improving psychological research

I was rereading the paper ‘The New Statistics: Why and How‘ published in Psychological Science the other day. It’s a great paper and I can highly recommend reading it. If you are busy (and I guess you are), make sure to at least read the 25 guidelines for improving psychological research (in Table 1). Here are the guidelines:

  1. Promote research integrity: (a) a public research literature that is complete and trustworthy and (b) ethical practice, including full and accurate reporting of research.
  2. Understand, discuss, and help other researchers appreciate the challenges of (a) complete reporting, (b) avoiding selection and bias in data analysis, and (c) replicating studies.
  3. Make sure that any study worth doing properly is reported, with full details.
  4. Make clear the status of any result—whether it deserves the confidence that arises from a fully prespecified study or is to some extent speculative.
  5. Carry out replication studies that can improve precision and test robustness, and studies that provide converging perspectives and investigate alternative explanations.
  6. Build a cumulative quantitative discipline.
  7. Whenever possible, adopt estimation thinking and avoid dichotomous thinking.
  8. Remember that obtained results are one possibility from an infinite sequence.
  9. Do not trust any p value.
  10. Whenever possible, avoid using statistical significance or p values; simply omit any mention of null-hypothesis significance testing (NHST).
  11. Move beyond NHST and use the most appropriate methods, whether estimation or other approaches.
  12. Use knowledgeable judgment in context to interpret observed effect sizes (ESs).
  13. Interpret your single confidence interval (CI), but bear in mind the dance. Your 95% CI just might be one of the 5% that miss.
  14. Prefer 95% CIs to SE bars. Routinely report 95% CIs, and use error bars to depict them in figures.
  15. If your ES of interest is a difference, use the CI on that difference for interpretation. Only in the case of independence can the separate CIs inform interpretation.
  16. Consider interpreting ESs and CIs for preselected comparisons as an effective way to analyze results from randomized control trials and other multiway designs.
  17. When appropriate, use the CIs on correlations and proportions, and their differences, for interpretation.
  18. Use small- or large-scale meta-analysis whenever that helps build a cumulative discipline.
  19. Use a random-effects model for meta-analysis and, when possible, investigate potential moderators.
  20. Publish results so as to facilitate their inclusion in future meta-analyses.
  21. Make every effort to increase the informativeness of planned research.
  22. If using NHST, consider and perhaps calculate power to guide planning.
  23. Beware of any power statement that does not state an ES; do not use post hoc power.
  24. Use a precision-for-planning analysis whenever that may be helpful.
  25. Adopt an estimation perspective when considering issues of research integrity.

I do not agree with all recommendations (e.g. number 10), but there is a lot of great points in the paper.

Last, the paper also formulates an eight-step strategy for how to conduct research with integrity: 1) Formulate research questions in estimation terms. 2) Identify the ESs that will best answer the research questions. 3) Declare full details of the intended procedure and data analysis. 4) After running the study, calculate point estimates and CIs for the chosen ESs. 5) Make one or more figures, including CIs. 6) Interpret the ESs and CIs. 7) Use meta-analytic thinking throughout. 8) Report.

Observationer relateret til COVID-19 #2

Dette er en forlængelse af mit forrige indlæg med observationer relateret til COVID-19.

Gensyn. Færre film bliver lavet og udgivet i løbet af pandemien. Flere film, der skulle have været ude nu, er blevet udskudt til 2021, og mange projekter er blevet sat på pause. Dette betyder også, at der ganske enkelt er mere tid til at få set andre film (forudsat at den tid man bruger på at se film, er konstant). Jeg har i løbet af de seneste måneder haft mulighed for at gense blandt andet Citizen Kane, Pulp Fiction, No Country for Old Men, The Truman Show, Goodfellas, Do the Right Thing, The Usual Suspects, Mad Max: Fury Road og Chinatown. Nævnte film kan alle anbefales.

Musik. Musikken er en af de få undtagelser i forhold til ny kultur, da det som bekendt er en helt anden (fysisk) proces end at skulle lave en film eller TV-serie. Der er kommet en del gode albums i 2020 og jeg har svært ved at forestille mig, at året ville have set fundamentalt anderledes ud rent musikalsk, havde der ikke været en pandemi. Jeg så en overskrift på et studie forleden, der påstår, at folk har lyttet til ældre musik i løbet af pandemien (idéen er at folk er blevet mere nostalgiske i deres musikforbrug). Jeg har ikke læst studiet og skal af samme grund ikke kunne udtale mig om, hvorvidt jeg tror på resultaterne. Jeg kan blot notere, at jeg ikke selv har bemærket at lytte til mere ældre musik end normalt (eller i hvert fald ikke musik, der kan betegnes som a trip down memory lane). Tværtimod har jeg været travlt beskæftiget med at holde mig opdateret på nye albums, der er kommet ud. Min foreløbige top 10 for i år består af Jay Electronicas A Written Testimony, Killah Priests Rocket to Nebula, Nine Inch Nails Ghosts V: Together, Roger Eno og Brian Enos Mixing Colours, R.A.P. Ferreiras (aka Milo) Purple Moonlight Pages, Curren$y og Harry Frauds The Outrunners, Parkers Suite for Max Brown, Romares Home, Blu & Exiles Miles og MIKEs Weight of the World.

Udgifter. Det er blevet nemmere at lave et budget for, hvilke faste udgifter, man har. Selvom der er nogle bestemte udgifter forbundet med at arbejde hjemmefra, bruger jeg generelt meget få penge, når jeg holder mig hjemme og kun går til Tesco i ny og næ. Ligeledes har HM Revenue and Customs givet et lille skattefradag til dem, der nu er tvunget til at arbejde hjemmefra. På den måde har pandemien været god for min økonomi. Igen er min bekymring ikke min egen privatøkonomi, men samfundsøkonomien generelt betragtet og især dem, der arbejder i brancher der på kort sigt er hårdt påvirket af pandemien.

Klima. Det er ingen overraskelse, at klimaforandringer ikke har stået øverst på dagsordenen i løbet af 2020. Det virker sågar til, at der er en konsensus om, at der først skal være styr på COVID-19 før vi igen beskæftiger os med klimaet. Dette er en klar fejl, som jeg nok skal skrive et længere indlæg om ved en senere lejlighed. Det er ingen hemmelighed, at klimaforandringerne allerede er her (atmosfæren er varmere, gletsjerne smelter og vandstanden er steget). Derfor virker det underligt at der ikke politisk bliver fokuseret mere på klimaforandringerne, end tilfældet er. Dertil skal det tilføjes, at COVID-19 på mange måder har øget min skepsis i forhold til, hvordan vi kan løse mange af de globale udfordringer, vi har med at gøre i disse år. Vi kan ikke selvisolere os ud af klimakrisen. Vi kan ikke blot øge afstanden til hinanden, arbejde hjemmefra og satse på en vaccine. Jeg frygter at vi først for alvor vil se en offentlig (og dermed politisk) interesse, når det er for sent (altså på et tidspunkt hvor prisen vil være betydeligt større, end hvis der blev vist langt større handling nu).

Pendling. De færreste har brug for at pendle meget i 2020, og noget tyder på, at vi kommer til at pendle en del mindre i fremtiden. Jeg har aldrig brudt mig om at pendle, hvorfor jeg forsøger at bo relativt tæt på, hvor jeg arbejder (gerne et sted mellem 15-30 minutter på gåben eller cykel). Forskning har da også vist, at folk der pendler meget er mindre tilbøjelige til at indgå i sociale aktiviteter og har et dårligere helbred (se evt. også denne video). Putnam beskriver blandt andet i Bowling Alone, hvordan ti ekstra minutter i pendlingtid reducerer tilbøjeligheden til at tage del i lokale aktiviter med 10 procent. Dette er dog ikke det samme som, at der ikke kan være fordele ved at pendle. Jeg læste dette studie, der viser, at det at pendle også kan fungere som en rolleovergang, eksempelvis fra forælder til kollega, så pendlingtiden fungerer som en buffer mellem forskellige roller. Jeg havde for år tilbage en idé om at kigge nærmere på de politiske implikationer af at pendle. Konkret var min idé at koble registerdata sammen med surveydata og bruge Google Maps API til at udregne daglig transporttid mellem arbejde og hjem og undersøge de politiske implikationer af at pendle (jeg fik lavet et script, der nemt kunne udregne transporttid mellem hjem og arbejde, men jeg kom aldrig videre med projektet). Dette er dog et af de emner jeg ser frem til, at der kommer mere forskning omkring i relation til COVID-19.

Kontor. En lang periode uden adgang til kontoret har fået mine øjne op for nogle af de aspekter, jeg har savnet ved kontoret. Her tænker jeg ikke det sociale aspekt ved at ses med mine kollegaer, som jeg var klar over fra dag 1, at jeg ville savne. Jeg tænker konkret på de mere trivielle aspekter, herunder at cykle til kontoret i godt vejr (apropos at pendle). Eller det faktum at det føles mere naturligt at gå ned og hente en kaffe hos Starbucks, end hvis man blot sidder og arbejder hjemme (apropos udgifter). Med andre ord har pandemien gjort det nemmere for mig at værdsætte de trivielle aspekter af at arbejde, der alt andet end værdsættes under normale omstændigheder.

Podcasts. På samme måde som med musik har pandemien ikke gjort den store forskel for podcasts – og hvis den har, har den nok kun ført til endnu flere podcasts (til trods for at folk pendler mindre og derfor måske forbruger podcasts i mindre omfang). De podcasts jeg primært lytter til kan deles op i de følgende kategorier: kultur (e.g. Han Duo, Filmnørdens Hjørne, Pop Culture Happy Hour, Sardonicast, Stream and chill, Soundvenue Streamer), politik og økonomi (Conversations with Tyler, The Ezra Klein Show, EconTalk), narrativt orienterede podcasts (e.g. Hardcore History, WeCrashed: The Rise and Fall of WeWork, The Dropout og Revisionist History), psykologi (e.g. Two Psychologists Four Beers, Very Bad Wizards, The Black Goat og Everything Hertz) og diverse andre danske (e.g. DataSnak, Her Går Det Godt, Programmet om Ingenting og Kortsluttet).

Biografen. Det var en fornøjelse at være i biografen igen (bare rolig – masser af distance, mundbind, Test and Trace osv.). De respektive trailers, der blev vist før filmen, var ikke opdateret med premieredatoer. Derfor viste datoerne stadig april og maj. Det fik mig til at fundere over hvor anderledes 2020 havde set ud, havde vi ikke fået COVID-19. Dette var nogle andre kontrafaktiske overvejelser end i begyndelsen af pandemien, nu hvor tiden i 2020 var gået. Som med mange andre semi-sociale aktiviteter er det svært ikke at være blot lidt begejstret for, hvor civiliseret sociale interaktioner er nu. Dette gælder ikke kun for social distance, men også noget så basalt som ikke at tale midt under en film (nuvel, blot en anekdotisk observation, men jeg håber det er et generelt fænomen).

Tidshorisont. COVID-19 er en permanent virus i samfundet. Selv hvis vi ganske usandsynligt skulle lykkedes med at opnå en situation, hvor intet menneske på jordens overflade længere havde COVID-19, ville den stadig være at finde blandt dyr. Spørgsmålet er ikke hvornår COVID-19 er datid, men hvornår vi ganske naturligt vil være i en kontekst, vi igen kan kalde ‘normal’, hvor virussen ikke sætter rammerne og restriktionerne. Jeg ser ingen grund til at være optimist omkring, hvornår vi kan vende tilbage til en ‘normal’ verden. Hvis det er snart, tager jeg lykkeligt fejl.

Lufthavne. Mit førstehåndsindtryk af lufthavne under COVID-19 er, at der er mere plads, bedre wi-fi og færre børn (igen en anekdotisk observation). Alt i alt virker lufthavne mere behagelige. Når det er sagt har jeg ikke helt samme motivation til at være der tidligt, da jeg ikke finder samme ro i at sidde og arbejde i lufthavnen flere timer, før jeg boarder. Lufthavne som et socialt rum, hvor jeg kan få lavet en masse arbejde, synes med anre ord at være fortid.

Rejser. Jeg savner ikke at rejse, men jeg savner at have rejst.

Evidens. Det siges, at jo mere sikker nogen er i forhold til COVID-19, desto mere kritisk skal du være. I begyndelsen af pandemien var der meget lidt enighed om, hvad der var den bedste strategi til at reducere smittespredningen. Der blev blandt andet talt om, at det var bedre at forsøge at opnå flokimmunitet end at lukke samfundet ned. Under alle omstændigheder var der brug for politisk stillingtagen til, hvilke tiltag, der (ikke) skulle indføres. Et godt eksempel har været de forskellige tilgange i Sverige og Danmark. Vi har nu relativt klar evidens for, at Trump klarede det elendigt (ikke den store overraskelse) – og det samme har den britiske regering (ej heller den store overraskelse). Vi har nu også evidens for, at COVID-19 især er luftbåren (se også denne fine artikel). Det interessante er her at se, hvordan forskellige lande begynder at have (relativt) identiske tiltag i forhold til COVID-19, jo mere vi ved om, hvad der virker og ikke virker. Eller mere specifikt: det er blevet lidt lettere at se, hvad der er rene politiske beslutninger. Det er i den sammenhæng også interessant at se, hvordan folk forsøger at politisere tiltag i Danmark, der burde være bred enighed omkring, og hvordan folk oplever Danmark set fra udlandet (eksempelvis Christian Bennike i Bruxelles og Kristian Leth i New York).

USA. Pandemien har på mange måder ramt USA hårdt. Det har været spændende at følge amerikansk politik i 2020. Det har ligeledes været et år, der har vist, at der ikke er noget exceptionelt ved USA (hvis nogen stadig skulle være i tvivl). Da jeg flyttede til USA i 2014 fik jeg ét råd i forhold til sundhedsvæsenet: Forsøg for alt i verden at sørge for, at det ikke bliver aktuelt at bruge det. Dette var ikke et råd ment i spøg. Det er ingen overraskelse at et samfund som USA ikke er godt gearet til en virus a la COVID-19, og jeg savner ikke at bo der. Trump har ikke gjort noget godt for USA, men som med mange andre problemer i USA er han symptomatisk for disse problemer, hvorfor det er svært at forestille sig, at USA havde klaret sig fint under alle omstændigheder.

Hygiejne. Mange af de hygiejnetiltag vi ser er primært til for at skabe en (falsk) tryghed. Ikke desto mindre er det fantastisk at se folk tage noget så simpelt som at vaske hænder mere seriøst. På mange måder nyder jeg stadig den ekstra opmærksomhed, der er på hygiejne – og især social distance. Normalt plejer jeg at have en årlig mindre forkølelse, men dette har ikke været tilfældet i løbet af det seneste års tid (syv-ni-tretten). Jeg håber for alt i verden at folk vil tage hygiejnen seriøst på længere sigt.

Restauranter. Det har været fantastisk at have mulighed for at spise ude igen (efter at restauranterne var lukket i en lang periode). Den britiske regerings “Eat Out to Help Out” var interessant og min fornemmelse er, at det rent faktisk fik flere til at spise ude (nej, jeg tror ikke at det har medført en anden bølge – og ja, jeg har læst et pivringe arbejdspapir, der påstår andet). Hvad jeg omvendt savner rent forskningsmæssigt er en artikel eller et studie der kan sige noget fornuftigt om El Farol Bar-problemet i relation til dette.

Spil. Kultur er eskapisme og spil er den form for kultur, der hurtigst får mig til at glemme tid og sted. Det er derfor ingen overraskelse, at jeg også har fundet mere tid til at fordybe mig i forskellige spil. Her skal blandt andet nævnes Mini Metro, Euro Truck Simulator 2 (som jeg har brugt lidt tid på on and off siden 2014) og Firewatch. Nogle af de spil der har været populære på det seneste har været spil, hvor gentagelser og arbejde udgør et vigtigt element, og der er nok noget ekstra tilfredsstillende ved at optimere arbejde i en tid med meget usikkerhed (se også dette video essay). Det har også været interessant at se hvordan forskellige sportsmiljøer har forholdt sig til COVID-19. Eksempelvis har det har været relativt let for skakmiljøet at fortsætte deres turneringer (især grundet specifikke initiativer fra Magnus Carlsen), mens det har været mærkeligt at se sport uden tilskuere. Selv i en sport som tennis, hvor der normalt er (relativt) stille, virker det mærkeligt, når man ser Rolex Paris Masters (der forløber i disse dage) og der ikke er nogle reaktioner fra tilskuertribunerne.

Hukommelse. Mange vil have svært ved at huske, hvad de egentlig brugte tiden på i løbet af 2020. Dette er der gode grunde til. Dette er også en af grundene til, at jeg er skriver indlæg som disse. Jeg værdsætter ligeledes andres forsøg på at skrive blogs om livet under COVID-19, eksempelvis Brady Harans Quarantine Diary Index.

Assorted links

My previous post with assorted links from various topics was published in 2012. Except for stuff related to statistics, I haven’t kept track of some of the interesting articles and blog posts I’ve read over the years (I wish I had), so here is an attempt at getting started again.

1. “Reality” is constructed by your brain. Here’s what that means, and why it matters.
2. 50 Reasons Why Everyone Should Want More Walkable Streets
3. The Scientific 7-Minute Workout
4. Mental Health Tips feat. Anxiety Wolf
5. Hiroshima
6. Awesome Mental Health
7. So You Want to Learn Physics…
8. A beginner’s guide to modern art jazz
9. Life is short
10. The Open-Office Trap
11. What Should You Do with Your Life? Directions and Advice
12. I Thought I Would Have Accomplished a Lot More Today and Also by the Time I Was Thirty-Five
13. Big data: A big mistake?
14. Tidy Data Rulebook
15. What Really Happened to Malaysia’s Missing Airplane
16. How to write in plain English
17. A Guide to Climate Change
18. What can a technologist do about climate change?
19. Climate crisis
20. Pollution
21. Data science is different now
22. Data Science: Reality Doesn’t Meet Expectations
23. Programmer Competency Matrix
24. The Tyler Ransom Production Function
25. Guide to securing and improving privacy on macOS
26. The economics of the Tour de France
27. Why Do Tennis Crowds Have to Be So Quiet?
28. How Nespresso’s coffee revolution got ground down
29. Tea Reviews
30. Refrigerator safety study: Case study analysis

Honesty may still pay off in politics

A new paper in Proceedings of the National Academy of Sciences finds that politicians who are averse to lying have lower reelection rates. If true, this finding has substantial implications for whether politicians with ambitions of getting (re)elected should lie or not. Accordingly, I found it extra relevant to read this manuscript carefully (in contrast to what the reviewers did, if they understand basic statistics) – and I am glad I did.

The study offered some mayors in Spain a personalised report with the results of a survey. The authors measured variation in truth-telling with the mayors only being able to obtain the report when reporting heads in a coin flip. The interesting finding in the paper is that there is a correlation between lying and getting reelected.

Table 2 in the paper reports the finding and build up different models to look into the robustness of the result. The variable of interest is ‘Reported heads’ and as you can see, the coefficient for this variable is significant in all models. However, we find a serious red flag in the table:

Specifically, in the fourth model, when the authors only look at the sample that are running for reelection (the sample of interest), the model includes an interaction term with ‘Reported heads’ as well. This made me think that something weird must be going on. Why not report the effect of reporting heads on getting reelected for the mayors actually running for reelection? Why hide this test to a model with a specific set of covariates?

When looking at the data, I find find no empirical support for the conclusion made in the article. Specifically, there is no statistically significant effect of reporting heads on getting reelected when we consider whether the mayor is actually running for reelection.

In the table below I present four parsimonious OLS regression models showing how lying politicians running for reelection are not more likely to get reelected. Model 1 reproduce the statistically significant finding in Model 1 (as you can see in the output from Table 2 above). Model 2 estimate the same model with the sample restricted to mayors who actually ran for reelection. In this model, there is no statistically significant effect of reporting heads. Model 3 includes reelection as a covariate in the full sample and shows, similar to Model 2, a statistically non-significant effect of reporting heads.

  (1)
Reproduction
(2)
Reelection sample
(3)
Reelection covariate
(4)
Interaction
Predictors Estimates Estimates Estimates Estimates
Reported heads 0.08
(0.04)
0.06
(0.04)
0.05
(0.03)
0.07
(0.05)
Ran for reelection 0.78
(0.04)
0.78
(0.04)
Margin 2015 0.47
(0.15)
Reported heads × margin 2015 -0.10
(0.19)
Observations 758 627 757 754
R2 / R2 adjusted 0.006 / 0.005 0.004 / 0.003 0.383 / 0.381 0.399 / 0.396

The reason the authors find a statistically significant result (p < .1) for the limited sample (in their Model 4 in Table 2) is the inclusion of an interaction term between reporting heads and the competitiveness of the election (substantially changing the interpretation of the coefficient for reporting heads). If this interaction term is not included, there is no significant effect of reporting heads. When controlling for whether the mayor is running for reelection in the full sample with the interaction, reporting heads is statistically non-significant. Accordingly, this statement in the article is incorrect: “As a supplementary analysis, we restrict the sample to those mayors who reran for election and show that the relationship between dishonesty and reelection holds for this subsample.”

There is simply no evidence that mayors actually running for reelection are more likely to get reelected if they are lying. While the authors present some bleak news for democracy in Spain, the data provides less pessimistic news. Honesty may not pay off in politics, but there is so far no compelling evidence that lying is a winning strategy.

Why are the reviewers not able to find such issues in the paper? Because reviewers are people too and we are more likely to believe that politicians are bad. In other words, a findings providing empirical support that politicians are more likely to get reelected if they lie sounds valid. A new study, for example, shows that one reason people have low trust in politicians is because they disproportionately remember stories about politicians behaving badly.

I should note that I found it important to notice the journal about the issues with the paper. However, as they wrote back to me, the issue I have identified, showing that there is no empirical support for a key conclusion in the paper, “does not contribute substantially to the discussion of the original article and therefore has declined to accept it for publication.” It’s great to see PNAS continue to be so on-brand.