‘A Diamond is Forever’ and Other Fairy Tales … and Other Fairy Tales

One of the most popular studies on the Social Science Research Network (SSRN) is a study published in Economic Inquiry in 2015. The abstract is viewed more than half a million times and the study is downloaded ~80,000 times. These are impressive numbers. The study is in top 10 of the all time total number of downloads on the SSRN.

Why do people share and like this study? Because there is a certain irony to the main finding, namely that the more money people spend on an engagement ring, the less likely it is that their marriage will last. Unsurprisingly, this study was also picked up by the popular press (e.g. here, here and here).

The title of the study is ‘“A diamond is forever” and other fairy tales: The relationship between wedding expenses and marriage duration‘. Here is the full abstract of the study:

In this study, we evaluate the association between wedding spending and marriage duration using data from a survey of more than 3,000 ever-married persons in the United States. Controlling for a number of demographic and relationship characteristics, we find evidence that marriage duration is inversely associated with spending on the engagement ring and wedding ceremony.

With the impressive number of downloads of the study, I am surprised that I haven’t seen more critical reflections on the study. I decided to take a closer look at the study and I am now even more surprised by all the attention that the study has received. Let’s try to understand what the study is about.

The basic idea of the study is to compare whether people with a more expensive engagement ring are more likely to get divorced (the outcome of the study is marital dissolution). I applaud the authors for taking on such a difficult task as this is one of the question we cannot study in a randomised experiment. Specifically, we cannot sample a lot of people that are about to get married and assign half of the sample to a cheap engagement ring and the other half to an expensive engagement ring.

More importantly, when we study a question like these, we need to be very much concerned about whether people who are divorced and have an expensive engagement ring are more likely to take part in a survey for money than people who are still married with an expensive engagement ring.

The sample composition is from what I can see one of the primary limitation with the study. Specifically, the study is using a sample of people from mTurk which is not a nationally representative sample. On the contrary, the study is relying on a lot of participants that were offered $0.50–$0.75 to answer approximately 40 questions. This sample is not sufficient to say anything meaningful about the impact of an expensive engagement ring on marriage duration. Importantly, the authors are aware of this limitation on page 1920 in the study: “Of course, for our purposes, a large national probability sample would be preferable.” Of course.

There is a much more severe problem with the study that, even if we assume that the design is strong, there is no reason to believe that people with a more expensive engagement ring are more likely to get divorced.

In brief, the authors conduct a survival analysis (a Cox proportional hazard model). Take a look at this description from the paper (page 1922) and see if you can get a sense of what some of the issues are:

In bivariate regressions, having no engagement ring and having wedding expenses below $1,000 are each associated with increases in the hazard of divorce, while spending $8,000 or more on an engagement ring is associated with a decrease in the hazard of divorce. This appears to be consistent with the relationship between wedding expenses and marriage duration posited by wedding industry advertising. However, the picture changes in multivariate regressions. Spending between $2,000 and $4,000 on an engagement ring is significantly associated with an increase in the hazard of divorce among the sample of men. Specifically, in the sample of men, spending between $2,000 and $4,000 on an engagement ring is associated with a 1.3 times greater hazard of divorce as compared with spending between $500 and $2,000.

There is a lot of stuff going on in the text above. The first problem is that bivariate models actually suggest the opposite of what the authors argue in the paper. The consistent finding in a bivariate model is that people engaged with no engagement ring are significantly more likely to be divorced. And people with an expensive engagement ring are more likely to stay married.

So what about the multiple regressions? There are two issues here. First, and least problematic, we are dealing with a strong portion of regression soup. Here is a list of the covariates in the model: age, marriage age, gender, race/ethnicity, education, employment, household income, region of residence, religious attendance, respondent-spouse differences (age difference, race difference, education difference), children with spouse, knew spouse very well, length of time dated before proposal, feelings and attitudes at time of proposal, had a honeymoon, wedding attendance, and total wedding expenses. One problem here is that some of these variables can induce post-treatment bias and I see no reason to be much more confident in the impact of an expensive engagement ring in the multiple regressions compared to the bivariate regression.

Second, even if we assume that nothing of the above is problematic, there is no pattern in the multiple regressions suggesting that people with a more expensive engagement ring are more likely to get a divorce. Take a look at the regression coefficients for engagement ring expenses in Table 2 (to keep it simple, I do not show all the other variables in the models):

It is a little difficult to understand the coefficients as the reference category is “$500 to $2,000”. In the bivariate model we can see, as stated above, that the more expensive the proposer’s engagement ring, the more likely that the respondent is still married. If we look at the multiple models for all respondents, we see no significant effects. In other words, no indication at all that “marriage duration is inversely associated with spending on the engagement ring” (as pointed out in the abstract).

Let’s take another look at what the authors say they find with the table above in mind: “in the sample of men, spending between $2,000 and $4,000 on an engagement ring is associated with a 1.3 times greater hazard of divorce as compared with spending between $500 and $2,000”. In the sample of men. Of course! Between $2,000 and $4,000. Of course! There is one statistically significant regression coefficient and nine statistically non-significant (not counting the reference category and the coefficients for “Don’t know”). Again, if anything, the models suggest that having spend more money on an engagement ring correlates with a longer marriage – not shorter.

Marriage duration is not inversely associated with spending on the engagement ring. This is just yet another social science fairy tale.

Social science research during COVID-19

What a time to be alive. The coronavirus pandemic is a global problem and social scientists use this unique opportunity to write novel papers with the novel COVID-19 as the case (or the context of the study). There is not a single day without a new study saying something about either social distancing or the social behaviour of mass publics in relation to COVID-19.

I have already seen a lot of good papers with direct relevance for the COVID-19 crisis. However, interestingly, some of the best research you will see these days rely on data collected prior to the outbreak of COVID-19. The best example of this is this working paper on the potential implications of vote-by-mail programs (using data from 1996 to 2018).

Similarly, I am convinced that we will see a lot of interesting research coming out over the next years by political scientists on the different policy responses across the world (Denmark and Sweden will serve as a great most similar system design) and the social and political behaviour of citizens during the crisis. In other words, we do and will see great social science research coming out of this crisis.

That being said, we are in the midst of a global crisis with a lot of uncertainty and bad incentives for researchers. This is not to say that social scientists shouldn’t consider studying COVID-19 or in other ways contribute to our understanding of the crisis. However, the bad incentives researchers face combined with the uncertainty and rapid development of the crisis makes me critical towards how social scientists can contribute with to the crisis (at least in the short term).

To illustrate, let us take a look at a paper interested in how people estimate the spread of the virus. The first version, version 1 of the paper, was online on March 8. The study talked about “widespread misperceptions” and that people overestimated the severity of the virus. I made it clear when the study came out that I was not convinced by the study. This was before there was a lot of daily deaths in the US, but I found it weird that the researchers talked about the importance of not overestimating the spread of the virus at this stage (especially when taking the uncertainty about the spread of the virus into account).

Interestingly, version 2 of the paper came out on March 19. Now the paper did not talk about the problem of overestimating the virus at all. Instead, the paper had changed the framing substantially. However, this version has so far received less attention (compared to the first version). It is interesting that the paper changes its focus from ‘overestimating the severity of the crisis’ (version 1) to ‘successful containment’ (version 2) once the researchers themselves (economists, of course) acknowledged the severity of the crisis.

The paper in question got several limitations but my criticism is not related to the findings or specific methodological choices. Instead, I am critical towards the extent to which social scientists can (and should) bring important research into the world these days (taking the high level of uncertainty into account). Again, I am not saying that we will not see good social science research coming out of the COVID-19 pandemic. However, for consumers of social science research I suggest that you remain skeptical towards all of the papers coming out these days (independent of whether they are peer-reviewed or not).

Another important aspect to keep in mind is data quality. Data is not cheap and a lot of the research we see coming out these days will rely on data that we know is less than ideal. We will see a lot of small samples (most of them collected on mTurk) and surveys of a questionable quality. However, the challenge is not only with self-reported data in surveys.

I have already seen multiple papers using Google search data to study topics such as racial prejudice, economic anxiety and religiosity in relation to the COVID-19 pandemic. We might learn a lot from this body of research but we should be aware of a lot of the limitations with such data (see this blog post for more info). My concern is that there are very specific limitations to what kind of inferences we can make using such time-series data in the time of COVID-19.

More generally, the COVID-19 pandemic is not a good case for a lot of social science questions. This article describes the challenge: “The best natural experiments usually look at similar groups of people where one group experiences a very specific change”. The problem with the COVID-19 pandemic is that everybody is affected (“treated”) by the pandemic and we cannot make sensible counterfactual claims on the impact of specific initiatives these days. For example, if we are interested in the impact of lockdowns, such lockdowns will not be exogenous to the crisis and other relevant factors.

I am not the only one to be concerned about the quality of the research coming out these days and the potential implications. Anne Scheel describes it best here: “My point is not that all of this research is pointless or harmful — some of it may have a genuine positive impact. But I do feel that our concern about the extremely unusual and serious situation we’re in leads us to overlook the potential costs of conducting and consuming research in emergency mode.” Also, I agree 100% with the points made by Stuart Ritchie here.

Social science will contribute with a lot of interesting research related to the COVID-19 global pandemic. However, we should remain skeptical towards a lot of the work coming out these days.