Uber and irresistibly interesting data

The other day, the Guardian could report that Uber paid Alan Krueger $100,000 for a study that was positive towards Uber. The fact that the study was positive towards Uber should come as no surprise. You do not pay a lot of money for a study showing that your business is bad for the world.

Various people have already discussed ethical issues related to the study and the payment, especially on Twitter. There are many issues to dissect, but I believe one of the most interesting, yet overlooked, aspects is related to a now-deleted tweet from Justin Wolfers:

The point by Wolfers is that it is not about the money but the fact that the data was ‘irresistibly interesting’. (I know, there is a certain irony in those words coming from an economist.) The argument is that Alan Krueger could not care less about $100,000 – he is all about the irresistibly interesting data. Not the money.

Here is what I believe is missing from the discussion: The data is anything but interesting. You get $100,000 because the data is not irresistibly interesting. The data was actually resistibly uninteresting. If I got access to the data, I doubt I would be tempted to even write and publish a scientific study. That is, unless I got a lot of money.

The paper using the data is Hall and Krueger (2018). I have read the paper and it is more of a descriptive report than a scientific study on, say, the impact of being an Uber driver. When you read the paper, there is nothing about the data that is ‘irresistibly interesting’. On the contrary, the survey data is pretty basic and cannot be used to say a lot about the questions Hall and Krueger set out to answer or speculate about.

To understand the limitations with the data, I recommend that you read Berg and Johnston (2019). They provide a comprehensive overview of the many issues with the data used in the study, and in particular four criticisms related to the data and conclusions:

First, there are methodological concerns, including a poorly constructed survey and a flawed analysis of job-related costs. Second, the authors present an incomplete portrayal of the situation of Uber drivers. Third, the authors make unsubstantiated claims that parrot Uber’s corporate narrative about the virtues of the company’s business model. These claims are not grounded in the authors’ research and are at odds with a growing body of literature, not cited by the authors, that presents a more critical analysis of the working conditions of Uber drivers. And fourth, the authors provide an incomplete labor market analysis that fails to account for the impact of transport network companies (TNCs), such as Uber and Lyft, on taxi and other for-hire vehicle (FHV) drivers, despite the paper’s recurring comparisons of Uber vs. Taxi.

I guess you can expect to get some “unsubstantiated claims that parrot Uber’s corporate narrative about the virtues of the company’s business model” for $100,000. I find the four criticisms of the paper in question nuanced and fair.

Let us briefly outline the relevant issues with the data as discussed by Berg and Johnston:

Non-response bias. More satisfied drivers are more likely to participate in the survey. It is no surprise that the Uber drivers participating in the study are, generally speaking, happy working for Uber. Berg and Johnston describe the specific challenge with non-response in this setting: “shortly after the 2014 survey results were released, Newsweek published an article questioning the validity of the survey results, in part, based on interviews with Uber drivers who said they would not have responded to the survey had they been selected, with one driver indicating fear of retribution from the company”. In other words, people who were not happy with Uber were less likely to participate.
Missing questions and issues with question wording. There are important questions that are not included in the survey, such as data on the (average) number of hours a person drives for Uber in a typical week. Several of the conclusions made in the paper are not backed up by the data because the relevant questions were not asked. In addition, you find problems with the framing of questions, such as double-barreled questions (the questions are framed in such a way that they are more likely to return data that will be positive towards Uber and the business model).
Reference category for earnings. Uber drivers are compared with taxi driver employees. This is not a good reference category (unless you want to overestimate how much Uber drivers earn). Furthermore, they conduct calculations that understate Uber drivers’ expenses. As Berg and Johnston write: “First, the authors make no mention of the self-employment taxes paid by independent contractors. Second, they fail to account for additional licensing costs in highly regulated markets. Third, variation in cities’ transportation infrastructure suggests that the national mileage estimate used in their analysis is not appropriate. Fourth, the use of national American Automobile Association (AAA) data as a proxy for vehicle ownership costs underestimates costs in the BSG cities; variation based on driver demographics is also insufficiently addressed. And fifth, Uber has implemented multiple fare decreases since 2014 that most certainly affect the associated costs and earnings of Uber drivers; existing research suggests these changes have resulted in higher hourly expenses.”

In sum, there are several limitations with the data and models. Here is what Berg and Johnston end up concluding on the findings and why we should care about all of this:

Hall and Krueger’s article has been cited in committee hearings of the U.S. Congress, at a Federal Trade Commission workshop on the sharing economy, on the California State Treasurer’s website (as part of “peer-reviewed” work), and likely in other policy venues. The regulatory questions are not settled, and articles published in scientific journals can skew policymakers’ opinions.

Yet the article by Hall and Krueger, and the survey it is based on, are fraught with methodological problems—sample bias, leading questions, incomplete reporting of findings, flawed earnings calculations, unsubstantiated claims, and outdated data. These limitations do not restrain the authors from asserting their findings confidently, nor has it restrained the company from using these findings in support of its position in political and regulatory debates. The authors advance corporate claims of flexibility, extoll the benefits of driver ratings, and champion the “be your own boss” narrative without offering evidence to support their claims or to refute the growing body of literature that is critical of the on-demand labor practices of Uber and other similar companies.

These are some very important points, and it is weird to see Justin Wolfers trying to convince people that the data was simply ‘irresistibly interesting’.