New book: Reporting Public Opinion

I am happy to announce the publication of a new book, ‘Reporting Public Opinion: How the Media Turns Boring Polls into Biased News‘, co-authored with Zoltán Fazekas. The book is about how and why opinion polls are more likely to be about change in the news reporting. Specifically, journalists are more likely to pick opinion polls that show changes, even when such changes are within the margin of error, highlight such changes in the reporting – and the public, pundits and politicians are more likely to respond to and share such polls.

Here is the puzzle we address throughout the various chapters: how can most opinion polls show a lot of stability over short periods of time whereas the reporting of opinion polls are dominated by change?

Even for the most hardcore followers of politics, opinion polls are quite boring in and by themselves. In most cases they show nothing new. When we take the margin of error into account, a new opinion poll will most likely show that there is no statistically significant shift in the polls for any of the political parties of interest. And when there is a large change, it is most likely a statistical fluke we should be cautious about. I have over the years written countless posts about such opinion polls being covered in the Danish media.

The book is our attempt to provide a unified framework to better understand these dynamics in a systematic manner. In the first chapter of the book, we introduce the theoretical puzzle and outline the main limitation of existing studies on the topic, namely that studies on opinion polls tend to focus on one specific stage in the coverage, such as whether methodological details are present in the coverage or not. To fully understand how opinion polls are covered and consumed in contemporary democracies, we argue that we need to combine different literatures on opinion polls and examine how a strong preference for change can explain biases in how opinion polls travel through several stages from their initial collection to how they reach the public.

In the second chapter, we further develop a framework that focuses on the temporal dimension of how opinion polls are brought to the public via the media. This chapter serves as an introduction to the four stages that opinion polls have to go through in our framework. Specifically, we show how each stage – or activity – will lead to polls showing greater changes getting more attention. This is illustrated below:

Next, throughout Chapters 3, 4, and 5, we cover the stages of opinion polls in greater detail and show collectively how opinion polls are being turned into specific news stories. In Chapter 3, we focus on the selection of opinion polls. That is, we investigate what can explain whether journalists decide to cover an opinion poll or not. In Chapter 4, we target the content of the reporting of opinion polls, which covers the news articles dedicated to the opinion polls that journalists have decided to report on. In doing this, we show how the selection and reporting of opinion polls are shaped by a similar preference for change. Noteworthy, when introducing the idea of change, we dedicate extensive considerations to how we can best measure change and what the availability of these change measures means for the selection and reporting.

In Chapter 5, we analyse the next natural stage in the life of opinion polls: how do politicians, experts and the public respond to them and to the stories written about them. Essentially, we delve into the implications of how these opinion polls are selected and covered. Here, we show that both elites and the broader public have a strong preference to engage with (respond to or share) opinion polls that show greater changes or support a well-defined change narrative. Interestingly, we find that opinion polls showing greater changes are much more likely to go viral on Twitter.

In Chapter 6, we turn our attention to the alternatives of the reporting of opinion polls. Here, we discuss how no opinion polls at all, poll aggregators, social media, and vox pops can be seen as alternatives to opinion polls, and in particular what are their strengths and limitations. The ambition here is not to force the reader to decide whether opinion polls are good or bad, but rather to understand how alternatives to opinion polls can mitigate or amplify the biases introduced in the previous chapter.

Last, in Chapter 7, we conclude how the media might report on opinion polls by considering the trade-offs between what the polls often show and what journalists wish they showed. Specifically, we first set out to discuss the implications of the findings for how we understand the political coverage of opinion polls today and then discuss the most important questions to be answered in future work.

The book is the product of years of work on the topic of how opinion polls are reported in the media. However, while the topic should be of interest to most people with an interest in politics and opinion polls, this is an academic book and I should emphasise that it might be a tough read for a non-academic audience.

You can buy the book at Waterstones, Bookshop, Springer, Blackwell’s and Palgrave.

A problem with survey data when studying social media

We cannot understand modern politics without studying social media. Politicians as well as ordinary citizens rely on social media to discuss and consume political content. One of the data sources researchers rely on to study behaviour on social media is survey data. However, there can be specific challenges with studying social media. Here, I will illustrate such a challenge when using survey data to study behaviour on social media. Specifically, even if you rely on a representative sample to study social media behaviour, there is no guarantee that you can use this sample to make meaningful inferences about social media users.

To understand this, we need to understand that there is a difference between the sample size you have and the sample size you end up using in your statistical models. If you have interviewed 1,000 citizens, but only 100 of these actually use social media, how much can we then actually say based on this data?

Research from the UK shows that users of Twitter and Facebook are not representative of the general population (see also this paper). However, there are even more potential problems with using survey data to study behaviour on social media. Specifically, we know that the “effective sample” is not necessarily similar to the real sample. That is, just because you have a specific sample, you cannot expect that estimates obtained from a regression will apply to the population that the actual sample is represenative for (see this great paper for more information).

I was thinking about this issue when I read a paper titled “Ties, Likes, and Tweets: Using Strong and Weak Ties to Explain Differences in Protest Participation Across Facebook and Twitter Use”. You can find the paper here. There are so different issues with the paper but I will focus on one particular issue here, namely the small sample we end up looking at in the manuscript and the implications hereof.

The paper examines whether people have strong and weak ties on Facebook and Twitter and how that matters for their participation in protest activities. Specifically, the paper argues that different types of social ties matter on Facebook and Twitter. The paper expects, in two hypotheses, that strong ties matter more for Facebook use in relation to protest behaviour whereas weak ties matter more for Twitter use in relation to protest behaviour. This is also what the paper (supposedly) finds empirical support for. Here is the main result presented in Table 1 in the paper:

That’s a lot of numbers. Very impressive. And look at that sample size… 995! But here is the problem: While the paper relies on a representative survey with 1,000 respondents, only 164 of these respondents use Facebook and Twitter. You could have had a sample size of 100,000, but if only 164 of those used Facebook and Twitter, how much should be believe that the findings generalise to the full sample?

Out of the 164 respondents using Facebook and Twitter, only 125 have weak or strong ties. And only 66 of the respondents have variation in the ties within the respective social media platform (i.e. not the same weak or strong ties on Facebook or Twitter). Only 18 respondents in the sample have different ties across the respective social media platforms (i.e. not the same weak or strong ties on Facebook and Twitter). Here is a figure showing how we end up with only having variation on the relevant variables for 2% of the sample:

This means that when we enter a regression framework where we begin to control for all of the aforementioned variables, we will be putting a lot of emphasis on very few cases.

Why do we care about this? Because the results are weak (and definitely not strong). Even minor adjustments to the analysis will make these results throw in the towel and beg for mercy. However, this is not the impression you get when you read the paper, and in particular how confident the authors are that the results are representative: “To make the results more representative of the population, all analyses were conducted using a post-stratification weight (although the results are virtually the same when using unweighted data).”

I informed the authors that their findings are not virtually the same when using unweighted data, and that the coefficient for ‘Strong-tie Twitter use’ is actually for ‘Weak-tie Facebook use’ and vice versa. Based on this, the authors issued a corrigendum to the article, stating that: “On Table 1, the study reports regression coefficients for variables in the study. Due to a clerical error, the coefficients for two variables, strong-tie Twitter use and weak-tie Facebook use, are flipped. In Figure 1, however, the same coefficients are correctly displayed. A corrected table appears below. The authors apologize for the confusion this error may have caused.”

Notice how there is nothing about the fact that the results do not hold up when looking at the unweighted data. Interestingly, while not addressing the issue in the text in the corrigendum, the new Table 1 looks nothing like the old Table 1 (i.e. the table presented above). Here is the new table:

You will see that this table is corrected in a weird manner and looks nothing like the old Table 1. What happened to Model 1? In the new table, we only see two different versions of Model 2. Notice here that, for the unweighted data, neither the strong ties or weak ties on Twitter has a statistically significant effect. Only the two coefficients for ties on Facebook are statistically significant. The same results? No. Virtually the same results? Also no.

Why do the authors say that the results are virtually the same? Because they conduct statistical tests to see whether the coefficients are different across the two models and find no statistically significant differences. This is a very conservative threshold and the coefficients would need to change a lot before they would no longer be “virtually” the same.

However, take a look at the results in the model and see whether they are in line with the key “finding” in the paper. The relevant heuristic here is the following question: Would the authors still have made the same interpretation, i.e. that weak ties matter more than strong ties on Twitter, if only presented with Model 2 using unweighted data? I find that unlikely, especially as the coefficient for weak ties on Facebook is statistically significant in this model.

While there is something predictable about the response from the authors, I do find it interesting that they acknowledge the relevance of reporting the results using unweighted data. Kudos for the transparency, I guess.

What can we learn from this? There might be some methodological recommendations for other researchers who actually care about these issues. First, improve the (effective) sample size. Remember that 1,000 observations might not be 1,000 observations once you are done clicking on fancy buttons in SPSS. This is even more relevant when you might have a lot of measurement error. One study, for example, showed that self-reported Facebook usage is at best correlated .4 with Facebook logs of user activity.

Second, we should care about better sampling (ideally primarily studying social media users). There is no need to have a representative sample if it is limited how much any of these findings actually apply to the representative sample (or the population of interest). I doubt we have learned anything about the relevance of social media from looking at this observational data from a non-representative survey in Chile with limited variation on the key variables of interest.

Third, while we know a lot about social media, there is still a lot to be understood and I would like to see researchers deal with “simpler” hypotheses before turning to complex ideas about how strong and weak ties work across different social media platforms. Sure, it is an interesting idea and I am convinced the authors will spend more time celebrating and promoting their h-index than taking my concerns into account. However, I am – again – not convincend that we have learned a lot about how social media matter upon reading this paper.

There are many challenges with survey data when studying social media, and I am not against using such data at all. Most of my research rely on survey data and I believe we can use such data to say a lot about social behaviour, including on social media. However, there are particular problems that we should be aware of, including what sample we are actually looking at and how that shapes the results we get out of our statistical models.

“This letter closes the question”

Here is the beginning of an abstract from a recent study published in the American Political Science Review:

Larger protests are more likely to lead to policy changes than small ones are, but whether or not attendance estimates provided in news or generated from social media are biased is an open question. This letter closes the question: news and geolocated social media data generate accurate estimates of protest size variation.

I disagree from the bottom of my heart. The question is not closed. I don’t understand why researchers need to mislead in the abstract in order to make the research stand out.

Let me make a guess. It’s because the editor, reviewers and authors all know that this can be one of the studies people will cite uncritically (when they use unreliable or reliable protest size data). “We thank Reviewer 1 for paying attention to this issue. We have consulted the literature and we now provide a reference to a study showing that social media data accurately measure protest size variation“. I mean, it makes sense if you read the title of the paper: “News and Geolocated Social Media Accurately Measure Protest Size Variation”. My prediction is that we will see a lot of research where news and social media data will be used to measure protest size with no further validation (beyond a reference to this study).

Is that a problem? The research is strong and I have no reason to be concerned with the quality of the work (I also admire the work of some of the researchers mentioned in the ‘Acknowledgement’, so I assume it’s good). I believe there is a problem, however, in how researchers try to sell their findings as closing a debate that has merely started.

The study in question looks at one salient protest type (Women’s March protest) in one country (the US). The (peaceful) Women’s March protest is not necessarily representative for most protests we see around the world (or representative for the sociodemographic composition of protest participants in general) – and the researchers know that. Actually, they are explicit about this towards the end of the paper: “Since this paper’s validation has only been tested on one event, the scope to which it holds remains to be tested. The results probably hold in other wealthy democracies, though for now that claim remains an assumption.” Why not add that to the abstract? Seriously. “The results probably hold in other wealthy democracies, though for now that claim remains an assumption”.

If that claim remains an assumption, why then close the question in the abstract? I am tired of how researchers are pandering to the intellectual laziness of their colleagues. Why not be honest and just say that the study adds evidence to the matter? And why do I bother writing a blog post about a single paragraph in a paper? Because I believe we should care about how we communicate research findings – and in particular what can be learned from a single study.