Noise

The world is full of noise. This is not a novel insight. Luckily, the power of statistical models to predict human behaviour is limited, and if a model is able to predict all relevant variation in an outcome of interest, we should be concerned about overfitting and other potential problems. There are good reasons why R² is rarely anywhere near 1 in regression outputs and why regularization in machine learning is such a hot topic.

Noise is part of the error term in our statistical models. A good model is not trying to explain everything about the world (and not have any noise at all). Unsurprisingly, within the domain of behavioural economics, the idea of noise as a core component of human behaviour has been known for decades. Humans are not (fully predictable) machines, partially because there are limits to how we process information. The concept of bounded rationality, introduced by Herbert Simon, is one good example of how we can understand the existence of noise.

The psychologists Daniel Kahneman and Amos Tversky improved our understanding of human behaviour by convincingly demonstrating that a lot of what goes into the error term, i.e., what we treat as noise, is not in fact noise but predictable behaviour rooted in cognitive biases. Here is how the economist Richard Thaler describes this historical development in his book, Misbehaving: The Making of Behavioural Economics:

I believe many economists ignored [Herbert] Simon because it was too easy to brush aside bounded rationality as a “true but unimportant” concept. Economists were fine with the idea that their models were imprecise and that the predictions of those models would contain error. In the statistical models used by economists, this is handled simply by adding what is called an “error” term to the equation. Suppose you try to predict the height that a child will reach at adulthood using the height of both parents as predictors. This model will do a decent job since tall parents tend to have tall children, but the model will not be perfectly accurate, which is what the error term is meant to capture. And as long as the errors are random—that is, the model’s predictions are too high or too low with equal frequency—then all is well. The errors cancel each other out. This was economists’ reasoning to justify why the errors produced by bounded rationality could safely be ignored. Back to the fully rational model! Kahneman and Tversky were waving a big red flag that said these errors were not random. Ask people whether there are more gun deaths caused by homicide or suicide in the U.S., and most will guess homicide, but in fact there are almost twice as many gun deaths by suicide than homicides. This is a predictable error.

Over the years, thousands of studies have tried to remove noise from the models and give the predictable errors unique names, often in the form of specific cognitive biases. Today, behavioural economics is a thriving research agenda uncovering too many predictable errors (biases) to cover in a blog post (or a book for that matter!). In his book from 2011, Thinking, Fast and Slow, Daniel Kahneman summarised a lot of the research within the field (alas, a lot of this research relied on noisy data and, accordingly, does not hold up well).

Three years prior to the publication of Thinking, Fast and Slow, the book Nudge: Improving Decisions about Health, Wealth, and Happiness, by Richard Thaler and Cass Sunstein, brought behavioural economics to the policy-makers toolbox. The basic principle is that traditional economic models of human behaviour will not be sufficient if we are to reach policy objectives. In other words, there is too much noise generated by traditional models that behavioural economics can help us understand – and even take advantage of if we are to improve decisions about … health, wealth, and happiness.

However, despite decades of research on minimising the error term, i.e., shed light on the predictable errors in human judgments and decisions, the fact remains that there is still a lot of noise in human judgments. But maybe we focus too much on what we can predict and too little on noise? This is the topic of the new book, Noise: A Flaw in Human Judgment, by Daniel Kahneman, Olivier Sibony and Cass Sunstein. In other words, the book is not trying to simply understand biases, but this thing called noise. Or, as the authors introduce the focus of the book: “The topic of bias has been discussed in thousands of scientific articles and dozens of popular books, few of which even mention the issue of noise. This book is our attempt to redress the balance.”

Is the book successful in redressing this balance? Maybe, but I am not convinced. Here is my main issue with Noise: the book is trying to both argue that noise is rarely acknowledged and we know very little about noise (and, for that reason, we need new concepts to study noise) and summarise decades of research on noise (including research done by the authors themselves decades ago). The book is both trying to set a new research agenda with actionable insights (what Sunstein did with Nudge) and introduce the reader to decades of research (what Kahneman did with Thinking, Fast and Slow). You can’t have your cake and eat it too.

In reading the book you will learn about different types of noise, in particular system noise, level noise, pattern noise and occasion noise. Despite the graphical presentation of the sum of squares in various chapters, and especially in chapter 17 on the components of noise, it is difficult to follow the reasoning throughout the book. I am left with the feeling that some additional conceptual work is needed in order for the concepts to fully work. Alas, upon reading the book, the added value of the concepts remains limited. Also, at one place in the book, it is described that system noise can be broken into three components of noise (level, pattern, and occasion), whereas the authors later write “System noise can be broken down into level noise and pattern noise.” I believe the authors could have done a much better job by providing more conceptual clarity, for example with tables and additional useful examples to better demonstrate and discuss the usefulness and relevance of the concepts.

In particular, I would like to see the authors use existing concepts in the scientific literature instead of pretending that they are the first to come up with concepts to think about reliability. I can understand if the authors don’t want to rely on too many concepts, but an Appendix listing the relevant and related concepts would be great. Instead, bhe book begins by introducing the well-known distinction between bias (validity) and noise (reliability) without using mentioning reliability. (Surprisingly, the book is doing what it can not to mention the word reliability more than necessary.) Here the core distinction is also introduced: bias is systematic deviation and noise is random scatter.

The first chapter introduces a key theme of the book, namely that “wherever there is judgment, there is noise—and more of it than you think”. The authors then set out to understand noise in various topics, such as medicine, child custody decisions, forecasts, asylum decisions, personnel decisions, bail decisions, forensic science, and grant patents. However, the book is not interested in random scatter, but to explain systematic components to noise. That is, to actually focus more on biases than noise. Here is how chapter 1 begins:

Suppose that someone has been convicted of a crime—shoplifting, possession of heroin, assault, or armed robbery. What is the sentence likely to be?
The answer should not depend on the particular judge to whom the case happens to be assigned, on whether it is hot or cold outside, or on whether a local sports team won the day before.

This could have been the introduction to a lot of recent pop psychology books with one-word book titles. Again, the problem is that we are not really interested in noise here, but rather to show that there is less noise (by adding variables on judge characteristics, outdoor temperature, sport results, etc.). Accordingly, the book is more about showing that what we believe is noise is not actually noise. This is the noise paradox. Once you can explain random scatter, it is no longer random. Once we can explain noise, there is less noise. What the book is doing with a lot of examples is to say that there is less noise than we think there is (whereas the authors interpret this as there is more noise than we think). In sum, there are contradictory forces at play. On the one hand, the authors want to explain noise (i.e., say that we can understand and predict noise). On the other hand, the authors want to say that there is more noise than we expect (cf. “wherever there is judgment, there is noise—and more of it than you think”).

This focus on showing how “noise” matters, or how small and irrelevant factors matter, also brings some low-quality research into the book (see, for example, here and here). To be fair, the authors also rely on a lot of high-quality research, but they use low-quality research to make some extreme statements. Accordingly, the book is worst when it provides conclusions such as “If you are suffering political persecution in your home country and want asylum elsewhere, you should hope and maybe even pray that your hearing falls on a cool day.” Again, this has more to do with focusing on noise in data than providing actionable insights on how to deal with noise. This is also how you end up with coverage about the book such as “Why You Shouldn’t Buy Bitcoin When You’re Hungry“. In my view, such articles add more noise than signal to our understanding of human behaviour. And it is not doing a good job in redressing the balance.

The problem with the definition of noise in the book (or rather the lack hereof) is that every difference we can imagine can be understood as noise. Anywhere there is variation, there is noise. For example, because all countries in the world did not respond to the COVID-19 crisis in a similar manner, we can conclude that there is noise. Or, as the authors formulate it: “For another exercise in counterfactual thinking, consider how different countries and regions responded to the COVID-19 crisis. Even when the virus hit them roughly at the same time and in a similar manner, there were wide differences in responses. This variation provides clear evidence of noise in different countries’ decision making.” This is not the best way to think about noise, nor a good example of noise in decision making. (Also, apparently, there is so much noise in the world, yet here is the cognitive bias that makes you panic about a pandemic.)

It is old news that people do not respond to treatments (events, news, pandemics, Sundays, hunger, books, etc.) in a similar way and it is not surprising that all countries did not respond with the exact same policies to the COVID-19 pandemic. Even scientific studies differ in the results they find and we now have several meta-analytic techniques we can rely on to understand how effects vary over different persons, locations, treatments, and outcomes. However, if Noise is the first book you ever read, you will have the impression that very little work has been done to understand such differences in treatment effects.

The statement that noise can be any variation also means that the book is dealing with different topics that should not necessarily be put together if we are to reduce the noise in our thinking. The authors write that noise “is variability in judgments that should be identical.” This means that noise is both treatment heterogeneity (“If every respondent makes the same mistake, there is no noise.”) and measurement error (“Can we agree on an anchor case that will serve as a reference point on the scale?”). However, the authors do not convincingly show why we should use the same concept to consider both treatment heterogeneity and measurement error. On the contrary, it seems like the authors rely on a catch-all definition of noise solely for the purpose of bringing different points and arguments together within the same book. This also makes the book very difficult to review. There is simply too much going on.

Overall, I would have liked to see the authors focus more systematically on how little we can actually explain (instead of how much we can explain). For all the examples they provide throughout the book, to what extent are we actually able to predict behaviour? At some point the authors argue that correlations of about .2 are quite common in human affairs, but I find such interpretations too optimistic based on what we know from the replication crisis within psychology (and related fields). Speaking of correlations, I have also seen some discussions on how the book misunderstands how the absence of a correlation is not evidence of no causation (Rachael Meager made this point on Twitter, and see the discussion on Andrew Gelman’s blog where Daniel Kahneman also responds).

And speaking of “speaking of”. Each chapter ends with a few “Speaking of …” bullet points to capture the key points relevant to the chapter in question. However, these are often symptomatic for the fact that the genereal insights provided throughout the chapters are trivial at best and confusing at worst. Consider these randomly selected examples: “To fight noise, they first have to admit that it exists.”, “Wherever there is judgment, there is noise—and that includes reading fingerprints”, “Before we start discussing this decision, let’s designate a decision observer” and “We have kept good decision hygiene in this decision process; chances are the decision is as good as it can be”.

Ironically, the book is best in part 6 when the authors discuss the seven objections to reducing or eliminating noise: 1) it is not always cheap to reduce noise, 2) reducing noise can introduce bias, 3) eliminating noise can reduce the feeling of respect and dignity, 4) noise is a prerequisite for moral and political evolution, 5) noise makes it difficult for people to exploit a system, 6) noise introduce uncertainty that can be used in combination with biases to improve social outcomes, and 7) reducing noise might lead to less creativity. These are not original insights but there are some good discussions and it is great to see the book devote attention to the benefits of noise as well, especially to provide balance to the naive optimism provided throughout the book when it comes to noise reducing strategies such as algorithms.

Overall, if you care about noise (as you should), I will not recommend this book. Instead, if you do want a book with noise in the title, consider Nate Silver’s The Signal and the Noise (that also focuses on the work by Philip E. Tetlock, provides references to Moneyball, etc.) and Nassim Nicholas Taleb’s Fooled by Randomness. The latter provides a much better introduction to noise, uncertainty, and randomness. Recent books that deal with some of the same topics and ideas are Dark Data: Why What You Don’t Know Matters (I provided a few thoughts on the book here) and Radical Uncertainty: Decision-Making Beyond the Numbers.

Philip E. Tetlock describes Noise as “A masterful achievement and a landmark in the field of psychology” and Cass Sunstein said in an interview that: “Unlike bias, noise isn’t intuitive, which is why we think we’ve discovered a new continent”. This might very well be the case but why not acknowledge previous research on noise? If you are truly discovering a new continent, you need to provide a stronger case for this continent being new. For example, why not mention the work on noise traders in financial markets (inspired by Fischer Black, and also described in Thaler’s book Misbehaving). When not dealing with all of this research, Andrew Gelman is correct with his comment: “But then I realized that Sunstein kinda is like Columbus, in that he’s an ignorant guy who sails off to a faraway land, a country that’s already full of people, and then he goes back and declares he’s discovered the place.”

This is not to say that there isn’t a lot of new land to discover. There is a lot of potential in a book about noise. Noise is all around us and we are not good at dealing with noise. How, for example, do we use different strategies to deal with different types of noise? And what about when people do not agree with each other about what is noise and what is not? (We might call this “noise in noise”.) What is noise for some people can be a signal for other people. In opinion polls, there is a lot of noise that journalists use to create engaging narratives – but political scientists (at least those not yet turned pundits) see nothing but random error. (We used such noise to study how small differences within the margin of error affected the coverage of opinion polls.)

Noise is a book closer to being an appendix to Thinking, Fast and Slow and Nudge than an original piece of work that improves our understanding of human behaviour, and in particular the role of noise. Again, there is a lot of potential in providing a book-length examination of noise in human judgment and decision making. And while there are definitely good things to say about Noise, is a lost opportunity to provide such an examination. In sum, this is a fairly lengthy book with a low signal-to-noise ratio for people interested in the scientific study of noise/cognitive biases/measurement error/treatment heterogeneity/reliability.

There is a lot of noise in Noise.