We don’t have a hundred biases, we have the wrong model

From the time of Aristotle through to the 1500s, the dominant model of the universe had the sun, planets, and stars orbiting around the Earth.

This simple model, however, did not match what could be seen in the skies. Venus appears in the evening or morning. It never crosses the night sky as we would expect if it were orbiting the Earth. Jupiter moves across the night sky but will abruptly turn around and go back the other way.

To deal with these ‘anomalies’, Greek astronomers developed a model with planets orbiting around two spheres. A large sphere called the deferent is centered on the Earth, providing the classic geocentric orbit. The smaller spheres, called epicycles, are centered on the rim of the larger sphere. The planets orbit those epicycles on the rim. This combination of two orbits allowed planets to shift back and forth across the sky.

But epicycles were still not enough to describe what could be observed. Earth needed to be offset from the center of the deferent to generate the uneven length of seasons. The deferent had to rotate at varying speeds to capture the observed planetary orbits. And so on. The result was a complicated pattern of deviations and fixes to this model of the sun, planets, and stars orbiting around the Earth.

Instead of this model of deviations and epicycles, what about an alternative model? What about a model where the Earth and the planets travel in elliptical orbits around the sun?

By adopting this new model of the solar system, a large collection of deviations was shaped into a coherent model. The retrograde movements of the planets were given a simple explanation. The act of prediction became easier as a model that otherwise allowed astronomers to muddle through became more closely linked to the reality it was trying to describe.

Behavioral economics today is famous for its increasingly large collection of deviations from rationality, or, as they are often called, ‘biases’. While useful in applied work, it is time to shift our focus from collecting deviations from a model of rationality that we know is not true. Rather, we need to develop new theories of human decision to progress behavioral economics as a science. We need heliocentrism.

The dominant model of human decision-making across many disciplines, including my own, economics, is the rational-actor model. People make decisions based on their preferences and the constraints that they face. Whether implicitly or explicitly, they typically have the computational power to calculate the best decision and the willpower to carry it out. It’s a fiction but a useful one.

As has become broadly known through the growth of behavioral economics, there are many deviations from this model. (I am going to use the term behavioral economics through this article as a shorthand for the field that undoubtedly extends beyond economics to social psychology, behavioral science, and more.) This list of deviations has grown to the extent that if you visit the Wikipedia page ‘List of Cognitive Biases’ you will now see in excess of 200 biases and ‘effects’. These range from the classics described in the seminal papers of Amos Tversky and Daniel Kahneman through to the obscure.

We are still at the collection-of-deviations stage. There are not 200 human biases. There are 200 deviations from the wrong model.

Why we study biases

The collection of deviations in astronomy did have its uses. Absent the knowledge of heliocentric orbits, astronomers still made workable predictions of astronomical phenomena. Ptolemy’s treatise on the motions of the stars and planets, Almagest, was used for more than a millennium.

The collection of biases also has practical applications. Today’s highest-profile behavioral economics stories and publications involve applied problems, be that boosting gym attendance, vaccination rates, organ donation, retirement savings, or tax return submission. Develop an intervention based on potential biases leading to the (often assumed) suboptimal behavior, test, and publish. This program of work has had some success.

But there is something unsatisfying about this being the frontier of behavioral economics as a science. Dig into many of these applications and you see a philosophy of ‘grab a bunch of ideas and see which ones work’. There is no theoretical framework to guide the selection of interventions, but rather a potpourri of empirical phenomena to pan through.

Selecting the right interventions is not trivial. Suppose you are studying a person deciding on their retirement savings plans. You want to help them make a better decision (assuming you can define it). So which biases could lead them to err? Will they be loss averse? Present biased? Regret averse? Ambiguity averse? Overconfident? Will they neglect the base rate? Are they hungry? From a predictive point of view, you have a range of countervailing biases that you need to disentangle. From a diagnostic point of view, you have an explanation no matter what decision they make. And if you can explain everything, you explain nothing.

This problem has led to the development of megastudies, whereby large numbers of interventions are trialed in a single domain. For example, a recent megastudy on gym attendance trialed 53 interventions to increase gym attendance against a control. These interventions included social norms: ‘Research from 2016 found that 73% of surveyed Americans exercised at least three times per week. This has increased from 71% in 2015′. They tested combinations of micro-incentives, whereby people were given Amazon credit for attending the gym. Some incentives were loss-framed in that the experimental participants were told that they were given a certain number of points and would lose them if they did not attend. The largest effect was generated in the intervention group where incentives were provided for returning to the gym after a missed workout. By testing many interventions in a common context, the megastudy provides a method to filter which are more effective.

There is clearly a need for studies of this type. When health experts, behavioral practitioners, and laypeople predicted the results of the megastudy interventions on gym attendance, there was no relationship between their predictions and the results. In a more recent megastudy on vaccine take-up, behavioral scientists were similarly unable to predict the results. If you can’t predict, you need to test. Surprisingly, laypeople were able to predict which vaccine interventions were more effective. Common sense, at least in this application, provided a better predictive tool than the list of biases and interesting effects known to the researchers.

Outside of applied work, the lack of a theoretical framework hampers progress of behavioral economics as a science. Primarily, it means you don’t understand what it is that you are observing. Further, many disciplines have suffered from what is now called the replication crisis, for which psychology is the poster child. If your body of knowledge is a list of unconnected phenomena rather than a theoretical framework, you lose the ability to filter experimental results by whether they are surprising and represent a departure from theory. The rational-actor model might have once provided that foundation, but the departures have become so plentiful that there is no longer any discipline to their accumulation. Rather than experiments that allow us to distinguish between competing theories, we have experiments searching for effects.

The collection of empirical phenomena can provide a building block for theory. The observed deviations from the geocentric model of the solar system supported the development of the heliocentric model. In genetics, the theory of particulate inheritance provided an explanation for the reappearance of inherited traits in later generations and the maintenance of phenotypic variation over time. Deviations from classical mechanics when objects are near the speed of light or of subatomic size provided the foundations for relativity and quantum mechanics. It is now time for those human biases that we consider to be robust deviations to serve a similar role.

The need for new models

Scan the published economics literature and you will find relatively little work to develop new theoretical frameworks to encompass the range of biases and effects. The highest-profile publications tend to be applied work, such as the megastudies described above and collaborations with industry and government. The best minds have settled on a role closer to technicians or engineers.

Why is this the case?

One possibility is that economics has assembled an array of extensions to the rational-actor model that explain most of the observed economically important behavior. We don’t need another unifying theory beyond the rational-actor model we have. Our preferences can relate to any good, service, or outcome. We can have preferences over outcomes for others. Prospect theory can be used to examine choice under uncertainty. There are many models that capture how we discount over time. We can incorporate emotions. And so on.

But that proliferation of models is a problem parallel to the accumulation of biases. A proliferation of models with slight tweaks to the rational-actor model shows the flexibility and power of that model, but also its major flaw. Economics journals are full of models of decision-making designed to capture a particular phenomenon. But rarely are these models systematically tested, rejected or improved, or ultimately integrated into a common theoretical framework. And if you can find a model to explain everything, again, you explain nothing.

It is possible that the rational-actor model is as good as it gets. Richard Thaler takes this view, with behavioral economics set to become a multitude of theories and variations on those theories. He points to the eclectic collection of findings and theories in psychology as the pattern that behavioral economics will follow. But it is not clear that psychology is the right discipline to copy. Psychology was possibly the hardest-hit science through the replication crisis. The weakness of much psychology research as a foundation for applied work was also brutally displayed in the early days of the Covid-19 pandemic.

Accordingly, it is not yet time to throw our hands in the air. While we are unlikely to develop a theoretical framework of human decision-making as clean as the heliocentric model, we can and should try to do better than we currently do. As scientists we want to understand more about the world. We need a theoretical backbone to guide experimental work. And a stronger theoretical framework could translate into new applications and better-directed applied work.

An alternative way of thinking about bias

If we need a new effort, what is the path from here? I believe that the most prospective approaches will have four features.

The first is a weakened focus on the concept of bias. The point of decision-making is not to minimize bias. It is to minimize error, of which bias is one component. In some environments, a biased decision-making tool will deliver the lowest error. For example, statisticians and computer scientists often use a class of procedures called regularization to generate simpler models. The procedure deliberately adds bias to reduce the error due to overfitting.

A human decision-making example is the gaze heuristic, a tool that people use to catch balls. The heuristic is simply to move so that you maintain the ball at a constant angle of gaze. This will take you to where the ball will land. The gaze heuristic results in a strange pattern of movement. You might back away from the ball as it rises and then move back in as it falls. If the ball is hit up and to the side of you, you will move to the ball in a curve. The nonlinear path you follow to catch the ball might be considered a ‘bias’, but it also performs well despite its extreme simplicity.

Second, the human mind is a computationally constrained resource. Even if optimization is the best approach – and it often isn’t – the best we can usually do is approximate optimization. The decision-making rule needs to be feasible with the mind we have. For instance, the gaze heuristic is computationally tractable. Calculating the precise landing place from the velocity of the ball, angle of flight, and wind is not.

Another example relates to the availability heuristic, by which people judge the probability of an outcome based on the ease with which it comes to mind. The unbiased way to make a decision under uncertainty is to sum the utility of all possible outcomes weighted by their probability. This, however, is typically computationally intractable where there are many possible outcomes. In that case, a more tractable approach is to instead sample a limited number of possible outcomes, which comes with the cost of possibly not including rare but extreme outcomes. Falk Lieder, Ming Hsu, and Tom Griffiths showed that the ‘rational’ solution to this computational constraint is to over-sample extreme outcomes. That is, you should apply something like the availability heuristic by calling those more extreme (easily accessible) outcomes to mind. The result is a biased estimate, but one that is optimal given the finite computational resources at hand.

The third feature is that the outcome of a decision is the combination of the decision-making tool and the environment in which it is used. The polymath Herbert Simon described rationality as being shaped by the two blades of a pair of scissors. One blade represents the structure of the environment, the other the computational tool kit of the decision maker. You need to examine both the tool and the environment to understand the nature of the decision that has been made.

Lieder and friends’ explanation of the availability heuristic is also relevant to this point. The evolutionary environment in which our mental toolbox developed was predominantly an environment with more limited information flow than today. What is ‘available’ to us has changed markedly. Modern media is full of extreme events. In such an environment, over-sampling extreme events may lead to too much bias.

If you approach a decision-making problem with these three features, you can still see ‘biases’. But you also build a better understanding of the basis of the bias. Instead of noting someone has made a poor decision, you might note what decision-making tool they are using, why it was appropriate (or not) for the task they used it, in what alternative environments that decision rule might be effective (tools might only be right on average), and whether an alternative heuristic or decision rule might be superior for this particular problem.

Finally, any successful heliocentric approach to modeling behavior will have a fourth feature: It will be multidisciplinary. It won’t involve economics picking up a couple of random pieces of psychology. We will find insight across the sciences. I am going to highlight two fields that I believe are particularly good candidates: evolutionary biology and computer science.

Evolutionary biology

Human minds are the product of evolution, shaped by millions of years of natural selection. Any theory of human behavior must be consistent with evolutionary theory.

Humans are also cultural creatures. (I am using culture as a broad term that includes technology, norms, and institutions.) Our evolved traits and culture interact with and shape one another.

What does an evolutionary approach tell us about the human mind?

For a start, it tells us something about our objectives. All your ancestors, without fail, managed to survive to reproductive age and reproduce. This does not mean that we assess every action by whether it aids survival or reproduction. Instead, evolution shapes proximate mechanisms that lead to that ultimate goal. For example, we crave the sweet and fatty foods that increased survival in ancestral times.

The shaping of proximate rather than ultimate mechanisms has some interesting consequences. In particular, our evolved traits and preferences were shaped in times different from today. Our taste for food was shaped at a time when calories were scarce – all of history until at least the past century. Today, most people in developed countries are effectively calorie unconstrained. Similarly, our evolved desire for sex may not lead to offspring in a world of effective contraception, and few of us pursue offspring maximization options such as sperm donation. This backfiring of our evolved traits and preferences in the modern environment is known as mismatch. In some environments the decision-making tool works. In others it doesn’t. This makes Simon’s scissors such an important frame.

Mismatch is a prospective frame for rethinking bias. Mental tools shaped in one environment may fail in a new context. Experiencing loss likely has a different consequence in a subsistence environment versus the welfare state. Our intuitions around whether doing something yourself is worthwhile may be inappropriate in an economy with a deep division of labor. Our tools for filtering information in small bands may not function as well in a world of social media.

Understanding objectives is important for both theoretical and applied work. A theoretical decision-making framework that incorrectly ascribes someone’s objectives is built on sand. You cannot specify the objective function. In applied work, misunderstanding someone’s objectives is an easy way to assume someone is making a poor decision when they’re not. We often assume the objective: maximizing wealth or income, improving health, paying tax on time, and so on. Is this the objective actually held?

Let me provide one example involving signaling, a core concept in evolutionary biology (and with some history in economics). People signal their traits to potential mates, competitors, and coalition partners, be that their intelligence, health, conscientiousness, kindness, or resources. Yet, the interests of the signaler and the receiver may not be aligned. People lie.

When should someone trust a communication? Signals can be considered trustworthy if they impose a cost (a handicap) on the bearer that only someone possessing that trait can bear. In the animal kingdom, the classic example is the peacock’s tail. Only a high-quality peacock can bear the cost. For humans, we have equivalent signals such as conspicuous consumption and risky behavior.

Many costly signals are inherently wasteful. Money, time, or other resources are burnt. And wasteful acts are the types of things that we often call irrational. A fancy car may be a logical choice if you are seeking to signal wealth, despite the harm it does to your retirement savings. Do you need help to overcome your error in not saving for retirement, or an alternative way to signal your wealth to your intended audience? You can only understand this if you understand the objective.

The benefit of understanding evolutionary objectives is richer than simply understanding the functional reason for a decision. It might enable you to understand the patterns of when a particular decision tool works or not. You can gain insight into what circumstances might evoke the behavior.

For example, Sarah Brosnan and friends researched the endowment effect in chimpanzees. The endowment effect is the phenomenon where individuals tend to place a higher value on an item they possess than they would if they did not already own that same item. If an individual is given an item and then asked if they will trade it for a second item, the endowment effect leads us to predict that they will be less likely to acquire that second item than if they had instead been presented with a simple choice between those two. Brosnan and friends found that chimps exhibited an endowment effect when presented with choices between two foods (peanut butter and juice). However, the researchers also found that the endowment effect was not present when less evolutionary-salient objects (toys) were traded.

Brosnan and friends further explored this context-specific behavior when they provided chimps with tools that could be used to access food. When the tool the chimp possessed could be used to obtain one food and the tool available by trade could be used to obtain a different food, an endowment effect was present. There was no endowment effect for the tools when food was unavailable. Taken together, the presence of the endowment effect across species is indicative that it may have adaptive value in the environments in which it developed. Further, the context-specific nature of the effect led the researchers to propose a hypothesis that the endowment effect evolved to ‘maximize outcomes during inherently risky exchange interactions’.

I am not providing these examples to support an argument that we should simply lift evolutionary ideas and take them as explaining human behavior. Rather, evolutionary biology can be a source of specific, testable predictions about behavior. We can assess those predictions against known phenomena, use them to generate new hypotheses, and test whether those hypotheses hold. That understanding in turn becomes material for bringing seemingly disparate phenomena and biases into a new framework of decision-making.

Computer science

Another field that may help build a new model of human behavior is computer science and, in particular, the development of decision-making and learning algorithms.

Computer science and evolution face a similar challenge of shaping a constrained computational resource to learn and make decisions. And despite the marked difference between the biological and electronic substrates, there is a possibility that evolution and computer science will tend to converge on similar solutions to the same fundamental problem.

This means that where an effective learning or decision-making algorithm is developed, we can ask whether there is a human counterpart. Successful algorithms can be repurposed into hypotheses about how humans make decisions. Sometimes this won’t work, as (most) computer scientists are not seeking to replicate human brains. But results to date suggest this approach has some potential.

One example involves a process called temporal difference, or TD, learning. If you are training an artificial agent to achieve a goal that requires multiple actions (e.g., winning a game of chess requires many moves), providing feedback only when the goal is achieved will rarely lead to successful training. The sparsity of the feedback will lead to slow learning, if the agent learns at all. (Imagine providing chess coaching to a person simply by telling them whether they have won.) TD learning is one method that has been developed to provide feedback before the final goal is reached (e.g., in chess, counting taking pieces as worth a number of points) to enable faster, and often more accurate, learning.

To illustrate the mechanism by which TD learning works, imagine again that we are training an artificial agent to play chess. As the agent makes each move, the agent forms an expectation as to its probability of winning the game. That assessment allows it to compare the strength of different moves. As the game progresses, the changing board position may lead the agent to update its belief as to whether it will win or not. The key to TD learning is that each time the agent updates its expectations, there is a learning opportunity. If the agent takes the change in expectation as a shift toward the truth – the expectation should become more accurate the closer the event – this allows an agent to learn even if they haven’t necessarily experienced the event yet.

In our chess example, a change in the estimated probability of winning is evidence that the estimate of winning at the previous move could be improved. The agent learns from the change in expectations to get a better sense of the strength of that previous board position. It doesn’t need to wait until the end of the game for that learning opportunity.

The implementation of TD learning by computer scientists found early applied success. The most famous of these was the development of a strong backgammon program that could challenge solid human players.

TD learning then led to a breakthrough in the human domain. As described by Brian Christian in The Alignment Problem, the cross-fertilization occurred when a researcher who had worked on development of the TD learning algorithm, Pater Dayan, commenced work with a group of neuroscientists. He and his new colleagues at the Salk Institute realized that the human mind also learned from temporal differences. In particular, temporal difference learning is the function of dopamine (at least in this simplified version of the story).

This finding has had implications for research into happiness and hedonic adaptation and how these in turn affect behavior. If our mind uses a TD learning algorithm, it is not the level of the outcome that causes the positive feelings associated with success but prediction errors arising from exceeding expectations. This leads to a possible explanation for the centrality of reference points to Kahneman and Tversky’s prospect theory, whereby our utility is not a function of absolute levels but rather changes. Reference dependence becomes a feature of our learning process rather than a bias – or bug. This algorithm also provides a source of hypotheses about what the reference point should be.

Another prospective thread for bringing insight from computer science comes from a technique originating in psychology that is now a core part of reinforcement learning: reward shaping. In reinforcement learning, feedback is provided to agents in the form of ‘rewards’, such as a point for winning a game. The agent learns what actions will maximize its rewards. But, as noted above, granting a reward to an agent only when it reaches its final goal, such as winning a game of chess, will rarely lead to successful training.

A chess-playing agent won’t learn as it will never stumble on the full combination of moves required to win. It needs some guidance along the way in the form of a reward structure for making progress. The development of this reward structure is the task of reward shaping.

The challenge of reward shaping is that the computer scientist needs to shape the proximate rewards in a way that the ultimate objective is still achieved. There are many well-known examples of algorithms finding ways to hack the reward structure, maximizing their rewards without achieving the objective desired by its developers. For example, one tic-tac-toe algorithm learned to place its move far off the board, winning when its opponent’s memory crashed in response. With this framing, you can see the parallel with evolution and mismatch. Evolution ultimately rewards survival and reproduction, but we don’t receive a reward only at the moment we produce offspring. Evolution has given us proximate objectives that lead to that ultimate outcome, with rewards along the way for doing things that tended to (over our evolutionary past) lead to reproductive success.

Again, this parallel points to computer science as a source of hypotheses for understanding what drives our actions. What types of reward structures are effective in training algorithms? Are these reward structures reflected in humans? What do those reward structures tell us about our objectives and how we seek to achieve them?

I will close with a belated defense of the rational-actor model.

Evolution is ruthlessly rational. We should not expect evolution to produce error-strewn decision-making tools. Similarly, computer scientists seek to make rational use of the resources at hand to develop the best learning and decision-making tools they can. In that light, it is likely that what we will learn from evolution, computer science, and other fields will contain features of the rational-actor model.

However, modifications to the rational-actor model will emerge from the fact that its current conception typically involves poorly specified or incorrectly assumed objectives, a conception of rationality focused on bias rather than error, and an inadequate consideration of the constrained computational resources that we have at hand. The rational-actor model is not bad, but, like those astronomers grappling with epicycles on epicycles, we can and should try to do better.