Nash equilibrium and computation

In Beyond Nash Equilibrium: Solution Concepts for the 21st Century, Joseph Halpern cites three problems with the idea of Nash equilibrium that are inspired by computer science. These – and here I’m roughly quoting –  are that the equilibrium do not deal with “faulty” or “unexpected” behavior, they do not deal with computational concerns, and they assume that players have common knowledge of the structure of the game. I think the first and third can be roughly summed up as “Nash players should be all-knowing rationalists, but that is not always a useful assumption”.

The most immediately interesting to me is the need to take computation into account. The first example he gives is

You are given an n-bit number x. You can guess whether it is prime, or play safe and say nothing. If you guess right, you get $10; if you guess wrong, you lose $10; if you play safe, you get $1. There is only one Nash equilibrium in this 1-player game: giving the right answer. But if n is large, this is almost certainly not what people will do.

This is what I would call a problem that relies on pseudoperfect agents – ones that know what a prime number is, how to calculate whether a number is prime, but not, immediately, whether a given number is prime. A more typical lay person could just use what the probability distribution is for prime numbers may be. And of course, the cost of calculating the primality of a number in both physical and opportunity costs needs to be included in the final outcome.

But really: the computational complexity of a given situation will add implicit costs to the strategies and that does need to be taken into account. But how often does that actually happen?

The young and the restless

Elderly chinese men playing chess

It struck me recently that one of the key differences between economists and neuroscientists studying decision-making is their interest in dynamics.  Economists seem more interested in explaining how behavior operates (or should operate) on average whereas neuroscientists would like to explain trial-to-trial variability.  Decisions are rarely made just once in a lifetime, but are instead made repeatedly.  Any behaviorist would instantly tell you that this means that there will be a learning component, something that I hardly see in the economic decision-making literature (feel free to correct me if this is wrong).

In many of these repeated decisions, people are not simply making a decision in a vacuum but are responding to the actions of others.  The decision must then be balanced by their prior beliefs, the results of recent decisions, and their predictions of how other people will act.  All of this can be incorporated into a reinforcement learning (RL) paradigm, where the expected value of any action is a combination of classical RL – where every payoff suggests future payoffs, and every loss suggests future losses – as well as a ‘mentalizing’ component that predicts how the opponent is likely to act, and how the opponent will react.  By fitting the responses of different brain regions to this type of model, one can get a sense of what each region is (kind of) doing.  One region that instantly pops out is the medial prefrontal cortex (mPFC): this region is highly correlated with the prediction of other people’s behavior.

I once took a behavioral economics class in which the professor pointed out that deviations from rational behavior are only important if they translate to something in aggregate.  In other words, who cares if just a few people have abnormal mPFC function.  In a large population you won’t notice them.  But in fact there is a very large group of people with degraded mPFC: the elderly.  13 percent of the US is over the age of 65, and this group is known to have significant loss of volume in mPFC.  The prediction, then, would be that older individuals would be less inclined to take into account the behavior of other individuals when making decisions.

To test how they will act, we can take the experimental game the “Patent Race”.  In this game, two players are selected from a pool to compete for a prize.  They are each given either a large five credit or a small four credit endowment, and are asked to “invest” some portion of that.  They then get to keep whatever is left over, and the person who “invested” the most wins ten extra credits.

Cumulative distribution plots of how influential other individual's behavior is in determining one's own behavior.  Blue represents young adults and purple-dashed represents the elderly.

Cumulative distribution plots of how influential other individual’s behavior is in determining one’s own behavior. Blue represents young adults and purple-dashed represents the elderly.

There does exist a Nash equilibria to this game, and young adults will play the Nash equilibria exactly.  Old adults, on the other hand, play a significantly different strategy.  What is more interesting, though, is half of elderly adults behave as if they did not care at all about the strategy of the other player.  In other words, they are making decisions using a pure reinforcement learning strategy where they only cared about payoffs, not about how the other player was going to act.  In contrast, no young adults played like this: they all took into account the strategy that the other player would use.

References

Hampton, A., Bossaerts, P., & O’Doherty, J. (2008). Neural correlates of mentalizing-related computations during strategic interactions in humans Proceedings of the National Academy of Sciences, 105 (18), 6741-6746 DOI: 10.1073/pnas.0711099105

Zhu, L., Walsh, D., & Hsu, M. (2012). Neuroeconomic Measures of Social Decision-Making Across the Lifespan Frontiers in Neuroscience, 6 DOI: 10.3389/fnins.2012.00128

Photo from