In the classical game theory match-up known as the Prisoner’s Dilemma, two prisoners kept isolated from each other are offered a deal: they can confess to a crime and if their accomplice remains silent the charges will be dropped in exchange for testimony against the other. If they both confess, they can both get early parole. If both remain silent, they get convicted of a lesser charge.

If both players cooperate with each other, they both receive a payoff but if one cooperates and the other does not, the cooperating player receives the smallest possible payoff, and the defecting player the largest. If both players do not cooperate, they both receive a payoff, but it is less than what they would gain if both had cooperated. The origin of the puzzle is the Rand Corporation's investigations into applying game theory to global nuclear strategy. It pays to cooperate, but it can pay even more to be selfish. 


C and D are “cooperate” or “defect”. R is the “reward” payoff that each player receives if both cooperate. P is the “punishment” that each receives if both defect. T is the “temptation” that each receives as sole defector and S is the “sucker” payoff that each receives as sole cooperator. Credit: Stanford Encyclopedia of Philosophy

That's a one-off situation. What about over time? Using narrow parameters, it is easy to see why cooperation and generosity have evolved in nature - generous strategies succeed over the long-term - but adding more flexibility to the game can allow selfish strategies to be more successful, which muddies the psychology water a little.

In an upcoming PNAS paper, the authors examined an iterated and evolutionary version of the Prisoner’s Dilemma, in which a population of players matches up against one another repeatedly. The most successful players “reproduce” more and pass along their winning strategies to the next generation. The researchers found that, in such a scenario, cooperative and even forgiving strategies won out, in part because “cheaters” couldn’t win against themselves. They added a twist. Now, not only could players alter their strategy — whether or not they cooperate — but they could also vary the payoffs they receive for cooperating.

This more accurately reflects the balancing of risk and reward that occurs in nature, where organisms decide not only how often they cooperate but also the extent to which they cooperate. Initially, cooperative strategies found success but the population of players reached a tipping point after which defection was the predominant strategy in the population.

In a second analysis, they allowed the payoffs to vary outside the order set by the Prisoner’s Dilemma. Instead of unilateral defection winning the greatest reward, for example, it could be that mutual cooperation reaped the greatest payoff, the situation described by a game known as Stag Hunt. Or, mutual defection could generate the lowest possible reward, as described by the game theory model known as the Snowdrift or Hawk-Dove game.

What they found was that, again, there was an initial collapse in cooperative strategies. But, as the population continued to play and evolve, players also altered the payoffs so that they were playing a different game, either Snowdrift or Stag Hunt.

“So we see complicated dynamics when we allow the full range of payoffs to evolve,” said  Joshua B. Plotkin, a professor in the University of Pennsylvania Department of Biology. “One of the interesting results is that the Prisoner’s Dilemma game itself is unstable and is replaced by other games. It is as if evolution would like to avoid the dilemma altogether.”

Plotkin and co-author Alexander J. Stewart say their new conception of how strategies and payoffs co-evolve in populations is ripe for testing, with the marine bacteria Vibrionaceae as a potential model. In these bacterial populations, the researchers noted, individuals cooperate by sharing a protein they extrude that allows them to metabolize iron. But the bacteria can possess mutations that alter whether they produce the protein and how much they generate, whether and how much they cooperate, as well as mutations that affect how efficiently they can take up the protein, their payoff. They believe a “natural experiment” using these or other microbes could put their theory to the test, to see exactly when and how selfishness can pay off.

“After this study, we end up with a less sunny view of the evolution of cooperation,” Stewart said. “But it rings true that it’s not the case that evolution always tends towards happily ever after.”

Source: University of Pennsylvania