Game theory models the behaviors that emerge in situations of conflict, and predicts how rational individuals driven by selfish motivations compete and cooperate. The predicted outcomes can be disappointing. PD, the Prisoner's Dilemma, is a prototypical model in game theory used to demonstrate that the interaction of two rational individuals each attempting to optimize their gains can lead to outcomes that are bad for both participants. So we observe the counterintuitive result of rational individuals knowingly taking decisions that can be predicted to led to poor results.

Countless authors have argued this game-theoretical result being wrong. They challenge the assumption that human beings can be modeled as narrowly rational and selfish individuals. Altruism, evolutionary driven cooperative behaviors, and mystical concepts like super-rationality are put forward as missing features in game theory.

Interestingly, when PD-type results emerge in multiplayer games, such criticism tends to be absent. Silence descends over the 'super-rationality preachers'. The very same people who argue against the outcomes of PD do seem to recognize their own behaviors in the outcomes of what are games that are arguably closer to the reality of situations of conflict and cooperation arising in our societies. Yet, both the multi-player versions and the two-player PD make identical counterintuitive predictions: rational players seek an equilibrium of outcomes that can make every participant being worse off.

What do you do when you recognize your own selfish behaviors in the non-cooperative outcomes of a model, and yet you believe humans capable of behaviors transcending selfishness? Right. You scratch your head and announce you have stumbled on a genuine paradox. So while the two- player game PD is labelled a dilemma, a multiplayer game yielding a similar outcome is labelled a paradox.

Welcome to Braess' paradox, the non-paradoxical multiplayer version of PD.

Each morning at rush hour a total of 600 commuters drive their cars from point A to point B. All drivers are rational individuals eager to minimize their own travel time. The road sections AD, CB and CD are so capacious that the travel time on them is independent of the number of cars. The sections AD, and CB always take 10 minutes, and the short stretch CD takes no more than 3 minutes. The bridges, however, cause real bottlenecks, and the time it taken to traverse AC or DB varies in proportion to the number of cars taking that route. If N is the number of cars passing a bridge at rush hour, then the time to cross the section with this bridge is N/100 minutes.

Given all these figures, each morning each individual driver decides on the route to take from A to B. The outcome of all deliberations is a repetitive treadmill like behavior. Each morning all 600 commuters crowd the route ACDB and patiently await the traffic jam at both bridges to resolve. The net result is a total travel time of 600/100 + 3 + 600/100 = 15 minutes for each of them.

Does this make sense?

At this stage you may want to pause and consider the route options. If you would be one of the 600 commuters, would you join the 599 others in following route ACDB?

Of course you would.

Neither you nor any of the other drivers is tempted to take an alternative route like ACB or ADB. These routes would take you 600/100 + 10 = 16 minutes, a full minute longer than the preferred route ACDB.

So each morning you and 599 other commuters travel along route ACDB and patiently queue up at both bridges. Everyone is happy, until one day it is announced that the next day the road stretch CD will be closed for maintenance work. This announcement is the talk of the day. There is no doubt in anyone's mind that this planned closure will create havoc. Would section AD or CB be closed, it would have no impact as these are idle roads. But section CD is used by each and every commuter. Clearly a poorly planned maintenance scheduling, a closure of such a busy section should never coincide with rush hour traffic!

The next morning all 600 commuters enter their cars with a deep sigh, expecting the worst. Each of them randomly selects between the equivalent routes ACB and ADB. The result is that the 600 cars split roughly 50:50 over both routes, and that both bridges carry some 300 cars. Much to everyone's surprise all cars reach point B in no more than 300/100 + 10 = 13 minutes. Two minutes faster than the route ACDB preferred by all drivers.

How can this be? If a group of rational individuals each optimize their own results, how can it be that all of them are better off when their individual choices are being restricted? How can it be that people knowingly make choices that can be predicted to lead to outcomes that are bad for everyone?

Asking these questions is admitting to the wishful thinking that competitive optimization should lead to an optimum. Such is not the case, competitive optimization leads to an equilibrium and not to an optimum. If you think about this, it will become clear there is nothing paradoxical to this assertion. Rather, it would be highly surprising if competitive optimization would invariably lead to optimum outcomes.

A question to test your understanding of the situation: what do you think will happen the day section CD gets opened again? Would all players avoid the section CD and stick to the 50:50 split over routes ACB and ADB, a choice better for all of them?

If all others would do that, that would be great. It would give you the opportunity to follow route ACDB and arrive at B in a record time of about 9 minutes (300/100 + 3 + 301/100 minutes to be precise). But of course all other rational drivers will reason the same. So you will find yourself with 599 others again spending 15 minutes on the route ACDB, without any of you being tempted to select an alternative, yet all of you hoping that damn attractive shortcut between C and D to get closed again.

## Comments

- Link

A selfish rational individual will by definition not forfeit a sure gain. There is really nothing more to say about the whole thing. Those who sympathize with Hofstadter's idea that selfish and "superrational" beings would forfeit a certain gain in the game PD, seem to have a distorted version of this game in mind. I advise them to read this write up.

If you were programming 600 robots to drive these bridges, trying to optimize for time, and supposing the robots could not communicate with each other, you would obviously make each robot choose either ACB or ADB with 50% chance. If you were programming two robots to play Prisoner's Dillema, trying to minimize total jail time, you would obviously program each robot to Cooperate. For each such problem, there is an obvious optimal solution, and we could define "super-rational" to mean exactly that solution, with no inconsistencies. Note that you don't need one "programmer" to make the super-rational solution work. E.g. in the 600-robots case, you could have 600 different programmers - as long as each programmer knows that the other programmers are super-rational (and knows that they know, etc), a globally-optimal solution is possible. Of course this does *not* work if you know that the other players are rational (and not "super-rational") players, and of course being "super-rational" is not rational under the game-theoretic definition of rational, but that does not mean that we can't have a well-defined meaning for "super-rational".

*randomly select between ACB and ADB*", wouldn't you?

Remember, the task is to minimize the individual travel time, irrespective of what this means to the travel time of all other commuters. (If it helps, imagine that during rush hour you have to bring a deadly wounded relative to the hospital located at point B, knowing that each minute counts in terms of survival probability - now which route do you chose? ) A program that would use the simple algorithm "

*always select ACDB*" would beat your "superrational" algorithm hands-down.

Ok, let's try this: Imagine you are creating an algorithm to make a choice (either in PD or in the bridge situation), and somehow know with 100% certainty that all other participants in the game will use exactly the same algorithm to make their choice. Obviously, in that case your PD algorithm would be "always Cooperate" and your bridge algorithm would be "50% chance of ACB and 50% chance of ADB". With this one extra added assumption (that everyone uses the same algorithm as you), these cooperative choices actually become rational. (Any other choice of algorithm affects everyone else's choice of algorithm by our assumption, and thus ultimately hurts you). We could define "super-rational" to mean exactly the algorithm that is optimal given this assumption. Of course, the assumption does not hold in most situations, so in most cases the "super-rational" choice is actually *not* the rational choice, but my point is that "super-rationality" could still be well-defined (even if useless).

People routinely will shift their driving patterns based on time of day, and expectation of delays. Construction, accidents, etc are all variables that the drivers must keep in mind, so the most "optimum" solution for any of them is predictability.

Of course, in your example, you indicate that there is no variability in the travel time regardless of the route chosen and yet you conclude that no one would explore those alternatives. Based on what?

In fact, one of the most obvious motivators for many drivers is to keep moving, rather than to stand still. As a result, many drivers choose routes that may actually be longer [and take more time] if they can move rather than remain stopped. Of course, this hinges on how much experience the driver has with a particular route, because often jumping off can cause worse problems than remaining where you are.

I don't have a problem with a basic thought experiment about systems, but it becomes a strawman when you presume that no one would use an alternate route and therefore doesn't behave rationally.

In real life, the decision would more likely be that route ACDB may be a bit slower, but it is consistent, while the other routes may be faster [at certain times or circumstances] but may incur higher delays if there is anything wrong. So, again, the rational choice is to take a bit longer if it is more reliable.

You may think I'm being nit-picky, but how does this scenario resemble anything if you can constrain the behaviors without justification? I also realize that I'm imposing conditions that were never mentioned, but then, the present model doesn't represent reality, especially not in presuming that no one would choose the alternate route. As I said, if the purpose is to merely show how random distributions can vary the outcomes, that's fine, but making statements about the rationality of the players, doesn't make sense, since you selected their behaviors.

I also can't think of anyone that considers the Prisoner's Dilemma to be a comprehensive example of any behavior without considering the iterative versions. Any single encounter would be subject to bad outcomes as described, but repeated behaviors ... I don't' think so.

A selfish rational individual will by definition not forfeit a sure gain.Your statement doesn't reflect reality, since people routinely forfeit an advance [even if very slight] to promote cooperation. This is readily observed in traffic jams when people are trying to enter a road or how individuals take turns when several flows converge.

It seems like this is some attempt to presume that humans only behave selfishly [versus simple self-interest], instead of recognizing that the most successful strategies are cooperative and that given iterative occurrences, that's precisely how people will behave.

The latter category attempts to gain insights into "what really matters". Game-theoretical models tend to be in this category. Game tneory claims that humans do have rational and selfish characteristics, and that this is

*to understand a lot about the character of economic behaviors and (depending on the game settings) the emergence of behaviors such as cooperation, retaliation, etc.*

**all you need**One should not expect cooperation in a PD setting such as Braess' paradox. Adding the features that you suggest (participants optimising other quantities than solely travel time) will not help in getting to 'an optimal that works best for all'. As long as participants individually compete for the same resources, PD-type situations are bound to emerge. However, once you introduce opportunities for retaliation in PD settings (such as is realized by the iterated PD algorithm) cooperation can emerge. And needless to say that games like stag-hunt are much more interesting and much more relevant when it comes to modeling cooperation behaviors.

All of these are insights obtained with the charicatures studied by game theorists. A remarkable feat. Having said this, I agree that humans are more complex and more interesting than 'robots who rationally optimize their own gains' (I have blogged about this before, and I am planning a next blog post on this subject). Yet it is remarkable how much the 'selfish rational robot caricature' presented by game theory can explain about human behaviors.

I know of no other caricature that comes anywhere near in explaining your and mine behaviors.

In the first place, you're presuming a constant travel time, so that predictability and reliability are removed. OK, if those are the conditions, then where do you get to presume that no one has discovered an alternative route and consequently optimized their travel time accordingly?

That's where the incongruity comes in, because even after they've "discovered" a more optimal route, you conclude that they will revert back to their old patterns. Where is that a prediction of game theory? That's where the additional parameters come into play, because you are assuming that travel time is the only utility value. If so, then we need an explanation as to why its not exploited. However, if we look at the additional parameters, such as predictability, then we may well find that they have optimized their route based on their utility value.

Whether you intended it or not, the underlying thesis seems to be that humans behave less than optimally, however you have determined what their behavior will be and made it irrational. I would argue that if you randomly chose any number of actual commuters, you would likely find that they have optimized their travel time according to their own knowledge of conditions and attempting to optimize the reliability and predictability of the trip, of which travel time is the result.

*"even after they've "discovered" a more optimal route, you conclude that they will revert back to their old patterns. Where is that a prediction of game theory?"*

But the whole point is: that 'optimal' route is optimal (yielding fastest individual travel time from A to B) only when section CD is closed off. Try it: with section CD open it is

*better to take route ACDB. Please don't make the error to optimize other's travel time. Optimize your own. The assumption (and only assumption) is that you are a rational individual in a hurry. Again, if it helps imagine your beloved one is in the car, deadly wounded. Can you get her the quickest to the hospital at point B? Do the math and be honest: road section CD is open again: what route do you chose?*

**always**And yes, this is a game theory prediction. You are not arguing with me but with Mr. Nash. The combination of single-player strategies "all chose route ACDB" is the single Nash equilibrium for this game.

As a result, there is no better choice, since [presumably] everyone will be "forced" to take the same route.

If the CD connector is closed, then the travel times are equal and a more uniform distribution doesn't lend any better outcomes [not actually true, but for the 50/50 case it is].

However, I still fail to see how this is a Nash equilibrium, since possessing knowledge about the traffic on the CD connector could affect your decision. It is only compelled if we constrain the system. In the same way that the CD connector being closed, is only comparable if traffic is evenly distributed, it really suggests that anything less than a 50/50 split will result in better times for some and potentially worse times for others. Again, if knowledge is available regarding the route choices made by the other drivers, then that could influence your decision.

*"If the CD connector is open, then it is always the fastest route, regardless of the traffic on the bridge, therefore anyone that is seeking to optimize travel time must take that connector. [..] However, I still fail to see how this is a Nash equilibrium, since possessing knowledge about the traffic on the CD connector could affect your decision.."*

You agree with me that the choice ACDB is the dominant strategy, and next you ask why it would be the Nash equilibrium. That makes no sense. A dominant strategy (meaning: "under no circumstance is there a better choice") must be the Nash equilibrium (meaning: "post the fact, nobody regrets his/her choice").

As a result, everyone will ultimately pursue the CD connector route leading back to the condition of all traveling ACDB.

BTW ... yes, please delete the other thread.

All the same, I appreciated reading it. I enjoy these kind of exercises, so thanks! :-)

But... it is not so bad

http://theory.stanford.edu/~tim/papers/routing.pdf

http://theory.stanford.edu/~tim/papers/indep.pdf

It continues to amaze me how many people still sympathize with the super-rationality 'fix' to PD. In this blog post I therefore wanted to focus on the fact that in PD settings one should expect an equilibrium of behaviors and not the emergence of optimum behaviors.

I agree that the 'greedy optimization' equilibrium realized by selfish rational individuals is fairly robust (never wildly away from optimum performance). Rational suckers aren't that bad, actually!

Having wasted countless hours in rush hour traffic, I've observed that there is NO rationality among motorists. As a group they behave like a slightly compressible fluid and traffic follows the laws of fluid dynamics.

In a civilized country, the obvious result will be that the county will install speed bumbs or even close down CD for non-residential traffic.

Actually, the Nash equilibrium examples are the foundation of the theory of Economic Development: People require good institutions to escape such a bad equilibrium.

The universal solution of the original PD is also practiced in time honored ways by humans: Kill those who snitch. The relevant institution is organized crime.

That graphic doesn't make any sense. Clearly, the rational choice is to misinform the other 599 drivers that both bridges are closed, forcing them to take the 23 minute ADCB detour. According to the graphic, that leaves you with AC for 1/100 of a minute, the CD stretch for 3 minutes, and DB for another 1/100 of a minute. Total travelling time 181.2 seconds.

Frankly, I doubt that CD even takes 3m for one car. After all, it's only 1/1.7x the length of AC, it should take 0.35294118 of a second. Revised total travelling time 1.55294118 seconds.