The Prisoner's Dilemma

Jon Orwant

April 5, 1998. FBI agents break down your door and hustle you downtown for interrogation. Seems you've been using illicit cryptography to exchange information about - well, they're not exactly sure, because you used cryptography, but they know it must be sinister because, hey, you used cryptography. And they know who you were talking to: the FBI's packet sniffers (and subpoenaed router logs) revealed that you were communicating with your friend a few miles away. They've arrested him too.

You're going to jail. The question is, for how long?

The FBI doesn't have enough evidence to convict you of the new crime, Encoding In The First Degree, which carries a penalty of five years in jail. But they can convict you of Second Degree Encoding, and that's two long years in the overcrowded Dubuque Minimum Security Federal Penitentiary for Computer Abusers.

They offer you a deal: if you agree to confess, and testify against your friend, they'll let you go free. They offer your friend the same deal. But if you both decide to testify against one another, you each get four years in jail. You must both decide what to do in solitude, without communicating. (That's why they split you up.)

You can either Testify (T) or Hold Out (H), and your friend can do the same. The outcomes are shown in this payoff matrix:

Your Friend

You T H

T You get 4 years
Friend gets 4 years You get 0 years
Friend gets 5 years

H You get 5 years
Friend gets 0 years You get 2 years
Friend gets 2 years

What should you do? You might think to yourself, "If I testify, I'll get either four years or zero years. And if I hold out, I'll get either five years or two years. I have no idea what my friend will do, and I can't talk to him. Maybe I should assume that he'll choose at random, in which case I'm better off testifying."

If your friend thinks the same way, you'll both testify and get four years each. That's unfortunate, since the outcome with the fewest number of man-years in jail occurs when you both hold out.

This problem is called the Prisoner's Dilemma, and it's the foundation for a mathematical discipline called game theory. It's been used to represent scenarios in gambling, genetics, business, and thermonuclear war, by using different payoffs: money, offspring, more money, and death.

The Iterated Prisoner's Dilemma

There's not a whole lot to say about the "one-shot" Prisoner's Dilemma: either you should testify or you shouldn't, and you can construct compelling arguments for both decisions.

Here's when things get interesting: forget about the prison terms and think of the payoff matrix as abstract "points". Now pit you and your friend against one another in a series of matches, say 100. Your goal is to minimize the number of points you accrue during the 100 matches.

Now there's a little bit of communication between you and your friend: for any given match, you can consider all of the previous matches before making your decision. If your friend held out on all the previous matches, you might think that he'll remain consistent, and hold out again. But maybe that's just what he wants you to think, so that he can testify, adding five points to your score and none to his. (Remember, he's a criminal too!)

Here's a simple always-hold-out strategy:

sub nice_guy {
    return "H";
}

Here's a strategy that chooses at random:

sub random {
    return "H" if rand() < 0.5;
    return "T";
}

Here's parting_shot(), which holds out on the first 99 matches, and testifies on the last (100th) match. The history of your choices is stored in the array reference $my_choices_ref (which becomes the array @my_choices). parting_shot() only uses that array to determine when the 100th match has been reached.

sub parting_shot {
  my ($my_ref,$friend_ref) = @_;
  my (@my_choices) = @$my_ref;

  if (@my_choices == 99) {
      return "T"
  } else { return "H" }
}

Here's a strategy called tit-for-tat, which says, "I'll hold out on the first match, after which I'll choose whatever you chose last time."

sub tit_for_tat {
    my ($my_ref, $friend_ref) = @_;
    my (@friend_choices) = @$friend_ref;

    return "H" unless @friend_choices;
    return $friend_choices[$#friend_choices];
}

Tit-for-tat variants usually perform well in Prisoner's Dilemma contests. Random strategies aren't so bad either. Of course, that all depends on which other strategies participate. There's no single best strategy - just the best strategy for a given match.

The Three-way Prisoner's Dilemma

Let's add another level of complexity: a three-person Prisoner's Dilemma, in which three strategies compete simultaneously. Here's the payoff matrix (actually a payoff cube):

(Friend 1, Friend 2)

You (T, T) (T, H) (H, T) (T, H)

T You: 4
Friend 1: 4
Friend 2: 4 You: 1
Friend 1: 1
Friend 2: 7 You: 1
Friend 1: 7
Friend 2: 1 You: 0
Friend 1: 5
Friend 2: 5

H You: 7
Friend 1: 1
Friend 2: 1 You: 5
Friend 1: 0
Friend 2: 5 You: 5
Friend 1: 5
Friend 2: 0 You: 2
Friend 1: 2
Friend 2: 2

When one person testifies and the other two hold out, the fink gets off scot-free, and the holdouts get five years each. When two people testify, they get one year each, and the holdout gets seven.

As before, the only communication between the prisoners is through their actions on previous matches. Here's a sample strategy for a three-way contest:

# Testify only if both friends testified 
# during the last match.

sub fool_me_twice {
  my ($my_ref, $friend1_ref, $friend2_ref) = @_;  
  my (@friend1_choices) = @$friend1_ref;
  my (@friend2_choices) = @$friend2_ref;

  if ($friend1_choices[$#friend1_choices] eq "T" &&
      $friend2_choices[$#friend2_choices] eq "T") {       
           return "T";
  } else { return "H" }
}

The Prisoner's Dilemma Programming Contest

We invite you to design your own three-way strategy, to be pitted against the rest of the Perl community.

Every strategy should be a single subroutine. During a match, the subroutine will be passed three array references (as with fool_me_twice() above): the first will contain an array of your past decisions, and the second and third will contain the decision arrays for friend1 and friend2 respectively.

Your subroutine must always return either "H" or "T".
Your subroutine will play every other subroutine at least once, in a series of exactly 100 matches.
The winning strategy will be the one with the lowest cumulative score over all matches.
Your entry must have comments before the subroutine with your name, e-mail address, and a truthful explanation of the strategy.
The random number generator will be initialized before every series of 100 matches, should you care to use it.
One entry per person.
Mail entries to perl-journal-contest@perl.com. Entries must be received by midnight EST on May 1, 1996.
The best strategies will be announced in the Summer (June) issue of TPJ, and on our Web site, and on comp.lang.perl.misc.

Good luck!

__END__

TABLE OF CONTENTS

Your Friend
You		T	H
	T	You get 4 years Friend gets 4 years	You get 0 years Friend gets 5 years
	H	You get 5 years Friend gets 0 years	You get 2 years Friend gets 2 years

(Friend 1, Friend 2)
You		(T, T)	(T, H)	(H, T)	(T, H)
	T	You: 4 Friend 1: 4 Friend 2: 4	You: 1 Friend 1: 1 Friend 2: 7	You: 1 Friend 1: 7 Friend 2: 1	You: 0 Friend 1: 5 Friend 2: 5
	H	You: 7 Friend 1: 1 Friend 2: 1	You: 5 Friend 1: 0 Friend 2: 5	You: 5 Friend 1: 5 Friend 2: 0	You: 2 Friend 1: 2 Friend 2: 2