Can subgame perfect equilibrium threats foster cooperation? An experimental test of finite-horizon folk theorems.
Angelova, Vera ; Bruttel, Lisa V. ; Guth, Werner 等
I. INTRODUCTION
In this paper, we study the effect of equilibrium punishment
threats on cooperation in a finitely repeated prisoners' dilemma
(PD) game. To this end, we extend the standard PD stage game with the
strategies "cooperate" and "defect" with an
additional strategy as in the study by Schwartz, Young, and Zvinakis
(2000) and Feinberg and Snyder (2002). Mutual play of this strategy
constitutes a second payoff dominated equilibrium in the stage game.
Given this extension and an adequate choice of payoff parameters we
prove a folk theorem in the spirit of Benoit and Krishna (1985),
according to which cooperative subgame perfect outcomes are possible
despite the finite horizon of interaction. Without the extension, in
contrast, backward induction predicts universal defection to be the
unique subgame perfect equilibrium in the finitely repeated game. Given
these theoretical arguments, we expect cooperation rates in the extended
game to be higher than in the standard game.
In addition to introducing equilibrium punishment, we also vary its
strategic stability. Depending on whether the additional strategy is
weakly dominated or undominated, the additional stage game equilibrium
is either weak or strict. While the folk theorem predicts higher
cooperation rates for both extended games, a refinement concept (called
strictly perfect or proper equilibrium) predicts higher cooperation
rates only for the strict case. We test these competing theoretical
predictions experimentally to answer the following research question. In
a finitely repeated PD game, is it sufficient to introduce an
equilibrium punishment
threat to increase cooperation or is it necessary for the punishment
threat to be strictly self-enforcing?
Subjects in our experiment play the supergames repeatedly, similar
to Selten and Stoecker (1986) and Bereby-Meyer and Roth (2006). In a
between-subjects design we explore six distinct treatments. Treatments
differ in the type of stage game (either standard PD, or weak game or
strict game) and the time horizon of interaction with the same partner
(either long or short). In the long horizon treatments, subjects
ABBREVIATIONS
EPD: Extended Prisoners' Dilemma
ORSEE: Online Recruitment System for Economic
Experiments
PD: Prisoners' Dilemma
doi: 10.1111/j.1465-7295.2011.00421.x
VERA ANGELOVA, LISA V. BRUTTEL, WERNER GUTH and ULRICH KAMECKE *
* We gratefully acknowledge the helpful comments of two anonymous
referees.
Angelova: Research Fellow, Max Planck Institute of Economics,
Strategic Interaction Group, Kahlaische Str. 10, 07745 Jena, Germany.
Phone +49 (0)3641 686 637, Fax +49 (0)3641 686 667, E-mail
popova@econ.mpg.de
Bruneh Assistant Professor for Behavioral Economics, Department of
Economics, University of Konstanz, Box 131, 78457 Konstanz, Germany.
Phone +49 (0)7531 88 3214, Fax +49(0)7531 88 2145, E-mail
lisa.bruttel@unikonstanz.de
Guth: Director of the Strategic Interaction Group, Max Planck
Institute of Economics, Kahlaische Str. 10, 07745 Jena, Germany. Phone
+49 (0)3641 686 620, Fax +49 (0)3641 686 667, E-mail gueth@econ.mpg.de
Kamecke: Professor for Competition Policy, Department of Business
and Economics, Humboldt-University Berlin, Spandauer Str. 1, 10099
Berlin. Phone +49 (0)30 2093 5895, Fax +49 (0)30 2093 5787, E-mail
kamecke@ wiwi.hu-berlin.de
are rematched once after 16 rounds of play so that the incentives
to cooperate are large. In the short horizon treatments, subjects play
eight supergames with four rounds each. The horizon variation serves as
a robustness check.
Our results support the prediction of the proper equilibrium. The
strict extension generates a significant increase in cooperation rates,
while behavior in the weak extension is indistinguishable from behavior
in the standard PD game. These results are stable across horizons.
Our paper relates to the literature on the effect of punishment on
cooperation. Generally, voluntary cooperation can be enhanced by
nonequilibrium punishment or equilibrium punishment. In both cases
punishment is costly both to the punishing and punished party. In
studies with nonequilibrium punishment as described by Fehr and Gachter
(2000) or Ostrom, Walker, and Gardner (1992), parties can impose damages
on each other after interacting, but they would never do so if they were
rational selfish agents. Nevertheless, these damages seem ineffective in
increasing cooperation as long as they hurt the punisher less than the
punished. (1) With equilibrium punishment, parties do not have to cause
damages to each other: the mere presence of the punishment possibility
(i.e., the threat) suffices to stabilize cooperation. The difference
between the two types of punishment is that equilibrium punishment can
increase cooperation without dropping the assumption of common
opportunism (i.e., that individuals care only for monetary payoffs),
while for explaining nonequilibrium punishment one has to give up this
assumption. We focus on equilibrium punishment. Here, the question is
whether punishment is a best reply to others' behavior and whether
it suffices to discourage deviation from equilibrium play.
The remainder of the paper is organized as follows. In Section II,
we describe the stage games and provide theoretical and behavioral
predictions about behavior in the repeated games. The experimental
protocol is described in Section III. Section IV presents the main
findings and Section V concludes.
II. MODEL ANALYSIS
A. The Stage Games
Let i = 1, 2 denote the players in the one-shot game. In the
baseline treatment PD each player has two actions, C
("cooperate") and D ("defect"). In the two other
treatments, they additionally have action A ("avoid"). (2) A
pair of actions is denoted by a = ([a.sub.1], [a.sub.2]), where the
action of player 1 is listed first and that of player 2 second. We
distinguish three symmetric payoff matrices of the one-shot interaction:
a standard PD game, an extended PD game with a strict additional
equilibrium ([EPD.sub.s]), and an extended PD game with a weak
additional equilibrium ([EPD.sub.w]).
PD
i = 2
i=1 C D
C 18.18 0.21
D 21.0 9.9
[EPD.sub.s]
i = 2
i=1 C D A
C 18.18 0.21 0.0
D 21.0 9.9 0.0
A 0.0 0.0 3.3
and
[EPD.sub.w]
i = 2
i = 1 C D A
C 18.18 0.21 3.3
D 21.0 9.9 3.3
A 3.3 3.3 3.3
The only pure strategy (3) equilibria of the extended stage games
are (D, D) and (A, A). (4) However, (A, A) is an equilibrium in weakly
dominated actions in [EPD.sub.w], whereas it is strict and therefore in
undominated actions in [EPD.sub.s]. (5)
Correspondingly, we experimentally distinguish between
W-treatments, where subjects play [EPD.sub.w], and S-treatments, where
they play [EPD.sub.s] repeatedly with the same partner. S-treatments
feature situations, where the alternative payoff of 3 can only be
obtained by both players coordinating on "avoid." W-treatments
capture situations, where 3 is the conflict payoff resulting when at
least one party uses "avoid," that is, where the
"avoid" outcome does not require coordination. A common
feature of both repeated games is that the threat to continue with the
payoff (3, 3) instead of (9, 9) can discourage myopically profitable
deviations from mutual cooperation.
B. Subgame Perfect Equilibrium Outcomes in the Repeated Games
Let T [greater than or equal to] 2 denote the number of rounds of
repeated play of either [EPD.sub.w] or [EPD.sub.s]. In each game, we
observe histories [h.sub.t] [member of] [H.sub.t] up to round t (a
vector of length 2 x T which, assuming appropriate information feedback
between rounds, specifies all previous actions of the two players). A
behavioral strategy profile a : H [right arrow] {C, D, A} x {C, D, A}
specifies actions a ([h.sub.t]) = ([a.sub.1], [a.sub.2]) ([h.sub.t]) for
all histories ht of all rounds t.
Constant play of actions (A, A) or (D, D) is obviously a subgame
perfect equilibrium outcome of both repeated games, [EPD.sub.s] and
[EPD.sub.w]. Another subgame perfect equilibrium is the (grim) strategy
constellation ([a.sup.grim], [a.sup.grim]) for T-supergames with
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
That is, players cooperate in all rounds except for the last one
and defect in the last round as long as both cooperate. Otherwise they
switch to A and keep it up until the end of the game. The proof is
straightforward. After playing (C, C) in all rounds [tau] < T, it
does not pay to deviate unilaterally from ([a.sup.grim], [a.sup.grim])
in the last round t = T as (D, D) is a strict equilibrium of the
one-shot interaction. Similarly, deviating from ([a.sup.grim],
[a.sup.grim]) in round t < T after some violation of "(C, C) for
all "t < T" does not pay as constant play of (A, A) is a
subgame perfect equilibrium of the supergame. Deviating unilaterally
from ([a.sup.grim], [a.sup.grim]) in round t after "(C, C) for all
[tau] < t" does not pay because the highest amount a player can
gain from such a deviation is an additional payoff of 3 which will
result in a periodic payoff of 3 rather than 18 or 9 in all later
rounds. Even in case of t = T - 1, the additional gain of 3 in round t =
T - 1 would cost 6 in round T. (6) Thus, ([a.sup.grim], [a.sup.grim]) is
a subgame perfect equilibrium of the T [greater than or equal to] 2
supergame with stage game [EPD.sub.w] and [EPD.sub.s], respectively.
In the first section of Appendix 1, we show that the same argument
justifies a large number of equilibrium outcomes. In rounds preceding
outcome (D, D), the equilibrium payoffs are only restricted by two
conditions: feasibility and individual rationality of payoffs (i.e., the
number of rounds still to be played r (r = T - t) must guarantee the
maximin payoffs of r. (3, 3)). This condition imposes a restriction only
on the occurrence of outcomes (0, 0), (21,0), and (0, 21), so that we
can conclude:
Folk Theorem-Like Result: For T [right arrow] [infinity] the set of
average payoffs in a subgame perfect equilibrium of the finite
supergames with T (< [infinity]) commonly known rounds of play and
stage game [EPD.sub.w] or [EPD.sub.s] converges to a dense set of
individually rational attainable average payoffs:
(2)
[omega] = {([[pi].sub.1], [[pi].sub.2]) | ([[pi].sub.1],
[[pi].sub.2]) [greater than or equal to] (3, 3) and ([[pi].sub.1],
[[pi].sub.2])
[member of] conv ((3, 3); (21,0); (18, 18); (0, 21))}
In particular, the subgame perfect equilibrium strategies
([a.sup.grim], [a.sup.grim]) predict that both players will
"cooperate" in all rounds except in the last round when they
both "defect" so that the average periodic payoffs converge to
(18, 18) as T [right arrow] [infinity].
C. Refining Rationality
To equilibrate cooperation, players have to condition their
equilibrium continuation on the history of the game, that is, they must
threaten to continue the game with (A, A) instead of (D, D) after some
"unwarranted" experience. Such reactions, used to prove the
folk theorem, are eliminated by a simple dominance argument in the last
round of [EPD.sub.w] but not [EPD.sub.s]. Using this dominance argument
iteratively eliminates all but the defective strategies in [EPD.sub.w],
because all strategies employing C or A after some history [h.sub.t] are
dominated by a strategy using [a.sub.i] ([h.sub.t]) = D when all
strategies satisfy [a.sub.i] ([h.sub.[[tau]]) = D for all [tau] > t.
However, the result of iterated elimination of dominated strategies
may depend on the order in which the strategies are eliminated. (7) We
can, however, use a backward elimination argument to prove two
alternative results for [EPD.sub.w] which do not hold for [EPD.sub.s].
In the second section of Appendix 1, we show that "always
defect" is the unique strictly perfect and the unique proper
equilibrium (8) of [EPD.sub.w]. These game theoretic arguments suggest
less cooperation in [EPD.sub.w] than in [EPD.sub.s].
Strong Rationality Refinement: The set of strictly perfect
equilibrium outcomes as well as the set of proper equilibrium outcomes
satisfy the "folk theorem-like result" above in a repeated
[EPD.sub.s], while they both contain only defective play [a.sub.i]
([h.sub.t]) = D for all [h.sub.t] [member of] [H.sub.t] in a repeated
[EPD.sub.w].
D. Behavioral Predictions
Theoretically we established two results. According to the folk
theorem, both extended games will trigger higher cooperation than the
standard PD game. According to the strong rationality refinement, only
the strict game will increase cooperation rates. In this section we
provide an intuitive explanation for the latter result. After that we
discuss our behavioral expectation about the usage of the additional
action A.
Exploiting the other by playing D can have more serious
consequences for the exploiter in the strict than in the weak game. If
the other chooses to punish the exploiter by playing A, the punishment
in the weak game leads to a minimal payoff of 3, regardless of whether
both manage to coordinate on mutual play of A or not. In the strict
game, however, obtaining the payoff of 3 requires coordination.
Unilateral use of A leads to a payoff of 0 to both. Hence, provoking a
punishment and failing to coordinate in the strict game is more costly
than in the weak game for the defector but also for the punisher. If
subjects realize the former but not the latter, defections will be less
frequent in the strict game. If subjects realize the latter, defections
will be less frequent in the weak game. Which reasoning will prevail?
Most probably the former because the latter requires higher levels of
deliberation (Nagel 1995).
Moreover, in the strict game it appears to be much harder to
coordinate on mutual cooperation once the partner plays A to punish
defection. This is because choosing C unilaterally while the other is
still playing A is more costly in the strict than in the weak game. And
this is an implication of the fact that A is self-enforcing in the
strict but not in the weak game. Anticipating these extra costs,
subjects in the strict game are more likely to abstain from defecting in
the first place.
Theoretically, A is expected to be used only off the equilibrium
path. Therefore, we expect subjects to choose A rarely. Without actually
being able to observe A, we can still measure its effectiveness as the
difference in cooperation rates between each extended game and the
standard PD. In the rare cases, in which we expect to see A, we can only
speculate that its purpose will be to punish defection. It is worth
exploring the reaction to such sanction.
III. EXPERIMENTAL DESIGN AND PROCEDURES
In a between-subjects design, we study the six treatments 16PD,
16W, 16S, 4PD, 4W, and 4S, which differ as to
* whether the stage game is PD, [EPD.sub.w] (W), or [EPD.sub.s] (S)
and
* whether the number T of rounds is T = 16 or T = 4.
Subjects play 32 rounds of either PD or [EPD.sub.w], or
[EPD.sub.s]. In 16PD, 16W, and 16S they play two supergames of 16 rounds
each, with two different partners. In 4PD, 4W, and 4S they play eight
successive supergames with four rounds each. (9) We use a random
strangers design: subjects form matching groups of four; each subject is
matched with another subject from her/his own matching group for the
next supergame. We guarantee that no subject faces the same other
subject in two successive supergames. We performed two sessions per
treatment with 32 subjects per session. As one matching group of four
subjects corresponds to one independent observation, we obtained 16
independent observations per treatment. In total, we recruited 384
undergraduate students from the University of Jena, using the online
recruitment system for economic experiments (ORSEE) (Greiner 2004). On
average, subjects earned 8.80 euros and spent 1 hour (15 minutes thereof
on the instructive part) in the laboratory of the Max Planck Institute
of Economics in Jena, Germany.
Upon arrival in the laboratory, subjects were randomly assigned to
a cubicle, where they individually read the instructions. (10) After the
instructions were also read aloud, subjects were able to familiarize
themselves with the experiment during two or three (depending on the
treatment) simulated test rounds. (11) Subsequently, they answered a
questionnaire so that we could check their understanding of the game
rules. After that, they participated in the computerized (12)
experiment. During the experiment eye contact was not possible. Although
subjects saw each other at the entrance to the lab, there was no way for
them to guess with which person(s) from the crowd of 32 students they
would be matched later on. Most subjects already had some previous
experience with other experiments: more than 90% of the subjects had
previously participated in at least one experiment.
We introduced two different time horizons into our experiment.
Backward induction predicts the same cooperation rates in the PD game
independent of the number of repetitions. However, previous experiments
demonstrate that in long repeated PD games subjects play more
cooperatively than in short ones (Dal B6 2005). Any difference between
the simple and the extended PD game and between the two extended games
might disappear when varying the horizon. The supergames with the
shorter time horizon therefore provide a robustness check for our
results.
[FIGURE 1 OMITTED]
IV. RESULTS
A. Cooperation Rates across Treatments
Average cooperation rates (13) by treatment are depicted in Figure
1. For both the long and the short horizon cooperation rates increase
from PD over W toward S. However, only the strict game triggers
significantly more cooperation compared to the standard PD, while the
difference between the weak game and the standard PD is not
statistically significant. Apparently, multiplicity of equilibria alone
does not lead to more cooperation. Across horizons, for all game types
(PD, W, S) cooperation is significantly higher for the longer horizon,
confirming the results of Dal B6 (2005). The results of all pairwise
comparisons between treatments with respect to means using a Wilcoxon
rank-sum test are summarized in Table 1.
Figure 2 captures the evolution of cooperation over time in
supergames with 16 rounds.
[FIGURE 2 OMITTED]
In round 17, subjects were assigned to a new partner (see the usual
end game effect). The figure shows that the results above hold not only
at the aggregate level but also for most rounds. When performing
Wilcoxon rank-sum tests based on pairwise comparisons between treatments
for each round, in almost all rounds cooperation rates in the strict
game with the long horizon lie above those in the other two games
(Appendix Table A 1). Cooperation rates in 16PD and 16W are never
significantly different. In the short games, results are very similar,
except for the fact that here we observe a significant difference
between the strict and the other two games even in the last round of
interaction with the same partner (see Figure 3 and Appendix Table A 1).
Result 1 An additional strict equilibrium significantly increases
cooperation rates compared to the prisoner's dilemma game.
In our view, this is not a very surprising result. Although
participants may not reason as presupposed by subgame perfect
equilibrium outcomes stated by the folk theorem, they qualitatively seem
to understand the preventive effect of the A-option. Thus, what is far
more surprising is that this prevention only "works" when (A,
A) is a strict equilibrium of the base game.
Result 2 An additional weak equilibrium does not increase
cooperation rates compared to the prisoner's dilemma game.
Result 1 is in line with the theoretical prediction of the
"strong rationality refinement," that is, the additional
threat only increases the willingness to cooperate if it is strictly
self-enforcing. The surprising result that an additional weak
equilibrium of the stage game does not change behavior compared to the
PD game also suggests that punishment option A whose mutual use does not
qualify as an equilibrium will also be not very "preventive."
Table 2 reveals two sources of higher cooperation rates in the
treatments with an additional strict equilibrium via transition
probabilities between the different strategy profiles. On the right-hand
side, one can see the transition frequency from outcome (D, D) in round
t to each outcome (except for the outcomes including action A) in t + 1.
Given that outcome (D, D) is prominent (compare the left-hand side of
Table 2), reactions to it are important. First, (D, D) becomes less
likely to be followed by (D, D) when passing from PD over W to S.
Second, the percentage of players willing to unilaterally cooperate
after a mutual defection is highest in the strict treatments.
B. Actual Use of the Additional Action A
The main purpose of action A is the prevention of defection from
cooperation, that is, to discourage deviations from cooperation, as the
additional action is mainly predicted off the
equilibrium path. (14) Hence, its actual use should be rare or at
least become rare with experience. A's relative frequency is less
than 3% in both the strict and the weak games (independent of the
horizon), supporting the idea that action A is not supposed to be used.
Table 3 illustrates when players use action A and how they react
when their partner has used A in the previous round. (15) A is mostly
selected after the other has played D in the previous round, that is,
after outcomes (C, D), (D, D), or (A, D). Some subjects choose A after
having played D themselves, while the other has played C, that is, after
an outcome (D, C), probably as a response to an expected punishment by
the other player. Option A seems to be a (actual and expected) punishing
device.
Do "punished" subjects become more cooperative in the
next round? In fact, only a few start cooperating. Most subjects
continue playing D. It seems the use of action A is ineffective in
coordinating subjects on (C, C) in the subsequent round.
It is worth mentioning that the frequencies observed in Table 3 are
due to just a few subjects who repeatedly use A. For example, in 16S
only 10 out of 64 subjects employ A, and in 16W this number is 18.
Finally, the theoretical difference between [EPD.sub.s] and
[EPD.sub.w] is based on the robustness of the stage game equilibrium (A,
A). The frequency of this equilibrium is, however, very low. (A, A)
appears once in 16S and twice in 16W. So (A, A) seems to be a
coincidence rather than a systematic choice. The results are again very
similar for the supergames with short horizon.
Result 3 The additional action A is seldom used, mainly to punish
defection in the previous round. Subsequent to this punishment, however,
only a few partners switch to C.
V. CONCLUSION
Folk theorem results do not require an infinite horizon. Multiple
equilibria in the stage game suffice for subgame perfect equilibrium
cooperation in all rounds except for the commonly known last round of
interaction. Refining subgame perfection allows to distinguish between
games with an additional strict versus weak equilibrium of the stage
game. Experimentally we show that only the strict additional equilibrium
has a measurable effect on cooperation rates. This result is robust to
variations in the time horizon of interaction.
We justified our behavioral prediction of more cooperation in the
strict than in the weak game with the intuition that defecting in the
strict game may be more costly for the defector than defecting in the
weak game. Further we argued that players at least partly ignore that
for the same reason punishing in the strict game may be more costly also
for the punisher and, thus, less credible. An experimental analysis of
the intermediate game with bimatrix representation may sharpen our
understanding about which of these arguments is the relevant one. In the
table above A is undominated for player 1 and dominated for player 2. If
punishment in the strict game is perceived as more costly only for the
defector and therefore more credible, then in the intermediate game
player 1 will defect less often than player 2. If, to the contrary,
punishment in the strict game is perceived as costly for both the
defector and the punisher and therefore less credible, player 1 will
defect more often than player 2. We leave this for future research.
[FIGURE 3 OMITTED]
i = 2
i = 1 C D A
C 18.18 0.21 0.0
D 21.0 9.9 0.0
A 3.3 3.3 3.3
Finally, let us resume our discussion started in Section I about
punishment as a cooperation enhancing device. Our control experiment PD
confirms previous findings that there is voluntary cooperation even when
punishment cannot be equilibrated. (16) However, when punishment can be
equilibrated by an additional strict equilibrium in the base game,
voluntary cooperation increases significantly. Our general conclusion is
that being able to equilibrate people's reciprocity inclination definitely strengthens their willingness to reciprocate and their
anticipation of others' willingness to reciprocate. Both types of
punishment, equilibrium and nonequilibrium, rely on reciprocity
incentives. However, incentives seem stronger when they are equilibrated
by an additional strict equilibrium.
APPENDIX 1 : EQUILIBRIUM PREDICTIONS
The Folk Theorem
Let us give up symmetry and explore the payoff space which can be
justified by any pure strategy subgame perfect equilibrium as it is
usually performed when establishing folk theorems.
Suppose [EPD.sub.w] is repeated T times so that the number of
rounds left to be played is r = T ... 1. In each round r both players i
select a stage game action [a.sub.i] [member of] {C, D, A}.
In the last round (r = 1), there are two pure subgame equilibrium
strategies, (D, D) and (A, A), with additional subgame payoffs (9, 9)
and (3, 3), respectively.
In the second last round (r = 2), the continuations are sufficient
to discourage deviation from cooperation so that (C, C) is supported as
equilibrium outcome with continuation (D, D) and the threat to select
(A, A) instead after a deviation. For the same reason, the (less
interesting) actions (D, A) and (A, D) can occur in equilibrium. Thus,
the set of payoffs II2 = {(27, 27), (18, 18), (12, 12), (6, 6)} can be
realized in the remaining two rounds by the corresponding subgame
equilibrium strategies. In this round coordination failure, (C, D) and
(D, C) and the actions (C, A) and (A, C) are never chosen in
equilibrium, because the corresponding potential gains of 9 and 15 after
a deviation cannot be compensated by equilibrium retribution in the last
round.
One round earlier (r = 3), this is no longer valid, because the
threat to continue with (18, 18), (12, 12), or (6, 6) instead of (27,
27), or with (6, 6) instead of (18, 18) discourages deviations.
Similarly, the remaining asymmetric actions (C, A) and (A, C) can be
stabilized in this round if the players continue with the equilibrium
payoff (6, 6) instead of (27, 27). Thus, all nine action combinations
may occur in an equilibrium in rounds r [greater than or equal to] 3 if
the feasible continuations are restricted as described earlier. The
resulting asymmetric and symmetric additional subgame payoffs are
[[PI].sub.3] = {(48, 27L (27, 48), (39, 18), (18, 39), (45, 45),
(36, 36), (30, 30)}
[union] {(27, 27), (21,21), (15, 15), (9, 9)}.
The resulting restrictions on the action combinations can be
summarized as follows: (17)
THEOREM 1 A combination of actions ([a.sub.1], [a.sub.2]) in round
r of the repeated weak extended prisoners' dilemma (EPD) game is
compatible with a subgame perfect equilibrium if and only if r [greater
than or equal to] 3, or r = 2 and ([a.sub.1], [a.sub.2]) [member of]
{(C, C), (D, D), (D, A), (A, D), (A, A)}, or r = 1 and ([a.sub.1],
[a.sub.2]) [member of] {(D, D), (A, A)}.
Also in earlier rounds (r = 4, 5 ...), all nine action combinations
are allowed by the equilibrium strategies. The corresponding set of
equilibrium payoffs is generated by adding the potential stage payoffs
{(18, 18), (21, 0), (0, 21), (9, 9), (3, 3)} to the set of equilibrium
payoffs in the following round whenever the difference to the strongest
potential punishment is sufficient to deter deviations. To construct the
equilibrium payoffs in round r, we therefore have to identify the
subgame equilibrium payoffs in round r - 1 which allow to punish both
players with at least 3 (to reach [18, 18] in round r), which allow to
punish the first player with at least 9 (to reach [0, 21] in round r),
and which allow to punish the second player with at least 9 (to reach
[21, 0] in round r). These three sets are defined by
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
In order to obtain the set of equilibrium payoffs in round r we
finally add the corresponding equilibrium continuation payoffs as
follows:
[[PI].sub.r] = ([[PI].sub.r-1] + {(9, 9), (3, 3)}) [union]
([[PI].sup.1.sub.r-1] + {(18, 18)})
[union] ([[PI].sup.2.sub.r-1] + {(21,0)}) [union]
([[PI].sup.3.sub.r-1] + {(0, 21)}).
For r = 4 (after eliminating double elements) this yields the set
of possible additional subgame payoffs
[[PI].sub.4] = {(57, 36), (51,30), (36, 57), (30, 51), (48, 27),
(27, 48), (36, 36)}
[union] {(30, 30), (24, 24), (18, 18), (12, 12)}
[union] {(63, 63), (54, 54), (48, 48), (45, 45), (39, 39), (33,
33)}
[union] {(69, 27), (60, 18), (66, 45), (57, 36), (51, 30), (48,
27), (42, 21)}
[union] {(27, 69), (18, 60), (45, 66), (36, 57), (30, 51), (27,
48), (21,42)}.
It is straightforward that a folk theorem holds for the resulting
average equilibrium outcomes:
THEOREM 2 The set of average equilibrium payoffs [[PI].sub.r]/r
converges to a dense set on the individually rational attainable average
payoffs in
[omega] = {([[pi].sub.1], [[pi].sub.2]) | ([[pi].sub.1],
[[pi].sub.2]) [greater than or equal to] (3, 3) and ([[pi].sub.1],
[[pi].sub.2])
[member of] conv ((3, 3); (21,0); (18, 18); (0, 21))}.
To show this limit result, we approximate every interior point of
[omega] as a rational convex combination [alpha]. (3, 3) + [beta] x (21,
0) + [gamma] x (18, 18) or [alpha] x (3, 3) + [beta] x (0, 21) + [gamma]
x (18,18). Let us choose T so that [alpha]T [member of] N, [alpha]T
[greater than or equal to] 4, [beta]T [member of] N and [gamma]T [member
of] N, and let the players use ([alpha]T - 1) times (3, 3), [beta]T
times (21,0), or (0, 21), respectively, [gamma]T times (18, 18) and (9,
9) once. This payoff scheme can be realized as an equilibrium outcome
because it ends with equilibrium payoffs and because individual
rationality ([[pi].sub.1], [[pi].sub.2]) [greater than or equal to] (3,
3) restricts the use of (21,0) sufficiently to allow the necessary
threat for equilibrating such behavior.
In the strict extended prisoners' dilemma game [EPD.sub.s],
the situation is more complicated as the additional payoff (0, 0) is
more difficult to reach in an equilibrium. However, as this payoff is
below the maximin stage game payoff, (18) this additional cell is only
of limited relevance. There is no difference in the last round as the
set of equilibrium payoffs coincides with those in the weak game. In the
second last round (r = 2), the stage payoffs (0, 0) can be added to (9,
9) and (18, 18), because the threat to move on with (3, 3) instead is
sufficient to stabilize the corresponding actions. Thus,
[[PI].sub.2,strict] = {(27, 27), (18, 18), (12, 12), (9, 9), (6, 6)}
contains one more element than [[PI].sub.2,weak]. This trick allows to
fill the gaps between the average payoffs faster, but it does not change
the limit result in Theorem 2 as it never affects the smallest average
payoff (3, 3). The equilibrium actions are a little more restricted than
above:
THEOREM 3 A combination of actions ([a.sub.1], [a.sub.2]) in round
r of the repeated strict extended prisoners' dilemma game is
compatible with a subgame perfect equilibrium if and only if r [greater
than or equal to] 4, or r = 3 and ([a.sub.1], [a.sub.2]) [not member of]
{(A,C),(C,A)}, or r = 2 and ([a.sub.1], [a.sub.2]) [member of] {(C, C),
(D, D), (A, A)}, or r = 1 and ([a.sub.1], [a.sub.2]) [member of] {(C,
C), (D, D)}.
Properness as a Refinement
In this section, we need additional notation. By [[alpha].sub.i] we
denote mixed behavioral strategies with probabilities [[alpha].sub.i],
(h) for the three possible actions a [member of] {C, D, A} after history
h [member of] H. The normal form pure (mixed) strategies are denoted by
s ([sigma]); [[sigma].sup.[epsilon]] is a sequence of completely mixed
strategies with [epsilon] [right arrow] 0 and with
[[sigma].sup.[epsilon].sub.i]([S.sub.i]) [greater than or equal to]
[epsilon] for every pure strategy [s.sub.i];
([sigma] | [s.sub.i]) is the strategy vector in which the (mixed)
strategy [[sigma].sub.i] is replaced by the (pure) strategy [s.sub.i],
[pi] ([sigma]) = ([[pi].sub.1] ([sigma]), [[pi].sub.2] ([sigma])) is the
expected payoff realized if the strategies [sigma] are played, and the
payoff [[pi].sup.h.sub.i] ([sigma]) is the expected payoff of a mixed
strategy conditional on the fact that history h has been reached (which
is always well defined for completely mixed strategies).
The folk theorem holds in [EPD.sub.s] also if we restrict the
attention to strictly perfect equilibria, that is, to equilibria which
are stable with respect to every perceivable tremble. In [EPD.sub.w] the
situation is drastically different. There the simple argument holds only
for the defective equilibrium in which the players select the strategies
[a.sub.i] ([h.sub.t]) = D for all t = 1 ... T. All other equilibria can
be excluded by backward induction in the form of repeated elimination of
dominated strategies sketched above: let players tremble such that
future mistakes matter much less than present ones. From the requirement
that (agent normal form) perfect equilibria must not use dominated
behavioral strategies follows that players will not use dominated
actions in the last round. So assume that the players select only
behavioral strategies [a.sub.i] ([h.sub.t]) = D for all t > [??],
then [a.sub.i] ([h.sub.[??]]) = D is the unique best reply after any
[h.sub.[??]] [member of] [H.sub.[??]] to a completely mixed strategy,
because potential gains from future trembles are negligible compared to
the present gains. Thus, we can conclude that this tremble structure
justifies only the defective action as equilibrium behavior:
THEOREM 4 Defective play [a.sub.i] ([h.sub.t]) = D for all
[h.sub.t] [member of] H is the unique strictly perfect equilibrium in
the agent normal form of [EPD.sub.w].
The concept of strict perfection is not accepted in the literature,
because it requires so much stability that it does not always exist (van
Damme 1987, 29). We therefore use the concept of proper equilibrium
(Myerson 1978) to defend the uniqueness of defective play in
[EPD.sub.w]. To impose rationality in decision nodes never reached, we
restrict our attention to normal form proper equilibria in behavioral
strategies which are approximated by [epsilon]-proper equilibria as
proposed by van Damme (1987), 119. So we can prove
THEOREM 5 Defective play [a.sub.i]([h.sub.t]) = D for all [h.sub.t]
[member of] H is the unique strategy which can be approximated by the
normal form strategies of an [epsilon]-proper equilibrium of
[EPD.sub.w].
We show that the behavioral strategies concentrate on an
"always defect" continuation, that is,
[[alpha].sup.[epsilon].sub.iD](h) [greater than or equal to]
[[epsilon][alpha].sup.[epsilon].sub.ia](h) for both actions a [epsilon]
{C, A} and for all histories h [epsilon] H so that "always
defect" is the unique proper equilibrium which can be approximated
by corresponding induced [epsilon]-proper behavioral equilibrium
strategies.
We proceed by backward induction. Suppose the claim is true for all
histories of length [rho] < r. Take an arbitrary history [h.sub.r]
[member of] [H.sub.r] of length T - r and assume that the outcomes in
round r are not concentrated on (D, D), that is, the limit distribution
induced by [[alpha].sup.[epsilon]]([h.sub.r]) on the nine states puts a
positive limit probability on some state ([a.sub.1], [a.sub.2]) [not
equal to] (D, D) as [epsilon] [right arrow] 0. Let [D.sub.i]([h.sub.r])
be the set of player i's defective continuations of [h.sub.r], that
is, the pure strategies which follow [h.sub.r] and select defect (D) in
all subsequent decision nodes. Our induction assumption implies both
players concentrating on defective continuations [d.sub.i] [member of]
[D.sub.i] ([h.sub.r], ([a.sub.1], [a.sub.2])), no matter which state
([a'.sub.1], [a'.sub.2]) is realized after [h.sub.r]. So if
([a.sub.1], [a.sub.2]) ([h.sub.r]) [not equal to] (A, A), (at least) one
of the players can improve his payoff by a deviation to a [d.sub.i]
[member of] [D.sub.i] ([h.sub.r]): if [a.sub.j] [not equal to] A player
i [not equal to] j receives a larger payoff after [h.sub.r], while the
outcomes in the rest of the game are (9, 9) with probability [(1 -
[epsilon]).sup.T-r] [right arrow] 0. Thus, the expected outcome
difference in the remaining rounds becomes negligible in comparison to
the gain after [h.sub.r].
So let ([a.sub.1], [a.sub.2]) = (A, A) and compare player l's
payoff [[pi].sup.h.sub.1] ([[sigma].sup.[epsilon]] | [d.sub.1]) after a
defective continuation [d.sub.1] [member of] [D.sub.1] ([h.sub.r]) and
player 1's payoff [[pi].sup.h.sub.1]{[[sigma].sup.[epsilon]] |
[[??].sub.1]) after any pure strategy [[??].sub.1], which continues with
at least one later deviation from D after some history ([h.sub.r], (A,
A) ...). Both payoffs contain the same constant value which is realized
up to [h.sub.r]. In round r player 1 receives more than 3 if player 2
trembles to C or D, while ([[sigma].sup.[epsilon]] | [[??].sub.1]) leads
to a payoff of 3. In later rounds player 1 may gain or lose from further
trembles if he continues with ([[sigma].sup.[epsilon]] | [d.sub.1]),
while ([[sigma].sup.[epsilon]] | [[??].sub.1]) generates a loss of at
least 6 against player 2's regular strategy at least once. Thus, we
get [[pi].sub.h]([[sigma].sup.[epsilon]] | [d.sub.1]) -
[[pi].sub.h]([[sigma].sup.[epsilon]] | [[??].sub.1]) [greater than or
equal to] [epsilon] (21 (T - r)) + 6. (1 - [epsilon]) > 0 for
[epsilon] sufficiently small so that the requirement of [epsilon]-proper
trembles implies that player l's trembles satisfy
[[sigma].sup.[epsilon].sub.1] ([d.sub.1]) [greater than or equal to]
[epsilon] x [[sigma].sup.[epsilon].sub.1] ([[??].sub.1]).
Finally, we use this condition to compare player 2's defective
continuations [d.sub.2] [member of] [D.sub.2] ([h.sub.r]) with the pure
strategy [[??].sub.2] [member of] [D.sub.2] ([h.sub.r], (A, A)), on
which he is supposed to concentrate. The defective continuation
[d.sub.2] yields a higher payoff than [[??].sub.2] ([r.sup.h.sub.2]
([[sigma].sup.[epsilon] | [d.sub.2]) > [[phi].sup.h.sub.2]
([[sigma].sup.[epsilon] | [[??].sub.2]])) if player 1 trembles after
[h.sub.r] and it-may give less (or more) if player 1 trembles in later
rounds. The condition [[sigma].sup.[epsilon].sub.1] ([d.sub.1]) [greater
than or equal to] [epsilon] x [[sigma].sup.[epsilon].sub.1]
([[??].sub.1]) implies that later trembles are much less likely; the
resulting expected present gains dominate potential future losses for
small [epsilon]. This implies that strategies which put a positive
weight on the action [a.sub.i] ([h.sub.r]) = A cannot be best replies to
[sigma] and are therefore by van Damme's (1987, 30) Lemma 2.3.2
incompatible with an [epsilon]-proper equilibrium.
APPENDIX 2: INSTRUCTIONS
Welcome and thank you for participating in this experiment. Please
read the instructions carefully. From now on we ask you to remain seated
and to stop communicating with other participants. If you have any
questions, please raise your hand. We will come to your place and answer
your questions in private. It is very important that you follow these
rules. Any violation will lead to your exclusion from the experiment and
any payment.
The instructions are identical for all participants.
You will participate in the following sub-experiment two [eight]19
times. Every sub-experiment consists of 16 [four] rounds. Within the
same sub-experiment you will be interacting with the same participant.
Whenever a sub-experiment is finished, the other participant will be
replaced. [It is possible that you interact with a participant you have
already interacted with. However, it is impossible that you interact
with the same other participant for two consecutive sub-experiments.]
In each round, you and the other participant will be simultaneously
asked to choose one of three {two} (20) alternatives A, B, or C (21) {A,
or B}. Depending on your own decision and the decision of the other
participant, your earnings are given by the following table. (22,23)
For example, if you choose A while the other participant chooses B,
you will earn 0 ECU and the other will earn 21 ECU. If you choose B and
the other chooses A, you will receive 21 ECU and the other 0 ECU. At the
end of each round, you will be informed about
* your own decision
* the decision of the other participant
* your earnings from the current round
* your total earnings from the current sub-experiment.
Your earnings from the two [eight] sub-experiments will be added up
and paid to you in cash at the end of the experiment. The exchange rate
is 66 ECU per 1 euro. Additionally, you will receive a show-up fee of
2.50 euros.
After reading these instructions, you can familiarize yourself with
the experiment during three {two} test rounds. The test rounds are not
relevant 1 or your earnings. Then you will be asked to answer some
control questions. After this, the experiment will start. In the end we
will ask you to fill in a brief questionnaire.
(1.) Ahlert, Crhger, and Guth (2001) show that equal damages for
punishers and punished do not work.
(2.) In the experimental instructions we used a neutral frame for
the strategies: "cooperate" was called "A,"
"defect" "B" and "avoid" "C."
(3.) In [EPD.sub.s] a mixed strategy equilibrium exists according
to which both players use "defect" with probability 1/4 and
"avoid" with probability 3/4.
(4.) Of course, the only pure strategy equilibrium of the PD game
is (D, D).
(5.) Feinberg and Snyder (2002) use a workhorse that is very
similar to our [EPD.sub.w]. However, the focus of their paper
substantially differs from ours: they study the effect of imperfect information about the actions of the other player on collusion in
repeated duopoly markets, while we are interested in comparisons between
standard and extended PD games that allow for equilibrium punishment.
(6.) In our experiment and the corresponding theoretical analysis,
we concentrate on supergames in which the cumulative payoffs of all T
rounds are paid after the last round. However, our arguments are equally
valid if the players discount their payoffs with some sufficiently large [delta] < 1. The cooperative outcome, for example, is obtained,
whenever [delta] > 1/2.
(7.) Even though our backward elimination employs only
"nice" weak dominance in the sense of Marx and Swinkels
(1997), we cannot apply their result, because their condition of
"Transference of Decision Maker Indifference" does not hold
for our repeated game.
(8.) Strict perfection (Okada 1981) and properness (Myerson 1978)
are refinements of trembling hand perfectness (Selten 1975). Strict
perfection requires stability with respect to every tremble, while (the
weaker) properness imposes a payoff dependent structure on the trembles.
Both question the idea of uniform trembles (Harsanyi and Selten 1988).
In Appendix 1 we use both, trembling hand perfectness and properness.
(9.) Imposing the same total number of rounds across treatments
keeps the stakes constant across treatments and also the effects of
fatigue, etc. We preferred this over controlling for the number of
repetitions of supergames across treatments. Nevertheless, we can
compare learning between treatments by focusing on the first repetition
only.
(10.) For a translation of the instructions from the German, see
Appendix 2.
(11.) In the test rounds subjects did not interact with another
subject but with the computer.
(12.) We used the software z-Tree, Fischbacher (2007).
(13.) Results of the following analysis do not substantially change
when we consider medians instead of means. Therefore, we do not discuss
them separately.
(14.) However, the folk theorem shows that there are equilibria in
which the players use action A earlier on the equilibrium path.
(15.) The last actions are, of course, triggered by second last
ones which suggests to assess also the conditioning on earlier than last
behavior. Our attempts to explore such conditioning on earlier than last
behavior were, however, inconclusive.
(16.) Here we abstract from experimentally not induced incomplete
information (Kreps et al., 1982) or weaker notions of rationality
(Radner 1980).
(17.) Of course, not all actions can be combined to an equilibrium
history so that there are further restrictions on the set of equilibrium
strategies.
(18.) If one chooses, for example D with probability 1/4 and A with
probability 3/4, one is sure to receive at least an expected payoff of
3/4 regardless of what the other does.
(19.) In square brackets: short horizon.
(20.) In curly brackets: PD treatments.
(21.) Here, "A" corresponds to "cooperate,"
"B" to "defect," and "C" to
"avoid."
(22.) In round brackets: weak game. The table for the PD game
consists only of the cells that include choices A and B.
(23.) ECU, experimental currency units.
The The Earnings
Decision of My of the Other
My the Other Earnings Participant
Decision Participant in ECU in ECU
A A 18 18
A B 0 21
A C 0 (3) 0 (3)
B A 21 0
B B 9 9
B C 0 (3) 0 (3)
C A 0 (3) 0 (3)
C B 0 (3) 0 (3)
C C 3 3
TABLE A1
Round-wise Wilcoxon Rank-Sum p Values for Means
Long 16S versus 16S versus 16W versus
Horizon Round 16W 16PD 16PD
1 0.09 0.01 0.41@
2 0.14@ 0.01 0.39@
3 0.04 0.02 0.69@
4 0.02 0.01 0.77@
5 0.06 0.44@ 0.27@
6 0.16@ 0.07 0.57@
7 0.05 0.04 0.77@
8 0.03 0.01 0.69@
9 0.04 0.00 0.30@
10 0.02 0.06 0.82@
11 0.02 0.01 0.80@
12 0.04 0.02 0.59@
13 0.03 0.03 0.84@
14 0.03 0.03 0.89@
15 0.19@ 0.19@ 0.92@
16 0.36@ 0.86@ 0.43@
Short 4S versus 4S versus 4W versus
Horizon Round 4W 4PD 4PD
1 0.09 0.05 0.56@
2 0.08 0.02 0.4@
3 0.13@ 0.17@ 0.64@
4 0.01 0.05 0.77@
Notes: Wilcoxon rank-sum p-values with nonsignificant
values in italics; test conducted by round; null hypothesis:
two independent samples are from populations with the same
distributing; 16 independent observations per treatment.
Notes: Wilcoxon rank-sum p-values with nonsignificant
values in italics with @ indicated.
REFERENCES
Ahlert, M., A. Cruger, and W. Guth. "How Paulus Becomes
Saulus. An Experimental Study of Equal Punishment Games." Homo
Oeconomicus, 18, 2001,303-18.
Benoit, J. P., and V. Krishna. "Finitely Repeated Games."
Econometrica, 53(4), 1985, 905-22.
Bereby-Meyer, Y., and A. E. Roth. "The Speed of Learning in
Noisy Games: Partial Reinforcement and the Sustainability of
Cooperation." American Economic Review, 96(4), 2006, 1029-42.
Bruttel, L. V., W. Guth, and U. Kamecke. Forthcoming.
"Finitely Repeated Prisoners' Dilemma Experiments without a
Commonly Known End." International Journal of Game Theory, DOI:
10.1007/s00182-0110272-z.
Dal Bo, P. "Cooperation under the Shadow of the Future:
Experimental Evidence from Infinitely Repeated Games." American
Economic Review, 95(5), 2005, 1591-604.
Fehr, E., and S. Guchter. "Cooperation and Punishment in
Public Goods Experiments." American Economic Review, 90(4), 2000,
980 94.
Feinberg, R., and C. Snyder. "Collusion with Secret Price
Cuts: An Experimental Investigation." Economics Bulletin, 3(6),
2002, l - 11.
Fischbacher, U. "z-Tree: Zurich Toolbox for Ready-Made
Economic Experiments." Experimental Economics, 10(2), 2007, 171-78.
Greiner, B. "An Online Recruitment System for Economic
Experiments," in Forschung und Wissenschaftliches Rechnen 2003,
GWDG Bericht 63, edited by K. Kremer and V. Macho. Goettingen: Ges. fuer
Wiss. Datenverarbeitung, 2004, 79-93.
Harsanyi, J. C., and R. Selten. A General Theory of Equilibrium
Selection in Games. Cambridge, MA: MIT Press, 1988.
Kreps, D. M., P. Milgrom, J. Roberts, and R. Wilson. "Rational
Cooperation in the Finitely Repeated Prisoners' Dilemma."
Journal of Economic Theory, 27, 1982, 245 -52.
Marx, L. M., and J. M. Swinkels. "Order Independence for
Iterated Weak Dominance." Games and Economic Behavior, 18, 1997,
219-45.
Myerson, R. B. "Refinements of the Nash Equilibrium Concept." International Journal of Game Theory, 15, 1978, 133-54.
Nagel R. "Unraveling in Guessing Games: An Experimental
Study." American Economic Review, 85(5), 1995, 1313-26.
Okada, A. "On Stability of Perfect Equilibrium Points."
International Journal of Game Theory, 10(2), 1981, 67-73.
Ostrom, E., J. Walker, and R. Gardner. "Covenants With and
Without a Sword: Self-Governance Is Possible." American Political
Science Review, 86(2), 1992, 404-17.
Radner, R. "Collusive Behavior in Noncooperative
Epsilon-equilibria of Oligopolies with Long but Finite Lives."
Journal of Economic Theory, 22(2), 1980, 136-54.
Schwartz, S., R. A. Young, and K. Zvinakis. "Reputation
without Repeated Interaction: A Role for Public Disclosures."
Review of Accounting Studies, 5, 2000, 351-75.
Selten, R. "Reexamination of the Perfectness Concept for
Equilibrium Points in Extensive Games." International Journal of
Game Theory, 4(1), 1975, 25-55.
Selten, R., and R. Stoecker. "End Behavior in Sequences of
Finite Prisoner's Dilemma Supergames." Journal of Economic
Behavior and Organization, 7, 1986, 47-70.
van Damme, E. Stability and Perfection of Nash Equilibria. Berlin
Heidelberg: Springer Verlag, 1987.
TABLE 1
Wilcoxon Rank-Sum p Values for Means; Null Hypothesis: Two Independent
Samples Are from Populations with the Same Distribution; 16
Independent Observations per Treatment
Treatment 16S versus 16W 16S versus 16PD 16W versus 16PD
p values .036 .030 .706
Treatment 4S versus 4W 4S versus 4PD 4W versus 4PD
p values .065 .048 .650
Treatment 16PD versus 4PD 16S versus 4S 16W versus 4W
p values .004 .001 .002
TABLE 2
Relative Frequencies of Outcomes and Transition Frequencies from
Outcome (D, D) in Round t to Outcomes (D, D), (C, D), (D, C), (C, C)
in Round t + 1
Average
16PD (%) 16W (%) 16S(%)
Frequency of (D, D) 32 26 13
(C, D) or (D, C) 16 13 12
(C, C) 52 55 74
After (D, D)
16PD (%) 16W (%) 16S(%)
Frequency of (D, D) 84 79 77
(C, D) or (D, C) 14 1l 18
(C, C) 2 1 1
Notes: The numbers for each treatment do not sum up to 100% because
outcomes including A were excluded. To abstract from the usual
endgame effect, we do not count outcomes in rounds 16 and 32, as well
as reactions to (D, D) in rounds 16 and 32. Only reactions to (D, D)
within an interaction with the same partner are analyzed, that is,
reactions to (D, D) in round 17 were not counted.
TABLE 3
Absolute Frequency of Decision A after a
Given Outcome in the Previous Round and
Absolute Frequency of Decisions C, D, or A
after Outcomes (C, A), (D, A), or (A, A) by
Treatment; Subjects per Treatment = 64,
Observations per Treatment = 2,048
Absolute Frequency of A-choices after ... 16S 16W
Own C and other's C 0 0
Own C and other's D 9 9
Own C and other's A 0 0
Own D and other's C 3 6
Own D and other's D 5 23
Own D and other's A 1 1
Own A and other's C 0 0
Own A and other's D 5 16
Own A and other's A 0 0
Reaction after the Other Has Played A 16S 16W
C after own C and other's A 4 4
C after own D and other's A 1 5
C after own A and other's A 1 3
D after own C and other's A 1 1
D after own D and other's A 9 38
D after own A and other's A 1 1
A after own C and other's A 0 0
A after own D and other's A 1 1
A after own A and other's A 0 0