Learning to forecast price.
Kelley, Hugh ; Friedman, Daniel
I. INTRODUCTION
In recent years economists have begun to investigate how people
might learn equilibrium behavior. Microeconomists following Binmore
(1987) and Fudenberg and Kreps (1988) consider learning models with
roots in Cournot (1838) and Brown (1951). Numerous laboratory studies
test and refine the microeconomists' learning models; see Camerer
(1998) for a recent survey. There is also a separate theoretical
macroeconomics literature on learning following Marcet and Sargent
(1989a, 1989b, 1989c) and Sargent (1994); see Evans and Honkapohja
(1997) for a recent survey. The focus is on how people might learn to
forecast relevant prices and whether the learning process permits
convergence to rational expectations equilibrium. We are not aware of
any laboratory work intended to test and refine the learning models
favored by macroeconomists. (1) The current study is intended to fill
that gap.
We gather laboratory evidence on the most basic questions a
macroeconomist might ask about learning: Can people learn to forecast
prices rationally? If there are obstacles to learning, are they
transient or innate characteristics of human behavior? What sorts of
environments reduce or enlarge those obstacles? Additional questions
might be asked about the effects of learning observable in the usual
macroeconomic and financial field data and about forecasting in a
self-referential macroeconomic setting. Our work does not address such
questions directly, but it does lay a foundation for later
investigations of these additional questions.
Available evidence on the basic questions is rather disquieting. An
extensive cognitive psychology literature, following Kahneman, Slovic,
and Tversky (1973), finds that human forecasts are bedeviled by many
systematic biases, such as the anchoring and adjustment heuristic, the
availability and representativeness heuristics, base rate neglect, and
confirmatory and hindsight biases; see Rabin (1998) and Camerer (1998)
for recent surveys. There is also a small experimental economics
literature on forecasting prices and rational expectations that reaches
generally negative conclusions. Garner (1982) presents 12 subjects over
44 periods with a continuous forecasting task that implicitly requires
the estimation of seven coefficients in a third-order autoregressive
linear stochastic model. He rejects stronger versions of rational
expectations but finds some predictive power in weaker versions.
Williams (1987) find autocorrelated and adaptive forecast errors by
traders in simple asset markets. However, the true data -generating
process is not stationary in this task and is unknown even to the
experimenter, which makes it difficult to identify individually rational
behavior. Dwyer et al. (1993) test subjects' forecasts of an
exogenous random walk. They find excess forecast variance but no
systematic positive or negative forecast bias for this nonstationary
task.
A possible objection to both strands of the empirical literature is
that neither provides good opportunities for learning. Most of the
cognitive studies frame the tasks in ways that do not immediately engage
subjects' forecasting experience, offer no salient reward, or
provide little feedback that would allow subjects to improve
performance. The three economics articles just cited have relatively few
trials with complicated or nonstationary processes. Our study, by
contrast, presents laboratory subjects with a moderately difficult
forecasting task in several stationary learning environments.
We examine human learning in an individual choice task called
Orange Juice Futures price forecasting (OJF). The OJF task has a form
and complexity similar to the forecasting tasks in macroeconomists'
models: Subjects must implicitly learn the coefficients of two
independent variables in a linear stochastic process. The task is based
on the observation of Roll (1984) that the price of Florida orange juice
futures depends systematically on only two exogenous variables: the
local weather hazard and the competing supply from Brazil. The
laboratory experiment consists of many independent trials in which human
subjects forecast the OJF price after observing values of the two
variables. After each trial the subject receives feedback in the form of
the "actual" price generated from the linear stochastic model
using the observed values of the two variables. We report results for 99
subjects, each forecasting in 480 trials. Several treatments that may
affect the learning environment are varied across subjects, such as the
noise amplitude and the relative impact of the two variables.
We are interested in two aspects of learning: consistency and
speed. Roughly speaking, learning is consistent to the extent that
subjects eventually respond correctly to the exogenous variables, and
learning is speedy to the extent that subjects settle quickly into a
systematic pattern of response to the variables. To measure learning
speed and consistency, we introduce a rolling regression (or sequential
least squares) technique inspired by Marcet and Sargent (1989a, 1989b,
1989c). The technique gives us trial-by-trial estimates of
subject's implicit coefficient values or responsiveness to the two
exogenous variables. We deem learning to be consistent if these
estimates converge by the last trial to the objective values, and say
that there is under- (or over-) response if the absolute values of the
coefficients are below (or above) the objective values. We measure
learning speed by comparing an individual subject's path of
coefficient estimates and cumulative squared forecast errors to a
Bayesian (or Marcet- Sargent [M-S]) ideal forecast.
The OJF task is a continuous analogue of the discrete response
Medical Diagnosis (MD) task studied intensively by psychologists, such
as Gluck and Bower (1988) and more recently by Kitzis et al. (1998). (2)
The older psychological literature from Thordike (1898) emphasizes
reinforcement learning in binary tasks--actions that do well now are
"reinforced" and chosen more frequently in the future. Naive
reinforcement models do not extend naturally to our OJF task because it
is not clear what reinforcement means in the context of continuous
stimuli (weather and supply information) and continuous response (price
forecast). The MD literature considers more sophisticated models of
error-driven learning, including neural network or connectionist models
and generalized discrete Bayesian models. The most striking finding of
Kitzis et al. (1998) is that a generalized Bayesian model (a cousin to
our rolling regressions) outperforms alternative psychological models in
the version of the MD task closest to the present OJF task. The MD
results encourage us to pursue rolling regression techniques in the OJF
task.
Section II describes our experiment. Section III presents the
results. The main conclusions include (1) learning is quite consistent
in that most subject's coefficient estimates converge closely to
the objective values, but there is a slight general tendency toward
overresponse. (2) Typically learning is noticeably slower than the M-S
ideal. Among the more striking treatment effects are a general tendency
(3) toward overresponse in the High Noise treatment and (4) toward
under-response in the Asymmetric impact treatment. Section IV discusses
the results and proposes extensions of our work. Appendix A reproduces
the instructions to subjects, and Appendix B documents the
identification of unresponsive subjects. Other articles that rely on our
data include Kelley and Friedman (forthcoming), which briefly summarizes
the recent MD results together with preliminary OJF results, and Kelley
(1998), which reports additional OJF results.
II. LABORATORY PROCEDURES
We induce the following linear stochastic relationship of price p
to contemporaneous values of two exogenous variables, [x.sub.1] and
[x.sub.2]:
(1) [p.sub.t] = [a.sub.1][x.sub.1,t] + [a.sub.2][x.sub.2,t] +
[e.sub.t]
Subjects are told that p refers to the local orange juice futures
price relative to its normal level. They are also told that [x.sub.1]
refers to the local weather hazard, which could potentially destroy part
of the domestic orange production, and that [x.sub.2] refers to the
competing supply of oranges from Brazil. The realized price [p.sub.t] in
trial t depends on the realized value of [x.sub.1,t] [subset] [0, 100]
and its coefficient [a.sub.1] (approximately 0.4 in the baseline
treatment), and on [x.sub.2,t] [subset] [0, 100] and its coefficient
[a.sub.2] (approximately --0.4 in the baseline treatment). The
coefficient signs reflect the economic reality that loss of domestic
crops tends to increase price and that increased foreign supply tends to
decrease price. The noise term e reflects the unpredictability of prices
in field markets. Its value [e.sub.t] is drawn independently each trial
from the uniform distribution on [-v, v], where the (maximum) noise
amplitude [upsilon] is a treatment variable (approximately 8 in the
baseline treatment).
Subjects are instructed on the general nature of the task but are
not specifically told the functional form or the coefficient values.
Subjects are told that the experiment is a learning experience in which
the goal is to learn the relationship between information (weather and
competing supply) and the price of OJF. The instructions (Appendix A)
state in nontechnical language that the relationship is stable but
subject to random events that are independent across trials. Treatments
described in the next subsection are held constant for each subject and
are varied across subjects.
Subject Pool
We tested 99 undergraduates from the University of California at
Santa Cruz, most of them from the pool of psychology students who need
to fulfill a class requirement. Salient cash payments were offered in
one treatment described.
Apparatus
The experiment uses a graphics computer program written in C++, run
on Power Mac 7500/100 computers with full color monitors. Subjects in
four sound-dampened isolated testing rooms view controlled events on the
monitor screen and respond via clicking the mouse on various icons on
the display. See Figure 1 for examples of screen displays. This setup
was chosen to minimize boredom and to eliminate the possibility of peer
pressure.
Stimuli
The realized values for weather [x.sub.1,t] and supply [x.sub.2,t]
are independently drawn each period from the uniform distribution on
(0,100), so the variables are orthogonal. The noise term is
independently drawn each period from a different uniform distribution,
U(-v, v). The realized values then are combined using equation (1) and
chosen parameter values ([a.sub.1], [a.sub.2], and v) to produce a 480
trial sequence of prices. The same sequence of realized values and
prices is used for all subjects in any given treatment condition.
Method
Each trial begins with the graphical presentation of the weather
and supply values using two thermometer icons (labeled weather hazard
and Brazilian supply) on the left side of the monitor display as in
Figure 1 (top). Each thermometer is partially filled in red to indicate
the realized value. Except in the No History treatment described, the
subject could also access (by clicking on the Previous Cases icon
labeled C in Figure 1, top) the history of prices in previous trials
with similar weather and supply levels, as in Figure 1 (middle). (3)
Subjects enter their forecast each period by moving slide B in
Figure 1 (top) up or down within the possible price range. After the
price prediction is entered and confirmed, a blue line appears on the
slide bar to indicate the actual price in that trial as in Figure 1
(bottom). Except in the No Score treatment, the score box then appears
as in Figure 1 (bottom). (4) After viewing the score box (if present)
the subject advances to the next trial via a mouse click.
Each subject completed 480 self-paced trials. The session is broken
into three blocks of 160 trials and subjects are permitted five-minute
breaks between blocks. Subjects generally finish in less than the
allotted two hours.
Treatments
We vary the learning environment using five alternatives to the
baseline treatment. Actual participants in financial markets face a wide
variety of conditions in terms of the availability of useful historical
information, the immediacy and accuracy of information on current
conditions, and quality of feedback on investment decisions. We would
like to know something about how such conditions affect the quality of
human forecasts. Also, as mentioned in the introduction, most of the
existing laboratory data uses unpaid subjects who produce poor
forecasts. By using both paid and unpaid subjects, we can see whether
this treatment seems pivotal. The treatments described after the
baseline are ordered in increasing anticipated difficulty for making
accurate price forecasts.
Baseline. The baseline treatment provides parameter values of
[a.sub.1] = 0.417, [a.sub.2] = -0.417 and v = 8.33. The history and
score boxes appear as described. The history and score boxes appear as
described. We regard the baseline treatment as resembling favorable
information conditions in the field, when investors have good access to
historical and contemporary information and immediate feedback. If
subjects don't learn well in this environment, then either the task
exceeds their cognitive abilities or they have insufficient motivation.
Paid. This treatment differs from baseline only in that subjects
are paid according to their final scores. Each subject receives a $5
show-up fee covering the first 30,000 points of final cumulative score.
(Actual final scores always exceeded 30,000, with the top scores over
37,000.) Subjects also receive an additional $1 for each 700 points
scored above 30,000. The median payment was about $15 with top payments
about $16.50. Subjects are told the payment procedures on arrival. This
is a priori the most favorable environment for learning because the
financial incentive seems sufficient to elicit subjects' serious
effort. The a priori prediction of most experimental economists is
faster and more consistent learning than in the baseline; most
psychologists would predict no effect.
No Score. This treatment differs from baseline only in that
subjects do not have access to the Results or Score box icon and box.
The omission of this feedback may degrade the learning environment in
that subjects no longer have a direct measure of their relative
performance. (5) Of course, subjects still have access to all
information that is directly useful in forecasting price, so the effect
might be small.
No History. This treatment differs from baseline only in that
subjects do not have access to the Previous Cases or History icon and
box. A priori we expect that our handy summary of relevant historical
information enables subjects to learn more rapidly. It is hard for most
people to remember and organize hundreds of previous observations. Of
course, the trial-by-trial outcomes still are all observable in this
treatment, so an ideal observer would not be affected. Therefore we
again have two competing hypotheses for this treatment: no effect, or
slower and perhaps less consistent learning.
Asymmetric. The only difference from baseline is that the
coefficient values are [a.sub.1] = 0.250 and [a.sub.2] = -0.583. Thus,
the weather and the competing supply stimuli no longer have equal (or
symmetric) impact on OJF price. A priori it is not clear whether this
treatment creates a more difficult learning environment than No History.
It would have no effect on a subject who learned each coefficient
independently, but would reduce the learning speed and consistency of a
subject with symmetric priors. Psychological studies for binary tasks
with asymmetrically salient stimuli suggest additional hypotheses. If
the overshadowing effect of Kahneman et al. (1982) were present subjects
would tend to overrespond to the larger weighted stimuli and ignore the
less important stimuli. The larger stimuli overshadows the smaller one.
If this effect extends to our continuous task, we would see a bias
toward overresponding to the more important news [x.sub.2] and
underresponding to the less important news [x.sub.1].
High Noise. The final treatment almost doubles the noise amplitude
to v = 14.3, and the coefficient values [a.sub.i] are scaled to [+ or
-]0.357, as described. All other features are as in the Baseline
treatment. We expect High Noise to slow down learning appreciably. Even
an ideal M-S learner would take longer to reach a given degree of
precision in estimating coefficients. One of the characteristics of the
M-S least squares estimator that produces this effect is that the
subjective estimates have a standard error that is decreasing in sample
size (T) but increasing in the variance of the error term
([[sigma].sub.[epsilon]]); see Greene (1993), section 5.6.1. Humans may
have additional difficulties because they tend to have difficulty
separating random from systematic variability (e.g., Rabin 1988; Brehmer
1980). The effect is important in field applications, because some
macroeconomic and financial variables are quite noisy (i.e., the
nonsystematic component has large amplitude relative to the systematic
co mponent) and others are not very noisy.
The general pattern one might anticipate is reduced forecast
consistency and speed as we move down the list of treatments from Paid
to High Noise. That is, we should see on average less accurate final
estimates of the objective weights [a.sub.i] and slower convergence. The
final scores should also decline, because these scores reflect forecast
errors accumulated over all trials and therefore proxy for learning
speed. Finally, the fraction of subjects displaying significant
deviations from consistent forecasts should increase.
Before analyzing the data, we had no idea how good the forecasts
would be. Possibly the representativeness heuristic and base rate
neglect would lead to massive and persistent overestimates of both
coefficients, or the anchoring and adjustment heuristic would produce
initial estimates close to zero that converged toward the true values
very slowly. Possibly overshadowing would be important in the Asymmetric
treatment. Before looking at the results, we need to explain how the
data are prepared for analysis.
Data Processing
The Baseline values of [a.sub.i] are scaled as follows. Begin with
unscaled values [a.sup.*.sub.1] = 0.5 and [a.sup.*.sub.2] = -0.5. Given
noise amplitude [v.sup.*], equation (1) implies that the unscaled price
ranges from [p.sup.*] = 0.5(0) - 0.5(100) - [v.sup.*] = -(50 +
[v.sup.*]) to [p.sup.*] = 0.5(100) - 0.5(0) + [v.sup.*] = (50 +
[v.sup.*]). To fit in the screen's range (-50, 50), we display the
scaled price p = 50[p.sup.*]/(50 + [v.sup.*]). The scaled coefficients
therefore are [a.sub.i] = 50[a.sup.*.sub.i]/(50 + [v.sup.*] and the
scaled noise amplitude is v = 50[v.sup.*]/(50+[v.sup.*]). For the
Baseline noise value [v.sup.*] = 10 we have v = 8.33 and [a.sub.i] =
0.833[a.sup.*.sub.i] = 0.417. The scaled coefficients used in the High
Noise ([v.sup.*] = 20) and Asymmetric treatments are derived in a
similar fashion.
For a given subsequence of trials ([P.sub.t], [x.sub.1,t],
[x.sub.2,t]), t = [t.sub.0],..., T, we define the ideal Bayesian (or
Least Squares or M-S) learner by regressing [P.sub.t] on the independent
variables [x.sub.1,t] and [x.sub.2,t] via ordinary least squares (OLS).
The regression over this subsequence of trials yields coefficient
estimates [a.sub.1,T] and [a.sub.2,T]. The subsequences we consider
consist of trials 1 to 160, 2 to 161,..., 320 to 480. Thus we obtain
learning curves [a.sub.1,T] and [a.sub.2,T] for T = 160, 161,...,480,
which can be interpreted as ideal subjective estimates of the objective
values [a.sub.1] and [a.sub.2]. We refer to these as the M-S learning
curves.
We use similar rolling regressions for human subjects. An actual
subject may think of the task in various idiosyncratic ways-- for
example, he may believe that prices are serially correlated or that
price is a nonlinear deterministic function of the exogenous variables,
despite instructions to the contrary. Nevertheless, the analyst can
summarize the subject's beliefs by seeing how he responds to the
current stimuli [x.sub.i,t], and can summarize the learning process by
seeing how the subject's response changes with experience. Our
approach therefore is to reconstruct implicit beliefs using equation (1)
and subjects' actual responses.
The reconstruction proceeds as follows. Take the subject's
actual forecast [c.sub.t] in trial t as the dependent variable, and run
rolling regressions as before on the realized values [x.sub.i,t], using
a moving window of 160 consecutive trials with the last trial T ranging
from 160 to 480. Consistent and speedy learning is indicated by rapid
convergence of the coefficient estimates [a.sub.i,T] (as T increases) to
the objective values [a.sub.i]. Obstacles to learning are suggested by
slow convergence, convergence to some other value, which represents
over-or underresponse, or divergence of the coefficient estimates.
Some details may be worth noting briefly. (1) In all the results
reported below, the intercept coefficient [a.sub.0] is constrained to
its objective value of zero. Excluding the intercept doesn't affect
our main results, but it does reduce clutter and improve statistical
efficiency. (2) In preliminary work we considered stretchable windows of
data running from t = 1 to T, to capture fully the evidence available to
the subject (or M-S ideal learner) in trial T. However, the entire
learning curve then reflects the subject's initial response pattern
as well as the recent response pattern. We concluded that learning
curves would be more informative when estimated from a moving window
that includes only the most recent responses. Of course, the recent
responses already incorporate everything the subject has learned since
the beginning of the session. (3) Lengthening a (nonstretchable) moving
window reduces standard errors in the coefficient estimates but also
reduces the weight on the most recent responses. After a cursory investigation of preliminary data, we settled on length 160 as a
reasonable compromise. (4) We use OLS in the spirit of Marcet and
Sargent. Because the error term [[epsilon].sub.t] is uniformly
distributed rather than normal maximum likelihood (ML) estimation
methods in theory could give better coefficient estimates. (6) We
checked and found that all ML estimates were insignificantly different
from the OLS estimates.
III. RESULTS
The data analysis provides various comparisons between human
subjects' forecasts and the ideal Bayesian (or M-S) forecasts. The
first step is to get a qualitative impression of the overall comparison
and an impression of how the learning environment treatments affect the
subjects' performance. Then we proceed to formal statistical tests
of the various hypotheses presented earlier.
Figure 2 presents a sample of learning curves in each treatment.
Each panel of the figure shows the objective coefficient values as a
horizontal dotted line and shows the ideal M-S learning curves as thin,
continuous lines. The rolling regressions that generate the M-S curves
seem to capture the price data quite well; typical [r.sup.2]s ranged
from 0.91 for the first 160-trial window of data to 0.93 for the last
window. We were pleased to see that M-S learning is consistent and quite
rapid--indeed, it is virtually complete within the first 160 trials, as
indicated by closeness of the dotted and continuous lines in every
panel. The gap between the lines typically is about one standard error
of the M-S coefficient estimate.
The heavy continuous lines in each panel of Figure 2 represent the
learning curves for the highest scoring subject or the subject with the
median score in each treatment. The width of the line is roughly
represents a one-standard-deviation band around the coefficient
estimate. The rolling regressions again had typical [r.sup.2]s above
0.90. The first two panels show moderate but persistent overresponse to
current weather and supply information, with implicit coefficient
estimates lying closer to [+ or -]0.45 than to [+ or -]0.42 for both
subjects in the Baseline treatment. The next two panels suggest that the
top-scoring Paid subject is right on target, but the median scorer tends
to underrespond slightly. Overresponse seems strongest, with the
top-scoring subject in the No History treatment and the two subjects
shown in the High Noise treatment. The two subjects shown in the
Asymmetric treatment appear to underrespond in most trials.
To conserve space we do not show the learning curves for the other
87 subjects. Suffice it to say that subjects sometimes overrespond,
sometimes underrespond, but typically are fairly close to the objective
values. Subjects seem to update more slowly than the M-S ideal learner.
The prediction that performance deteriorates as we go down the list of
treatments is not contradicted by visual impressions of the learning
curves. But neither is it strongly supported; individual variability
within each treatment makes it difficult to see the treatment effects
clearly. The rest of this section seeks answers more systematically
using statistical tools.
Distribution of Scores
Figure 3 shows the distribution of the scores earned by subjects in
each treatment. Recall that score is a proxy for learning speed, which
is expected to decline as we move down the list of treatments from Paid
to High Noise. Recall also that the M-S ideal learner would earn the
same score in all treatments except High Noise, where the larger error
variance would lower the score.
The figure shows that forecasts often are quite good. In most
treatments the highest score is close to 38,000, only a bit below the
M-S ideal. The modal score and the median score usually are not very far
behind. Mean scores are usually lower because the lowest scores are much
lower, sometimes below 34,000. The mean scores indeed have the expected
ranking. Paid is highest, followed by Baseline, then No Score, No
History, and Asymmetric treatments. The main surprise is High Noise,
where the mean score is a bit higher than in Asymmetric. For comparison,
we calculated scores in the Baseline treatment for two sorts of zero
intelligence agents or nonlearners. An agent who always forecasted zero
(the optimal uninformed forecast) would score 34,326 and an agent who
always used last period's price as the forecast would earn 30,647.
Closer examination of the raw data raises questions about the
motivation of the subjects with lowest scores. We found that these
subjects generally stopped responding to the weather and Brazil supply
information at some point during the session. Subjects who don't
care about performance but seek only to finish quickly can do so by just
clicking the OK icons in every trial, leaving the price forecast at the
default value c = 0. We identified such behavior in 9 of the 99
subjects, whose scores are flagged by asterisks in Figure 3. (7) We now
face a methodological issue. In general we do not recommend the ex post
exclusion of unresponsive subjects from analyses. However, unthinking
responses of c = 0 will bias coefficient estimates toward 0, so it is
potentially important for subsequent data analysis to identify such
behavior. Our solution is to report results both for the full sample and
also for reduced sample that excludes the unresponsive subjects, and to
carefully document our exclusion procedures in Append ix B. These
complications are the price we pay for maintaining comparability to the
psychological literature by using unpaid subjects in most treatments.
Fortunately, none of our conclusions are reversed when we move from the
full to the reduced sample; some results are sharpened.
We are now ready to report statistical tests of treatment effects.
Given the problems with outliers, it is appropriate to use a robust (but
possibly less powerful) nonparametric test. The standard Wilcoxon test fails to reject the null hypothesis of no difference in median scores
between Paid and Baseline (p value = 0.15). Again relative to Baseline,
the same test detects no significant impact of the No Score (p = 0.71)
and No History (p = 0.56) treatments. The corresponding null hypothesis
is rejected, and the research hypotheses are confirmed that scores are
significantly lower in the Asymmetric (p value = 0.002) and High Noise
(p = 0.002) treatments. We conclude that humans indeed learn more slowly
in these environments. (8)
Distribution of Coefficient Estimates
We now consider the key question of consistency: Do humans
eventually learn the objectively correct response to weather and supply
news? Recall that the coefficient estimates [a.sub.i] from the
regression equation explaining individual forecasts [c.sub.t],
(2) [c.sub.t] = [[alpha].sub.1][x.sub.1,t] +
[[alpha].sub.2][x.sub.2,t] + [e.sub.t]
represent the individual's response, and the weights
[[alpha].sub.i] used in the data-generating process in equation (1)
represent the objectively correct response. Learning is consistent to
the extent that the final estimates over trials T -159 to T coincide
with the true values [[alpha].sub.i] by the last trial (T = 480). Figure
4 shows by treatment the distribution across subjects of both
coefficient estimates in the last trial.
Overall, the subjects seem to have it about right: The estimates
center near the objective value and most of the estimates are not far
away. Moreover, most of the outlying estimates are spurious underresponses from the nine unresponsive subjects (denoted with
asterisks, (*). The figure also suggests some treatment effects. There
may be a slight bias toward overresponse in the High Noise treatment and
toward underresponse to the more important stimulus (Supply) in the
Asymmetric treatment. The distributions appear to be tighter for the
Paid subjects, as predicted in the motivation hypothesis favored by
experimental economists. As predicted in the main treatment hypotheses,
the dispersion appears to increase slightly as we move to the more
challenging learning evironments indicated in panels C-F, especially E
(Asymmetric) and F (High Noise).
These impressions are not reliable for two reasons. First, some
treatments have more subjects than others so it is difficult for the eye
to properly compare the distributions behind the histograms. Second,
estimated standard errors are e = 0.02 for the High Noise treatment and
e = 0.01 for the other treatments. The histogram bins in the figure have
width 0.10 or 5-10 standard errors, so the classification is a bit
coarse.
Table 1 rectifies these shortcomings. It classifies a final (T =
480) coefficient estimate as objectively correct if its central 95%
confidence interval contains the corresponding final value from the M-S
simulation. (9) The estimate is classified as over- (or under-) response
if the confidence interval lies entirely outside (or entirely within)
the interval from zero to the (M-S) objective value. Overall, a
plurality of estimates (71 of them) are classified as objectively
correct, and there are roughly equal numbers of overresponses (59) and
underresponses (48 plus 18 questionables).
The main imbalances arise in the last two treatments. Underresponse
to the more important variable (as in Figure 4) and overresponse to the
other variable are quite prevalent in the Asymmetric treatment. In the
High Noise treatment, a majority of the nonquestionable estimates for
both coefficients are classified as overresponse and none is classified
as underresponse.
Formal statistical tests of the main treatment hypotheses are
reported in the last column of the table. The entries are Wilcoxon p
values for each of the two coefficients in each treatment for the full
sample (and in parentheses, for the reduced sample that excludes the
nine unresponsive subjects). In three cases the tests reject (at the
conventional p = 0.05 level in the reduced sample) the null hypothesis
that the estimates center at the objective value [a.sub.i], in favor of
the following one-sided alternatives. There is significant underresponse
to the Supply variable in the Asymmetric treatment (p = 0.00), and
significant overresponse to both variables in the High Noise treatment
(p = 0.02, 0.00). There is also marginally significant overresponse to
the Supply variable in the Baseline treatment (p = 0.08). The other
cases of apparent under- and overresponse do not produce significant
results in this conservative test.
Table 1 also reports behavior observed halfway through the session,
at T = 240. Recall from Figure 2 the impression that moderate but
shrinking overresponse is quite typical at this point. The table shows
that overresponse at the halfway point indeed is somewhat more prevalent
than at the end of the session, especially in the Paid and High Noise
treatments.
Summary and Interpretation
Several general conclusions emerge from the data analysis. First
and foremost, our subjects learn rather quickly to produce surprisingly
consistent forecasts. We see little evidence in our forecasting task of
anchoring and adjustment (systematic underresponse) or of the
representativeness heuristic or base rate neglect (systematic
underresponse). Typical subjects in early trials often overrespond or
underrespond somewhat to the two news sources ([x.sub.1] = weather and
[x.sub.2] = information), but by the end of the experiment they have it
about right.
Of course, human subjects do not learn as fast as an ideal Bayesian
(or M-S econometrician). At the halfway point (T = 240) of the
experiment, the coefficient estimates [a.sub.1] and [a.sub.2] indicate a
slight tendency toward overresponse. Table 1 and sample learning curves
in Figure 2 show that this tendency almost disappears by the end (T =
480) of the experiment. The magnitude of the lag is indicated by the
subjects' scores, which average 3%-8% lower than the ideal.
Some clues as to how human forecast performance varies with the
learning environment can be gleaned from the impact of the laboratory
treatments. First, increasing subject motivation by offering higher cash
payments for more accurate forecasts has a surprisingly modest impact in
our experiment. Unlike most other treatments, there are no subjects with
questionable motivation in the Paid treatment. Also, the distribution of
coefficient estimates seems tightest in the Paid treatment, consistent
with significant findings in other experiments (Smith and Walker, 1993).
However, the difference from Baseline turns out not to be significant in
our data according to standard parametric and non-parametric tests. Of
course, our sample size is not large because the effect of salient
payments was a secondary concern in our experimental design.
Surprisingly, neither the No Score treatment nor the No History
treatment significantly impaired the subjects' scores or accuracy
of the estimated coefficients. It seems that our subjects were able to
keep track of some sort of summary statistic that made the score and the
history summary almost redundant. The Asymmetric treatment, however,
significantly lowered scores and pushed subjects significantly toward
underresponse to the more important information and (insignificantly)
toward overresponse to the less important information. Asymmetry is
typical in field environments, so the finding is potentially important.
It is the reverse of the overshadowing effect documented by
psychologists in other tasks; see Busemeyer (1993) for another example
of reverse overshadowing. The High Noise treatment had the strongest
impact: Significantly lower scores and significant overresponse to both
information variables. The implication seems to be that learning is
slower and more biased when markets are volatile. One might c onjecture
that such inefficient learning might contribute to market volatility and
partially explain the clustered volatility documented in many financial
markets (see Kelley, 1999).
IV. DISCUSSION AND FUTURE WORK
As we noted in the introduction, existing literature from cognitive
psychology indicates that humans typically make very irrational choices
in simple laboratory tasks. In sharp contrast, our human subjects (with
some modest exceptions) rather quickly learn highly rational behavior in
a nontrivial forecasting task. What accounts for the divergent results?
In some ways our experiment makes it difficult for subjects to be
rational. The task is challenging in that the target variable, price, is
stochastic and contingent on two independent variables. Another
challenging aspect of our experiment is that we used psychology pool
subjects, unpaid in most treatments. Irrational behavior exhibited by
such subjects in some tasks disappears when subjects drawn from other
pools are offered salient payments (Friedman and Sunder, 1994). With the
exception of 9 of 99 subjects whose motivation was questionable, our
nonpaid subjects behaved quite rationally and appear to be just as
motivated as the paid participants.
But in other ways our experiment gives rationality its best shot.
The basic task allows subjects to learn over a relatively long sequence
of 480 trials in a stationary environment. Our laboratory setup
encourages subjects to draw on relevant intuitions about price
determination and avoids features that might suggest inappropriate
heuristics. The visual interface encourages rapid and unbiased
processing of information and feedback. If anything, the interface
biases subjects toward underresponse, because the default response is 0
and the subject must move the slide up or down from that point. The
reduced sample used in some of the data analysis screened out the most
egregious cases of default response, but perhaps some slight bias
remains. (10) Arguably our setup is more representative of economically
important field environments than the some of the setups used in
laboratory studies that find irrational behavior.
The rational behavior is fairly robust. Performance was not
significantly impaired in the No Score and No History treatments, which
eliminated useful feedback. Even in the Asymmetric and High Noise
treatments, performance was still quite good. Kelley (1998) reports
several additional robustness checks that reinforce this conclusion.
(11)
An important extension of the work presented here, especially from
the macroeconomics point of view, is to introduce self-referential price
determination. Marcet and Sargent (1989a, 1989b, 1989c) study several
linear stochastic models where traders' forecasts affect the actual
price observed each period. They derive conditions on traders'
learning processes (rolling regressions in essence) that ensure
convergence of actual price to rational expectations equilibrium. It
seems feasible to implement such economies in the laboratory and (given
some stronger assumptions than needed in the present article) to extract
estimates of subjects learning processes. We conjecture that the
empirical models introduced herein will continue to do well in a more
complex self-referential setting. Equally important, the findings here
should be examined in field settings. The observable implications of the
modest departures from rationality we observed in the Asymmetric and
High Noise treatments must be derived and tested on suita ble finance
and macroeconomic data. Kelley (1998) begins this task.
We see two main lessons in the present results. First, people can
learn to make quite good forecasts. Second, some slight but systematic
biases remain. In particular, even after 480 trials, subjects still
tended to over-respond to news in the High Noise environment. Slight
individual biases might interact to produce economically important
market biases (Akerlof and Yellen, 1985; DeLong et al., 1990; Kelley,
1998). More theoretical and empirical work is needed to understand fully
the effects in major financial markets, and more experimental work is
needed to understand learning in self-referential, non-stationary
environments.
APPENDIX A: INSTRUCTIONS TO SUBJECTS
ORANGE JUICE FUTURES EXPERIMENT
(Revised 5/98)
GENERAL INFORMATION
In this experiment you will be asked to use information to make
predictions. You will look at information on competing supply levels and
on weather hazard and will predict orange juice futures prices. Orange
juice price determination in this experiment is fictitious but basically
similar to real life. Your job is similar to that of an investor who
must use imperfect information to predict futures prices.
In this experiment, new information arrives each period (or harvest
season) on (1) the weather hazard for the local orange crop and (2) the
supply of oranges in the main competing region, Brazil (see label A at
Figure 1, top). Each piece of information can take on a value from 0 to
100. A value of 0 for weather hazard means that there will be no loss of
local production due to inclement weather, and a value of 100 means
likely massive damage to the local crop. Similarly, a value of 0 for
supply means a very small Brazilian production and a value of 100 means
the largest possible Brazilian crop.
Each period after viewing the information on weather and supply,
you will enter your price prediction. Prices are measured within the
range -100 (all the way DOWN, or 100 cents below the normal level) to
+100 (all the way UP or 100 cents above the normal level). For example,
sliding the box (see Figure 1, top) to the topmost UP position indicates
that you believe that the current supply and weather conditions will
result in a price 100 cents above the normal price. Likewise sliding the
best guess box to the bottommost DOWN position indicates that you
believe the current crop conditions imply a price 100 cents below the
normal price. Moving the box halfway up (halfway down) between the
middle and top (bottom) predicts a price 50 cents price above (below)
the normal level. Leaving the best guess box at its original position
predicts exactly the normal price level.
READING CHARTS
Each period (or harvest season), you should first look at the
information chart. You may be able to get useful additional information
by clicking on the Previous Cases box. If it is present (see Figure 1,
top) it will be under the chart symbols. When you click that box, a
window will appear in the lower left corner of the screen (see Figure 1,
middle). The first column of the window lists the current information on
competing supply and/or weather hazard. The second column lists the
number of times so far in the experiment you have seen similar supply
and weather conditions, i.e., within plus or minus 10. For example, in
Figure 1, middle, in all previous periods a weather hazard between 0 and
17 has occurred 1 time, and a supply between 59 and 79 has occurred 3
times. The third column gives the average price in these similar
conditions. For example (see Figure 1, middle), the current
harvest's low weather hazard of (7) was associated with a price 35
cents below normal, and the somewhat high competing supply (6 9) was
associated with a price 18 cents below normal. Click OK to leave the
Previous Cases window.
After you have considered the relevant information, you enter your
forecast by clicking the slide box and moving it to your chosen location
on the ruler. After you have made your prediction the UP or DOWN box
will be darkened if you predict a price different from the normal level,
otherwise they will both remain light. Click on OK to submit your
forecast. You will then be told the actual price that period. A blue bar
will appear on the ruler to indicate the actual price (see Figure 1,
bottom). You may then be given a numerical score for your prediction
this harvest and a cumulative score for all harvests to date (see Figure
1, bottom). You will then get the information chart for the next period.
Your goal is to predict as accurately as possible each period.
There will be many periods for you to predict. Work at your own pace.
The experiment should take less than 2 hours. We ask that you do not
take notes.
SCORING
Your score is the profit an investor makes when acting on your
price prediction. Each harvest you earn points based on your prediction
(between -100 and 100) and the actual price that harvest. Profit is
higher the more accurate your forecast (See Figure 1, bottom). For
example, if the actual price turns out to be 70 cents above normal, then
your score is highest if your prediction was +70, a bit lower if your
prediction was +60 or +80, and much lower if you predicted 0 or below.
USEFUL FACTS ABOUT PRICES IN THIS EXPERIMENT
You should not expect your forecasts to be exactly correct each
period. The same supply and weather conditions can sometimes lead to a
price increase and sometimes to a price decrease relative to the normal
level. But if you properly use the average effects of weather and
competing supply, your forecasts will usually be fairly accurate.
Each harvest period researchers collect available information about
market conditions affecting orange juice. The information is distilled
into the charts you see. The charts always record the available
information correctly. The two pieces of information are independent in
the sense that, for example, a high local weather hazard does not
indicate a high or low Brazilian supply.
Each piece of information tends to be associated with higher or
lower prices, but there is never certainty. An expert who completely
understands the effects of competing supply and weather hazards
typically earns much higher profits than a novice, but even the expert
can't predict perfectly each period.
Feel free to ask the experimenter about anything in these
instructions or in the experiment that is unclear to you.
APPENDIX B: IDENTITY OF UNRESPONSIVE SUBJECTS
The reduced sample omits 9 of the 99 subjects. The omitted nine usually
earned the lowest scores in their particular treatment group. The
criterion for omission was whether the subject actually responded to the
stimuli or always entered the default continuous response of 0
(corresponding to a normal price forecast, or no expected price change)
in many consecutive trials. Here are the specifies.
Subject # Score Treatment Subject
Characteristics
10 32638.82 Baseline Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Second
lowest score.
20 33753.02 High Noise Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Lowest
score in group.
29 33071.24 Asymmetric Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Second
lowest score.
30 36294.49 High Noise Completely stopped responding
early in experiment.
Sixth lowest score in group.
34 31426.81 Asymmetric Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Lowest score.
40 35851.27 High Noise Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Third
lowest score.
61 35558.64 High Noise Completely stopped responding.
Overresponse that moves to
underresponse. Second lowest score.
74 32271.7 Baseline Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Lowest score.
89 35965.6 High Noise Virtually all responses are default
([c.sub.t] = 0) for 50 to 200
consecutive trials. Fourth
lowest score.
[FIGURE 1 OMITTED]
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
TABLE 1
Frequency of Correct Response
T=240
Condition \[a.sub.1]\ \[a.sub.2]\ #Ss Over On
Baseline .40 .40 17 16 8
Paid .40 .40 22 27 6
No Score .40 .40 13 10 7
No History .40 .40 12 8 7
Asymmetric .24 .57 15 2 10
High Noise .33 .33 20 28 6
Overall 99 91 44
T=240 T=480
Condition Under Over On Under
Baseline 10 11 14 9(-4)
Paid 11 15 13 16
No Score 9 7 13 6
No History 9 6 9 9
Asymmetric 18 3 9 18(-4)
High Noise 6 17 13 10(-10)
Overall 63 59 71 68(-20)
Wilcoxon
Condition [a.sub.1], [a.sub.2]
Baseline 0.49, 0.96 (0.98, 0.08)
Paid 0.67, 0.89
No Score 0.89, 0.74
No History 0.47, 0.47
Asymmetric 0.08, 0.00 (0.22, .00)
High Noise 0.81, 0.39 (0.02, .00)
Overall
(1.) We hasten to add that several important laboratory
investigations have been inspired by other strands of macroeconomic
theory. For example, Van Huyck et al. (1997) and related work study
equilibrium convergence in coordination games; Marimon and Sunder (1994)
and related work study sunspot equilibria in overlapping generations economies. We will discuss three laboratory studies of rational
expectations equilibrium.
(2.) Our OJE study differs in two other respects from the MD
studies by other investigators. Our rolling regressions are fit trial by
trial utilizing only information the subject has actually seen.
Typically the MD studies first train their models and then provide an
in-sample fit. Also, the MD studies fit their models to aggregate
behavior averaged across groups of subjects. The OJF model fits are to
individual subjects, to investigate heterogeneity that might affect
macroeconomic aggregates.
(3.) The history box numerically displays the current realization
of both variables, the number of previous trials for each variable whose
realization is within ten of the current realization, and the average
realized price in those previous trials. The box remains on the screen
until the subject clicks on the OK icon.
(4.) The score box displays the subject's score on the current
trial and the cumulative score through the current trial. Each trial the
score is calculated from the continuous price forecast c and the
realized price p using the quadratic scoring rule S(p, c) = A - B[(p -
c).sup.2], with A = 80 and B = 280. Thus the maximum score (for a
perfect forecast) is A = 80 points and the minimum is -B = -200 points.
See Friedman and Massaro (1998) for a recent discussion of this scoring
rule. The box also displays the "expert" score of a forecaster
with nothing left to learn, that is, the score earned by forecasting
[a.sub.1][x.sub.1,t] + [a.sub.2][x.sub.2,t] in trial t, using objective
values of [a.sub.i]. Subjects, of course, do not observe the expert
forecast, just the expert score.
(5.) A referee asked an interesting question: What is the relation
between score and market discipline? There is indeed a close relation.
As can be seen from the quadratic scoring rule definition, the expected
score declines linearly in the forecast error variance. But it is well
known that the position size (short or long), and hence expected profit,
also declines linearly in forecast error variance for an investor with
constant absolute risk aversion.
(6.) Normal errors are unbounded, making them impractical in our
laboratory task. Truncated normal errors are practical but are less
convenient for our purposes and fail to eliminate the potential
econometric problem.
(7.) None of these subjects appeared in the Paid treatment, but
five of the nine appeared in High Noise. Perhaps subjects are more
likely to become frustrated in this difficult learning environment.
(8.) Interpreting the High Noise test result is complicated by the
fact noted earlier that ideal learners also learn more slowly in the
High Noise environment. An eyeball examination of Figure 3 suggests that
most High Noise scores lie a bit farther away from the M-S ideal as
compared with subjects in the alternative treatments, but this
difference is not significant in either sample.
(9.) Note that this redefinition of the objective value uses the
available sample information rather than sunavailable population
information to define the objective value. The redefinition is a bit
conservative in that the original (population) definition differs by
about 0.013 and would tend to shift the classifications very slightly
toward overresponse.
(10.) The most questionable remaining subject is 009 in the No
History treatment. He made very erratic choices until late in the
session, spent no more time making choices than the screened subjects
(about half as long as most remaining subjects), and earned almost as
low a score as screened subjects. He was not screened out of the reduced
sample because he entered mainly non-default responses, but his
motivation is also questionable and his coefficient estimates indicate
dramatic underresponse. Indeed, the relevant test would indicate
marginally significant overresponse (to the second variable in the No
History treatment, p = 0.08) if this subject were screened out of the
sample.
(11.) Specifications designed to capture prior beliefs and
non-linear responses detected some transient effects in many subjects,
but for the most part these effects disappeared by the final trial.
Tests allowing a nonzero intercept term a0 reached different conclusions
only for the Asymmetric treatment, where the marginally significant
overresponse to the less important news disappeared. Responses remained
fairly rational even in a treatment featuring a structural break.
REFERENCES
Akerlof, G., and J. Yellen. "Can Small Deviations from
Rationality Make Significant Differences to Economic Equilibria?"
American Economic Review, 75, 1985, 4.
Binmore, K. "Modeling Rational Players, Part I."
Economics and Philosophy, 3, 1987, 179-214.
Brehmer, B. "In One Word, Not from Experience."
Organizational Behavior and Human Performance, 1,1980, 110-28.
Brown, G. "Iterated Solution of Games by Fictitious
Play." Activity Analysis of Production and Allocation, New York:
Wiley, 1951.
Busemeyer, J. R., I. J. Myung, and M. A. McDaniel. "Cue
Competition Effects: Theoretical Implications for Adaptive Network
Learning Models." Psychological Science, 4, 1993, 196-202.
Camerer, Colin. "Experiments on Game Theory." Draft
manuscript, Caltech Division of Humanities and Social Sciences, 1998.
Cournot, A. "Researches sur les Principes Mathematiques de la
Theorie Richesses." English edition, N. Bacon (ed.) 1897,
Researches into the Mathematical Principles of the Theory of Wealth. New
York: Macmillan, 1838.
DeLong, J. B., A. Shleifer, L. Summers, and R. J. Waldmann.
"Noise Trader Risk in Financial Markets." Journal of Political
Economy, 98(4), 1990, 703-38.
Dwyer G. Jr., A. Williams, R. Battalio, and T. Mason. "Tests
of Rational Expectations in a Stark Setting." Economic Journal,
103, 1993, 586-601.
Evans, George W, and Seppo Honkapohja. "Learning
Dynamics," in Handbook of Macroeconomics, edited by J. Taylor and
M. Woodford. New York: Elsevier, 1997.
Friedman, D., and D. Massaro. "Understanding Variability in
Binary and Continuous Choice." Psychonomic Bulletin and Review,
5(3), 1998, 370-89.
Friedman, D., and S. Sunder. "Experimental Methods: A Primer
for Economists." Cambridge: Cambridge University Press, 1994.
Fudenberg, D., and D. Kreps. "A Theory of Learning,
Experimentation, and Equilibrium in Games." Mimeo, MIT, 1998.
Garner, A. "Experimental Evidence on the Rationality of
Intuitive Forecasters." Research in Experimental Economics, 2,
1982, 113-28.
Gluck, M. A., and G. H. Bower. "From Conditioning to Category
Learning: An Adaptive Network Model." Journal of Experimental
Psychology: General, 117, 1988, 225-44.
Greene, W H. Econometric Analysis. New York: Macmillan, 1993.
Hull, C. Principles of Behavior. New York: Appleton-Century-Crofts,
1943.
Kahneman, D., P. Slovic, and A. Tversky (eds.). Judgement Under
Uncertainty: Heuristics and Biases. Cambridge: Cambridge University
Press, 1982.
Kelley, H. "Learning to Forecast Price in the Laboratory and
in the Field." Ph.D. thesis, chapter 1, Economics Department,
University of California Santa Cruz, 1998.
-----. "Behavioral Country Fund Discounts: Experimental and
Field Evidence of Bounded Rationality." Ph.D. thesis, chapter 2, 3,
Economics Department, University of California Santa Cruz, 1999.
Kelley, H., and D. Friedman. "Learning to Forecast
Rationally," in Handbook of Experimental Economics Results, edited
by C. Plott and V Smith, forthcoming.
Kitzis, S., H. Kelley, E. Berg, D. Massaro, D. Friedman.
"Broadening the Tests of Learning Models." Journal of
Mathematical Psychology, 42(2-3), 1998, 327-55.
Marcet, A., and T Sargent. "Convergence of Least Squares
Learning Mechanisms in Self Referential Linear Stochastic Models."
Journal of Economic Theory, 48, 1989a, 337-68.
-----. "Convergence of Least Squares Learning in Environments
with Hidden State Variables and Private Information." Journal of
Political Economy, 97, 1989b, 1306-22.
-----. "Least Squares and the Dynamics of
Hyperinflation," in Chaos, Complexity, and Sunspots, edited by W.
Barnett, J. Geweke, and K. Shell. Cambridge: Cambridge University Press,
1989c.
Marimon, Ramon, and Shyam Sunder. "Indeterminary of Equilibria
in a Hyperinflationary World: Experimental Evidence." Econometrica,
61, 1993, 1073-107.
Rabin, M. "Psychology and Economics." Journal of Economic
Literature, 36, 1998, 11-46.
Roll, R. "Orange Juice and Weather." American Economic
Review, 74, 1984, 861-80.
Sargent, T. Bounded Rationality in Macroeconomics. Oxford:
Clarendon Press, 1994.
Smith, V., and J. Walker. "Monetary Rewards and Decision Cost
in Experimental Economics." Economic Inquiry, 31, 1993, 245-61.
Thordike, E. L. "Animal Intelligence: An Experimental Study of
the Associative Processes in Animals." Psychological Monographs, 2,
1898.
Van Huyck, J., R. Battalio, and F. Rankin. "On the Origin of
Convention: Evidence From Coordination Games." Economic Journal,
107(442), 1997, 56-97.
Williams, A. W "The Formation of Price Forecasts in
Experimental Markets." Journal of Money Credit and Banking, 19,
1987, 1-18.
RELATED ARTICLE: ABBREVATIONS
MD: Medical Diagnosis
ML: Maximum Likelihood
M-S: Marcet-Sargent
OJF: Orange Juice Futures
OLS: Ordinary Least Squares
DANIEL FRIEDMAN *
* This work is supported by NSF grants SBR 9310347 and SBR 9617917.
It benefited from the comments of Jules Leichter, Dominic Massaro,
Rachel Croson, Vai-Lam Mui, and especially Arlington Williams, as well
as participants at the Economics Science Association and Public Choice
Society meetings at Tucson and New Orleans. The exposition benefited
considerably from the thoughtful suggestions of two anonymous referees
and editor William Nielson.
Kelley: Assistant Professor, Department of Economics, Indiana
University Bloomington, Bloomington, IN 47405. Phone 1-812-855-7928, Fax
1-812-855-3736, E-mail hukelley@indiana.edu
Friedman: Professor, Department of Economics, University of
California, Santa Cruz, CA 95060. Phone 1-831-459-4981, Fax
1-831-459-5077, E-mail dan@cats.ucsc.edu