Expectation Formation in Finance and Macroeconomics:
A Review of New Experimental Evidence1
Te Bao, Cars Hommes, and Jiaoying Pei
Abstract: This paper reviews the recent development and new findings of the literature on learning to
forecast experiments (LtFEs). In general, the stylized finding in the typical LtFEs, namely the rapid
convergence to the rational expectations equilibrium in negative feedback markets and persistent
bubbles and crashes in positive feedback markets, is a robust result against several deviations from the
baseline design (e.g., number of subjects in each market, price prediction versus quantity decision,
short term versus long term predictions, predicting price or returns). Recent studies also find a high
level of consistency between findings from forecasting data from the laboratory and the field, and
forecasting accuracy crucially depends on the complexity of the task.
Keywords: Learning to Forecast Experiment; Experimental Finance, Rational Expectations; Bubbles
and Crashes; Behavioral Finance
JEL Classification: C10, D17, D84, E52, G12, G17, G40
1
Te Bao, School of Social Sciences and NTU-WeBank JRC, Nanyang Technological University Singapore,
Email: baote@ntu.edu.sg; Cars Hommes, CeNDEF, School of Economics, University of Amsterdam and Bank
of Canada, Email: c.h.hommes@uva.nl; Jiaoying Pei, School of Social Sciences, Nanyang Technological
University Singapore. Email: peij0003@e.ntu.edu.sg. Te Bao thanks the financial support from the AcRF Tier 1
Grant RG69/19 from Ministry of Education of Singapore, NTU-WebBank Joint Research Center (NWJ-2019-
001, NWJ-2020-003) and the National Science Foundation of China (No. 71803201, 71773013, and 71873149).
2
1. Introduction
Expectation formation plays a central role in modern finance and macroeconomic modelling. Since
the seminal works by Muth (1961) and Lucas (1972), the rational expectations hypothesis (REH) has
become the standard approach to model expectation formation. However, due to the lack of high-
quality observational data on agents’ expectation formation and the difficulty for “testing joint
hypotheses,” it is usually difficult to set up a clean test on the REH using empirical data from the
field.
In recent years, Learning-to-Forecast Experiments (LtFEs), an experimental design that dates back to
Marimon and Sunder (1993, 1994) and Marimon, Spear, and Sunder (1993) has been widely used by
experimental economists to study expectation formation in financial markets and macroeconomies.
The key feature of a LtFE is that the subjects of the experiment play the role of professional
forecasters (Hommes, 2011, 2013b, 2020). Their only task is to submit their expectation on an
economic variable, e.g., the market price, the inflation rate, or the output gap. After collecting
individual expectations, the conditional optimal quantity decision (e.g., trading, investing, and saving)
are calculated by a computer algorithm and then determine the realization of the variables on which
the subjects made their forecasts. The learning to forecast approach is usually contrasted with the
alternative learning-to-optimize experiment design (LtOEs, Duffy, 2010, 2014, Arifovic and Duffy,
2018), where the subjects simply make their choice decisions. Depending on the context, the choice
decision may refer to various quantity or trading decisions, e.g., the consumption or saving decision in
an intertemporal choice problem for a household in, e.g., Lei and Noussair (2002), the supply quantity
decision for a firm in, e.g., Bao et al., (2013) and a bid or ask made by a trader in a double auction
market in, e.g., Smith et al., (1988). Because he LtFE design elicits and incentivizes the individual
expectations directly, the subjects should have stronger incentives to form rational expectations.
The market can display positive feedback or negative feedback. The asset markets are considered
positive feedback systems, where the realized market price increases when individual price forecasts
increase. A classical cobweb framework describing a supply-driven commodity market with a
production lag, on the other hand—exhibits negative feedback—that is, a higher expected price leads
to increased production and, thus, a lower realized market price (Hommes, 2013, 2020). A general
conclusion from the LtFE literature is that the agents can learn. The rational expectations equilibrium
when the market is a negative feedback system (e.g., Hommes et al., 2000). Yet, agents fail to learn to
rational expectation equilibrium when the market is a positive feedback system (e.g., Hommes et al.,
2005, 2008). There have been several comprehensive surveys on this literature (e.g., Hommes, 2020,
2011, 2013a, 2013b, 2014, Assenza et al., 2014). In this paper, we focus on the relatively new
development in this literature, i.e., studies published in the 2010s, to summarize the recent trend and
discuss the possible future directions of the research in this field. The new designs and research
questions of papers surveyed in this paper mainly fall into the following categories:
1) Experiments that compare the learning to forecast and the learning to optimize design (e.g.,
Bao et al., 2013, 2017, Mirdamadi and Petersen, 2018, Giamattei et al., 2020).
2) Traditionally, like other market experiments, the market size of a LtFE is 6-10 participants. In
recent years, researchers start to run large scale learning to forecast experiments (e.g., Bao,
Hennequin, Hommes and Massaro, 2019, Hommes, Kopányi-Peuker and Sonnemans, 2020)
to test if the results from relatively small-scale experiments are robust in larger experimental
markets.
3) A typical LtFE usually lasts for 50 periods, and the predictions are made one period or two
periods ahead. In recent years, researchers start to run LtFE with longer horizons and
investigate the role of long-run predictions (Colasante et al., 2018; Evans et al., 2019).
2
4) Traditionally, LtFEs on asset market elicits beliefs on asset prices. Recently, some studies
compare the cases where agents form expectations on prices versus returns (Glaser et al.,
2019, Hanaki et al. (2020)).
5) Traditionally, LtFEs mainly study questions related to asset pricing. Recent LtFEs pay more
attention to monetary economics and the role of monetary policy in asset markets. (e.g.,
Arifovic and Petersen, 2017, Arifovic et al., 2019, Assenza et al., 2019, Bao and Zong, 2019,
Hommes et al., 2019a, 2019b, Mauersberger, 2019, Ahrens et al., 2020).
6) Papers trying to combine laboratory and computational experiments (e.g., Hommes et al.,
2017, Anufriev and Hommes, 2012, Bao et al., 2012, Anufriev et al., 2016, 2018, 2019).
7) Studies that try to compare data on expectation formation from the lab and from the field
(e.g., Landier et al., 2019, Cornand and Hubert, 2020).
8) Studies on how the complexity of the decision influences the forecasting behavior (e.g.,
Mirdamadi and Petersen, 2018, Anufriev et al., 2019, Arifovic et al., 2019, Bao and Duffy,
2019, He and Kucinskas, 2019).
In the rest of the paper, we will first go through the basic setup of a learning to forecast experiment in
Section 2. After that, we will list the main results that answer the above questions according to the
recent literature in Section 3. Finally, we draw a short conclusion and discussion based on the
development of the literature in Section 4.
2. Basic Setup of a Learning to Forecast Experiment
2.1 Experimental Design
A baseline learning to forecast experiment is usually a market experiment with6?-10 subjects in each
market. This type of experiment usually employs a between-subject design. One market serves as one
independent observation. The subjects make their forecast on one or two economic variables, e.g.,
price of a product/financial asset, inflation rate, GDP output gap, etc. To provide the right incentive to
do their best in making an accurate forecast, their payoff is a decreasing function of their prediction
error. Some studies make the subjects’ payoff a quadratic loss function of their prediction error, while
others put prediction error in the denominator of the payoff function.
In a learning to forecast experiment, the subjects usually play the role of a professional
consult/forecaster/analyst of a firm. Their expectations are fed into the decision problem of the firm in
determining their output/trading/investment decisions. Other things equal, a more accurate prediction
is associated with a higher profit of the firm, and better compensation to the subject.
A learning to forecast experiment is usually a multi-period experiment. The subjects need to predict
the economic variable for 40-65 consecutive periods. In each period, their information set usually
includes the history of their own past predictions and the realization of the economic variable. They
usually do not know the data generating process (DGP) of the economic variable, as most market
participants do not know the DGP of GDP, stock prices or the inflation rate in real life. The
experiment usually uses the simultaneous decision setting, which means that they do not have the
information on others’ expectations in the same period either, and even after the realization of the
economic variable is revealed. In a way, a learning to forecast experiment differentiates from market
experiments with strategic substitutes and complements (Fehr and Tyran, 2005, 2008) in that it is not
a game between the subject and other players as his/her opponents, but a game between the subject
and “the market.” Thus, a subject in a learning to forecast experiment is usually considered a price
taker, who does not put a lot of consideration on his/her market power in their decision making.
2
A key research question of the learning to forecast experiment literature is that: when people do not
start from the rational expectations equilibrium and do not have the knowledge about the specification
of the DGP of the economy, can they learn to play rational expectations over time? Stated differently,
can learning lead the market to converge to its rational expectations equilibrium (REE)? According to
the rational expectations hypothesis (REH), this should be the case. Instead of assuming that every
agent has full information about the economy, the theoretical prediction by REH is that people should
be able to learn the REE as long as they have the incentive to search for information and try to form
an accurate forecast. Their prediction errors do not have a cross-sectional correlation.
2.2 Price Dynamics and Individual Expectations
Figure 1: Price dynamics in negative (left panel) and positive feedback (right panel) markets in the
learning to forecast experiment by Bao, Hommes, Sonnemans, and Tuinstra (2012).
Figure 1 shows the aggregate price dynamics in a typical learning to forecast experiment. While the
markets with negative feedback usually converge to the REE (dashed line) within five periods after
the experiment starts, or after the experimental economy experiences a large exogenous shock,
markets with positive feedbacks usually fail to converge to the REE and exhibit prolonged oscillations
and deviation from the underlying REE/fundamentals.
Figure 2: Simulated fractions of individuals using different forecasting strategies in a typical negative
feedback market (left panel) and a positive feedback market (right panel) in the learning to forecast
experiment by Bao, Hommes, Sonnemans, and Tuinstra (2012).
To better understand the individual expectation formation in the experimental markets of learning to
forecast experiments, researchers use different methodologies to categorize the forecasting behavior
by individual subjects in these markets. One important behavioral model used in this literature is the
heuristic switching model (HSM) by Anufriev and Hommes (2012). The basic setup of an HSM is
that in each period, the subjects choose from a menu of forecasting heuristics. They can observe the
history of the forecasting accuracy of each heuristic, and the key assumption of the model is that the
heuristics that perform better in the recent past are assigned with higher evolutionary fitness, and
hence attract more followers in the next period. There are typically four forecasting strategies in an
2
HSM: an adaptive expectations rule, a weak trend following rule (or a contrarian rule), a strong trend
extrapolation rule, and an Anchoring and Adjustment rule (Tversky and Kahneman, 1974). As shown
by Figure 2, individuals usually follow adaptive or contrarian expectations in negative feedback
markets, and strong trend-following rule or anchoring and adjustment rule in positive feedback
markets. They are hence able to converge to the REE using adaptive expectations, especially when the
markets are E-stable (Evans and Honkapohja, 1999, 2003, 2009) in negative feedback markets. Yet,
they are usually unable to learn the RE equilibrium in markets with positive feedbacks, because riding
on a common trend leads to violation of “uncorrelated prediction errors” across individuals.
More recently, Bao and Hommes (2019) study the price dynamics in experimental housing markets as
a “hybrid” of positive and negative feedback systems. The housing market is a production market, and
hence a negative feedback system for the builders, and an asset market, and hence a positive feedback
market for the speculators. The result of the experiment shows that the market price tends to be more
stable when the “strength” of negative feedback, i.e. the slope of the supply function is larger. The
result provides supportive evidence that other things equal, housing markets with larger supply
elasticity should experience fewer bubbles and crashes. These results also show that overall weak
positive feedback leads to a stable market, while strong positive feedback creates bubbles and crashes.
3. Stylized Results from Recent Literature
In this section, we review the results of some recent studies (most of them conducted or published
after 2010) in the LtFE literature. We do not attempt to cover all details of the design and results of all
studies, but try to highlight the main conclusions and supporting evidence, as Palan (2013) did for the
literature on asset bubbles in continuous double auction markets a la Smith et al. (1988).
3.1 LtFE versus LtOE
Observation 1: The convergence to REE is not more likely or faster when the subjects submit quantity
decisions instead of making price forecasts. Rather under quantity decisions convergence may be
slower, and bubbles and crashes are robust.
Support: Since the beginning of the learning to forecast experiment literature, there have been
questions about the comparability between the results from LtFEs and learning to optimize
experiments (LtOEs), where subjects make quantity decisions directly. Though there have been some
LtOEs that also elicit price forecast (e.g., Cheung et al., 2014, Cohn et al., 2015, Haruvy et al., 2007,
Hanaki et al. 2018), the price forecast in those experiments is more like a by-product of the
experiment: it does not enter the DGP of the market price, and hence plays a minimal role in the
experiment as opposed to expectation formation in LtFEs.
To our knowledge, Bao et al. (2013) is the first experiment that sets up comparable LtFE, LtOE
treatments, as well as the combination of the two. The underlying model is the same cobweb economy
model in all treatments, where the subjects play the role of advisors of competing companies
producing consumer products. The good is an ordinary good, so that demand is a downward-sloping
function of the price. The authors impose a quadratic cost function of production.
In the LtFE treatment, the subjects submit their price forecast in each period. The price is then
determined by the average price forecast, and the subjects are paid according to their forecasting
accuracy. In the LtOE treatment, the subjects submit their production quantity directly. The market
price is then determined by the total supply quantity, and subjects are paid according to the
profitability of this quantity decision. In a third treatment, they combine the two, the subjects submit
both a price forecast and a production quantity. Then the market price is determined by the total
2
supply quantity as in the LtOE treatment, and subjects receive their payoff half from the forecasting
task and half from the quantity decision task.
The result of Bao et al. (2013) shows that convergence is the fastest in LtFE and slowest in the
combination of LtFE and LtOE. The authors further find that most subjects use adaptive rules to
forecast prices. Given their price forecast, subjects fail to provide the conditionally optimal quantity in
the treatment with both forecasting and optimizing tasks. The results suggest that LtFE indeed
provides an “upper bound” of how well the rational expectations hypothesis works in markets.
Unlike Bao et al. (2013), Bao et al. (2017) study the expectation formation and price dynamics in
positive feedback markets where subjects play the role of advisors for investment companies. In Bao
et al. (2017), the company will buy more assets if the subject’s prediction of the future asset price is
higher. The authors also design three treatments: LtFE, LtOE, and a third one called Mixed, where the
subject also does both forecasting and quantity decision (on trading) tasks. To avoid potential
hedging, the subjects in the Mixed treatment receive their payment based on their performance in the
forecasting and trading task with 50:50 probability, instead of 50:50 weight.
Figure 3: The asset price in a typical market in the LtFE (top left panel), LtOE (top right panel), and
Mixed (bottom panel) in Bao et al. (2017).
Figure 3 presents the asset price dynamics in a typical market in the LtFE, LtOE, and Mixed in Bao et
al. (2017). None of the markets converge to the REE. But between the treatments, the price deviation
and the magnitude of fluctuation are way larger in the LtOE and Mixed treatments than in the LtFE
treatment.
Besides, the authors provide an empirical micro foundation of observed differences across the three
treatments. They estimated individual forecasting and trading rules and found significant differences
across treatments. In the LtFE treatment, individual forecasting behavior is more cautious. Subjects
use a more conservative anchor (a weighted average of last observed price and last forecast) in their
trend-following rules. In contrast, in the Mixed treatment, almost all weight is given to the last
observed price, leading to a more aggressive trend-following forecasting rule. Individual trading
behavior of most subjects can be characterized by extrapolation of past and/or expected returns, and
the degree of the return extrapolation coefficients are higher in the LtOE and Mixed treatments.
2
The result from Arifovic et al. (2019) supports the aforementioned fast convergence in LtFE in a
complex nonlinear overlapping generations framework. For both the LtFE and LtOE the OLG
economy converges to simple equilibria, a steady state, or a 2-cycle. Subjects in LtFE design may
converge to a two-cycle, while price predictions in LtOE fail to do so even after the initial
oscillations. The authors plot the cumulative distribution of individual decision times and the length
of instructions and report a significantly higher cognitive load in LtOE than LtFE. In sum, they
suggest the possibility that it is the strategic uncertainty or difference in cognitive load between the
two designs that lead to the observed differences in outcomes.
Giamattei et al. (2020) find that if subjects are asked to provide a price forecast on a double auction
market a la Smith et al. (1988), paying for the accuracy of the forecast tends to enlarge the mispricing
and market instability. The reason may be that the incentive distracts the subjects’ attention in
tracking the fundamental value while trading.
3.2 Large Scale LtFEs
Observation 2: Bubbles and crashes also occur in large experimental LtFE asset markets.
Support: Most standard LtFEs use the market size of 6 participants. Some people may wonder if the
results of this design are robust when the group size becomes larger. In particular, supporters of the
rational expectations hypothesis may claim that RE works the best with a large economy populated by
millions of people, and a large sample size may be a necessary and sufficient condition for “wisdom
of crowds” to work.
In response to this question, a few recent LtFEs employ large scale design, i.e., by increasing the
market size from 6 to 20-30, or even 100. These studies usually show that bubbles and crashes still
occur in these large markets as they did in smaller markets.
Bao et al. (2020) study the price dynamics and individual expectations in LtFE markets. Each solid
line represents one market in the experiment. The experimental setup is the same as in Hommes et al.
(2008), except that the market size increases from 6 to 21-32. The unique REE of the market price is
60 (dashed line). But the results show that similar to markets in Hommes et al. (2008), 6 out of 7
markets show persistent divergence from the REE, and the peak of the price cycle can be as high as
almost 1000. Thus, the findings in Hommes et al. (2008) are robust when the market size increase
from 6 to 20-30.
Figure 4: Price Dynamics in Bao et al. (2020).
2
The price dynamics in the seven markets from Bao et al. (2020) are shown in Figure 4. As the figure
shows, the price dynamics follow the same pattern as in Hommes et al. (2008), and there is no
evidence that larger group size reduced the size or likelihood of bubbles.
Hommes et al. (2020) extend the size of the large experimental asset market further to around 100
subjects (between 92 to 104) in each market. The unique REE of the asset price in this experiment is
66. The average asset price is 139.38 for the Large groups and 153.41 for the Small groups (with six
subjects each as in a standard LtFE). While the overvaluation seems smaller in large markets, it is still
far from zero, and large bubbles occur in 3 out of 6 large markets. Besides, the authors also examine
the effect of news announcements when the market is overheated and find that it can help to bring
down the asset prices substantially.
3.3 Time Horizon
Observation 3: Markets populated with more long-run forecasters are more likely to converge to the
REE. Long-run forecasters’ forecast is better described by adaptive learning, while short-run
forecasters are usually trend-extrapolators.
Support: Evans et al. (2019) run a learning to forecast experiment where subjects play the role of
agents with CRRA utility functions and solve a consumption-based asset pricing problem a la Lucas
(1978). In this setting, a boundedly rational agent model by Branch et al. (2012) proposes that when
agents make “T-period ahead optimal learning,” the asset price will converge to its REE faster when
T is larger. The authors’ design four treatments, where the market is populated by 0%, 30%, 50%, and
100% of subjects who make ten periods ahead forecasts (while the rest are subjects who make one
period ahead forecasts as in standard LtFEs). The result shows that short-horizon markets are prone to
persistent deviations from rational expectations (RE). By contrast, markets populated by even a
modest fraction of long-horizon forecasters exhibit convergence towards the REE. Long-horizon
forecasts are well-described by adaptive learning, which leads to convergence and stabilization, while
short-horizon forecasts are usually users of destabilizing trend following strategies.
Parallel to the paper mentioned above, Anufriev et al. (2020) exam how long-run expectations
influence market stability. Different from Evans et al. (2019), their experimental setting is the
standard one following Brock and Hommes (1998) and Hommes et al. (2005, 2008). In this study,
long-run expectation means the subjects can make two periods ahead or three periods ahead
expectations, and there is no treatment with a mixture of short-run and long-run forecasters, i.e., all
subjects face the same forecasting time horizon in each market. The authors also introduce the initial
history of past prices at the beginning of the experiment. That is, instead of seeing no past prices, the
subjects can observe a long history of asset prices from markets in previous asset pricing LtFEs. Like
Evans et al. (2019), the result of this paper shows that long-run expectations tend to help stabilization
and convergence to REE. All markets that start with the history of converging prices tend to stay
stable. For the markets that start with the history of oscillating price dynamics, the price tends to be
more stable when the subjects make the long run instead of short-run expectations.
Besides these two papers, some studies elicit long-run expectations besides short-run expectations,
e.g., Colasante et al. (2018, 2020). But since the long-run expectations in those experiments do not
enter the DGP of the realized asset prices, they play a lesser role in the experiment and tend to
generate a smaller impact.
Observation 4: Increasing the length, i.e., the number of periods, and time pressure can help the
markets to converge to the REE.
2