Signal extraction and rational inattention.
Luo, Yu Lei ; Young, Eric R.
I. INTRODUCTION
The objective of this paper is to compare two theories of
information frictions. In the first theory, agents observe the state of
the world with exogenously imposed noise; we refer to this environment
as the signal extraction (SE) model. (1) The second theory permits
agents to design the distribution of noise terms by focusing limited
attention on certain variables at the expense of others; following Sims
(2003) we refer to this model as the rational inattention (RI) model.
(2) In both models, agents respond to changes in the true underlying
state slowly because it takes time for them to learn exactly what the
new state is. The key difference is that the noise distribution under RI
is an outcome of optimal choice, and thus can adapt to changing
circumstances in the economy, whereas under SE the noise distribution is
technologically fixed. Our goal is to explore this crucial distinction.
We explore the distinction between SE and RI within a
linear-quadratic-Gaussian (LQG) setting. We first study a univariate
case for which the variance-covariance matrix of the noise shocks can be
solved in closed form. First, we note that (at least within the LQG
framework) any variance-covariance structure for the noise shocks
derived from the RI problem can be exogenously imposed in the SE
environment, so there is a sense in which the two environments are
observationally equivalent.
The next result we find is that if the variance of the noise itself
is fixed, we can use a policy experiment to distinguish SE from RI.
Suppose that the variance of an exogenous shock is scaled up because of
a change in policy. In the SE problem where the distribution of noise is
exogenous, an increase in the variance of the shock will lead to a
different solution for the conditional variance and Kalman gain;
consequently, the change in policy will lead to a change in the
model's dynamic behavior and the agent's payoffs. In contrast,
in the RI problem, if channel capacity is fixed, a change in the
variance of the exogenous shock will lead to the same change in the
conditional variance of the state and the variance of the noise, but
will not affect the Kalman gain. That is, inattentive agents with fixed
capacity will behave as if they face noise shocks whose nature changes
systematically as the dynamic properties of the economy change with
policy. Furthermore, we show that if we assume that the marginal cost of
information is fixed, capacity will be elastic with respect to a change
in policy; consequently, the Kalman gain in this case will also adjust
in response to the policy change. The losses of agents due to imperfect
information depend on the value of the Kalman gain. Therefore, in
general, SE and RI imply different policy recommendations.
In contrast, we also find that in the univariate case, if the ratio
of the variance of the exogenous shock to that of the noise (i.e., the
signal-to-noise ratio, SNR) is fixed, the SE and RI problems are
observationally equivalent in the sense that they lead to the same
dynamics of the model economy when the ratio of the conditional variance
to that of the noise in the SE problem equals a particular function of
the channel capacity in the RI problem. Our results do not depend on the
presence or absence of correlation between fundamental and noise shocks.
Specifically, in the presence of the correlation, a change in the
variance of the exogenous shock does not change the dynamic behavior of
the model in the fixed capacity case, whereas it changes the
model's dynamics in the fixed information-processing cost case in
which both the variance of noise and the Kalman gain are affected by the
interactions between the correlation, the variance of the fundamental
shock, and the conditional variance.
We then move on to study the multivariate case. In this case, given
channel capacity the conditional variance-covariance matrix can be
obtained by solving a semidefinite programming problem in which the
inattentive agent minimizes the expected loss due to
information-processing constraints. After computing the optimal
steady-state conditional variance-covariance matrix, we can recover the
variance-covariance matrix of the noise vector and then determine the
Kalman gain. In this case, we show that SE and RI generate different
dynamics after a change in the variance of the exogenous shock even if
the SNR is fixed.
However, when modeling the multivariate SE problem, it is difficult
to specify the process of the vector of noises ex ante without prior
knowledge about the states. Ad hoc assumptions on the nature of the
noise might be inconsistent with the underlying efficiency conditions
(equalization of the marginal utility of additional capacity across
variables). (3) Therefore, RI provides a micro-founded method for
specifying the stochastic properties of the noises. It is worth noting
that in the multivariate RI problem, the agent's preference, budget
constraint, and information-processing constraints jointly determine the
values of the conditional variance of the state, the variance of the
noise, and the Kalman gain, whereas in the multivariate SE problem given
the variance of the noise, the propagation equation updating the
conditional variance based on the budget constraint is used to determine
the conditional variance and then the Kalman gain.
Finally, we present two canonical examples to illustrate our
results. First, we study a permanent income model to explore the
different consumption and asset pricing implications of SE and RI. We
argue that the RI model with fixed information-processing cost does a
better job at replicating the different consumption behavior in emerging
and developed small open economies (SOEs), and thus helps us distinguish
RI from SE. (4) We also discuss a simplified model of price setting in
the vein of Mackowiak and Wiederholt (2009) that optimal attention
allocation between the aggregate and firm-specific shocks can also help
distinguish RI from SE. In that model, Paciello and Wiederholt (2013)
show that optimal monetary policy under SE and RI differs precisely
because the monetary authority can manipulate the attention decision; in
other words, it matters whether agents live in an SE or RI world.
The remainder of the paper is organized as follows. Section II
examines optimal decisions and economic dynamics in an LQG setting with
SE. Section III presents the RI version of the model and compares
different implications of RI and SE for dynamics and policy within the
LQG setting. Section IV presents applications to models of permanent
income and price setting. Section V briefly discusses the different
implications of SE and RI for consumption in a continuous-time setting.
Section VI concludes.
II. SIGNAL EXTRACTION IN AN LQG MODEL
A. Full-Information Rational Expectations LQG Model
Consider the following LQG model:
(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
subject to
(2) [S.sub.t+1] = [As.sub.t] + [Bc.sub.t] + [[epsilon].sub.t+1],
with [s.sub.0] known and given, where [beta] < 1 is the discount
factor, [s.sub.t] is a (n x 1) state vector, [c.sub.t] is a ([kappa] x
1) control vector, [[epsilon].sub.t+1] is an iid (n x 1) vector of
Gaussian random variables with mean 0 and covariance matrix [OMEGA], and
[E.sub.t] [*] denotes the mathematical expectation of a random variable
given information processed at t. We assume that Q, R, and W are such
that the objective function is jointly concave in [s.sub.t] and
[c.sub.t], and the usual conditions required for the optimal policy to
exist are satisfied.
When the agent can fully observe the state [s.sub.t], the model is
a standard linear-quadratic regulator problem. Solving the corresponding
Bellman equation
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
yields the decision rule
(3) [c.sup.*.sub.t] = - F[s.sub.t],
and the Riccati equation is
(4) P = Q + [F.sup.T] RF - 2[F.sup.T] W +[beta]([A.sup.T] -
[F.sup.T] [B.sup.T])P(A - BF),
where
(5) F = [(R + [beta][B.sup.T] PB).sup.-1] (W + [beta][B.sup.T] PA).
Iterating on the matrix Riccati equation (4) uniquely determines P,
as the equation defines a contraction mapping. Using P, we can determine
F in the optimal policy (5).
B. Signal Extraction with Exogenous Noises
Following the SE literature (e.g., Lucas 1972; Morris and Shin
2002; Muth 1960), we now assume that the agent cannot observe the true
state [s.sub.t] perfectly and only observes the noisy signal
[s.sup.*.sub.t] = [s.sub.t] + [[xi].sub.t] when making decisions. Here
[[xi].sub.t] is a (n x 1) vector of noises. The agent then estimates the
state using a standard Kalman filtering equation. In the standard SE
problem, the stochastic property of the noise [[xi].sub.t] is given
exogenously. Specifically, assume that [[xi].sub.t] is an iid Gaussian
innovation with mean 0 and variance-covariance matrix [LAMBDA]. (5) We
point out here that the agent may not have perfect information even
about the endogenous part of the state vector [s.sub.t].
Under the LQG assumption, the certainty equivalence principle holds
when the agent cannot observe [s.sub.t], perfectly, so the decision rule
under imperfect information can be written as
(6) [c.sup.*.sub.t] = -F[[??].sub.t],
where [[??].sub.t] = E [[s.sub.t]|[I.sub.t]] is the perceived state
and [I.sub.t] = {[s.sup.*.sub.t], [s.sup.*.sub.t-1], ...,
[s.sup.*.sub.0]} is the information set including perceived signals
until time t.
Furthermore, we assume that in the steady state, the true state
follows a normal distribution after observing the noisy signals
[s.sub.t]\[I.sub.t] ~ N (E [[s.sub.t]| [I.sub.t]], [[SIGMA].sub.t]),
where [[SIGMA].sub.t] = [E.sub.t][([s.sub.t] - [[[??].sub.t]) ([s.sub.t]
- [[??].sub.t]].sup.T]] is the conditional variance-covariance matrix,
and the Kalman filtering equation governs the behavior of [[??].sub.t]
(7) [[??].sub.t+1] = (1 - [theta]) (A[[??].sub.t] + B[c.sub.t]) +
[theta][s.sup.*.sub.t+1],
where [theta] is the steady-state Kalman gain to be determined. (6)
Following the standard procedure in the Kalman filter literature, we
have the steady state updating equation for [SIGMA] and Kalman gain
[theta]:
(8) [summation] = (I - [theta]) (A[summation][A.sup.T] + [OMEGA])
[(I - [theta]).sup.T] + [theta][LAMBDA][[theta].sup.T]
and
(9) [theta] = ([OMEGA] + A[SIGMA][A.sup.T]) [([OMEGA] +
A[summation][A.sup.T] + [LAMBDA]).sup.-1].
After computing ([theta], [SIGMA]) by iteration, we can obtain a
complete characterization of the dynamic system. The key assumption in
the SE problem is that the variance-covariance matrix of the noise,
[LAMBDA], is given. Given this [LAMBDA], Equations (8) and (9) jointly
determine the steady state ([theta], [SIGMA]). It is straightforward to
show that we have the following alternative equations for computing the
Kalman gain and the conditional variance-covariance matrix, ([theta],
[SIGMA]):
(10) [[LAMBDA].sup.-1] = [[summation].sup.-1] - [[PSI].sup.-1].
and
(11) [theta] = [summation][[LAMBDA].sup.-1],
where [PSI] = A[summation][A.sup.T] + [OMEGA] is the steady-state
conditional variance of the state prior to observing the new signal.
After obtaining the Kalman gain, Equations (6), (7), and (11) completely
characterize the model's dynamic behavior.
III. RATIONAL INATTENTION IN THE LOG MODEL
Following Sims (2003), we introduce RI into the LQG model proposed
in Section A by assuming agents face information-processing constraints
and have only finite Shannon channel capacity to observe the state of
the world. Specifically, we use the concept of entropy from information
theory to characterize the uncertainty about a random variable; the
reduction in entropy is thus a natural measure of information flow.
Formally, entropy is defined as the expectation of the negative of the
(natural) log of the density function, -E [ln (f (X))]. For example, the
entropy of a discrete distribution with equal weight on two points is
simply E [in (f (X))] = -0.5 ln (0.5) - 0.5 ln (0.5) = 0.69, and the
unit of information contained in this distribution is 0.69
"nats." (7) In this case, an agent can remove all uncertainty
about X if the capacity devoted to monitoring X is [kappa] = 0.69 nats.
With finite capacity [kappa] [member of] (0, [theta]), a variable s
following a continuous distribution cannot be observed without error and
thus the information set at time t + 1, [I.sub.t+1], is generated by the
entire history of noisy signals [{[s.sup.*.sub.j]}.sup.t+1.sub.j=0].
Following the literature, we assume the noisy signal takes the additive
form [s.sup.*.sub.t+1] = [s.sub.t+1] + [[xi].sub.t+1], where
[[xi].sub.t+1] is the endogenous noise caused by finite capacity. We
further assume that [[xi].sub.t+1] is an iid idiosyncratic shock and is
independent of the fundamental shock. Note that the reason that the
RI-induced noise is idiosyncratic is that the endogenous noise arises
from the consumer's own internal information-processing constraint.
Agents with finite capacity will choose a new signal [s.sup.*.sub.t+1]
[member of] [I.sub.t+1] = {[[s.sup.*.sub.1], [s.sup.*.sub.2], ...,
[s.sup.*.sub.t+1]]} that reduces the uncertainty of the state variable
[s.sub.t+1] as much as possible. Formally, this idea can be described by
the information constraint
(12) H ([s.sub.t+1]|[I.sub.t]) - H ([s.sub.t+1] | [I.sub.t+1])
[less than or equal to] [kappa],
where [kappa] is the investor's information channel capacity,
H ([s.sub.t+1]| [I.sub.t]) denotes the entropy of the state prior to
observing the new signal at t + 1, and H([s.sub.t+1]|[I.sub.t+1]) is the
entropy after observing the new signal, [kappa] imposes an upper bound
on the amount of information--that is, the change in the entropy--that
can be transmitted in any given period. Finally, following the
literature, we suppose that the prior distribution of [s.sub.t+1] is
Gaussian.
Under the LQG setting, as has been shown in Sims (2003, 2006), the
true state under RI also follows a normal distribution
[s.sub.t]|[I.sub.t] ~ N (E [[s.sub.t]\[I.sub.t]], [[SIGMA].sub.t]),
where [[summation].sub.t] = [E.sub.t][([s.sub.t] -
[[??].sub.t])[([s.sub.t] [[??].sub.t]).sup.T]]. In addition, given that
the noisy signal takes the additive form [s.sup.*.sub.t+1], =
[s.sub.t+1] + [[xi].sub.t+1], the noise [[xi].sub.=t+1] should also be
Gaussian. It is worth noting that the Gaussianity of the posterior
variance of the true state and the noise is optimally determined by the
LQG structure. This result is often assumed as a matter of convenience
in SE models with exogenous noises, and RI can rationalize this
assumption. In addition, in the steady state the agent observes an
additive noisy signal: [s.sup.*] = s, + [[xi].sub.t]. Note that in the
RI problem we also have the usual formula for updating the conditional
variance-covariance matrix of a Gaussian distribution [SIGMA] in steady
state:
(13) [[LAMBDA].sup.-1] = [[summation].sup.-1] - [[PSI].sup.-1].
Using these expressions, the Kalman gain [theta] can also be
written as
(14) [theta] = [summation][[LAMBDA].sup.-1].
A. The Univariate Case
The key difference between SE and RI is that under RI the agent
faces the information-processing constraint
(15) -ln([absolute value of [[summation].sub.t+1]) + ln ([absolute
value of [A.sup.T] [[summation].sub.t] A + [OMEGA]]) [less than or equal
to] 2[kappa].
As more information about the state is better in single-agent
models, this constraint will be binding. (8) Considering the univariate
state case n = 1, (15) fully determines the value of the steady-state
conditional variance [SIGMA]:
(16) [summation] = [OMEGA]/exp (2[kappa]) - [A.sup.2]],
which means that [SIGMA] is determined by the variance of the
exogenous shock ([OMEGA]) and the exogenously given capacity ([kappa]).
(9) Given this [SIGMA] we can use (13) to recover the variance of the
endogenous noise ([LAMBDA]):
(17) [LAMBDA] = [([[summation].sup.-1] - [[PSI].sup.-1]).sup.-1],
where [PSI] = [A.sup.2][summation] + [OMEGA], and use (14) to find
the Kalman gain ([theta]):
(18) [theta] = [summation][[LAMBDA].sup.-1] = 1 -
[SIGMA][[psi].sup.-1],
which reduces to [theta] = 1 - 1/exp(2[kappa]) using Equations
(14), (16), and (17). Note that Equations (17) and (18) also hold in the
SE problem. To compare the RI and SE problems in the univariate case, we
first consider the following two cases.
Case 1. Assume that A is fixed exogenously in the SE case.
In Case 1, it is clear that in the SE problem given [LAMBDA] and
[OMEGA], we can compute [SIGMA] by solving the nonlinear equation (10).
After obtaining [SIGMA], we can use Equation (11) to determine the
Kalman gain [theta]; thus, in this sense SE and RI have the same
implications.
We now discuss how to use a policy experiment to distinguish RI
from SE. Suppose that the variance of the exogenous shock, [OMEGA], is
scaled up because of a change in policy. In the SE problem with fixed
[LAMBDA], Equations (17) and (18) imply that an increase in [OMEGA] will
generally lead to a different solution for [SIGMA] and [theta];
consequently, the change in policy will lead to a change in the
model's dynamics. Because [SIGMA] is a nonlinear function of
[OMEGA], the effect of changes in [OMEGA] on [SIGMA] could be
complicated. In the next section, we will explore this relationship
using some numerical examples in a permanent income model. In contrast,
in the RI problem, if [kappa] is fixed, Equations (16), (17), and (18)
imply that a change in [OMEGA] will lead to the same change in [SIGMA],
[PSI], and [LAMBDA], but has no impact on [theta]. In other words,
agents with fixed capacity will behave as if facing noise whose nature
changes systematically as the dynamic properties of the economy change,
that is, the change in policy does not change the model's dynamics.
Case 2. Assume that the SNR (SNR), [OMEGA][[LAMBDA].sup.-1], is
fixed exogenously in the SE case.
Note that Equation (17) can be rewritten as
(19) [OMEGA][[LAMBDA].sup.-1] = [OMEGA][[summation].sup.-1] -
[[[A.sup.2][([OMEGA][[summation].sup.-1]).sup.-1] + 1].sup.-1].
In Case 2, since the SNR is fixed, (19) can be used to solve for
[OMEGA][[SIGMA].sup.-1]. Given the SNR and [OMEGA][[SIGMA].sup.-1], we
can compute
(20) [summation][[LAMBDA].sup.-1] = ([summation][[OMEGA].sup.-1])
([OMEGA] [[LAMBDA].sup.-1]).
Consider the same case in which [OMEGA] is scaled up, fixed SNR
means that the exogenous noise should also be scaled up such that
[OMEGA] [[LAMBDA].sup.-1] is fixed at the same level; consequently,
Equation (19) leads to the same solution for [OMEGA] [[SIGMA].sup.-1]
and Equation (20) leads to the same E[[LAMBDA].sup.-1]. The following
proposition summarizes the main conclusion in this case:
PROPOSITION 1. In Case 2 (i.e., the SNR is fixed), the SE and RI
problems are observationally equivalent in the sense that they lead to
the same dynamics if [kappa] is fixed and [SIGMA] [[LAMBDA].sup.-1] in
the SE problem is equal to 1 - 1 / exp (2[kappa]) in the RI problem.
Proof. The proof is straightforward by comparing Equations (18) and
(20).
In the above analysis, for simplicity we assume that [kappa]
remains unchanged when [OMEGA] is altered by the government policy.
However, if an increase in [OMEGA] leads to higher marginal welfare
losses due to imperfect observations, some capacity may be reallocated
from other sources to reduce the welfare losses due to low capacity. In
this case, [theta] will change accordingly as it is completely
determined by capacity [kappa]; consequently, the dynamic behavior of
the model will also change in response to the change in [OMEGA]. We will
further explore this issue in the next section and the permanent income
model in Section 4.
Alternative Way to Model Limited Information-Processing Capacity.
As argued by Sims (2010), instead of using fixed finite channel capacity
to model limited information-processing ability, it is also reasonable
to assume that the marginal cost of information processing is constant.
That is, the Lagrange multiplier on Equation (15) is constant. (10) In
the univariate case, if the decision rule under full information is
[c.sup.*.sub.t] = H[s.sub.t] and the objective of the agent with finite
capacity is to minimize [[summation].sup.[theta].sub.t-0] [[beta].sup.t]
[([c.sub.t] - [c.sup.*.sub.t]).sup.2], the optimization problem reduces
to
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [[SIGMA].sub.t] is the conditional variance at t, [lambda] is
the Lagrange multiplier corresponding to Equation (15), and we impose
the restriction that [beta]A = 1 and H = A - 1 for simplicity. Solving
this problem yields the optimal steady-state conditional variance:
(21) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
It is straightforward to show that as [lambda] goes to 0,
[summation] = 0; and as [lambda] goes to [infinity], [SIGMA] =
[infinity]. Comparing Equations (21) with (16), it is clear that the two
modeling strategies are observationally equivalent in the sense that
they lead to the same conditional variance if the following equality
holds:
(22) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
In this case, the Kalman gain is
(23) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
It is obvious that [kappa] converges to its lower limit
[[kappa].bar]= In (A) as [lambda] goes to [infinity]; and it converges
to [infinity] as [lambda] goes to 0. (11) In other words, using the RI
modeling strategy, the agent is allowed to adjust the optimal level of
capacity in such a way that the marginal cost of information processing
for the problem at hand remains constant. Note that this result is
consistent with the concept of "elastic" capacity proposed in
Kahneman (1973).
Furthermore, it is clear from Equation (22) that if the cost of
information processing ([lambda]) is fixed, an increase in fundamental
uncertainty ([OMEGA]) will lead to higher capacity ([kappa]) devoted to
monitoring the evolution of the state. We now consider the same policy
experiment discussed above: [OMEGA] is scaled up because of a change in
policy. If we adopt the assumption that X is fixed, Equation (21) means
that there is a less change in [SIGMA] because [partial
derivative]ln[SIGMA]/[partial derivative] ln [OMEGA]. Note that in the
fixed [kappa] case, = 1. Consequently, a change in [OMEGA] will change
[theta] and the model's dynamics if the inattentive agent is facing
fixed marginal cost of information. Therefore, different ways of
modeling RI may lead to different policy implications. (12)
Extension to Correlated Shocks and Noises. In the above analysis,
we assumed that the exogenous fundamental shock and noise are
uncorrelated. We now discuss how correlated shocks and noises affect the
implications of SE and RI for the model's dynamic behavior. In real
systems, we may observe correlated shocks and noises. For example, if
the system is an airplane and winds are buffeting the plane, the random
gusts of wind affect both the process (the airplane dynamics) and the
measurement (the sensed wind speed) if people use an anemometer to
measure wind speed as an input to the Kalman filter. Please see Simon
(2006) for a discussion. To use an economic example, suppose there is a
correlation between levels and volatility, and also that producing
national income data is costly; then a transition from low to high
volatility would be associated with low output levels and, at the
margin, less accurate measurements.
It is straightforward to introduce correlated shocks and noises
into the SE problem. Specifically, we consider the case in which the
process shock ([epsilon]) and the noise ([xi]) are correlated as
follows:
corr([[epsilon].sub.t+1], [[xi].sub.t+1]) = [rho],
cov ([[epsilon].sub.t+1], [[xi].sub.t+1]) = [GAMMA] = [rho][square
root of [OMEGA]][square root of [LAMBDA]],
where [rho] is the correlation coefficient between
[[epsilon].sub.t+1] and [[xi].sub.t+1], [OMEGA] = var
[[[epsilon].sub.t+1]] and [LAMBDA] = var [[[xi].sub.t+1]]. Under SE,
[LAMBDA] is given exogenously and the correlation just introduces
another exogenous stochastic dimension on the noise. As shown in Simon
(2006), in this case the optimal Kalman gain can be written as
(24) [theta] = ([PSI] + [GAMMA])[([PSI] + [LAMBDA] +
2[GAMMA]).sup.-1],
and the updating formula for the conditional variance is
(25) [summation] = [PSI] - [([PSI] + [GAMMA]).sup.2] [([PSI] +
[LAMBDA] + 2[GAMMA]).sup.-1],
where [PSI] = [OMEGA] + [A.sup.2][summation]. Just like the case
without the correlation, given A and T, Equations (24) and (25) jointly
determine the steady state (0, [SIGMA]).
In the RI problem, the correlation generalizes the assumption in
Sims (2003) on the uncorrelated RI-induced noise. In the steady state.
Equation (25) can be rewritten as the following quadratic equation in
terms of [square root of [LAMBDA]]: [[[rho].sup.2][OMEGA] - O([psi] -
[summation])] A 4- 2[rho][summation][square root of [OMEGA]][square root
of [LAMBDA]] + [summation][psi] = 0, which can be solved for
(26) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
It is clear from (26) that if [kappa] is fixed, the change in
[OMEGA] will lead to the same change in [SIGMA], [PSI], and [LAMBDA],
but has no effect on the Kalman gain [theta] =
[summation][[LAMBDA].sup.-1]. That is, the presence of correlated noise
does not change the dynamic behavior of the model.
Consider the RI problem with a fixed information-processing cost
([lambda]). From Equations (21) and (26), it is clear that in the
presence of correlated noise ([rho] > 0), there is a less change in
[SIGMA] when there is a change in [OMEGA] because [[partial derivative]
ln [SIGMA]/[partial derivative ln [OMEGA]] < 1, and consequently, the
change in [OMEGA] will also change 0 because A depends on the
interactions between [OMEGA] and [SIGMA].
B. The Multivariate Case
In the multivariate RI problem, it is more difficult to determine
the steady-state conditional variance-covariance matrix [SIGMA] because
it cannot be computed analytically. Here we follow Sims (2003) and
calculate the expected loss due to imperfect observations under RI.
Specifically, we assume that the value functions under full information
and imperfect information can be written as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
respectively. (13) We can compute the optimal [SIGMA] by minimizing
the expected welfare loss due to RI,
(27) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
subject to information-processing constraints. Note that to solve
this problem numerically, we need to use a two-stage procedure. (14)
First, under the LQG assumption, the certainty equivalence principle
applies and the decision rule under imperfect information,
(28) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
is independent of [SIGMA] or [LAMBDA]. We then use this decision
rule to determine [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
which depends on [SIGMA] and [LAMBDA]. Applying the welfare criterion
proposed in Equation (27), we can solve for optimal steady-state [SIGMA]
and [LAMBDA].
Solving the problem posed in Equation (27) is equivalent to solving
the semidefinite programming problem
(29) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
subject to
(30) -ln ([absolute value of [SIGMA]]) (31) ([absolute value of
[A.sup.T] [summation]A + [OMEGA]]) [less than or equal to] >
2[kappa],
(31) [A.sup.T][summation]A + [OMEGA] [greater than or equal to]
[summation],
where Z = [F.sup.T]RF - [F.sup.T] W + [beta]([F.sup.T] [B.sup.T]
PBF + [F.sup.T] [B.sup.T] PA + [A.sup.T] PBF) (see Appendix SI,
Supporting Information, for the derivation). If the
positive-definiteness constraint on [A.sup.T] [SIGMA] A + [OMEGA] -
[SIGMA], Equation (31), does not bind, the first-order condition for E
can be written as
(32) [[summation].sup.-1] = [(G[summation][G.sup.T] +
[G.sub.0]).sup.-1] - Z/[lambda],
where G = [([A.sup.T]).sup.-1] A and [G.sub.0] =
[([A.sup.T]).sup.-1] [OMEGA][A.sup.-1]. We can then use standard methods
to solve Equation (32). When applied to a permanent income model in the
next section, we first solve this equation and then check whether in
fact Equation (31) is satisfied by the optimal solution of [SIGMA]. If
so, the problem is solved. (15)
After computing the optimal steady-state [SIGMA], we can then use
Equation (13) to determine the steady-state [LAMBDA] and Equation (14)
to determine the Kalman gain [theta]. Therefore, the key difference
between SE and RI is that in the SE problem we need to specify the
process of the noise first, whereas in the RI problem we need to first
specify the value of channel capacity that determines the steady-state
conditional variance of the state by solving the semidefinite
programming problem proposed in Equation (29) subject to Equations (30)
and (31). (16) Theoretically, it is clear that after solving an RI
problem, we can always reconstruct an SE problem using the resulting
endogenous noise due to RI as the input, and the two models are
observationally equivalent in this sense. However, it is difficult to
specify the process of the vector of noises ex ante when modeling the
multivariate SE problem. (17)
When modeling the multivariate RI problem we only need to set a
value for channel capacity and then compute optimal conditional
variance-covariance matrices of the state and the variance-covariance
matrices of the noise vector by solving the constrained semidefinite
minimization problem (29). Therefore, in the multivariate RI problem,
the agent's preference, budget constraint, and
information-processing constraints jointly determine the values of
[SIGMA], [LAMBDA], and [theta], whereas in the multivariate SE problem
given [LAMBDA], Equation (13) that is used to determine [SIGMA] and 0
only depends on the budget constraint. If the noise in SE is specified
exogenously, it may violate the optimality conditions for RI; for
example, Melosi (2009) showed that a particular estimated SE model does
not equate the marginal utility of attention across states, implying
that the variance-covariance matrix of the noise would not be consistent
with any channel capacity. Of course, obtaining the marginal utility of
attention requires solving the RI problem, so it will be difficult to
specify ex ante an SE problem consistent with RI. (18)
We now consider the different policy effects of RI and SE in the
multivariate case. We first assume that initially the SE and RI problems
have the same Kalman gain that generates the same dynamic behavior.
Suppose that the variance-covariance matrix of the exogenous shock,
[OMEGA], is scaled up because of a change in policy. (19) In the SE
problem with fixed [LAMBDA], Equations (17) and (18) imply that a change
of [OMEGA] will lead to a different solution for [SIGMA] and [theta],
that is, the change in policy will lead to a change in the model's
dynamics. In contrast, in the multivariate RI problem, as shown in
Equations (29)-(31), a change in [OMEGA] will have complicated effects
on [SIGMA], [LAMBDA], and [theta]. In other words, in the multivariate
case a change in policy will affect the model's behavior in both SE
and RI problems. (Note that in the univariate case the change in policy
does not change the model's dynamics.)
We next consider the effects of RI and SE in Case 2 (i.e., the SNR,
[OMEGA][[LAMBDA].sup.-1], is fixed in the SE problem). As before, we
assume that initially the SE and RI problems have the same Kalman gain.
To illustrate how a change in [OMEGA] affects the Kalman gain in RI and
SE problems, we multiply [SIGMA] on both sides of Equation (17):
(33) [summation][[LAMBDA].sup.-1] = I - [[A[summation][A.sup.T]
[[summation].sup.-1] +
([OMEGA][[LAMBDA].sup.-1])([LAMBDA][[summation].sup.-1])].sup.-1],
where I is the identity matrix and we use the fact that
[OMEGA][[summation].sup.-1] = ([OMEGA][[LAMBDA].sup.-1])
([LAMBDA][[SIGMA].sup.-1]). In Case 2, the policy has the same impact on
[OMEGA] and [LAMBDA] to keep the SNR fixed. Equation (33) clearly shows
that if the policy changes [SIGMA] and then
A[SIGMA][A.sup.T][[SIGMA].sup.-1], it will affect 0 =
[summation][[LAMBDA].sup.-1] even in Case 2. Multiplying [OMEGA] on both
sides of (17) gives
(34) [OMEGA][[LAMBDA].sup.-1] = [OMEGA][[summation].sup.-1] -
[(A[summation][A.sup.T] [[OMEGA].sup.-1] + I).sup.-1],
which means that a change in [OMEGA] will lead to different E given
that [OMEGA][[LAMBDA].sup.-1] is fixed. Note that in the univariate
case, A[summation][A.sup.T][[summation].sup.-1] = [A.sup.2], which means
that the policy has no impact on [theta], and the SE and RI problems
cannot be distinguished by the policy in Case 2 that the SNR,
[OMEGA][[LAMBDA].sup.-1], is fixed.
IV. APPLICATIONS TO MODELS OF PERMANENT INCOME AND PRICE SETTING
In this section, we consider the effects of SE and RI for
consumption dynamics and their policy and welfare implications in an
otherwise standard permanent income model. As in the previous section we
first consider applications to the univariate case and then discuss
applications to the multivariate case.
We are not the first to examine this model. Sims (2003) examined
how RI affects consumption dynamics when the agent only has limited
capacity when processing information. Luo (2008) showed that the RI
permanent income can be solved explicitly even if the income process is
not iid, and then examined how RI can resolve the well-known excess
smoothness and excess sensitivity puzzles; that model admits a reduction
to a single state variable. (20) To explore the attention allocation
problem explicitly, we also consider a simplified price-setting model
used in Woodford (2001) and Mackowiak and Wiederholt (2009), and argue
that this model can also help distinguish RI from SE.
A. The Univariate Permanent Income Model
Optimal Consumption. Following Luo (2008), we have the subsequent
univariate version of the standard permanent income model (Flavin 1981;
Hall 1978) in which households solve the dynamic consumption-savings
problem
(35) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
subject to
(36) [s.sub.t+1] = R[s.sub.t] - [c.sub.t] - [[zeta].sub.t+1],
where u([c.sub.t]) = - 1/2 [([bar.c] - [c.sub.t]).sup.2] is the
period utility function, [bar.c] > 0 is the bliss point, [c.sub.t] is
consumption,
(37) [s.sub.t] = [w.sub.t] + [1/R] [[infinity] summation over
(j=0)] [R.sup.-j] [E.sub.t] [[y.sub.t+j]]
is permanent income (the expected present value of lifetime
resources), consisting of financial wealth ([w.sub.t]) plus human wealth
(i.e., the expected discounted present value of current and future labor
income, y),
(38) [[zeta].sub.t+1] = 1/R [[infinity] summation over (j=t+1)]
[(1/R).sup.j-(t+1)] ([E.sub.t+1] - [E.sub.t]) [[y.sub.j]]
is the time (t + 1) innovation to permanent income with mean 0 and
variance [[omega].sup.2.sub.[zeta]], [w.sub.t] is cash-on-hand (or
market resources), [y.sub.t] is a general income process with Gaussian
white noise innovations, [beta] is the discount factor, and R is the
constant gross interest rate at which the consumer can borrow and lend
freely. (21) We assume y follows an AR(1) process with persistence
coefficient [rho] [member of] [0, 1], [y.sub.t+1] = [[rho].sub.yt] +
[[epsilon].sub.t+1], where [[epsilon].sub.t+1] ~ N (0, [[omega].sup.2]),
[s.sub.t] = [w.sub.t] + [y.sub.t]/ (R - [rho]) and [[zeta].sub.t+1] =
[[epsilon].sub.t+1]/ (R - [rho]). (22) For the rest of the paper we will
restrict attention to points where [c.sub.t] < [bar.c], so that
utility is increasing and concave; following the literature we impose
the restriction that [beta]R = 1, because it implies a stationary path
for consumption. This specification follows that in Hall (1978) and
Flavin (1981) and implies that optimal consumption is determined solely
by permanent income:
(39) [c.sub.t] = (R-1)[s.sub.t].
Within this LQG setting, the certainty equivalence principle holds
and introducing SE or RI lead to the following new consumption function:
(40) [c.sub.t] = (R - 1)[[??].sub.t],
where [[??].sub.t] = [E.sub.t] [[s.sub.t]] is the perceived state
and is governed by the following Kalman filtering equation
(41) [[??].sub.t+1] = (1 - [theta]) (R[[??].sub.t] - [c.sub.t]) +
[theta] ([s.sub.t+1] + [[xi].sub.t+1]),
where [theta] is the Kalman gain, and given [s.sub.0] ~ N
([[??].sub.0], [[SIGMA].sub.0]). As shown in Luo (2008), combining
Equations (36), (40), with (41) yields the following expression for the
change in consumption:
(42) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where L is the lag operator. We require (1 - [theta])[R.sup.2] <
1, the model equivalent of the stabilizability condition stated before
(this condition implies (1 - [theta])R < 1 since R > 1). This
MA([infinity]) process shows that the dynamic behavior of the model is
strongly influenced by the Kalman gain [theta]. Using the explicit
expression for consumption growth (42), we can compute the key
stochastic properties of consumption process: the volatility of
consumption growth, the persistence of consumption growth, and the
correlation between consumption growth and income shocks. (23) All these
moments depend on the Kalman gain. In other words, SE and RI lead to
different consumption processes if and only if the resulting 0 differs.
It is worth noting that the representative-agent RI model can be
regarded as an extreme case of an RI model with a continuum of agents.
For example, consider two PIH models: one with a representative consumer
and one with a continuum of consumers endowed with identical channel
capacity. While the effect of RI on the reaction of consumption to
income shocks is the same in the two PIH environments, the volatility of
aggregate consumption depends on the specifics of the aggregative
mechanism. As argued in Sims (2003), although the randomness in an
agent's response to aggregate shocks will be idiosyncratic because
it arises from the agent's own information-processing channel,
there is likely a significant common component. Therefore, the common
term of the noise lies between 0 and the part of the RI-induced noise,
If we assume that [[[xi].bar].sub.t], consists of two independent
noises: [[xi].sub.t] = [[xi].sub.t] + [[xi].sup.i.sub.t], where
[[bar.[xi]].sub.t] = [E.sup.i] [[[xi].sub.t]] and are the common and
idiosyncratic components of the error generated by [[zeta].sub.t],
respectively. A single parameter, [lambda] = [var ([bar.[xi].sub.t])/var
[bar.[xi].sub.t])][member of] [0, 1], can be used to measure the common
source of coded information on the aggregate component (or the relative
importance of [bar.[xi].sub.t]) vs. [[xi].sub.t]). (Sims 2003 argued
that the presence of the common noise can help fit the model with the
VAR evidence better.) Given the existing theories on this issue, we
cannot pin down the value of [lambda]. Therefore, the [lambda] = 1 case
can be viewed as a special case of the model with a continuum of
consumers which is equivalent with the representative-agent model in
which we do not need to discuss the aggregation issue.
Policy Implications under SE and RI. In this univariate permanent
income model, substituting A = R into Equation (19),
(43) [[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1] =
[[omega].sup.2][[summation].sup.-1] - [[[R.sup.2] [([[omega].sup.2]
[[summation].sup.-1]).sup.-1] + 1].sup.-1],
where [[omega].sup.2.sub.[zeta]] [[summation].sup.-1] =
([[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1]) ([LAMBDA]
[[summation].sup.-1]), and denote [theta] = [summation][[LAMBDA].sup.-1]
and [pi] - [[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1], we obtain the
following relation between [pi] and [theta]:
(44) [pi] = [theta] (1/[1-[theta]] - [R.sup.2])
Solving for [theta] yields
(45) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where we omit the negative root of [theta] as both [SIGMA] and
[LAMBDA] must be positive. Figure 1 illustrates the relationship between
[pi] and [theta] given R = 1.02 and [pi] [member of] [0.1, 10]. It
clearly shows that [theta] is an increasing function of [pi], the
signal-to-noise ratio. Also, as [pi] [right arrow] [infinity], [theta]
[right arrow] 1.
[FIGURE 1 OMITTED]
In the RI version of the permanent income model, we have
(46) [summation] = [[OMEGA]/exp (2[kappa]) - [R.sup.2]],
(47) [LAMBDA] = [([[summation].sup.-1] - [[PSI].sup.-1]).sup.-1],
where [psi] = [R.sup.2][summation] + [OMEGA]. Using (46) and (47),
the Kalman filter gain under RI can be written as
(48) [theta] = [summation][[LAMBDA].sup.-1] = 1- 1/exp(2[kappa]).
Comparing Equations (45) with (48), it is clear that the
signal-to-noise ratio ([pi]) and the level of channel capacity ([kappa])
have one-to-one correspondence. Figure 2 shows the relationship between
[kappa] and [pi] when the SE and RI problems are observationally
equivalent in the sense that they lead to the same consumption dynamics
governed by the Kalman gain [theta], that is, [pi] = (1 - exp
(-2[kappa])) (exp (2[kappa]) - [R.sup.2]). This result is consistent
with the general conclusion we obtained using Case 2 in the previous
section.
[FIGURE 2 OMITTED]
Using the same expression for [theta], (45), we can examine how
Case 1 can be used to distinguish SE and RI when implementing a change
in government policy. Specifically, in the SE problem, we assume that
before the government implements stabilization policies, the SNR [pi] =
[[omega].sup.2.sub.[zeta]]/[LAMBDA] = 2. In this case, [theta] = 0.73.
After the government implements these policies, the variance of the
shock to permanent income will be reduced from
[[omega].sup.2.sub.[zeta]] to 0.5 [[omega].sup.2.sub.[zeta]]. As
[LAMBDA] is fixed in Case 1, [pi] will fall from 2 to 1; consequently,
[theta] = 0.62. We now assume that the RI and SE problems are
observationally equivalent in the sense that they lead to the same
[theta] = 0.73 before implementing the stabilization policies. After
implementing these policies, [[omega].sup.2.sub.[zeta]] will be scaled
down to 0.5 [[omega].sup.2.sub.[zeta]], and the RI theory predicts that
both [SIGMA] and A will be scaled down to 0.5[SIGMA] and 0.5[LAMBDA],
respectively. (24) Consequently, the Kalman filter gain, [theta] =
[SIGMA][[LAMBDA].sup.-1] = 0.73, remains unchanged. In other words,
stabilization policy has different implications for consumption dynamics
in the SE and RI models.
Alternatively, if we assume that the cost of information processing
([lambda]) is fixed, the optimal conditional variance equals
(49) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Comparing Equations (49) with (46), it is clear that the two
modeling strategies are observationally equivalent in the sense that
they lead to the same conditional variance if the following equality
holds:
(50) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
In this case, the Kalman gain is
(51) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
After implementing these policies, [[omega].sup.2.sub.[xi]] is
scaled down to 0.5 [[omega].sup.2.sub.[xi]] (i.e., the economy switches
to a more stable environment) and the fixed [lambda] theory predicts
that the Kalman filter gain, [theta] = [summation][[LAMBDA].sup.-1], is
reduced. For example, before the government implements stabilization
policies, we have [lambda]/[[omega].sup.2.sub.[xi]] = 0.000135 and
[theta] = 0.79. After the policy, we can easily calculate that [theta] =
0.68 using Equation (51). Figure 3 plots the different implications of
SE and RI for consumption dynamics after implementing the stabilization
policy: consumption growth falls more (less) under SE than RI when
[kappa] is fixed (when [lambda] is fixed), since the Kalman gain
decreases (increases). The intuition behind this result is simple. In
the fixed [lambda] case some capacity will be reallocated to other
sources because a reduction in macroeconomic uncertainty leads to
smaller welfare losses due to RI.
[FIGURE 3 OMITTED]
Implications for Consumption Volatility in Emerging and Developed
Countries. Aguiar and Gopinath (2007) consider two groups of small
economies (emerging and developed economies), and find that consumption
is more volatile than income at business cycle frequencies for emerging
markets, as compared to a ratio of less than one for developed markets.
They show that a SOE real business cycle (RBC) model driven primarily by
shocks to trend growth can explain well this regularity about the
relative volatility of consumption to income in emerging markets.
However, by using a long time series data over 1900-2005, Garcia-Cicco,
Pancrazi, and Uribe (2010) estimate an RBC model driven by the same
shocks considered in Aguiar and Gopinath (2007), and find that the model
does a poor job explaining the observed business cycle fluctuations in
Argentina and Mexico along a number of dimensions including the relative
volatility of consumption to income. In this section, we briefly show
that the RI model with a fixed information-processing cost can explain
the observed difference in consumption volatility in emerging and
developed countries.
The PIH model presented in Section A can be regarded as an SOE
model in which the constant interest rate is given exogenously and there
are a continuum of consumers in the model economy. Using Equation (42)
and assuming that all idiosyncratic errors are canceled out after
aggregation, the relative volatility of consumption growth to income
growth can be written
as
(52) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where sd(*) denotes standard deviation. It is straightforward to
show that the relative consumption volatility is decreasing with the
degree of imperfect state observations because [partial
derivative][mu]/[partial derivative][theta] < 0. As shown in Luo,
Nie, and Young (2012), if there is no imperfect-state-observation
assumption (i.e., [theta]=1), the model cannot generate the empirical
relative consumption volatility. For example, if R = 1.04, the full
information model predicts that [mu] in emerging and developed economies
would be 0.28 and 0.24, respectively. In contrast, in the data, the
corresponding [mu] values are 1.35 and 0.98, respectively. (25) In the
RI model with fixed capacity, [theta] is uniquely determined by fixed
capacity [kappa] and thus has no impact on the cross-country comparison
if emerging and developed countries have the same average amount of
channel capacity. In contrast, if we adopt the fixed
information-processing cost assumption, Equations (51) and (52) can have
the potential to generate the observed difference in consumption
volatility in emerging and developed countries because [theta] is an
increasing function of income uncertainty and income uncertainty in
emerging countries is much higher than that in developed countries (sd
([DELTA]y)/mean(y) is 3.82 in emerging countries, while it is 2.07 in
developed countries). Intuitively, in developed countries consumers pay
less attention to macroeconomic conditions because the fundamental
uncertainty is low; consequently, the aggregate consumption process in
these countries is more stable relative to the income process. (26)
General Equilibrium Asset Pricing Implications. The PIH model
presented in Section A is usually regarded as a partial equilibrium
model. However, as noted in Hansen (1987) and Cochrane (2005), it is not
a partial equilibrium result--it is a general equilibrium model with a
linear production technology and an exogenous income process. Given the
expression of optimal consumption in terms of the state variables
derived from the PIH model with imperfect-state-observation, we can
price assets by treating the process of aggregate consumption that
solves the model as though it was an endowment process. In this setup,
equilibrium prices are shadow prices that leave the agent content with
that endowment process.
In the model setting specified in Section A, w can be regarded as
capital. R can be regarded as the return on the linear technology and is
not yet the interest rate (the equilibrium rate of return on one-period
claims to consumption). As proposed in Cochrane (2005) and used in Luo
and Young (2010b), after finding optimal consumption as in Equation
(40), we can price one-period claims using this equilibrium consumption
stream. Denoting the risk-free rate by [R.sup.f], we have the following
Euler equation:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [E.sub.t] [*] is the consumer's expectation operator
conditional on his/her processed information at time t. We can now use
the basic pricing equation, p = E[mx]. (27) to compute the price of the
stream of aggregate consumption (treated as the stream of endowments) as
(28)
(53) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [m.sub.t,t+j] [equivalent to] [[beta].sup.j]
[u'([c.sub.t+j])/u'([c.sub.t])] is the stochastic discount
factor, and [XI] [equivalent to] [[summation].sup.[theta].sub.j=1]
([[beta].sup.j] [var.sub.t] [[c.sub.t+j]]) =
[(2-[theta]R)/1-[R.sup.2](1-[theta])] [[omega].sup.2.sub.[zeta]].
Denoting the risk-neutral component by [p.sup.rn.sub.t] and the
risk-adjusted component by [p.sup.ra.sub.t], we have
(54) [P.sup.rn.sub.t] = [1/R-1][c.sub.t]
and
(55) [p.sup.ra.sub.t] = [1/[bar.c]- [c.sub.t]]
[(2-[theta])R/1-[R.sup.2](1-[theta])][[omega].sup.2.sub.[zeta]].
Equation (53) yields the following implications. The first term in
Equation (53) is the risk-neutral component denoted by [p.sup.rn.sub.t].
This term can be regarded as the value of a perpetuity paying [c.sub.t].
The second term is the risk-adjusted component, [p.sup.ra.sub.t]; it
lowers the asset price relative to the risk-neutral level because
[c.sub.t] [less than or equal to] [bar.c] and it is decreasing with the
degree of attention (0).
From Equations (53) and (55), it is clear that the Kalman gain
([theta]) also plays a key role in determining the general equilibrium
asset prices under both SE and RI. Consider the same policy experiment
we discussed in the last section. If the economy switches to a more
stable environment as a result of stabilization policy (i.e.,
[[omega].sup.2.sub.[zeta]] is scaled down to 0.5
[[omega].sup.2.sub.[zeta]]), the fixed [lambda] assumption predicts that
the Kalman filter gain, [theta]= [[summation][[LAMBDA].sup.-1], is
reduced. Given that [lambda]/[[omega].sup.2.sub.[zeta]] = 0.000135 and
[theta] = 0.79, we can easily calculate that 0 = 0.68 using Equation
(51). Equation (55) clearly shows that asset price falls less (more)
under SE than RI when [kappa] is fixed (when [lambda] is fixed), since
the Kalman gain decreases (increases).
B. The Multivariate Permanent Income Model
In this section, we solve for optimal steady-state [SIGMA] and
[LAMBDA] in a parametric multivariate RI permanent income model and then
illustrate the differences between RI and SE problems. This example is
similar to that discussed in Sims (2003) and considers multiple income
shocks with different stochastic properties. Specifically, we assume
that the original budget constraint is as follows
(56) [w.sub.t+1] = R[w.sub.t] - [c.sub.t] + [y.sub.t+1],
where [w.sub.t] is the amount of cash-in-hand, and the income
process [y.sub.t] have two persistent components (x and z) and one
transitory component ([[epsilon].sub.y,t])
[y.sub.t] = [bar.y] + [x.sub.t] + [z.sub.t] + [[epsilon].sub.y,t],
[x.sub.t] = 0.99[x.sub.t-1] + [[epsilon].sub.x,t], [z.sub.t] -
0.95[z.sub.t-1] + [[epsilon].sub.z,t],
with
(57) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [x.sub.t] is the most persistent and smooth component and
[[epsilon].sub.y,t] is the most transitory and volatile component. For
the quadratic utility function u([c.sub.t]) = - 1/2 [([c.sub.1] -
[bar.c]).sup.2], using the first welfare criterion (27) provided in
Section B and setting [beta] = 0.95, we can compute that
(58) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
when capacity [kappa] = 2.2 bits, which can be used to compute the
variance of the noise [LAMBDA] using [[LAMBDA].sup.-1] =
[[SIGMA].sup.-1] - [[psi].sup.-1], and then compute the Kalman gain
according to [theta] = [SIGMA][[LAMBDA].sup.-1]. It is clear from
Equation (58) that owing to the low capacity devoted to monitoring the
state, the post-observation variances (i.e., the conditional variances)
of both the x and z components are greater than the corresponding
innovation variances in Equation (57). More importantly, the conditional
variance of the slow-moving x component is 18 times larger than its
corresponding innovation variance, whereas that of the fast-moving z
component is only 2 times larger than its innovation variance. (29) The
intuition behind this result is that the optimizing agent devotes much
less capacity to monitoring the slow-moving component, which leads to
greater effects on the conditional variance term. Figure 4 plots the
impulse responses of consumption to the income shocks and noises. It
shows that consumption reacts to the income shocks gradually and with
delay, and reacts to the corresponding noises promptly. In addition, we
can see that the response of consumption to the slow-moving x component
is much more damped than that to the fast-moving z component. It is also
worth noting that as the agent only cares about the trace of Z[SIGMA]
and the symmetric matrix Z is negative semidefinite, the agent with low
capacity will choose to make the post-observations of the states be
negatively correlated. This correlation conserves capacity by permitting
some information about each state to be transmitted using a single nat.
[FIGURE 4 OMITTED]
When we relax the information-processing capacity and increase
[kappa] to 2.8 nats, the conditional covariance matrix becomes
(59) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Comparing Equations (58) with (59), we can see that relaxing
information-processing capacity has the largest impact on the
conditional variance of the endogenous state variable w: the
post-observation variance of w is reduced to about half the initial
value. The intuition behind this result is that the endogenous variable
plays the most important role in the welfare losses due to RI. To see
this clearly, the matrix Z is displayed here:
(60) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
While w per unit has less of an effect on welfare, it is
proportionally much larger than either of the other two state variables.
It is also clear that as the information constraint is relaxed the agent
chooses to allocate more capacity to monitoring the slow-moving
component x than to monitoring the z component.
Note that in the RI problem (58) is optimal in the sense that it
minimizes the expected welfare losses due to finite
information-processing capacity by allocating fixed capacity optimally
across different elements in the state vector. In contrast, in the SE
problem, A must be specified first and then [SIGMA] and [theta] can be
computed. However, it is difficult to specify [LAMBDA] without prior
knowledge about the states. Ad hoc assumptions on A might be
inconsistent with the underlying efficiency conditions. Therefore, RI
could provide a useful way to specify the stochastic properties of the
noises by solving the agent's optimization problem subject to
information constraints. As we have noted previously, Melosi (2009)
presents an application of this idea; he notes that a particular
estimated model shows that the marginal utility of information is not
equated across variables and is thus inconsistent with RI (i.e.,
inconsistent with any value for [kappa]).
C. Attention Allocation in a Price-Setting Model
In the multivariate permanent income model we discussed in the
previous section, the difference between the RI and SE problems comes
from the optimal allocation of capacity, and the consumption reactions
to different income shocks do depend on optimal attention allocation.
However, in this setting, the optimal attention/capacity allocation
cannot be addressed explicitly because there exists an endogenous state
variable and attention allocation between the endogenous state variable
(in) and exogenous state variables (persistent components in y) is
nonsymmetric and complicated. In contrast, as has been shown in
Mackowiak and Wiederholt (2009), the attention allocation problem
between the exogenously given aggregate and idiosyncratic shocks has an
explicit solution. In this section, we present a simplified version of
their price-setting model.
Consider a firm price-setting model with monopolistic competition
in which firms continuously observe noisy signals about two random
variables: one aggregate variable and one firm-specific variable. Under
the full-information assumption, the profit-maximizing price (in logs)
of firm i, [p.sub.i], can be written as
(61) [p.sup.f.sub.i] = p + [[alpha].sub.x] x + [[alpha].sub.z]
[z.sub.i],
where p is the log of the aggregate price level, x is the log of
aggregate output, [z.sup.i] is an idiosyncratic demand shock, and
[[alpha].sup.x] and a, are coefficients that depend on structural
parameters in the profit function (the superscript f in the price
function indicates full information). All the variables on the RHS are
assumed to be normally distributed. Since the sum of two normal
variables is also normal, we can summarize the aggregate condition as y
= p + [[alpha].sub.x] x such that
(62) [p.sup.f.sub.i] = y + [[alpha].sub.x] x + [[alpha].sub.z]
[z.sub.i],
where y and z are assumed to be Gaussian variables with mean 0 and
variances [[sigma].sup.2.sub.y] and [[sigma].sup.2.sub.z], respectively.
Under RI, the typical firm cannot observe y and [z.sub.i] perfectly, so
all it can observe are noisy signals [y.sup.*] on the aggregate state
and [z.sup.*] on the idiosyncratic state:
(63) [y.sup.*] = y + [[xi].sub.y] and [z.sup.*] = [z.sub.i] +
[[xi].sub.z],
where the noises due to RI, [[xi].sub.y] and [[xi].sub.z] are
Gaussian variables with mean 0 and variances [[omega].sup.2.sub.y] and
[[omega].sup.2.sub.z], respectively. They are assumed to be independent
with each other and are also not correlated across firms. (30) Given the
observed independent noisy signals, the optimal price can be written as
(64) [p.sup.*.sub.i] = E [y|[y.sup.*]] + [[alpha].sub.z] E
[[z.sub.i] |[z.sup.*]].
Because [p.sup.*.sub.i] is different from the full-information
solution [p.sup.f.sub.i], the firm suffers profit losses from RI. The
profit loss function of the firm can be written as
(65) [DELTA][pi] = [gamma]/2[([p.sup.f.sub.i] -
[p.sup.*.sub.i]).sup.2],
where [gamma] > 0. Given Equations (64) and (65), the attention
allocation problem can be reduced to
(66) min var [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
subject to the information-processing constraint (IPC):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
which can be reduced to [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
IN ASCII] where [[sigma].sup.2.sub.y|[y.sup.*]] are posterior
conditional variances of y and [z.sub.i], respectively, and we use the
facts that (1) for a quadratic objective function and Gaussian state
variables, it is optimal to choose the joint density of the states and
noisy signals to be also Gaussian and (2) the aggregate and
idiosyncratic variables are uncorrelated. After using the IPC to
substitute out [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] the
optimal solution of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]:
(67) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
which gives the ratio of posterior to prior precision of briefs
about the aggregate condition under the optimal attention allocation.
(31) Equation (67) provides several important implications for the
optimal attention allocation to the aggregate condition. First, greater
values of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] mean that
the firm pays more attention to the aggregate condition. Second, given
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is increasing with
the relative importance of the prior variances of the aggregate and
firm-specific conditions. As the firm-specific shock is 10 times more
volatile than the aggregate shock as calibrated in Mackowiak and
Wiederholt (2009), the firm optimally pays much more attention to the
firm-specific shock.
Similarly, we can obtain optimal [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII]
(68) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Finally, using Equations (67) and (68), we can easily recover the
variances of the noises, [[omega].sup.2.sub.y] and
[[omega].sup.2.sub.z],
(69) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(70) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
In the corresponding SE problem in which [[omega].sup.2.sub.y] and
[[omega].sup.2.sub.z] are given exogenously and arbitrarily, using
Equations (69) and (70), we can recover the posterior conditional
variances, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], which
may not be optimal in the sense that they do not lead to the minimum
welfare loss, var ([p.sup.f.sub.i] | [y.sup.*], [z.sup.*]). The argument
in this model is similar to that in the permanent income model with an
endogenous state variable in the previous section. (32)
We conclude this section with a point regarding the importance of
distinguishing between SE and RI. In the context of this model, Paciello
and Wiederholt (2013) show how the distinction for SE and RI matters for
optimal policy. As under RI the variance-covariance matrix of the noises
will shift with monetary policy, the central bank will use that fact to
improve allocations--specifically, they find that the divine coincidence
of monetary policy (no trade-off between inflation and output
stabilization) holds under RI but not under SE. Of course, this result
does not help us distinguish between the two theories, but rather
provides a motive for doing so.
V. DISCUSSION ON THE CONTINUOUS-TIME SETTING
So far we have focused on the discrete-time setting in which agents
are assumed to have imperfect state information. In this section we
discuss the different implications of SE and RI in the continuous-time
setting. For simplicity, we use the permanent income model as an example
to illustrate the key different implications of the two information
frictions. In the continuous-time version of the permanent income model,
the typical consumer is assumed to maximize the following lifetime
utility:
(71) [E.sub.0] [[[integral].sup.[infinity].sub.t=0]
exp(-[beta]t)u([c.sub.t])dt],
subject to the following budget constraint:
(72) d[w.sub.t] = (r[w.sub.t] + [y.sub.t] - [c.sub.t]) dt,
where u([c.sub.t]) = - [([bar.c] - [c.sub.t]).sup.2] /2, and all
the variables here have the same definitions as that in the last
section. Labor income ([y.sub.t]) is assumed to follow a continuous-time
AR(1) (Ornstein-Uhlenbeck) process:
(73) d[y.sub.t] = [rho] ([mu]/[rho] - [y.sub.t]) dt +
[sigma]d[[zeta].sub.t],
where the steady-state income [bar.y] - [mu]/[rho], [rho] > 0
governs the speed of convergence from the steady state, and
[[zeta].sub.t] is a standard Brownian motion.
Just like the discrete-time model, to simplify the model, we adopt
the same state-space reduction approach and define a new state [s.sub.t]
= [a.sub.t] + [y.sub.t] / (r + [rho]). Consequently, the new state
transition equation can be written as:
(74) d[s.sub.t] = (r[s.sub.t] - [c.sub.t]) dt +
[[sigma].sub.s]d[[zeta].sub.t],
where [[sigma].sub.s] = [sigma]/ (r + [rho]). In the
full-information case, it is straightforward to show that the
consumption function is:
(75) [c.sub.t] = r[s.sub.t].
We assume that the consumer cannot observe the state perfectly and
only observes a noisy signal:
(76) [s.sup.*.sub.t] = [s.sub.t] + [[xi].sub.t],
where [[xi].sub.t] is a standard Brownian motion with mean 0 and
variance [LAMBDA], and is independent of [[zeta].sub.t]. To solve the
model, we rewrite Equation (76) in the following differentiation form:
d[s.sup.*.sub.t] - d[s.sub.t] + d[[xi].sub.t].
In this continuous-time LQG setting, the separation principle holds
and the consumption function can now be written as
(77) [c.sub.t] = r[[??].sub.t],
and the perceived state [[??].sub.t], = [E.sub.t] [[s.sub.t]] is
governed by the following Kalman filtering equation:
(78) d[[??].sub.t], = (r[[??].sub.t] - [c.sub.t]) dt +
[theta]d[[eta].sub.t],
where
(79) d[[eta].sub.t] = r ([s.sub.t] - [[??].sub.t]) dt +
[[sigma].sub.s]d[[zeta].sub.t] + d[[xi].sub.t],
with mean E [d[[eta].sub.t]] - 0 and var(D[[eta].sub.t]) =
([[sigma].sup.2.sub.s] + [LAMBDA])dt, [theta] = [SIGMA][[LAMBDA].sup.-1]
is the steady-state Kalman gain, and [SIGMA](= ([[sigma].sup.-2.sub.S] +
[[LAMBDA].sup.-1])-1 > 0) is the conditional variance of .v. Note
that [[eta].sub.t] is a Brownian motion with mean 0. Although the
Brownian variable, [[xi].sub.t], is not observable, the innovation
process, [[eta].sub.t], is observable because it is derived from
observable processes (i.e., d[s.sup.*.sub.t] and (r[[??].sub.t] -
[c.sub.t])dt). In this case, the path of the conditional expectation,
[[??].sub.t], is generated by the path of the innovation process,
[[eta]sub.t]. In summary, given [LAMBDA], Equations (77) and (78)
completely characterize the dynamics of the continuous-time PIH model.
We now move on to discuss the RI case in which we assume that the
consumer only has finite channel capacity when processing relevant
information about the state. As emphasized in Sims (1998) and discussed
in Kasa (2006) and Reis (2010), the observation equation, (76), is not
suitable to model RI owing to finite capacity because this specification
means that in any finite interval, arbitrarily large amounts of
information can be passed through the consumer's channel.
Specifically, Sims (1998) shows that when the true state (5) and the
noise ([xi]) are generated by nicely behaved stochastic differential
equations driven by Wiener processes, then if the noise has at least as
many derivatives as the true state, noisy observations can transmit
infinite amounts of information in finite time, while if the noise has
fewer derivatives than the true state, noisy observations over a finite
interval can transmit only finite amounts of information. Moscarini
(2004) solves this problem by assuming that the agent samples the
process at discrete intervals. (33) Kasa (2006) proposes an alternative
way to solve this problem. He assumes that the observation equation is
written as:
(80) d[s.sup.*.sub.t] = [s.sub.t dt + d[[xi].sub.t],
where the current signal affects the instantaneous rate of change
in the noisy observation. In this specification, the Kalman filtering
equation can be written as
(81) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where
(82) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
with mean E[d[[??].sub.t]] = 0 and var (d[[??].sub.t]) =
[LAMBDA]dt. Here [[??].sub.t] is also an observable Brownian motion with
mean 0. It is clear from Equations (79) and (82) that the two
specifications of imperfect information observations, (76) and (80),
lead to different consumption behavior.
In this RI case, of course, the variance of the noise, [LAMBDA], is
determined endogenously by the following continuous-time version of the
information-processing constraint specified in Section A:
(83) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is the
joint probability measure on s and [s.sup.*], [P.sub.s] and
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are the marginal
probability measures of s and [s.sup.*], respectively, and [kappa] is
finite capacity. As shown in Duncan (1970) and Reis (2010), Equation
(83) reduces to [summation][[LAMBDA].sup.-1] [less than or equal to]
[kappa], which determines the value of [LAMBDA]. Comparing Equations
(79) with (82), it is clear that we must distinguish the two noisy
signal specifications when modeling RI due to finite capacity; that is,
it is not the case that all SE environments can be derived from an
underlying RI problem.
VI. CONCLUSIONS
In this paper we have explored the implications of two
informational friction theories, SE and RI, for economic behavior,
policy, and welfare within the LQG setting. First, we showed that if the
variance of the noise itself is fixed exogenously, the two theories can
be distinguished as they lead to different dynamics and welfare after
implementing government policies. Second, we showed that if the SNR is
fixed in the SE problem and the channel capacity is fixed in the RI
problem, SE and RI are observationally equivalent in the sense that they
lead to the same dynamics even after implementing policies in the
univariate case, whereas they generate different policy and welfare
implications in the multivariate case. Furthermore, in the multivariate
case we showed that under RI the agent's preference, budget
constraint, and information-processing constraints jointly determine the
stochastic properties of the post-observation variance and endogenous
noise; hence, RI provides a micro-founded way to specify the nature of
the Kalman gain that governs the model's dynamics.
Distinguishing between SE and RI has policy relevance beyond the
simple examples we consider here. As we noted earlier, Paciello and
Wiederholt (2013) study optimal monetary policy in a model that nests
both SE and RI. They find that the policy can differ across the two
types of models--the key is that under RI the policymaker can affect the
attention allocation between different shocks. In particular, they find
that the divine coincidence of no tradeoff between stabilizing prices
and output holds under RI, but not under SE.
Extending our results to compare environments outside the LQG setup
will be challenging. Solving the SE problem is straightforward, although
computation of the conditional expectations may be difficult depending
on the distributional assumptions. (34) The RI problem is difficult to
solve, however, because the optimal joint distribution of states and
controls is typically not of a known form and not easy to approximate;
the optimality of discrete solutions discussed in Matejka and Sims
(2010) make it difficult to characterize the distribution in terms of a
small number of parameters. It seems therefore unlikely that SE and RI
environments will generally be observationally equivalent.
ABBREVIATIONS
LQG: Linear-Quadratic-Gaussian
RBC: Real Business Cycle
RI: Rational Inattention
SE: Signal Extraction
SNR: Signal-to-Noise Ratio
SOE: Small Open Economy
doi:10.1111/ecin.12073
REFERENCES
Adam, K. "Optimal Monetary Policy in the Presence of Imperfect
Common Knowledge." Journal of Monetary Economics, 54(2), 2005,
267-301.
Aguiar, M., and G. Gopinath. "Emerging Market Business Cycles:
The Cycle Is the Trend." Journal of Political Economy, 115(1),
2007, 69-102.
Angeletos, G.-M., and J. La'O. "Noisy Business
Cycles." NBER Macroeconomics Annual 2009, 24, 2010, 319-78.
Cochrane, J. Asset Pricing. Princeton, NJ: Princeton University
Press, 2005.
Duncan, T. E. "On the Calculation of Mutual Information."
SIAM Journal of Applied Mathematics, 19, 1970, 215-20.
Flavin. M. A. "The Adjustment of Consumption to Changing
Expectations About Future Income." Journal of Political Economy,
89(5), 1981, 974-1009.
Garcia-Cicco, J., R. Pancrazi, and M. Uribe. "Real Business
Cycles in Emerging Countries?" American Economic Review, 100(5),
2010, 2510-31.
Hall, R. E. "Stochastic Implications of the Life
Cycle-Permanent Income Hypothesis: Theory and Evidence." Journal of
Political Economy, 86(6), 1978, 971-87.
Hansen, L. P. "Calculating Asset Prices in Three Example
Economies," in Advances in Econometrics, Fifth World Congress,
edited by T. F. Bewley. Cambridge: Cambridge University Press, 1987.
Hansen, L. P., and T. J. Sargent. Robustness. Princeton, NJ:
Princeton University Press, 2007.
Kahneman, D. Attention and Effort, Englewood Cliffs, NJ: Prentice
Hall Press, 1973.
Kasa, K. "Robustness and Information Processing." Review
of Economic Dynamics, 9(1), 2006, 1-33.
Kim, J., E. Ko, and T. Yun. "The Role of Bounded Rationality
in Macro-Finance Affine Term-Structure Models." Unpublished, 2012.
Lucas, R. E., Jr. "Expectations and the Neutrality of
Money." Journal of Economic Theory, 4(1), 1972, 103-24.
Luo, Y. "Consumption Dynamics under Information Processing
Constraints." Review of Economic Dynamics, 11(2), 2008, 366-85.
--. "Rational Inattention. Long-run Consumption Risk, and
Portfolio Choice." Review of Economic Dynamics, 13(4), 2010,
843-60.
Luo, Y., and E. R. Young. "Risk-sensitive Consumption and
Savings under Rational Inattention." American Economic Journal:
Macroeconomics, 2(4), 2010a, 281-325.
--. "Asset Pricing under Information-Processing
Constraints." Economics Letters, 107(1), 2010b, 26-29.
Luo, Y., J. Nie, and E. R. Young. "Robustness,
Information-Processing Constraints, and the Current Account in Small
Open Economies." Journal of International Economics, 88(1), 2012,
104-20.
Mackowiak, B., and M. Wiederholt. "Optimal Sticky Prices under
Rational Inattention." American Economic Review, 99(3), 2009,
769-803.
Matejka, F., and C. A. Sims. "Discrete Actions in
Information-Constrained Tracking Problems." Unpublished, 2010.
Melosi, L. "A Likelihood Analysis of Models with Information
Frictions," Penn Institute for Economic Research Working Paper
09-009, 2009.
Menkulasi, J. "Rational Inattention and Changes in
Macroeconomic Volatility." Unpublished, University of Maryland,
2010.
Morris, S., and H. Song Shin. "The Social Value of Public
Information." American Economic Review, 92(5), 2002, 1521-34.
Moscarini, G. "Limited Information Capacity as a Source of
Inertia." Journal of Economic Dynamics and Control, 28(10), 2004,
2003-35.
Muth. J. F. "Optimal Properties of Exponentially Weighted
Forecasts." Journal of the American Statistical Association,
55(290), 1960, 299-306.
Paciello, L., and M. Wiederholt. "Exogenous Information,
Endogenous Information, and Optimal Monetary Policy." Review of
Economic Studies, 2013. doi: 10.1093/restud/rdt024
Reis, R. "When Should Policymakers Make Announcements?"
Unpublished, 2010.
Sargent, T. J. "The Observational Equivalence of Natural and
Unnatural Rate Theories of Macroeconomics." Journal of Political
Economy, 84(3). 1976, 631-40.
--. "Equilibrium with Signal Extraction from Endogenous
Variables." Journal of Economic Dynamics and Control, 15(2), 1991,
245-73.
Simon, D. Optimal State Estimation: Kalman, H-infinity, and
Nonlinear Approaches. Hoboken, NJ: John Wiley & Sons, 2006.
Sims, C. A. "Stickiness." Carnegie-Rochester Conference
Series on Public Policy, 49, 1998, 317-56.
--. "Implications of Rational Inattention." Journal of
Monetary Economics, 50(3), 2003, 665-90.
--. "Rational Inattention: Beyond the Linear-Quadratic
Case." American Economic Review, 96(2), 2006, 158-63.
--. "Rational Inattention and Monetary Economics," in
Handbook of Monetary Economics, Vol. 3, Chapter 4, edited by B. M.
Friedman and M. Woodford. Amsterdam: Elsevier, 2010, 155-81.
Townsend, R. M. "Forecasting the Forecasts of Others."
Journal of Political Economy, 91(4), 1983, 546-88.
Van Nieuwerburgh, S., and L. Veldkamp. "Information
Acquisition and Under-Diversification." Review of Economic Studies,
77(2), 2010, 779-805.
Veldkamp, L. Information Choice in Macroeconomics and Finance.
Princeton, NJ: Princeton University Press, 2011.
Wang, N. "Precautionary Saving and Partially Observed
Income." Journal of Monetary Economics, 51(8), 2004, 1645-81.
Woodford, M. "Imperfect Common Knowledge and the Effects of
Monetary Policy," in Knowledge, Information, and Expectations in
Modern Macroeconomics: In Honor of Edmund S. Phelps, edited by P.
Aghion, R. Frydman, J. Stiglitz, and M. Woodford. Princeton, NJ:
Princeton University Press, 2001.
(1.) Muth (1960) first applied the SE structure to an economic
model (namely the permanent income model of Friedman 1957); later SE
problems were extended to models with multiple agents by Townsend (1983)
and Sargent (1991). More recent studies include Morris and Shin (2002),
Wang (2004), and Angeletos and La'O (2010).
(2.) Luo (2008) was an early contribution to the RI literature;
other contributions include Adam (2005), Kasa (2006), Luo (2010), Luo
and Young (2010a), Mackowiak and Wiederholt (2009), Melosi (2009), Reis
(2010), Van Nieuwerburgh and Veldkamp (2010), Paciello and Wiederholt
(2013), and Kim, Ko, and Yun (2012).
(3.) See Melosi (2009) for an empirical investigation of this
issue.
(4.) Aguiar and Gopinath (2007) document that the relative
volatility of consumption growth to income growth in emerging countries
is significantly greater than that in developed countries.
(5.) Our quadratic objective function encompasses the standard
tracking objective of minimizing the squared difference of the control
from the target.
(6.) Muth (1960) showed that the exponentially weighted average of
past observations of a random walk plus a noise process is optimal in
the sense that it minimizes the mean squared forecast error.
(7.) For alternative bases for the logarithm, the unit of
information differs; with log base 2 the unit of information is the
"bit" and with base 10 it is a "dit" or a
"hartley."
(8.) By "better" we mean that conditional on draws by
nature for the true state, the expected utility of the agent increases
if information about that state is improved.
(9.) Note that here we need to impose the restriction exp
(2[kappa]) - [A.sup.2] > 0. If this condition fails, the state is not
stabilizable and the unconditional variance diverges. Obviously we
cannot directly impose conditions on A, as it is chosen by the agents in
the model; it is also clear that we can, for a given model, easily find
sufficient conditions that guarantee the restriction holds.
(10.) Formally, the assumption is that [kappa] is a choice variable
and the utility cost function is p,K for some constant [mu].
(11.) We require here that H [not equal to] 0; that is, the state
must be detectable. If the state is not detectable there is no point in
allocating attention to monitoring it.
(12.) Note that these two different ways to model RI are very
similar to the constraint and multiplier preferences adopted by Hansen
and Sargent (2007) to model aversion to model misspecification. They
also established the observational equivalence between the two
preferences within the LQG setting. Luo and Young (2010a) extend this
equivalence to RI settings.
(13.) See also Mackowiak and Wiederholt (2009).
(14.) Sims (2010) also applied this principle to solve a tracking
problem with information constraints.
(15.) If this constraint does not bind, in principle we can apply
the logic of the reverse water-filling problem to solve for [SIGMA].
(16.) Note that the basic idea of solving the multivariate RI
problem is the same as that in the univariate model and thus the key
difference between SE and RI problems remains unchanged.
(17.) This problem will be particularly difficult for non-LQG
problems, as the distribution of the noise shocks will generally be
impossible to specify analytically.
(18.) The robustness of the result in Melosi (2009) is unknown, but
it clearly depends on model assumptions to some degree.
(19.) That is, all elements in the variance-covariance matrix are
scaled up.
(20.) The excess smoothness puzzle states that consumption responds
too little to permanent changes in income. The excess sensitivity puzzle
states that current consumption responds to changes in income that were
anticipated in earlier periods.
(21.) Note that in this case, the flow budget constraint is
[w.sub.t+1] = R[w.sub.t] - [c.sub.t] + [y.sub.t].
(22.) Here we assume that disposable income follows a persistent
trend-stationary AR(1) process; Luo, Nie, and Young (2012) estimated the
process for small-open economies. It is well known that given the length
and structure of the data on real income, it is difficult to distinguish
persistent trend-stationary AR(1), unit root, and difference-stationary
(DS) processes for real income. We focus on the AR(1) case in this
paper.
(23.) See Luo (2008) for a discussion on the effects of RI on
consumption dynamics.
(24.) A proof is straightforward from Expressions (46) and (47).
(25.) See Table 1 in Luo, Nie, and Young (2012) for the estimated
income processes in both emerging and developed countries.
(26.) It is worth noting that this prediction can also be used to
distinguish SE and RI by examining the time-series behavior of an
economy across "regime changes," such as the observed U.S.
Great Moderation in which the volatility of output dropped after 1984.
This rationale was also used in Sargent (1976) to discuss the
observational equivalence between Classical and Keynesian models.
(27.) Note that we know E [mx] after solving the PIH model given
the state variables and can use them to determine the asset price p.
(28.) For the details of the derivation, see Luo and Young (2010b).
(29.) Alternatively, we can also see that the conditional variance
of the x component is about three times smaller than its corresponding
unconditional variance (0.4523), whereas that of the z component is
about five times smaller than its corresponding unconditional variance
(2.7692).
(30.) Madkowiak and Wiederholt (2009) show that independent signals
are optimal.
(31.) Note that this ratio must be greater than 1, as otherwise the
posterior variance would be higher than the prior variance, which means
that the firms forget. If this ratio is greater than exp (2[kappa), it
violates the information-processing constraint.
(32.) Menkulasi (2010) presents a model of the Great Moderation
under RI and shows that RI implies that the required decline in
volatility for the shocks needed to replicate observed GDP volatility
declines is smaller, but does not provide evidence of the actual
volatility declines.
(33.) Moscarini (2004) argues that although the state transition
and predictions (actions) happen continuously, observations only occur
at discrete intervals.
(34.) See Veldkamp (2011) for a textbook treatment on information
choice.
YU LEI LUO and ERIC R. YOUNG *
* We thank James MacGee (the Editor) and two anonymous referees for
many constructive comments and suggestions, and Tom Sargent and Chris
Sims for helpful discussions. We are also grateful for useful
suggestions and comments from Shin-ichi Fukuda, Ken Kasa, Yang Lu, Jun
Nie, Yi-Chang Tsai, Gaowang Wang, Jenny Xu, Tao Zhu, and seminar and
conference participants at the Hong Kong University of Science and
Technology, the University of Tokyo, and Fudan Conference on Economic
Dynamics. Luo thanks the General Research Fund (GRF) in Hong Kong
(#HKU749711) and the HKU seed funding program for basic research for
financial support. Young thanks the Bankard Fund for Political Economy
at the University of Virginia for financial support. Part of this work
was conducted while Luo was visiting the UC Davis Economics Department
and the Kansas City Fed, whose hospitality is greatly appreciated. All
errors are the responsibility of the authors.
Luo: School of Economics and Finance, The University of Hong Kong,
Hong Kong, Hong Kong. Phone 852 2859 1042, Fax 852 2548 1152. E-mail
yluo@econ.hku.hk
Young: Department of Economics, University of Virginia,
Charlottesville, VA 22904. Phone 434 924 3811, Fax 434 982 2904. E-mail
ey2d@virginia.edu