文章基本信息

标题：Signal extraction and rational inattention.
作者：Luo, Yu Lei ; Young, Eric R.
期刊名称：Economic Inquiry
印刷版ISSN：0095-2583
出版年度：2014
期号：April
语种：English
出版社：Western Economic Association International
摘要：The objective of this paper is to compare two theories of information frictions. In the first theory, agents observe the state of the world with exogenously imposed noise; we refer to this environment as the signal extraction (SE) model. (1) The second theory permits agents to design the distribution of noise terms by focusing limited attention on certain variables at the expense of others; following Sims (2003) we refer to this model as the rational inattention (RI) model. (2) In both models, agents respond to changes in the true underlying state slowly because it takes time for them to learn exactly what the new state is. The key difference is that the noise distribution under RI is an outcome of optimal choice, and thus can adapt to changing circumstances in the economy, whereas under SE the noise distribution is technologically fixed. Our goal is to explore this crucial distinction.
关键词：Attention;Attention (Psychology)

Signal extraction and rational inattention.

Luo, Yu Lei ; Young, Eric R.

I. INTRODUCTION

The objective of this paper is to compare two theories of information frictions. In the first theory, agents observe the state of the world with exogenously imposed noise; we refer to this environment as the signal extraction (SE) model. (1) The second theory permits agents to design the distribution of noise terms by focusing limited attention on certain variables at the expense of others; following Sims (2003) we refer to this model as the rational inattention (RI) model. (2) In both models, agents respond to changes in the true underlying state slowly because it takes time for them to learn exactly what the new state is. The key difference is that the noise distribution under RI is an outcome of optimal choice, and thus can adapt to changing circumstances in the economy, whereas under SE the noise distribution is technologically fixed. Our goal is to explore this crucial distinction.

We explore the distinction between SE and RI within a linear-quadratic-Gaussian (LQG) setting. We first study a univariate case for which the variance-covariance matrix of the noise shocks can be solved in closed form. First, we note that (at least within the LQG framework) any variance-covariance structure for the noise shocks derived from the RI problem can be exogenously imposed in the SE environment, so there is a sense in which the two environments are observationally equivalent.

The next result we find is that if the variance of the noise itself is fixed, we can use a policy experiment to distinguish SE from RI. Suppose that the variance of an exogenous shock is scaled up because of a change in policy. In the SE problem where the distribution of noise is exogenous, an increase in the variance of the shock will lead to a different solution for the conditional variance and Kalman gain; consequently, the change in policy will lead to a change in the model's dynamic behavior and the agent's payoffs. In contrast, in the RI problem, if channel capacity is fixed, a change in the variance of the exogenous shock will lead to the same change in the conditional variance of the state and the variance of the noise, but will not affect the Kalman gain. That is, inattentive agents with fixed capacity will behave as if they face noise shocks whose nature changes systematically as the dynamic properties of the economy change with policy. Furthermore, we show that if we assume that the marginal cost of information is fixed, capacity will be elastic with respect to a change in policy; consequently, the Kalman gain in this case will also adjust in response to the policy change. The losses of agents due to imperfect information depend on the value of the Kalman gain. Therefore, in general, SE and RI imply different policy recommendations.

In contrast, we also find that in the univariate case, if the ratio of the variance of the exogenous shock to that of the noise (i.e., the signal-to-noise ratio, SNR) is fixed, the SE and RI problems are observationally equivalent in the sense that they lead to the same dynamics of the model economy when the ratio of the conditional variance to that of the noise in the SE problem equals a particular function of the channel capacity in the RI problem. Our results do not depend on the presence or absence of correlation between fundamental and noise shocks. Specifically, in the presence of the correlation, a change in the variance of the exogenous shock does not change the dynamic behavior of the model in the fixed capacity case, whereas it changes the model's dynamics in the fixed information-processing cost case in which both the variance of noise and the Kalman gain are affected by the interactions between the correlation, the variance of the fundamental shock, and the conditional variance.

We then move on to study the multivariate case. In this case, given channel capacity the conditional variance-covariance matrix can be obtained by solving a semidefinite programming problem in which the inattentive agent minimizes the expected loss due to information-processing constraints. After computing the optimal steady-state conditional variance-covariance matrix, we can recover the variance-covariance matrix of the noise vector and then determine the Kalman gain. In this case, we show that SE and RI generate different dynamics after a change in the variance of the exogenous shock even if the SNR is fixed.

However, when modeling the multivariate SE problem, it is difficult to specify the process of the vector of noises ex ante without prior knowledge about the states. Ad hoc assumptions on the nature of the noise might be inconsistent with the underlying efficiency conditions (equalization of the marginal utility of additional capacity across variables). (3) Therefore, RI provides a micro-founded method for specifying the stochastic properties of the noises. It is worth noting that in the multivariate RI problem, the agent's preference, budget constraint, and information-processing constraints jointly determine the values of the conditional variance of the state, the variance of the noise, and the Kalman gain, whereas in the multivariate SE problem given the variance of the noise, the propagation equation updating the conditional variance based on the budget constraint is used to determine the conditional variance and then the Kalman gain.

Finally, we present two canonical examples to illustrate our results. First, we study a permanent income model to explore the different consumption and asset pricing implications of SE and RI. We argue that the RI model with fixed information-processing cost does a better job at replicating the different consumption behavior in emerging and developed small open economies (SOEs), and thus helps us distinguish RI from SE. (4) We also discuss a simplified model of price setting in the vein of Mackowiak and Wiederholt (2009) that optimal attention allocation between the aggregate and firm-specific shocks can also help distinguish RI from SE. In that model, Paciello and Wiederholt (2013) show that optimal monetary policy under SE and RI differs precisely because the monetary authority can manipulate the attention decision; in other words, it matters whether agents live in an SE or RI world.

The remainder of the paper is organized as follows. Section II examines optimal decisions and economic dynamics in an LQG setting with SE. Section III presents the RI version of the model and compares different implications of RI and SE for dynamics and policy within the LQG setting. Section IV presents applications to models of permanent income and price setting. Section V briefly discusses the different implications of SE and RI for consumption in a continuous-time setting. Section VI concludes.

II. SIGNAL EXTRACTION IN AN LQG MODEL

A. Full-Information Rational Expectations LQG Model

Consider the following LQG model:

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

subject to

(2) [S.sub.t+1] = [As.sub.t] + [Bc.sub.t] + [[epsilon].sub.t+1],

with [s.sub.0] known and given, where [beta] < 1 is the discount factor, [s.sub.t] is a (n x 1) state vector, [c.sub.t] is a ([kappa] x 1) control vector, [[epsilon].sub.t+1] is an iid (n x 1) vector of Gaussian random variables with mean 0 and covariance matrix [OMEGA], and [E.sub.t] [*] denotes the mathematical expectation of a random variable given information processed at t. We assume that Q, R, and W are such that the objective function is jointly concave in [s.sub.t] and [c.sub.t], and the usual conditions required for the optimal policy to exist are satisfied.

When the agent can fully observe the state [s.sub.t], the model is a standard linear-quadratic regulator problem. Solving the corresponding Bellman equation

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

yields the decision rule

(3) [c.sup.*.sub.t] = - F[s.sub.t],

and the Riccati equation is

(4) P = Q + [F.sup.T] RF - 2[F.sup.T] W +[beta]([A.sup.T] - [F.sup.T] [B.sup.T])P(A - BF),

where

(5) F = [(R + [beta][B.sup.T] PB).sup.-1] (W + [beta][B.sup.T] PA).

Iterating on the matrix Riccati equation (4) uniquely determines P, as the equation defines a contraction mapping. Using P, we can determine F in the optimal policy (5).

B. Signal Extraction with Exogenous Noises

Following the SE literature (e.g., Lucas 1972; Morris and Shin 2002; Muth 1960), we now assume that the agent cannot observe the true state [s.sub.t] perfectly and only observes the noisy signal [s.sup.*.sub.t] = [s.sub.t] + [[xi].sub.t] when making decisions. Here [[xi].sub.t] is a (n x 1) vector of noises. The agent then estimates the state using a standard Kalman filtering equation. In the standard SE problem, the stochastic property of the noise [[xi].sub.t] is given exogenously. Specifically, assume that [[xi].sub.t] is an iid Gaussian innovation with mean 0 and variance-covariance matrix [LAMBDA]. (5) We point out here that the agent may not have perfect information even about the endogenous part of the state vector [s.sub.t].

Under the LQG assumption, the certainty equivalence principle holds when the agent cannot observe [s.sub.t], perfectly, so the decision rule under imperfect information can be written as

(6) [c.sup.*.sub.t] = -F[[??].sub.t],

where [[??].sub.t] = E [[s.sub.t]|[I.sub.t]] is the perceived state and [I.sub.t] = {[s.sup.*.sub.t], [s.sup.*.sub.t-1], ..., [s.sup.*.sub.0]} is the information set including perceived signals until time t.

Furthermore, we assume that in the steady state, the true state follows a normal distribution after observing the noisy signals [s.sub.t]\[I.sub.t] ~ N (E [[s.sub.t]| [I.sub.t]], [[SIGMA].sub.t]), where [[SIGMA].sub.t] = [E.sub.t][([s.sub.t] - [[[??].sub.t]) ([s.sub.t] - [[??].sub.t]].sup.T]] is the conditional variance-covariance matrix, and the Kalman filtering equation governs the behavior of [[??].sub.t]

(7) [[??].sub.t+1] = (1 - [theta]) (A[[??].sub.t] + B[c.sub.t]) + [theta][s.sup.*.sub.t+1],

where [theta] is the steady-state Kalman gain to be determined. (6) Following the standard procedure in the Kalman filter literature, we have the steady state updating equation for [SIGMA] and Kalman gain [theta]:

(8) [summation] = (I - [theta]) (A[summation][A.sup.T] + [OMEGA]) [(I - [theta]).sup.T] + [theta][LAMBDA][[theta].sup.T]

and

(9) [theta] = ([OMEGA] + A[SIGMA][A.sup.T]) [([OMEGA] + A[summation][A.sup.T] + [LAMBDA]).sup.-1].

After computing ([theta], [SIGMA]) by iteration, we can obtain a complete characterization of the dynamic system. The key assumption in the SE problem is that the variance-covariance matrix of the noise, [LAMBDA], is given. Given this [LAMBDA], Equations (8) and (9) jointly determine the steady state ([theta], [SIGMA]). It is straightforward to show that we have the following alternative equations for computing the Kalman gain and the conditional variance-covariance matrix, ([theta], [SIGMA]):

(10) [[LAMBDA].sup.-1] = [[summation].sup.-1] - [[PSI].sup.-1].

and

(11) [theta] = [summation][[LAMBDA].sup.-1],

where [PSI] = A[summation][A.sup.T] + [OMEGA] is the steady-state conditional variance of the state prior to observing the new signal. After obtaining the Kalman gain, Equations (6), (7), and (11) completely characterize the model's dynamic behavior.

III. RATIONAL INATTENTION IN THE LOG MODEL

Following Sims (2003), we introduce RI into the LQG model proposed in Section A by assuming agents face information-processing constraints and have only finite Shannon channel capacity to observe the state of the world. Specifically, we use the concept of entropy from information theory to characterize the uncertainty about a random variable; the reduction in entropy is thus a natural measure of information flow. Formally, entropy is defined as the expectation of the negative of the (natural) log of the density function, -E [ln (f (X))]. For example, the entropy of a discrete distribution with equal weight on two points is simply E [in (f (X))] = -0.5 ln (0.5) - 0.5 ln (0.5) = 0.69, and the unit of information contained in this distribution is 0.69 "nats." (7) In this case, an agent can remove all uncertainty about X if the capacity devoted to monitoring X is [kappa] = 0.69 nats.

With finite capacity [kappa] [member of] (0, [theta]), a variable s following a continuous distribution cannot be observed without error and thus the information set at time t + 1, [I.sub.t+1], is generated by the entire history of noisy signals [{[s.sup.*.sub.j]}.sup.t+1.sub.j=0]. Following the literature, we assume the noisy signal takes the additive form [s.sup.*.sub.t+1] = [s.sub.t+1] + [[xi].sub.t+1], where [[xi].sub.t+1] is the endogenous noise caused by finite capacity. We further assume that [[xi].sub.t+1] is an iid idiosyncratic shock and is independent of the fundamental shock. Note that the reason that the RI-induced noise is idiosyncratic is that the endogenous noise arises from the consumer's own internal information-processing constraint. Agents with finite capacity will choose a new signal [s.sup.*.sub.t+1] [member of] [I.sub.t+1] = {[[s.sup.*.sub.1], [s.sup.*.sub.2], ..., [s.sup.*.sub.t+1]]} that reduces the uncertainty of the state variable [s.sub.t+1] as much as possible. Formally, this idea can be described by the information constraint

(12) H ([s.sub.t+1]|[I.sub.t]) - H ([s.sub.t+1] | [I.sub.t+1]) [less than or equal to] [kappa],

where [kappa] is the investor's information channel capacity, H ([s.sub.t+1]| [I.sub.t]) denotes the entropy of the state prior to observing the new signal at t + 1, and H([s.sub.t+1]|[I.sub.t+1]) is the entropy after observing the new signal, [kappa] imposes an upper bound on the amount of information--that is, the change in the entropy--that can be transmitted in any given period. Finally, following the literature, we suppose that the prior distribution of [s.sub.t+1] is Gaussian.

Under the LQG setting, as has been shown in Sims (2003, 2006), the true state under RI also follows a normal distribution [s.sub.t]|[I.sub.t] ~ N (E [[s.sub.t]\[I.sub.t]], [[SIGMA].sub.t]), where [[summation].sub.t] = [E.sub.t][([s.sub.t] - [[??].sub.t])[([s.sub.t] [[??].sub.t]).sup.T]]. In addition, given that the noisy signal takes the additive form [s.sup.*.sub.t+1], = [s.sub.t+1] + [[xi].sub.t+1], the noise [[xi].sub.=t+1] should also be Gaussian. It is worth noting that the Gaussianity of the posterior variance of the true state and the noise is optimally determined by the LQG structure. This result is often assumed as a matter of convenience in SE models with exogenous noises, and RI can rationalize this assumption. In addition, in the steady state the agent observes an additive noisy signal: [s.sup.*] = s, + [[xi].sub.t]. Note that in the RI problem we also have the usual formula for updating the conditional variance-covariance matrix of a Gaussian distribution [SIGMA] in steady state:

(13) [[LAMBDA].sup.-1] = [[summation].sup.-1] - [[PSI].sup.-1].

Using these expressions, the Kalman gain [theta] can also be written as

(14) [theta] = [summation][[LAMBDA].sup.-1].

A. The Univariate Case

The key difference between SE and RI is that under RI the agent faces the information-processing constraint

(15) -ln([absolute value of [[summation].sub.t+1]) + ln ([absolute value of [A.sup.T] [[summation].sub.t] A + [OMEGA]]) [less than or equal to] 2[kappa].

As more information about the state is better in single-agent models, this constraint will be binding. (8) Considering the univariate state case n = 1, (15) fully determines the value of the steady-state conditional variance [SIGMA]:

(16) [summation] = [OMEGA]/exp (2[kappa]) - [A.sup.2]],

which means that [SIGMA] is determined by the variance of the exogenous shock ([OMEGA]) and the exogenously given capacity ([kappa]). (9) Given this [SIGMA] we can use (13) to recover the variance of the endogenous noise ([LAMBDA]):

(17) [LAMBDA] = [([[summation].sup.-1] - [[PSI].sup.-1]).sup.-1],

where [PSI] = [A.sup.2][summation] + [OMEGA], and use (14) to find the Kalman gain ([theta]):

(18) [theta] = [summation][[LAMBDA].sup.-1] = 1 - [SIGMA][[psi].sup.-1],

which reduces to [theta] = 1 - 1/exp(2[kappa]) using Equations (14), (16), and (17). Note that Equations (17) and (18) also hold in the SE problem. To compare the RI and SE problems in the univariate case, we first consider the following two cases.

Case 1. Assume that A is fixed exogenously in the SE case.

In Case 1, it is clear that in the SE problem given [LAMBDA] and [OMEGA], we can compute [SIGMA] by solving the nonlinear equation (10). After obtaining [SIGMA], we can use Equation (11) to determine the Kalman gain [theta]; thus, in this sense SE and RI have the same implications.

We now discuss how to use a policy experiment to distinguish RI from SE. Suppose that the variance of the exogenous shock, [OMEGA], is scaled up because of a change in policy. In the SE problem with fixed [LAMBDA], Equations (17) and (18) imply that an increase in [OMEGA] will generally lead to a different solution for [SIGMA] and [theta]; consequently, the change in policy will lead to a change in the model's dynamics. Because [SIGMA] is a nonlinear function of [OMEGA], the effect of changes in [OMEGA] on [SIGMA] could be complicated. In the next section, we will explore this relationship using some numerical examples in a permanent income model. In contrast, in the RI problem, if [kappa] is fixed, Equations (16), (17), and (18) imply that a change in [OMEGA] will lead to the same change in [SIGMA], [PSI], and [LAMBDA], but has no impact on [theta]. In other words, agents with fixed capacity will behave as if facing noise whose nature changes systematically as the dynamic properties of the economy change, that is, the change in policy does not change the model's dynamics.

Case 2. Assume that the SNR (SNR), [OMEGA][[LAMBDA].sup.-1], is fixed exogenously in the SE case.

Note that Equation (17) can be rewritten as

(19) [OMEGA][[LAMBDA].sup.-1] = [OMEGA][[summation].sup.-1] - [[[A.sup.2][([OMEGA][[summation].sup.-1]).sup.-1] + 1].sup.-1].

In Case 2, since the SNR is fixed, (19) can be used to solve for [OMEGA][[SIGMA].sup.-1]. Given the SNR and [OMEGA][[SIGMA].sup.-1], we can compute

(20) [summation][[LAMBDA].sup.-1] = ([summation][[OMEGA].sup.-1]) ([OMEGA] [[LAMBDA].sup.-1]).

Consider the same case in which [OMEGA] is scaled up, fixed SNR means that the exogenous noise should also be scaled up such that [OMEGA] [[LAMBDA].sup.-1] is fixed at the same level; consequently, Equation (19) leads to the same solution for [OMEGA] [[SIGMA].sup.-1] and Equation (20) leads to the same E[[LAMBDA].sup.-1]. The following proposition summarizes the main conclusion in this case:

PROPOSITION 1. In Case 2 (i.e., the SNR is fixed), the SE and RI problems are observationally equivalent in the sense that they lead to the same dynamics if [kappa] is fixed and [SIGMA] [[LAMBDA].sup.-1] in the SE problem is equal to 1 - 1 / exp (2[kappa]) in the RI problem.

Proof. The proof is straightforward by comparing Equations (18) and (20).

In the above analysis, for simplicity we assume that [kappa] remains unchanged when [OMEGA] is altered by the government policy. However, if an increase in [OMEGA] leads to higher marginal welfare losses due to imperfect observations, some capacity may be reallocated from other sources to reduce the welfare losses due to low capacity. In this case, [theta] will change accordingly as it is completely determined by capacity [kappa]; consequently, the dynamic behavior of the model will also change in response to the change in [OMEGA]. We will further explore this issue in the next section and the permanent income model in Section 4.

Alternative Way to Model Limited Information-Processing Capacity. As argued by Sims (2010), instead of using fixed finite channel capacity to model limited information-processing ability, it is also reasonable to assume that the marginal cost of information processing is constant. That is, the Lagrange multiplier on Equation (15) is constant. (10) In the univariate case, if the decision rule under full information is [c.sup.*.sub.t] = H[s.sub.t] and the objective of the agent with finite capacity is to minimize [[summation].sup.[theta].sub.t-0] [[beta].sup.t] [([c.sub.t] - [c.sup.*.sub.t]).sup.2], the optimization problem reduces to

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where [[SIGMA].sub.t] is the conditional variance at t, [lambda] is the Lagrange multiplier corresponding to Equation (15), and we impose the restriction that [beta]A = 1 and H = A - 1 for simplicity. Solving this problem yields the optimal steady-state conditional variance:

(21) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

It is straightforward to show that as [lambda] goes to 0, [summation] = 0; and as [lambda] goes to [infinity], [SIGMA] = [infinity]. Comparing Equations (21) with (16), it is clear that the two modeling strategies are observationally equivalent in the sense that they lead to the same conditional variance if the following equality holds:

(22) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

In this case, the Kalman gain is

(23) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

It is obvious that [kappa] converges to its lower limit [[kappa].bar]= In (A) as [lambda] goes to [infinity]; and it converges to [infinity] as [lambda] goes to 0. (11) In other words, using the RI modeling strategy, the agent is allowed to adjust the optimal level of capacity in such a way that the marginal cost of information processing for the problem at hand remains constant. Note that this result is consistent with the concept of "elastic" capacity proposed in Kahneman (1973).

Furthermore, it is clear from Equation (22) that if the cost of information processing ([lambda]) is fixed, an increase in fundamental uncertainty ([OMEGA]) will lead to higher capacity ([kappa]) devoted to monitoring the evolution of the state. We now consider the same policy experiment discussed above: [OMEGA] is scaled up because of a change in policy. If we adopt the assumption that X is fixed, Equation (21) means that there is a less change in [SIGMA] because [partial derivative]ln[SIGMA]/[partial derivative] ln [OMEGA]. Note that in the fixed [kappa] case, = 1. Consequently, a change in [OMEGA] will change [theta] and the model's dynamics if the inattentive agent is facing fixed marginal cost of information. Therefore, different ways of modeling RI may lead to different policy implications. (12)

Extension to Correlated Shocks and Noises. In the above analysis, we assumed that the exogenous fundamental shock and noise are uncorrelated. We now discuss how correlated shocks and noises affect the implications of SE and RI for the model's dynamic behavior. In real systems, we may observe correlated shocks and noises. For example, if the system is an airplane and winds are buffeting the plane, the random gusts of wind affect both the process (the airplane dynamics) and the measurement (the sensed wind speed) if people use an anemometer to measure wind speed as an input to the Kalman filter. Please see Simon (2006) for a discussion. To use an economic example, suppose there is a correlation between levels and volatility, and also that producing national income data is costly; then a transition from low to high volatility would be associated with low output levels and, at the margin, less accurate measurements.

It is straightforward to introduce correlated shocks and noises into the SE problem. Specifically, we consider the case in which the process shock ([epsilon]) and the noise ([xi]) are correlated as follows:

corr([[epsilon].sub.t+1], [[xi].sub.t+1]) = [rho],

cov ([[epsilon].sub.t+1], [[xi].sub.t+1]) = [GAMMA] = [rho][square root of [OMEGA]][square root of [LAMBDA]],

where [rho] is the correlation coefficient between [[epsilon].sub.t+1] and [[xi].sub.t+1], [OMEGA] = var [[[epsilon].sub.t+1]] and [LAMBDA] = var [[[xi].sub.t+1]]. Under SE, [LAMBDA] is given exogenously and the correlation just introduces another exogenous stochastic dimension on the noise. As shown in Simon (2006), in this case the optimal Kalman gain can be written as

(24) [theta] = ([PSI] + [GAMMA])[([PSI] + [LAMBDA] + 2[GAMMA]).sup.-1],

and the updating formula for the conditional variance is

(25) [summation] = [PSI] - [([PSI] + [GAMMA]).sup.2] [([PSI] + [LAMBDA] + 2[GAMMA]).sup.-1],

where [PSI] = [OMEGA] + [A.sup.2][summation]. Just like the case without the correlation, given A and T, Equations (24) and (25) jointly determine the steady state (0, [SIGMA]).

In the RI problem, the correlation generalizes the assumption in Sims (2003) on the uncorrelated RI-induced noise. In the steady state. Equation (25) can be rewritten as the following quadratic equation in terms of [square root of [LAMBDA]]: [[[rho].sup.2][OMEGA] - O([psi] - [summation])] A 4- 2[rho][summation][square root of [OMEGA]][square root of [LAMBDA]] + [summation][psi] = 0, which can be solved for

(26) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

It is clear from (26) that if [kappa] is fixed, the change in [OMEGA] will lead to the same change in [SIGMA], [PSI], and [LAMBDA], but has no effect on the Kalman gain [theta] = [summation][[LAMBDA].sup.-1]. That is, the presence of correlated noise does not change the dynamic behavior of the model.

Consider the RI problem with a fixed information-processing cost ([lambda]). From Equations (21) and (26), it is clear that in the presence of correlated noise ([rho] > 0), there is a less change in [SIGMA] when there is a change in [OMEGA] because [[partial derivative] ln [SIGMA]/[partial derivative ln [OMEGA]] < 1, and consequently, the change in [OMEGA] will also change 0 because A depends on the interactions between [OMEGA] and [SIGMA].

B. The Multivariate Case

In the multivariate RI problem, it is more difficult to determine the steady-state conditional variance-covariance matrix [SIGMA] because it cannot be computed analytically. Here we follow Sims (2003) and calculate the expected loss due to imperfect observations under RI. Specifically, we assume that the value functions under full information and imperfect information can be written as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

respectively. (13) We can compute the optimal [SIGMA] by minimizing the expected welfare loss due to RI,

(27) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

subject to information-processing constraints. Note that to solve this problem numerically, we need to use a two-stage procedure. (14) First, under the LQG assumption, the certainty equivalence principle applies and the decision rule under imperfect information,

(28) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

is independent of [SIGMA] or [LAMBDA]. We then use this decision rule to determine [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] which depends on [SIGMA] and [LAMBDA]. Applying the welfare criterion proposed in Equation (27), we can solve for optimal steady-state [SIGMA] and [LAMBDA].

Solving the problem posed in Equation (27) is equivalent to solving the semidefinite programming problem

(29) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

subject to

(30) -ln ([absolute value of [SIGMA]]) (31) ([absolute value of [A.sup.T] [summation]A + [OMEGA]]) [less than or equal to] > 2[kappa],

(31) [A.sup.T][summation]A + [OMEGA] [greater than or equal to] [summation],

where Z = [F.sup.T]RF - [F.sup.T] W + [beta]([F.sup.T] [B.sup.T] PBF + [F.sup.T] [B.sup.T] PA + [A.sup.T] PBF) (see Appendix SI, Supporting Information, for the derivation). If the positive-definiteness constraint on [A.sup.T] [SIGMA] A + [OMEGA] - [SIGMA], Equation (31), does not bind, the first-order condition for E can be written as

(32) [[summation].sup.-1] = [(G[summation][G.sup.T] + [G.sub.0]).sup.-1] - Z/[lambda],

where G = [([A.sup.T]).sup.-1] A and [G.sub.0] = [([A.sup.T]).sup.-1] [OMEGA][A.sup.-1]. We can then use standard methods to solve Equation (32). When applied to a permanent income model in the next section, we first solve this equation and then check whether in fact Equation (31) is satisfied by the optimal solution of [SIGMA]. If so, the problem is solved. (15)

After computing the optimal steady-state [SIGMA], we can then use Equation (13) to determine the steady-state [LAMBDA] and Equation (14) to determine the Kalman gain [theta]. Therefore, the key difference between SE and RI is that in the SE problem we need to specify the process of the noise first, whereas in the RI problem we need to first specify the value of channel capacity that determines the steady-state conditional variance of the state by solving the semidefinite programming problem proposed in Equation (29) subject to Equations (30) and (31). (16) Theoretically, it is clear that after solving an RI problem, we can always reconstruct an SE problem using the resulting endogenous noise due to RI as the input, and the two models are observationally equivalent in this sense. However, it is difficult to specify the process of the vector of noises ex ante when modeling the multivariate SE problem. (17)

When modeling the multivariate RI problem we only need to set a value for channel capacity and then compute optimal conditional variance-covariance matrices of the state and the variance-covariance matrices of the noise vector by solving the constrained semidefinite minimization problem (29). Therefore, in the multivariate RI problem, the agent's preference, budget constraint, and information-processing constraints jointly determine the values of [SIGMA], [LAMBDA], and [theta], whereas in the multivariate SE problem given [LAMBDA], Equation (13) that is used to determine [SIGMA] and 0 only depends on the budget constraint. If the noise in SE is specified exogenously, it may violate the optimality conditions for RI; for example, Melosi (2009) showed that a particular estimated SE model does not equate the marginal utility of attention across states, implying that the variance-covariance matrix of the noise would not be consistent with any channel capacity. Of course, obtaining the marginal utility of attention requires solving the RI problem, so it will be difficult to specify ex ante an SE problem consistent with RI. (18)

We now consider the different policy effects of RI and SE in the multivariate case. We first assume that initially the SE and RI problems have the same Kalman gain that generates the same dynamic behavior. Suppose that the variance-covariance matrix of the exogenous shock, [OMEGA], is scaled up because of a change in policy. (19) In the SE problem with fixed [LAMBDA], Equations (17) and (18) imply that a change of [OMEGA] will lead to a different solution for [SIGMA] and [theta], that is, the change in policy will lead to a change in the model's dynamics. In contrast, in the multivariate RI problem, as shown in Equations (29)-(31), a change in [OMEGA] will have complicated effects on [SIGMA], [LAMBDA], and [theta]. In other words, in the multivariate case a change in policy will affect the model's behavior in both SE and RI problems. (Note that in the univariate case the change in policy does not change the model's dynamics.)

We next consider the effects of RI and SE in Case 2 (i.e., the SNR, [OMEGA][[LAMBDA].sup.-1], is fixed in the SE problem). As before, we assume that initially the SE and RI problems have the same Kalman gain. To illustrate how a change in [OMEGA] affects the Kalman gain in RI and SE problems, we multiply [SIGMA] on both sides of Equation (17):

(33) [summation][[LAMBDA].sup.-1] = I - [[A[summation][A.sup.T] [[summation].sup.-1] + ([OMEGA][[LAMBDA].sup.-1])([LAMBDA][[summation].sup.-1])].sup.-1],

where I is the identity matrix and we use the fact that [OMEGA][[summation].sup.-1] = ([OMEGA][[LAMBDA].sup.-1]) ([LAMBDA][[SIGMA].sup.-1]). In Case 2, the policy has the same impact on [OMEGA] and [LAMBDA] to keep the SNR fixed. Equation (33) clearly shows that if the policy changes [SIGMA] and then A[SIGMA][A.sup.T][[SIGMA].sup.-1], it will affect 0 = [summation][[LAMBDA].sup.-1] even in Case 2. Multiplying [OMEGA] on both sides of (17) gives

(34) [OMEGA][[LAMBDA].sup.-1] = [OMEGA][[summation].sup.-1] - [(A[summation][A.sup.T] [[OMEGA].sup.-1] + I).sup.-1],

which means that a change in [OMEGA] will lead to different E given that [OMEGA][[LAMBDA].sup.-1] is fixed. Note that in the univariate case, A[summation][A.sup.T][[summation].sup.-1] = [A.sup.2], which means that the policy has no impact on [theta], and the SE and RI problems cannot be distinguished by the policy in Case 2 that the SNR, [OMEGA][[LAMBDA].sup.-1], is fixed.

IV. APPLICATIONS TO MODELS OF PERMANENT INCOME AND PRICE SETTING

In this section, we consider the effects of SE and RI for consumption dynamics and their policy and welfare implications in an otherwise standard permanent income model. As in the previous section we first consider applications to the univariate case and then discuss applications to the multivariate case.

We are not the first to examine this model. Sims (2003) examined how RI affects consumption dynamics when the agent only has limited capacity when processing information. Luo (2008) showed that the RI permanent income can be solved explicitly even if the income process is not iid, and then examined how RI can resolve the well-known excess smoothness and excess sensitivity puzzles; that model admits a reduction to a single state variable. (20) To explore the attention allocation problem explicitly, we also consider a simplified price-setting model used in Woodford (2001) and Mackowiak and Wiederholt (2009), and argue that this model can also help distinguish RI from SE.

A. The Univariate Permanent Income Model

Optimal Consumption. Following Luo (2008), we have the subsequent univariate version of the standard permanent income model (Flavin 1981; Hall 1978) in which households solve the dynamic consumption-savings problem

(35) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

subject to

(36) [s.sub.t+1] = R[s.sub.t] - [c.sub.t] - [[zeta].sub.t+1],

where u([c.sub.t]) = - 1/2 [([bar.c] - [c.sub.t]).sup.2] is the period utility function, [bar.c] > 0 is the bliss point, [c.sub.t] is consumption,

(37) [s.sub.t] = [w.sub.t] + [1/R] [[infinity] summation over (j=0)] [R.sup.-j] [E.sub.t] [[y.sub.t+j]]

is permanent income (the expected present value of lifetime resources), consisting of financial wealth ([w.sub.t]) plus human wealth (i.e., the expected discounted present value of current and future labor income, y),

(38) [[zeta].sub.t+1] = 1/R [[infinity] summation over (j=t+1)] [(1/R).sup.j-(t+1)] ([E.sub.t+1] - [E.sub.t]) [[y.sub.j]]

is the time (t + 1) innovation to permanent income with mean 0 and variance [[omega].sup.2.sub.[zeta]], [w.sub.t] is cash-on-hand (or market resources), [y.sub.t] is a general income process with Gaussian white noise innovations, [beta] is the discount factor, and R is the constant gross interest rate at which the consumer can borrow and lend freely. (21) We assume y follows an AR(1) process with persistence coefficient [rho] [member of] [0, 1], [y.sub.t+1] = [[rho].sub.yt] + [[epsilon].sub.t+1], where [[epsilon].sub.t+1] ~ N (0, [[omega].sup.2]), [s.sub.t] = [w.sub.t] + [y.sub.t]/ (R - [rho]) and [[zeta].sub.t+1] = [[epsilon].sub.t+1]/ (R - [rho]). (22) For the rest of the paper we will restrict attention to points where [c.sub.t] < [bar.c], so that utility is increasing and concave; following the literature we impose the restriction that [beta]R = 1, because it implies a stationary path for consumption. This specification follows that in Hall (1978) and Flavin (1981) and implies that optimal consumption is determined solely by permanent income:

(39) [c.sub.t] = (R-1)[s.sub.t].

Within this LQG setting, the certainty equivalence principle holds and introducing SE or RI lead to the following new consumption function:

(40) [c.sub.t] = (R - 1)[[??].sub.t],

where [[??].sub.t] = [E.sub.t] [[s.sub.t]] is the perceived state and is governed by the following Kalman filtering equation

(41) [[??].sub.t+1] = (1 - [theta]) (R[[??].sub.t] - [c.sub.t]) + [theta] ([s.sub.t+1] + [[xi].sub.t+1]),

where [theta] is the Kalman gain, and given [s.sub.0] ~ N ([[??].sub.0], [[SIGMA].sub.0]). As shown in Luo (2008), combining Equations (36), (40), with (41) yields the following expression for the change in consumption:

(42) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where L is the lag operator. We require (1 - [theta])[R.sup.2] < 1, the model equivalent of the stabilizability condition stated before (this condition implies (1 - [theta])R < 1 since R > 1). This MA([infinity]) process shows that the dynamic behavior of the model is strongly influenced by the Kalman gain [theta]. Using the explicit expression for consumption growth (42), we can compute the key stochastic properties of consumption process: the volatility of consumption growth, the persistence of consumption growth, and the correlation between consumption growth and income shocks. (23) All these moments depend on the Kalman gain. In other words, SE and RI lead to different consumption processes if and only if the resulting 0 differs.

It is worth noting that the representative-agent RI model can be regarded as an extreme case of an RI model with a continuum of agents. For example, consider two PIH models: one with a representative consumer and one with a continuum of consumers endowed with identical channel capacity. While the effect of RI on the reaction of consumption to income shocks is the same in the two PIH environments, the volatility of aggregate consumption depends on the specifics of the aggregative mechanism. As argued in Sims (2003), although the randomness in an agent's response to aggregate shocks will be idiosyncratic because it arises from the agent's own information-processing channel, there is likely a significant common component. Therefore, the common term of the noise lies between 0 and the part of the RI-induced noise, If we assume that [[[xi].bar].sub.t], consists of two independent noises: [[xi].sub.t] = [[xi].sub.t] + [[xi].sup.i.sub.t], where [[bar.[xi]].sub.t] = [E.sup.i] [[[xi].sub.t]] and are the common and idiosyncratic components of the error generated by [[zeta].sub.t], respectively. A single parameter, [lambda] = [var ([bar.[xi].sub.t])/var [bar.[xi].sub.t])][member of] [0, 1], can be used to measure the common source of coded information on the aggregate component (or the relative importance of [bar.[xi].sub.t]) vs. [[xi].sub.t]). (Sims 2003 argued that the presence of the common noise can help fit the model with the VAR evidence better.) Given the existing theories on this issue, we cannot pin down the value of [lambda]. Therefore, the [lambda] = 1 case can be viewed as a special case of the model with a continuum of consumers which is equivalent with the representative-agent model in which we do not need to discuss the aggregation issue.

Policy Implications under SE and RI. In this univariate permanent income model, substituting A = R into Equation (19),

(43) [[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1] = [[omega].sup.2][[summation].sup.-1] - [[[R.sup.2] [([[omega].sup.2] [[summation].sup.-1]).sup.-1] + 1].sup.-1],

where [[omega].sup.2.sub.[zeta]] [[summation].sup.-1] = ([[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1]) ([LAMBDA] [[summation].sup.-1]), and denote [theta] = [summation][[LAMBDA].sup.-1] and [pi] - [[omega].sup.2.sub.[zeta]] [[LAMBDA].sup.-1], we obtain the following relation between [pi] and [theta]:

(44) [pi] = [theta] (1/[1-[theta]] - [R.sup.2])

Solving for [theta] yields

(45) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where we omit the negative root of [theta] as both [SIGMA] and [LAMBDA] must be positive. Figure 1 illustrates the relationship between [pi] and [theta] given R = 1.02 and [pi] [member of] [0.1, 10]. It clearly shows that [theta] is an increasing function of [pi], the signal-to-noise ratio. Also, as [pi] [right arrow] [infinity], [theta] [right arrow] 1.

[FIGURE 1 OMITTED]

In the RI version of the permanent income model, we have

(46) [summation] = [[OMEGA]/exp (2[kappa]) - [R.sup.2]],

(47) [LAMBDA] = [([[summation].sup.-1] - [[PSI].sup.-1]).sup.-1],

where [psi] = [R.sup.2][summation] + [OMEGA]. Using (46) and (47), the Kalman filter gain under RI can be written as

(48) [theta] = [summation][[LAMBDA].sup.-1] = 1- 1/exp(2[kappa]).

Comparing Equations (45) with (48), it is clear that the signal-to-noise ratio ([pi]) and the level of channel capacity ([kappa]) have one-to-one correspondence. Figure 2 shows the relationship between [kappa] and [pi] when the SE and RI problems are observationally equivalent in the sense that they lead to the same consumption dynamics governed by the Kalman gain [theta], that is, [pi] = (1 - exp (-2[kappa])) (exp (2[kappa]) - [R.sup.2]). This result is consistent with the general conclusion we obtained using Case 2 in the previous section.

[FIGURE 2 OMITTED]

Using the same expression for [theta], (45), we can examine how Case 1 can be used to distinguish SE and RI when implementing a change in government policy. Specifically, in the SE problem, we assume that before the government implements stabilization policies, the SNR [pi] = [[omega].sup.2.sub.[zeta]]/[LAMBDA] = 2. In this case, [theta] = 0.73. After the government implements these policies, the variance of the shock to permanent income will be reduced from [[omega].sup.2.sub.[zeta]] to 0.5 [[omega].sup.2.sub.[zeta]]. As [LAMBDA] is fixed in Case 1, [pi] will fall from 2 to 1; consequently, [theta] = 0.62. We now assume that the RI and SE problems are observationally equivalent in the sense that they lead to the same [theta] = 0.73 before implementing the stabilization policies. After implementing these policies, [[omega].sup.2.sub.[zeta]] will be scaled down to 0.5 [[omega].sup.2.sub.[zeta]], and the RI theory predicts that both [SIGMA] and A will be scaled down to 0.5[SIGMA] and 0.5[LAMBDA], respectively. (24) Consequently, the Kalman filter gain, [theta] = [SIGMA][[LAMBDA].sup.-1] = 0.73, remains unchanged. In other words, stabilization policy has different implications for consumption dynamics in the SE and RI models.

Alternatively, if we assume that the cost of information processing ([lambda]) is fixed, the optimal conditional variance equals

(49) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Comparing Equations (49) with (46), it is clear that the two modeling strategies are observationally equivalent in the sense that they lead to the same conditional variance if the following equality holds:

(50) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

In this case, the Kalman gain is

(51) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

After implementing these policies, [[omega].sup.2.sub.[xi]] is scaled down to 0.5 [[omega].sup.2.sub.[xi]] (i.e., the economy switches to a more stable environment) and the fixed [lambda] theory predicts that the Kalman filter gain, [theta] = [summation][[LAMBDA].sup.-1], is reduced. For example, before the government implements stabilization policies, we have [lambda]/[[omega].sup.2.sub.[xi]] = 0.000135 and [theta] = 0.79. After the policy, we can easily calculate that [theta] = 0.68 using Equation (51). Figure 3 plots the different implications of SE and RI for consumption dynamics after implementing the stabilization policy: consumption growth falls more (less) under SE than RI when [kappa] is fixed (when [lambda] is fixed), since the Kalman gain decreases (increases). The intuition behind this result is simple. In the fixed [lambda] case some capacity will be reallocated to other sources because a reduction in macroeconomic uncertainty leads to smaller welfare losses due to RI.

[FIGURE 3 OMITTED]

Implications for Consumption Volatility in Emerging and Developed Countries. Aguiar and Gopinath (2007) consider two groups of small economies (emerging and developed economies), and find that consumption is more volatile than income at business cycle frequencies for emerging markets, as compared to a ratio of less than one for developed markets. They show that a SOE real business cycle (RBC) model driven primarily by shocks to trend growth can explain well this regularity about the relative volatility of consumption to income in emerging markets. However, by using a long time series data over 1900-2005, Garcia-Cicco, Pancrazi, and Uribe (2010) estimate an RBC model driven by the same shocks considered in Aguiar and Gopinath (2007), and find that the model does a poor job explaining the observed business cycle fluctuations in Argentina and Mexico along a number of dimensions including the relative volatility of consumption to income. In this section, we briefly show that the RI model with a fixed information-processing cost can explain the observed difference in consumption volatility in emerging and developed countries.

The PIH model presented in Section A can be regarded as an SOE model in which the constant interest rate is given exogenously and there are a continuum of consumers in the model economy. Using Equation (42) and assuming that all idiosyncratic errors are canceled out after aggregation, the relative volatility of consumption growth to income growth can be written

as

(52) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where sd(*) denotes standard deviation. It is straightforward to show that the relative consumption volatility is decreasing with the degree of imperfect state observations because [partial derivative][mu]/[partial derivative][theta] < 0. As shown in Luo, Nie, and Young (2012), if there is no imperfect-state-observation assumption (i.e., [theta]=1), the model cannot generate the empirical relative consumption volatility. For example, if R = 1.04, the full information model predicts that [mu] in emerging and developed economies would be 0.28 and 0.24, respectively. In contrast, in the data, the corresponding [mu] values are 1.35 and 0.98, respectively. (25) In the RI model with fixed capacity, [theta] is uniquely determined by fixed capacity [kappa] and thus has no impact on the cross-country comparison if emerging and developed countries have the same average amount of channel capacity. In contrast, if we adopt the fixed information-processing cost assumption, Equations (51) and (52) can have the potential to generate the observed difference in consumption volatility in emerging and developed countries because [theta] is an increasing function of income uncertainty and income uncertainty in emerging countries is much higher than that in developed countries (sd ([DELTA]y)/mean(y) is 3.82 in emerging countries, while it is 2.07 in developed countries). Intuitively, in developed countries consumers pay less attention to macroeconomic conditions because the fundamental uncertainty is low; consequently, the aggregate consumption process in these countries is more stable relative to the income process. (26)

General Equilibrium Asset Pricing Implications. The PIH model presented in Section A is usually regarded as a partial equilibrium model. However, as noted in Hansen (1987) and Cochrane (2005), it is not a partial equilibrium result--it is a general equilibrium model with a linear production technology and an exogenous income process. Given the expression of optimal consumption in terms of the state variables derived from the PIH model with imperfect-state-observation, we can price assets by treating the process of aggregate consumption that solves the model as though it was an endowment process. In this setup, equilibrium prices are shadow prices that leave the agent content with that endowment process.

In the model setting specified in Section A, w can be regarded as capital. R can be regarded as the return on the linear technology and is not yet the interest rate (the equilibrium rate of return on one-period claims to consumption). As proposed in Cochrane (2005) and used in Luo and Young (2010b), after finding optimal consumption as in Equation (40), we can price one-period claims using this equilibrium consumption stream. Denoting the risk-free rate by [R.sup.f], we have the following Euler equation:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where [E.sub.t] [*] is the consumer's expectation operator conditional on his/her processed information at time t. We can now use the basic pricing equation, p = E[mx]. (27) to compute the price of the stream of aggregate consumption (treated as the stream of endowments) as (28)

(53) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [m.sub.t,t+j] [equivalent to] [[beta].sup.j] [u'([c.sub.t+j])/u'([c.sub.t])] is the stochastic discount factor, and [XI] [equivalent to] [[summation].sup.[theta].sub.j=1] ([[beta].sup.j] [var.sub.t] [[c.sub.t+j]]) = [(2-[theta]R)/1-[R.sup.2](1-[theta])] [[omega].sup.2.sub.[zeta]]. Denoting the risk-neutral component by [p.sup.rn.sub.t] and the risk-adjusted component by [p.sup.ra.sub.t], we have

(54) [P.sup.rn.sub.t] = [1/R-1][c.sub.t]

and

(55) [p.sup.ra.sub.t] = [1/[bar.c]- [c.sub.t]] [(2-[theta])R/1-[R.sup.2](1-[theta])][[omega].sup.2.sub.[zeta]].

Equation (53) yields the following implications. The first term in Equation (53) is the risk-neutral component denoted by [p.sup.rn.sub.t]. This term can be regarded as the value of a perpetuity paying [c.sub.t]. The second term is the risk-adjusted component, [p.sup.ra.sub.t]; it lowers the asset price relative to the risk-neutral level because [c.sub.t] [less than or equal to] [bar.c] and it is decreasing with the degree of attention (0).

From Equations (53) and (55), it is clear that the Kalman gain ([theta]) also plays a key role in determining the general equilibrium asset prices under both SE and RI. Consider the same policy experiment we discussed in the last section. If the economy switches to a more stable environment as a result of stabilization policy (i.e., [[omega].sup.2.sub.[zeta]] is scaled down to 0.5 [[omega].sup.2.sub.[zeta]]), the fixed [lambda] assumption predicts that the Kalman filter gain, [theta]= [[summation][[LAMBDA].sup.-1], is reduced. Given that [lambda]/[[omega].sup.2.sub.[zeta]] = 0.000135 and [theta] = 0.79, we can easily calculate that 0 = 0.68 using Equation (51). Equation (55) clearly shows that asset price falls less (more) under SE than RI when [kappa] is fixed (when [lambda] is fixed), since the Kalman gain decreases (increases).

B. The Multivariate Permanent Income Model

In this section, we solve for optimal steady-state [SIGMA] and [LAMBDA] in a parametric multivariate RI permanent income model and then illustrate the differences between RI and SE problems. This example is similar to that discussed in Sims (2003) and considers multiple income shocks with different stochastic properties. Specifically, we assume that the original budget constraint is as follows

(56) [w.sub.t+1] = R[w.sub.t] - [c.sub.t] + [y.sub.t+1],

where [w.sub.t] is the amount of cash-in-hand, and the income process [y.sub.t] have two persistent components (x and z) and one transitory component ([[epsilon].sub.y,t])

[y.sub.t] = [bar.y] + [x.sub.t] + [z.sub.t] + [[epsilon].sub.y,t], [x.sub.t] = 0.99[x.sub.t-1] + [[epsilon].sub.x,t], [z.sub.t] - 0.95[z.sub.t-1] + [[epsilon].sub.z,t],

with

(57) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where [x.sub.t] is the most persistent and smooth component and [[epsilon].sub.y,t] is the most transitory and volatile component. For the quadratic utility function u([c.sub.t]) = - 1/2 [([c.sub.1] - [bar.c]).sup.2], using the first welfare criterion (27) provided in Section B and setting [beta] = 0.95, we can compute that

(58) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

when capacity [kappa] = 2.2 bits, which can be used to compute the variance of the noise [LAMBDA] using [[LAMBDA].sup.-1] = [[SIGMA].sup.-1] - [[psi].sup.-1], and then compute the Kalman gain according to [theta] = [SIGMA][[LAMBDA].sup.-1]. It is clear from Equation (58) that owing to the low capacity devoted to monitoring the state, the post-observation variances (i.e., the conditional variances) of both the x and z components are greater than the corresponding innovation variances in Equation (57). More importantly, the conditional variance of the slow-moving x component is 18 times larger than its corresponding innovation variance, whereas that of the fast-moving z component is only 2 times larger than its innovation variance. (29) The intuition behind this result is that the optimizing agent devotes much less capacity to monitoring the slow-moving component, which leads to greater effects on the conditional variance term. Figure 4 plots the impulse responses of consumption to the income shocks and noises. It shows that consumption reacts to the income shocks gradually and with delay, and reacts to the corresponding noises promptly. In addition, we can see that the response of consumption to the slow-moving x component is much more damped than that to the fast-moving z component. It is also worth noting that as the agent only cares about the trace of Z[SIGMA] and the symmetric matrix Z is negative semidefinite, the agent with low capacity will choose to make the post-observations of the states be negatively correlated. This correlation conserves capacity by permitting some information about each state to be transmitted using a single nat.

[FIGURE 4 OMITTED]

When we relax the information-processing capacity and increase [kappa] to 2.8 nats, the conditional covariance matrix becomes

(59) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Comparing Equations (58) with (59), we can see that relaxing information-processing capacity has the largest impact on the conditional variance of the endogenous state variable w: the post-observation variance of w is reduced to about half the initial value. The intuition behind this result is that the endogenous variable plays the most important role in the welfare losses due to RI. To see this clearly, the matrix Z is displayed here:

(60) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

While w per unit has less of an effect on welfare, it is proportionally much larger than either of the other two state variables. It is also clear that as the information constraint is relaxed the agent chooses to allocate more capacity to monitoring the slow-moving component x than to monitoring the z component.

Note that in the RI problem (58) is optimal in the sense that it minimizes the expected welfare losses due to finite information-processing capacity by allocating fixed capacity optimally across different elements in the state vector. In contrast, in the SE problem, A must be specified first and then [SIGMA] and [theta] can be computed. However, it is difficult to specify [LAMBDA] without prior knowledge about the states. Ad hoc assumptions on A might be inconsistent with the underlying efficiency conditions. Therefore, RI could provide a useful way to specify the stochastic properties of the noises by solving the agent's optimization problem subject to information constraints. As we have noted previously, Melosi (2009) presents an application of this idea; he notes that a particular estimated model shows that the marginal utility of information is not equated across variables and is thus inconsistent with RI (i.e., inconsistent with any value for [kappa]).

C. Attention Allocation in a Price-Setting Model

In the multivariate permanent income model we discussed in the previous section, the difference between the RI and SE problems comes from the optimal allocation of capacity, and the consumption reactions to different income shocks do depend on optimal attention allocation. However, in this setting, the optimal attention/capacity allocation cannot be addressed explicitly because there exists an endogenous state variable and attention allocation between the endogenous state variable (in) and exogenous state variables (persistent components in y) is nonsymmetric and complicated. In contrast, as has been shown in Mackowiak and Wiederholt (2009), the attention allocation problem between the exogenously given aggregate and idiosyncratic shocks has an explicit solution. In this section, we present a simplified version of their price-setting model.

Consider a firm price-setting model with monopolistic competition in which firms continuously observe noisy signals about two random variables: one aggregate variable and one firm-specific variable. Under the full-information assumption, the profit-maximizing price (in logs) of firm i, [p.sub.i], can be written as

(61) [p.sup.f.sub.i] = p + [[alpha].sub.x] x + [[alpha].sub.z] [z.sub.i],

where p is the log of the aggregate price level, x is the log of aggregate output, [z.sup.i] is an idiosyncratic demand shock, and [[alpha].sup.x] and a, are coefficients that depend on structural parameters in the profit function (the superscript f in the price function indicates full information). All the variables on the RHS are assumed to be normally distributed. Since the sum of two normal variables is also normal, we can summarize the aggregate condition as y = p + [[alpha].sub.x] x such that

(62) [p.sup.f.sub.i] = y + [[alpha].sub.x] x + [[alpha].sub.z] [z.sub.i],

where y and z are assumed to be Gaussian variables with mean 0 and variances [[sigma].sup.2.sub.y] and [[sigma].sup.2.sub.z], respectively. Under RI, the typical firm cannot observe y and [z.sub.i] perfectly, so all it can observe are noisy signals [y.sup.*] on the aggregate state and [z.sup.*] on the idiosyncratic state:

(63) [y.sup.*] = y + [[xi].sub.y] and [z.sup.*] = [z.sub.i] + [[xi].sub.z],

where the noises due to RI, [[xi].sub.y] and [[xi].sub.z] are Gaussian variables with mean 0 and variances [[omega].sup.2.sub.y] and [[omega].sup.2.sub.z], respectively. They are assumed to be independent with each other and are also not correlated across firms. (30) Given the observed independent noisy signals, the optimal price can be written as

(64) [p.sup.*.sub.i] = E [y|[y.sup.*]] + [[alpha].sub.z] E [[z.sub.i] |[z.sup.*]].

Because [p.sup.*.sub.i] is different from the full-information solution [p.sup.f.sub.i], the firm suffers profit losses from RI. The profit loss function of the firm can be written as

(65) [DELTA][pi] = [gamma]/2[([p.sup.f.sub.i] - [p.sup.*.sub.i]).sup.2],

where [gamma] > 0. Given Equations (64) and (65), the attention allocation problem can be reduced to

(66) min var [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

subject to the information-processing constraint (IPC):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

which can be reduced to [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] where [[sigma].sup.2.sub.y|[y.sup.*]] are posterior conditional variances of y and [z.sub.i], respectively, and we use the facts that (1) for a quadratic objective function and Gaussian state variables, it is optimal to choose the joint density of the states and noisy signals to be also Gaussian and (2) the aggregate and idiosyncratic variables are uncorrelated. After using the IPC to substitute out [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] the optimal solution of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]:

(67) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

which gives the ratio of posterior to prior precision of briefs about the aggregate condition under the optimal attention allocation. (31) Equation (67) provides several important implications for the optimal attention allocation to the aggregate condition. First, greater values of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] mean that the firm pays more attention to the aggregate condition. Second, given [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is increasing with the relative importance of the prior variances of the aggregate and firm-specific conditions. As the firm-specific shock is 10 times more volatile than the aggregate shock as calibrated in Mackowiak and Wiederholt (2009), the firm optimally pays much more attention to the firm-specific shock.

Similarly, we can obtain optimal [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(68) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Finally, using Equations (67) and (68), we can easily recover the variances of the noises, [[omega].sup.2.sub.y] and [[omega].sup.2.sub.z],

(69) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(70) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In the corresponding SE problem in which [[omega].sup.2.sub.y] and [[omega].sup.2.sub.z] are given exogenously and arbitrarily, using Equations (69) and (70), we can recover the posterior conditional variances, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], which may not be optimal in the sense that they do not lead to the minimum welfare loss, var ([p.sup.f.sub.i] | [y.sup.*], [z.sup.*]). The argument in this model is similar to that in the permanent income model with an endogenous state variable in the previous section. (32)

We conclude this section with a point regarding the importance of distinguishing between SE and RI. In the context of this model, Paciello and Wiederholt (2013) show how the distinction for SE and RI matters for optimal policy. As under RI the variance-covariance matrix of the noises will shift with monetary policy, the central bank will use that fact to improve allocations--specifically, they find that the divine coincidence of monetary policy (no trade-off between inflation and output stabilization) holds under RI but not under SE. Of course, this result does not help us distinguish between the two theories, but rather provides a motive for doing so.

V. DISCUSSION ON THE CONTINUOUS-TIME SETTING

So far we have focused on the discrete-time setting in which agents are assumed to have imperfect state information. In this section we discuss the different implications of SE and RI in the continuous-time setting. For simplicity, we use the permanent income model as an example to illustrate the key different implications of the two information frictions. In the continuous-time version of the permanent income model, the typical consumer is assumed to maximize the following lifetime utility:

(71) [E.sub.0] [[[integral].sup.[infinity].sub.t=0] exp(-[beta]t)u([c.sub.t])dt],

subject to the following budget constraint:

(72) d[w.sub.t] = (r[w.sub.t] + [y.sub.t] - [c.sub.t]) dt,

where u([c.sub.t]) = - [([bar.c] - [c.sub.t]).sup.2] /2, and all the variables here have the same definitions as that in the last section. Labor income ([y.sub.t]) is assumed to follow a continuous-time AR(1) (Ornstein-Uhlenbeck) process:

(73) d[y.sub.t] = [rho] ([mu]/[rho] - [y.sub.t]) dt + [sigma]d[[zeta].sub.t],

where the steady-state income [bar.y] - [mu]/[rho], [rho] > 0 governs the speed of convergence from the steady state, and [[zeta].sub.t] is a standard Brownian motion.

Just like the discrete-time model, to simplify the model, we adopt the same state-space reduction approach and define a new state [s.sub.t] = [a.sub.t] + [y.sub.t] / (r + [rho]). Consequently, the new state transition equation can be written as:

(74) d[s.sub.t] = (r[s.sub.t] - [c.sub.t]) dt + [[sigma].sub.s]d[[zeta].sub.t],

where [[sigma].sub.s] = [sigma]/ (r + [rho]). In the full-information case, it is straightforward to show that the consumption function is:

(75) [c.sub.t] = r[s.sub.t].

We assume that the consumer cannot observe the state perfectly and only observes a noisy signal:

(76) [s.sup.*.sub.t] = [s.sub.t] + [[xi].sub.t],

where [[xi].sub.t] is a standard Brownian motion with mean 0 and variance [LAMBDA], and is independent of [[zeta].sub.t]. To solve the model, we rewrite Equation (76) in the following differentiation form:

d[s.sup.*.sub.t] - d[s.sub.t] + d[[xi].sub.t].

In this continuous-time LQG setting, the separation principle holds and the consumption function can now be written as

(77) [c.sub.t] = r[[??].sub.t],

and the perceived state [[??].sub.t], = [E.sub.t] [[s.sub.t]] is governed by the following Kalman filtering equation:

(78) d[[??].sub.t], = (r[[??].sub.t] - [c.sub.t]) dt + [theta]d[[eta].sub.t],

where

(79) d[[eta].sub.t] = r ([s.sub.t] - [[??].sub.t]) dt + [[sigma].sub.s]d[[zeta].sub.t] + d[[xi].sub.t],

with mean E [d[[eta].sub.t]] - 0 and var(D[[eta].sub.t]) = ([[sigma].sup.2.sub.s] + [LAMBDA])dt, [theta] = [SIGMA][[LAMBDA].sup.-1] is the steady-state Kalman gain, and [SIGMA](= ([[sigma].sup.-2.sub.S] + [[LAMBDA].sup.-1])-1 > 0) is the conditional variance of .v. Note that [[eta].sub.t] is a Brownian motion with mean 0. Although the Brownian variable, [[xi].sub.t], is not observable, the innovation process, [[eta].sub.t], is observable because it is derived from observable processes (i.e., d[s.sup.*.sub.t] and (r[[??].sub.t] - [c.sub.t])dt). In this case, the path of the conditional expectation, [[??].sub.t], is generated by the path of the innovation process, [[eta]sub.t]. In summary, given [LAMBDA], Equations (77) and (78) completely characterize the dynamics of the continuous-time PIH model.

We now move on to discuss the RI case in which we assume that the consumer only has finite channel capacity when processing relevant information about the state. As emphasized in Sims (1998) and discussed in Kasa (2006) and Reis (2010), the observation equation, (76), is not suitable to model RI owing to finite capacity because this specification means that in any finite interval, arbitrarily large amounts of information can be passed through the consumer's channel. Specifically, Sims (1998) shows that when the true state (5) and the noise ([xi]) are generated by nicely behaved stochastic differential equations driven by Wiener processes, then if the noise has at least as many derivatives as the true state, noisy observations can transmit infinite amounts of information in finite time, while if the noise has fewer derivatives than the true state, noisy observations over a finite interval can transmit only finite amounts of information. Moscarini (2004) solves this problem by assuming that the agent samples the process at discrete intervals. (33) Kasa (2006) proposes an alternative way to solve this problem. He assumes that the observation equation is written as:

(80) d[s.sup.*.sub.t] = [s.sub.t dt + d[[xi].sub.t],

where the current signal affects the instantaneous rate of change in the noisy observation. In this specification, the Kalman filtering equation can be written as

(81) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where

(82) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

with mean E[d[[??].sub.t]] = 0 and var (d[[??].sub.t]) = [LAMBDA]dt. Here [[??].sub.t] is also an observable Brownian motion with mean 0. It is clear from Equations (79) and (82) that the two specifications of imperfect information observations, (76) and (80), lead to different consumption behavior.

In this RI case, of course, the variance of the noise, [LAMBDA], is determined endogenously by the following continuous-time version of the information-processing constraint specified in Section A:

(83) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is the joint probability measure on s and [s.sup.*], [P.sub.s] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are the marginal probability measures of s and [s.sup.*], respectively, and [kappa] is finite capacity. As shown in Duncan (1970) and Reis (2010), Equation (83) reduces to [summation][[LAMBDA].sup.-1] [less than or equal to] [kappa], which determines the value of [LAMBDA]. Comparing Equations (79) with (82), it is clear that we must distinguish the two noisy signal specifications when modeling RI due to finite capacity; that is, it is not the case that all SE environments can be derived from an underlying RI problem.

VI. CONCLUSIONS

In this paper we have explored the implications of two informational friction theories, SE and RI, for economic behavior, policy, and welfare within the LQG setting. First, we showed that if the variance of the noise itself is fixed exogenously, the two theories can be distinguished as they lead to different dynamics and welfare after implementing government policies. Second, we showed that if the SNR is fixed in the SE problem and the channel capacity is fixed in the RI problem, SE and RI are observationally equivalent in the sense that they lead to the same dynamics even after implementing policies in the univariate case, whereas they generate different policy and welfare implications in the multivariate case. Furthermore, in the multivariate case we showed that under RI the agent's preference, budget constraint, and information-processing constraints jointly determine the stochastic properties of the post-observation variance and endogenous noise; hence, RI provides a micro-founded way to specify the nature of the Kalman gain that governs the model's dynamics.

Distinguishing between SE and RI has policy relevance beyond the simple examples we consider here. As we noted earlier, Paciello and Wiederholt (2013) study optimal monetary policy in a model that nests both SE and RI. They find that the policy can differ across the two types of models--the key is that under RI the policymaker can affect the attention allocation between different shocks. In particular, they find that the divine coincidence of no tradeoff between stabilizing prices and output holds under RI, but not under SE.

Extending our results to compare environments outside the LQG setup will be challenging. Solving the SE problem is straightforward, although computation of the conditional expectations may be difficult depending on the distributional assumptions. (34) The RI problem is difficult to solve, however, because the optimal joint distribution of states and controls is typically not of a known form and not easy to approximate; the optimality of discrete solutions discussed in Matejka and Sims (2010) make it difficult to characterize the distribution in terms of a small number of parameters. It seems therefore unlikely that SE and RI environments will generally be observationally equivalent.

ABBREVIATIONS

LQG: Linear-Quadratic-Gaussian

RBC: Real Business Cycle

RI: Rational Inattention

SE: Signal Extraction

SNR: Signal-to-Noise Ratio

SOE: Small Open Economy

doi:10.1111/ecin.12073

REFERENCES

Adam, K. "Optimal Monetary Policy in the Presence of Imperfect Common Knowledge." Journal of Monetary Economics, 54(2), 2005, 267-301.

Aguiar, M., and G. Gopinath. "Emerging Market Business Cycles: The Cycle Is the Trend." Journal of Political Economy, 115(1), 2007, 69-102.

Angeletos, G.-M., and J. La'O. "Noisy Business Cycles." NBER Macroeconomics Annual 2009, 24, 2010, 319-78.

Cochrane, J. Asset Pricing. Princeton, NJ: Princeton University Press, 2005.

Duncan, T. E. "On the Calculation of Mutual Information." SIAM Journal of Applied Mathematics, 19, 1970, 215-20.

Flavin. M. A. "The Adjustment of Consumption to Changing Expectations About Future Income." Journal of Political Economy, 89(5), 1981, 974-1009.

Garcia-Cicco, J., R. Pancrazi, and M. Uribe. "Real Business Cycles in Emerging Countries?" American Economic Review, 100(5), 2010, 2510-31.

Hall, R. E. "Stochastic Implications of the Life Cycle-Permanent Income Hypothesis: Theory and Evidence." Journal of Political Economy, 86(6), 1978, 971-87.

Hansen, L. P. "Calculating Asset Prices in Three Example Economies," in Advances in Econometrics, Fifth World Congress, edited by T. F. Bewley. Cambridge: Cambridge University Press, 1987.

Hansen, L. P., and T. J. Sargent. Robustness. Princeton, NJ: Princeton University Press, 2007.

Kahneman, D. Attention and Effort, Englewood Cliffs, NJ: Prentice Hall Press, 1973.

Kasa, K. "Robustness and Information Processing." Review of Economic Dynamics, 9(1), 2006, 1-33.

Kim, J., E. Ko, and T. Yun. "The Role of Bounded Rationality in Macro-Finance Affine Term-Structure Models." Unpublished, 2012.

Lucas, R. E., Jr. "Expectations and the Neutrality of Money." Journal of Economic Theory, 4(1), 1972, 103-24.

Luo, Y. "Consumption Dynamics under Information Processing Constraints." Review of Economic Dynamics, 11(2), 2008, 366-85.

--. "Rational Inattention. Long-run Consumption Risk, and Portfolio Choice." Review of Economic Dynamics, 13(4), 2010, 843-60.

Luo, Y., and E. R. Young. "Risk-sensitive Consumption and Savings under Rational Inattention." American Economic Journal: Macroeconomics, 2(4), 2010a, 281-325.

--. "Asset Pricing under Information-Processing Constraints." Economics Letters, 107(1), 2010b, 26-29.

Luo, Y., J. Nie, and E. R. Young. "Robustness, Information-Processing Constraints, and the Current Account in Small Open Economies." Journal of International Economics, 88(1), 2012, 104-20.

Mackowiak, B., and M. Wiederholt. "Optimal Sticky Prices under Rational Inattention." American Economic Review, 99(3), 2009, 769-803.

Matejka, F., and C. A. Sims. "Discrete Actions in Information-Constrained Tracking Problems." Unpublished, 2010.

Melosi, L. "A Likelihood Analysis of Models with Information Frictions," Penn Institute for Economic Research Working Paper 09-009, 2009.

Menkulasi, J. "Rational Inattention and Changes in Macroeconomic Volatility." Unpublished, University of Maryland, 2010.

Morris, S., and H. Song Shin. "The Social Value of Public Information." American Economic Review, 92(5), 2002, 1521-34.

Moscarini, G. "Limited Information Capacity as a Source of Inertia." Journal of Economic Dynamics and Control, 28(10), 2004, 2003-35.

Muth. J. F. "Optimal Properties of Exponentially Weighted Forecasts." Journal of the American Statistical Association, 55(290), 1960, 299-306.

Paciello, L., and M. Wiederholt. "Exogenous Information, Endogenous Information, and Optimal Monetary Policy." Review of Economic Studies, 2013. doi: 10.1093/restud/rdt024

Reis, R. "When Should Policymakers Make Announcements?" Unpublished, 2010.

Sargent, T. J. "The Observational Equivalence of Natural and Unnatural Rate Theories of Macroeconomics." Journal of Political Economy, 84(3). 1976, 631-40.

--. "Equilibrium with Signal Extraction from Endogenous Variables." Journal of Economic Dynamics and Control, 15(2), 1991, 245-73.

Simon, D. Optimal State Estimation: Kalman, H-infinity, and Nonlinear Approaches. Hoboken, NJ: John Wiley & Sons, 2006.

Sims, C. A. "Stickiness." Carnegie-Rochester Conference Series on Public Policy, 49, 1998, 317-56.

--. "Implications of Rational Inattention." Journal of Monetary Economics, 50(3), 2003, 665-90.

--. "Rational Inattention: Beyond the Linear-Quadratic Case." American Economic Review, 96(2), 2006, 158-63.

--. "Rational Inattention and Monetary Economics," in Handbook of Monetary Economics, Vol. 3, Chapter 4, edited by B. M. Friedman and M. Woodford. Amsterdam: Elsevier, 2010, 155-81.

Townsend, R. M. "Forecasting the Forecasts of Others." Journal of Political Economy, 91(4), 1983, 546-88.

Van Nieuwerburgh, S., and L. Veldkamp. "Information Acquisition and Under-Diversification." Review of Economic Studies, 77(2), 2010, 779-805.

Veldkamp, L. Information Choice in Macroeconomics and Finance. Princeton, NJ: Princeton University Press, 2011.

Wang, N. "Precautionary Saving and Partially Observed Income." Journal of Monetary Economics, 51(8), 2004, 1645-81.

Woodford, M. "Imperfect Common Knowledge and the Effects of Monetary Policy," in Knowledge, Information, and Expectations in Modern Macroeconomics: In Honor of Edmund S. Phelps, edited by P. Aghion, R. Frydman, J. Stiglitz, and M. Woodford. Princeton, NJ: Princeton University Press, 2001.

(1.) Muth (1960) first applied the SE structure to an economic model (namely the permanent income model of Friedman 1957); later SE problems were extended to models with multiple agents by Townsend (1983) and Sargent (1991). More recent studies include Morris and Shin (2002), Wang (2004), and Angeletos and La'O (2010).

(2.) Luo (2008) was an early contribution to the RI literature; other contributions include Adam (2005), Kasa (2006), Luo (2010), Luo and Young (2010a), Mackowiak and Wiederholt (2009), Melosi (2009), Reis (2010), Van Nieuwerburgh and Veldkamp (2010), Paciello and Wiederholt (2013), and Kim, Ko, and Yun (2012).

(3.) See Melosi (2009) for an empirical investigation of this issue.

(4.) Aguiar and Gopinath (2007) document that the relative volatility of consumption growth to income growth in emerging countries is significantly greater than that in developed countries.

(5.) Our quadratic objective function encompasses the standard tracking objective of minimizing the squared difference of the control from the target.

(6.) Muth (1960) showed that the exponentially weighted average of past observations of a random walk plus a noise process is optimal in the sense that it minimizes the mean squared forecast error.

(7.) For alternative bases for the logarithm, the unit of information differs; with log base 2 the unit of information is the "bit" and with base 10 it is a "dit" or a "hartley."

(8.) By "better" we mean that conditional on draws by nature for the true state, the expected utility of the agent increases if information about that state is improved.

(9.) Note that here we need to impose the restriction exp (2[kappa]) - [A.sup.2] > 0. If this condition fails, the state is not stabilizable and the unconditional variance diverges. Obviously we cannot directly impose conditions on A, as it is chosen by the agents in the model; it is also clear that we can, for a given model, easily find sufficient conditions that guarantee the restriction holds.

(10.) Formally, the assumption is that [kappa] is a choice variable and the utility cost function is p,K for some constant [mu].

(11.) We require here that H [not equal to] 0; that is, the state must be detectable. If the state is not detectable there is no point in allocating attention to monitoring it.

(12.) Note that these two different ways to model RI are very similar to the constraint and multiplier preferences adopted by Hansen and Sargent (2007) to model aversion to model misspecification. They also established the observational equivalence between the two preferences within the LQG setting. Luo and Young (2010a) extend this equivalence to RI settings.

(13.) See also Mackowiak and Wiederholt (2009).

(14.) Sims (2010) also applied this principle to solve a tracking problem with information constraints.

(15.) If this constraint does not bind, in principle we can apply the logic of the reverse water-filling problem to solve for [SIGMA].

(16.) Note that the basic idea of solving the multivariate RI problem is the same as that in the univariate model and thus the key difference between SE and RI problems remains unchanged.

(17.) This problem will be particularly difficult for non-LQG problems, as the distribution of the noise shocks will generally be impossible to specify analytically.

(18.) The robustness of the result in Melosi (2009) is unknown, but it clearly depends on model assumptions to some degree.

(19.) That is, all elements in the variance-covariance matrix are scaled up.

(20.) The excess smoothness puzzle states that consumption responds too little to permanent changes in income. The excess sensitivity puzzle states that current consumption responds to changes in income that were anticipated in earlier periods.

(21.) Note that in this case, the flow budget constraint is [w.sub.t+1] = R[w.sub.t] - [c.sub.t] + [y.sub.t].

(22.) Here we assume that disposable income follows a persistent trend-stationary AR(1) process; Luo, Nie, and Young (2012) estimated the process for small-open economies. It is well known that given the length and structure of the data on real income, it is difficult to distinguish persistent trend-stationary AR(1), unit root, and difference-stationary (DS) processes for real income. We focus on the AR(1) case in this paper.

(23.) See Luo (2008) for a discussion on the effects of RI on consumption dynamics.

(24.) A proof is straightforward from Expressions (46) and (47).

(25.) See Table 1 in Luo, Nie, and Young (2012) for the estimated income processes in both emerging and developed countries.

(26.) It is worth noting that this prediction can also be used to distinguish SE and RI by examining the time-series behavior of an economy across "regime changes," such as the observed U.S. Great Moderation in which the volatility of output dropped after 1984. This rationale was also used in Sargent (1976) to discuss the observational equivalence between Classical and Keynesian models.

(27.) Note that we know E [mx] after solving the PIH model given the state variables and can use them to determine the asset price p.

(28.) For the details of the derivation, see Luo and Young (2010b).

(29.) Alternatively, we can also see that the conditional variance of the x component is about three times smaller than its corresponding unconditional variance (0.4523), whereas that of the z component is about five times smaller than its corresponding unconditional variance (2.7692).

(30.) Madkowiak and Wiederholt (2009) show that independent signals are optimal.

(31.) Note that this ratio must be greater than 1, as otherwise the posterior variance would be higher than the prior variance, which means that the firms forget. If this ratio is greater than exp (2[kappa), it violates the information-processing constraint.

(32.) Menkulasi (2010) presents a model of the Great Moderation under RI and shows that RI implies that the required decline in volatility for the shocks needed to replicate observed GDP volatility declines is smaller, but does not provide evidence of the actual volatility declines.

(33.) Moscarini (2004) argues that although the state transition and predictions (actions) happen continuously, observations only occur at discrete intervals.

(34.) See Veldkamp (2011) for a textbook treatment on information choice.

YU LEI LUO and ERIC R. YOUNG *

* We thank James MacGee (the Editor) and two anonymous referees for many constructive comments and suggestions, and Tom Sargent and Chris Sims for helpful discussions. We are also grateful for useful suggestions and comments from Shin-ichi Fukuda, Ken Kasa, Yang Lu, Jun Nie, Yi-Chang Tsai, Gaowang Wang, Jenny Xu, Tao Zhu, and seminar and conference participants at the Hong Kong University of Science and Technology, the University of Tokyo, and Fudan Conference on Economic Dynamics. Luo thanks the General Research Fund (GRF) in Hong Kong (#HKU749711) and the HKU seed funding program for basic research for financial support. Young thanks the Bankard Fund for Political Economy at the University of Virginia for financial support. Part of this work was conducted while Luo was visiting the UC Davis Economics Department and the Kansas City Fed, whose hospitality is greatly appreciated. All errors are the responsibility of the authors.

Luo: School of Economics and Finance, The University of Hong Kong, Hong Kong, Hong Kong. Phone 852 2859 1042, Fax 852 2548 1152. E-mail yluo@econ.hku.hk

Young: Department of Economics, University of Virginia, Charlottesville, VA 22904. Phone 434 924 3811, Fax 434 982 2904. E-mail ey2d@virginia.edu