文章基本信息

标题：Hedonic imputation and the price index problem: an application to housing.
作者：Hill, Robert J. ; Melser, Daniel
期刊名称：Economic Inquiry
印刷版ISSN：0095-2583
出版年度：2008
期号：October
语种：English
出版社：Western Economic Association International
摘要：Price indexes play a significant role in modem economies. The consumer price index (CPI), for example, is used to index various government payments, as a target for monetary policy and as a benchmark in wage negotiations. Our focus in this paper, however, is on price indexes at a more disaggregated level, in markets where it is hard to match products from one period or region to the next. Computers and housing are notable examples of such markets. As well as being important inputs into the CPI, price indexes for such goods are often useful in their own right. Price indexes for computers play a critical role in productivity measurement across market sectors, while house price indexes provide an important indication of the state of an economy.
关键词：Consumer price indexes;Hedonic calculus;Monetary policy

Hedonic imputation and the price index problem: an application to housing.

Hill, Robert J. ; Melser, Daniel

I. INTRODUCTION

Price indexes play a significant role in modem economies. The consumer price index (CPI), for example, is used to index various government payments, as a target for monetary policy and as a benchmark in wage negotiations. Our focus in this paper, however, is on price indexes at a more disaggregated level, in markets where it is hard to match products from one period or region to the next. Computers and housing are notable examples of such markets. As well as being important inputs into the CPI, price indexes for such goods are often useful in their own right. Price indexes for computers play a critical role in productivity measurement across market sectors, while house price indexes provide an important indication of the state of an economy.

For the case of computers, the matching problem arises due to technological progress, which leads to the rapid evolution of products in the market, resulting in a short product cycle. For housing, the problem is that every house is different and that they tend to sell relatively infrequently. Hence, there is usually very little overlap in the houses sold from one period to the next and no overlap at all from one region to the next.

The fact that products can often not be matched across periods or regions poses a significant measurement problem in that it is therefore difficult to disentangle price differences from changes in the quality of products. In this paper, we focus primarily on the hedonic regression method for solving this problem. The hedonic method reduces the matching problem to one of comparing products on the basis of their characteristics. The "regression" aspect of hedonic regression refers to how the implicit prices for these characteristics are measured.

In the next section, we explain what is meant by the price index problem. Section III outlines more rigorously the measurement problem created by unmatched products. The hedonic imputation method is introduced in Section IV. Section V shows how the use of the hedonic imputation method complicates the price index problem. In addition to choosing between different formulas such as Fisher and Tornqvist, it is necessary to choose between different varieties of each formula. This is because index compilers have a certain amount of discretion over which prices are imputed. Possible solutions are considered in Section VI. We show that the choice of formula variety can affect the sensitivity of the results to omitted variables bias. The choice of price index formula (as opposed to variety) is also considered in this section. We show that this is intimately connected with the choice of functional form for the hedonic model. Section VII provides an empirical application of the issues raised. The case considered is the construction of house price indexes for three regions in Sydney over a 3-yr period. Section VIII concludes the paper.

II. THE PRICE INDEX PROBLEM

Let [P.sub.js,kt] denote a bilateral price index comparison between region j in time period s and region k in time period t. The price and quantity data of commodity heading n for country k in period t are denoted, respectively, by [p.sup.n.sub.kt] and [q.sup.n.sub.kt]. Six important bilateral formulas are: Paasche, Laspeyres, Fisher, geometric Paasche, geometric Laspeyres, and Tornqvist. These indexes are defined as follows:

(1) Paasche : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(2) Laspeyres : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(3) Fisher: [P.sup.F.sub.js,kt] = [square root of ([P.sup.P.sub.js,kt] x [P.sup.L.sub.js,kt])]

(4) Geometric Paasche :

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(5) Geometric Laspeyres :

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(6) Tornqvist : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Here, [w.sup.n.sub.kt] = [p.sup.n.sub.kt][q.sup.n.sub.kt]/[[summation].sup.N.sub.m=1] [p.sup.m.sub.kt][q.sup.m.sub.kt] denotes the expenditure share of product n in region-period kt.

These price index formulas all give the same answer if the price data satisfy the conditions for Hicks' aggregation theorem (Hicks 1946), that is, [p.sup.n.sub.kt] = [lambda][p.sup.n.sub.js] [for all]. Under this scenario, all the price relatives [p.sup.n.sub.js]/[p.sup.n.sub.kt] take the same value [lambda]; hence, there is no substitution effect. In such cases, [P.sub.js,kt] = [lambda], irrespective of the choice of formula. However, when there is some variation in the price relatives across products, the formulas diverge from each other. This is what is meant by the price index problem. It is a problem that has attracted some of the greatest minds in the economic profession over the best part of two centuries, such as Marshall, Edgeworth, Fisher, and Samuelson. Fisher (1922), for example, considers in excess of 100 different formulas.

The price index problem has been attacked from two main directions, usually referred to as the economic and the axiomatic approaches. The economic approach views quantities as utility maximizing responses to prices. This approach has culminated in the work of Diewert (1976), who proposed the concept of a superlative price index (a class of indexes that attain a second-order approximation to the underlying cost-of-living index [COLI]). Each index outlined above can be derived from a particular functional form for the cost or the utility function. Diewert's contribution was to show that some of the indexes are based upon more flexible representations of the cost function than others. The Fisher and Tornqvist index are superlative indexes as they allow for flexible substitution behavior. An alternative approach to justifying the form of index numbers is the axiomatic approach, which proposes a series of axioms that a price index should satisfy and then discriminates between them on the basis of their performance relative to these axioms (Balk 1995; Eichhorn and Voeller 1976). Fortunately, the axiomatic approach also tends to favor the Fisher and Tornqvist indexes as these usually emerge as best.

This literature, however, assumes that there is no matching problem. That is, it is assumed that all region-periods supply price and quantity data on the same list of commodity headings. Once this assumption is relaxed, the price index problem becomes more complex.

III. THEORETICAL FOUNDATIONS OF THE HEDONIC APPROACH

The problem posed by incompletely overlapping sets of products can be seen by outlining the conventional economic measurement framework. In terms of measuring price change between region j in period s and region k in period t, we want to estimate the cost for some representative consumer of obtaining a given level of utility under the two price and choice set regimes. Let the time periods be indexed by t = 1, ..., T; the set of regions by k = 1, ..., K; and the set of commodity headings by n = 1, ..., [N.sub.kt]. The price and quantity data of commodity heading n for region k in period t are denoted, respectively, by [p.sup.n.sub.kt] and q.sup.n.sub.kt] The COLI is defined as follows:

[P.sup.*.sub.js,kt] = C([p.sub.kt],[??])/C([p.sub.js],[??])

where the cost function is defined below.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The problem that arises frequently in practice is that the price vectors [p.sub.kt] and [p.sub.js] may not be comparable. For example, there might be some variety of computer that is available in region-period kt but not in region-period js. This makes the estimation of the COLI more complex.

A number of methods have been developed to tackle this problem. Hausman (1997, 1999) following Hicks (1940) suggested estimating the reservation price of the non-matched items. While this approach is conceptually appealing, it involves the estimation of demand systems and is econometrically and theoretically complex. Detailed data on both the prices paid and the quantities purchased by consumers are also required. An alternative approach suggested by Feenstra (1994) is to assume that the cost function takes the constant elasticity of substitution functional form in which case it is possible to derive the COLI exactly (see also Balk 1999; Nahm 1998). However, perhaps the most promising approach to dealing with hard-to-match products is hedonic regression. The hedonic approach dates back to Waugh (1928) and Court (1939). However, it was only with Griliches (1961) that interest in hedonics really took off (Schultze and Mackie 2002; Triplett 2004).

The conceptual basis of the hedonic approach, dating back to Lancaster (1966) and Rosen (1974), is that consumers' utility is derived from the characteristics of the goods and hence decisions also relate to these characteristics. At its most general, the hedonic approach reorients the measurement problem to one related to characteristics rather than to goods, which are bundles of characteristics.

At a conceptual level, there appears to be two main options for the application of hedonic techniques. First, we could completely reconstruct the consumers' optimization problem in terms of characteristics. That is, we could think of consumers minimizing the cost of obtaining a certain level of characteristics utility, given characteristics prices. Such an approach is at least implicit in the writing of Triplett (2004). For illustrative purposes, let us depict such a cost function, where [z.sup.c] denotes the characteristics c = 1, ..., C and [b.sup.c.sub.kt] the prices of these characteristics in region-period kt.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

This may be termed a characteristics approach and gives rise to price indexes defined over the characteristics (rather than goods) prices and quantities.

A second approach is to use the hedonic hypothesis to construct a relationship between prices and characteristics and to apply this relationship in goods space. If the hedonic price relationship is denoted by [p.sup.*.sub.kt], then this can be thought of as enabling the extension of the cost function to those goods for which we do not have comparable prices in region-periods kt and js. We order the goods so that the [N.sub.js] models available in region-period js are as follows: n = 1, ..., [N.sub.js,kt] indexes the models available in both region-periods js and kt, while n = [N.sub.js,kt] + 1, ..., [N.sub.js] indexes the models available in region-period js but not in region-period kt.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This approach to hedonics is called the imputations approach as we are imputing missing prices.

The characteristics and imputations approaches are quite different. The imputations method extends the well-established goods space approach to price measurement. The characteristics approach, however, transforms the whole problem into characteristics space.

A significant drawback of the characteristics approach is that characteristics are not observed directly. The goods that are actually traded are tied bundles of characteristics. This means that except in very rare circumstances, we do not observe characteristics prices directly; hence, that characteristics prices must always be estimated rather than recorded directly from market transactions. This is in contrast to the imputations index that requires estimation only in the case where there is incomplete matching of goods. (1) For these reasons, we prefer the imputations approach over the characteristics method.

IV. THE HEDONIC IMPUTATION METHOD

When a product sold in region-period js is not sold in region-period kt (or vice versa), it is no longer possible to compute the bilateral formulas mentioned in Section II. The problem is not the presence of zero quantities per se but rather that when [q.sub.kt] = 0, the corresponding price [p.sub.kt] is not observed. Zero quantities can arise in a temporal context due to technological progress and market turnover leading to the emergence of new goods and the disappearance of existing goods. If a new improved model of computer is simply matched with the previous model, this will create an upward bias in the price index (Boskin et al. 1996). A quality adjustment must therefore be made. The hedonic imputation method is ideal for this task. It can be used to impute what the price of the new model would have been the period before it appeared and the price of the old model the period after it disappeared.

Housing is particularly problematic since every house is different. Hence, there may be very little overlap in the houses sold from one period to the next and zero overlap from one region to the next. In a temporal context, the repeat-sales method of Bailey, Muth, and Nourse (1963) has nevertheless been extensively used. Whether this introduces a bias depends on whether there are any inherent differences (that lead to differing price paths) between houses that sell frequently and those that do not. Even if such an index is free of bias, it will dramatically limit the range of house sales that are included in each bilateral comparison and therefore increase the variance and reduce the reliability of the comparison.

Hedonic methods seem to provide the only satisfactory way of computing price indexes when there is a significant mismatch of products across periods (regions). Three main classes of hedonic methods have been proposed in the literature. They go by various names. Here, we adopt the terminology of Triplett (2004) and refer to them as the hedonic imputation, time-dummy, and characteristics price index methods, respectively. Our focus in this paper is on the hedonic imputation method, which uses hedonic regressions to impute prices for any models that are missing in particular region-periods. Once all the models have been matched, standard price index formulas can then be used. If required, these bilateral price indexes (e.g., Fisher and Tornqvist) can then be multilateralized. The hedonic imputation method is used by the Bureau of Economic Analysis to construct price indexes for computers in the U.S. national accounts (Cartwright 1986; Dulberger 1989; Triplett 2004).

Our reasons for preferring the hedonic imputation method to the characteristics approach are outlined in the previous section. Our concern with the time-dummy method arises from the fact that it computes a single pooled regression equation for all the periods in the sample and derives the price indexes directly from the regression equation. This has the major disadvantage that it does not allow characteristic shadow prices to change over time, a drawback that has led to criticism (Berndt, Griliches, and Rappaport 1995; Pakes 2003; Schultze and Mackie 2002). (2) Also, it is difficult to update the results when new periods are added to the data set, since reestimation of the hedonic equation will change all the results. In other words, the time-dummy method violates temporal fixity (Hill 2004).

The hedonic imputation method runs a separate regression for each region-period in the comparison. The explanatory variables on the right-hand side of the equation are the characteristics of the product. For the case of computers, examples of relevant characteristics include RAM, hard drive capacity, and processor speed. For the case of housing, relevant characteristics include land area, number of bathrooms, and geospatial characteristics such as distance from the city center.

The choice of functional form for the hedonic model is an interesting question in its own right. Here, we focus attention on the linear and semilog models. These models differ only in the dependent variable, which in our case takes the form [p.sup.n.sub.kt] for the linear model and In [p.sup.n.sub.kt] for the semilog model. The functional form for the semilog hedonic model is as follows:

(7) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Consider a product model n sold in region-period js. An imputed price for this same model in region-period kt is obtained as follows:

(8) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[??].sub.c,kt] denotes the estimator of [[beta].sub.c,kt]. (3) We abstract here from issues of exactly how [[??].sub.c,kt] is computed but return to this in the empirical section. The important point to note at this stage is that when product n is unavailable in region-period kt, it can be imputed and then standard price index formulas used.

V. ALTERNATIVE VARIETIES OF HEDONIC PRICE INDEXES

The use of the hedonic imputation method adds a new dimension to the price index problem. This is because we have some discretion as to which prices are imputed. If a product is unavailable in a particular region-period, we have no choice but to impute it. Even if the product is available, we may nevertheless still prefer to use an imputed price over the actual price. This might seem counterintuitive. However, it turns out that replacing real prices with imputations can sometimes reduce the omitted variables bias and help ensure that like is compared with like. These issues are explored further in the next section.

To illustrate how the hedonic imputation method complicates the price index problem, we focus first on the case of the Laspeyres price index. Four different varieties of the Laspeyres price index are obtained depending on how exactly the hedonic imputation method is implemented. For the case of L2 and L4, we order the [N.sub.js] models available in region-period js as follows: n = 1, ..., [N.sub.js,kt] indexes the models sold in both region-periods js and kt, while n = [N.sub.js,kt] + 1. ..., [N.sub.js] indexes the models sold in region-period js but not in region-period kt. (4)

Laspeyres 1: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Laspeyres 2: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Laspeyres 3: [P.sup.L3.sub.js,kt] = [[N.sub.js].summation over (n = 1)] {[w.sup.n.sub.js][[??].sup.n.sub.kt](z.sup.n.sub.js)/[[??].sup.n.sub.js] (z.sup.n.sub.js)]}

Laspeyres4: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The expenditure weights are defined as follows:

[w.sup.n.sub.js] = [p.sup.n.sub.js][q.sub.js]/[[N.sub.js].summation over (m = 1] [p.sup.m.sub.js][q.sup.m.sub.js].

Variety 2 uses the minimum number of imputations. The other three varieties (to differing extents) sometimes throw away real price observations and replace them with imputed prices.

Four more varieties of Laspeyres indexes are obtained if we allow expenditure weights to be imputed as well. That is, in each of the four equations mentioned above, the expenditure weight [w.sup.n.sub.js] we could be replaced by an imputed expenditure weight [[??].sup.n.sub.js] that is calculated as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We refer to these variants on the four methods mentioned above as L1', L2', L3', and L4'. We argue in the next section that there is no clear benefit to replacing actual expenditure weights with imputations. Hence, L1', L2', L3', and L4' do not warrant serious consideration.

Different varieties of Paasche, Fisher, geometric Paasche, geometric Laspeyres, and Tornqvist can be derived in an analogous manner. Here, we illustrate this point for the first variety only:

Paasche 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Fisher 1 : [P.sup.F1.sub.js,kt] = [square root of [P.sup.P1.sub.js,kt] x [P.sup.L1.sub.js,kt]]

Geometric Paasche 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Geometric Laspeyres 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Tornqvist 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

It should be noted that if we allow different varieties of Laspeyres and Paasche indexes to be combined, we obtain 64 rather than 8 different varieties of Fisher. By applying an analogous argument to geometric Laspeyres and geometric Paasche indexes, we also obtain 64 varieties of Tornqvist. However, it would be hard to justify on economic grounds combining unmatched pairs of Laspeyres and Paasche or geometric Laspeyres and geometric Paasche indexes. Hence, in practice, we can limit the range of varieties of Fisher and Tornqvist to eight (or four once we rule out imputing expenditure weights).

VI. CHOOSING A PRICE INDEX VARIETY AND FORMULA

A. Single versus Double Imputation

Which variety is best? To simplify matters, this question is addressed for the case of the Laspeyres index. The arguments carry forward equally well to other price index formulas such as Fisher and Tornqvist. Our focus is on minimizing omitted variables bias and generating results that are easy to interpret. Other criteria such as variance and sample selection bias minimization are not considered here (see Pakes 2003, for a discussion of these criteria).

The eight varieties of Laspeyres indexes differ in their treatment of the price relatives and expenditure shares. We consider the former first. A Laspeyres index only considers models sold in the base region-period. A distinction can be drawn between price relatives where the model is sold in only region-period js and price relatives where it is sold in both region-periods. Again, we consider the former first. Single imputation uses the price relatives [[??].sup.n.sub.kt]([z.sup.n.sub.js])/[p.sup.n.sub.js], while double imputation uses [[??].sup.n.sub.kt]([z.sup.n.sub.js])/[[??].sup.n.sub.js] ([z.sup.n.sub.js]). There has been some debate in the literature on which approach is best. The discussion focuses primarily on the case of computers. Silver and Heravi (2001), Pakes (2003) and de Haan (2004a) all argue that double imputation may be preferable because of the problem that price-determining variables may be omitted from the hedonic equation, which can bias the estimated price relative in the single imputation case.

To see how omitted variables affect the estimated price relatives, it is useful to work through the problem algebraically. Here, we focus on any potential bias introduced into the index by single and double imputation methods. While the variability of the index is also of concern, the problem of bias is often given higher priority in the production of official statistics. (5) Let us suppose that the hedonic researcher estimates a model with a set of characteristics, c = 1, ..., C. The model with estimated parameters [[??].sub.js,c] is shown below for the semilog specification:

(9) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

In fact, the true model reflects a set of additional characteristics, d = 1, ..., D, which are not available to the researcher. The true data-generating model of prices, with estimated parameters, is shown below:

(10) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Or alternatively the estimate of the log price from the complete model can be written as the sum of the actual value and the regression residual. That is,

(11) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Consider first the case where we use the actual prices in region-period js and the estimated prices in region-period kt (i.e., single imputation). This gives the following estimate of the log price relative: (6)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (12)

The alternative approach is to use predictions from the model in both periods (i.e., double imputation).

(13) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Our best estimate of the log price relative for a product n that sells in region-period js but not in kt is given by [ln [p.sup.n.sub.kt]([z.sup.n.sub.js]) - ln [p.sup.n.sub.js]]. Comparing the single and double imputation log price relatives from the misspecified model with this best estimate, we obtain the following errors.

Single imputation error:

(14) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

Double imputation error:

(15) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

These two equations show the errors from using double and single imputation methods. The size and sign of the error depend upon a number of factors, and one method does not always dominate the other. However, there are good reasons, based upon Equations (14) and (15), to prefer double imputation. When the omitted variables are reasonably stable across region-periods (which is generally the case at least in temporal comparisons), the terms [[summation].sup.C.sub.c = 1] ([[??].sub.js,c] - [[??].sub.js,c])[z.sup.n.sub.js,c] and [[summation].sup.C.sub.c = 1] ([[??].sub.kt,c] - [[??].sub.kt,c])[z.sup.n.sub.js,c] in Equation (15), representing the omitted variable bias in the included variables, will tend to partially offset each other. Similarly, the terms [[summation].sup.D.sub.d = 1] ([[??].sub.kt,d][z.sup.n.sub.js,d] and [[summation].sup.D.sub.d = 1] ([[??].sub.js,d][z.sup.n.sub.js,d] in Equation (15), reflecting the effects of the omitted variables on price, are likely to take similar values over region-periods and also partially offset each other. By contrast, in Equation (14), there are no pairs of partially offsetting errors in levels. Instead, we are left with the full level of each of the error components.

For example, consider the case of a house for which [[??].sup.n.sub.js] ([z.sub.js]) > [p.sup.n.sub.js]. This means that either the buyer got a bargain or the house performs poorly on its omitted variables. Assuming that the latter is correct, it follows that [[??].sup.n.sub.kt]([z.sub.js]) will overestimate the true price of a house with characteristics vector [z.sub.js] in region-period kt. It follows that the price relative [[??].sup.n.sub.kt]([z.sub.js])/[p.sup.n.sub.js] will have an upward bias. In contrast, the biases in [[??].sup.n.sub.js]([z.sub.js]) and [[??].sup.n.sub.kt]([z.sub.js]) will partially offset each other in the price relative [[??].sup.n.sub.kt]([z.sub.js])/[[??].sup.n.sub.js]([z.sub.js]), thus tending to generate a more accurate overall estimate.

The use of double imputation is particularly beneficial in cases such as housing where there is likely to be a serious omitted variables problem. This leads us to prefer Varieties 3 and 4 over Varieties 1 and 2. Some of the omitted variables in the housing context may be location specific, thus causing spatial correlation in house prices. If the hedonic model takes explicit account of spatial correlation, this should reduce the extent of the omitted variables problem (see, e.g., Anselin 1988; Basu and Thibodeau 1998; Dubin 1988), thus weakening slightly the case for using double imputation.

B. Zero versus Double Imputation for Repeat Observations

The difference between Varieties 3 and 4 is that Variety 4 only uses double imputation when a model is not available in both periods. Variety 3 by contrast always uses double imputation, even when there are no missing prices. A similar argument applies for Varieties 3' and 4'. As far as we are aware, the possibility of always imputing for repeat observations (i.e., Varieties 3 and 3') has not previously been considered in the literature. For the case of computers, this would be hard to justify since a particular model is the same irrespective of when it is sold. Housing, however, is another matter. There is no guarantee even for a repeat sale that we are comparing like with like. This is because the characteristics of a house may change over time due to renovations or the building of a new shopping center nearby, etc. The only way to be sure that like is compared with like is to double impute all houses (even repeat sales). Hence, for the case of housing, we are led to prefer Variety 3.

C. Imputation of Expenditure Weights

Varieties 1'-4' impute expenditure shares, while Varieties 1-4 do not. Given that expenditure shares are effectively used as measures of the importance of a product, it seems natural to weight products according to their actual rather than imputed importance. The difference between this case and the "prices" case is that for the latter, we are interested in a price relative, while for expenditure shares what we require is an estimate of the level. We have noted that we can miss-estimate levels when variables are omitted from the hedonic equation indicating that it is best to use actual rather than imputed expenditure shares. Hence, we prefer Variety 3 over Variety 3'.

Drawing all these results together, for the case of housing, we are led to favor Variety 3. This means that double imputation of price relatives is used even when both prices are available. Expenditure shares, however, are not imputed.

D. Choosing between Fisher and Tornqvist

The choice of formula should be between Fisher and Tornqvist since these formulas treat both region-periods symmetrically, are superlative, and have desirable economic and axiomatic properties (Balk 1995; Diewert 1976). From an economic and axiomatic perspective, neither Fisher nor Tornqvist clearly dominates the other. When used in conjunction with the hedonic imputation method, we must also consider the functional form of the hedonic regression.

We focus here on the case of housing. Our choice is between F3 and T3. For the semilog case, GL3 can be reexpressed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where we have defined

[[??].sub.c,js] = [[N.sub.js].summation over (n = 1)] [w.sup.n.sub.js][z.sup.n.sub.c,js].

Rearranging GP3 in a similar manner, we obtain that

[P.sup.GP3.sub.js,kt] = exp[1/2 [C.summation over (c = 1)] ([[??].sub.c,kt] - [[??].sub.c,js])[[??].sub.c,kt]].

Taking the geometric mean of GP3 and GL3, we obtain the following expression for T3:

[P.sup.T3.sub.js,kt] = exp[1/2 [C.summation over (c = 1)] ([[??].sub.c,kt] - [[??].sub.c,js])([[??].sub.c,js] + [[??].sub.c,kt])].

The fact that T3 can be decomposed multiplicatively by its characteristics has advantages when interpreting the results. It means that the contribution of each characteristic to the overall price index can be easily discerned. That is, we can decompose the price index as follows:

[P.sup.T3.sub.js,kt] = [P.sup.l.sub.js,kt] x [P.sup.2.sub.js,kt] x ... x [P.sup.C.sub.js,kt],

where [P.sup.c.sub.js,kt] measures the multiplicative contribution of characteristic c to the differences in house prices between region-periods js and kt. Of particular interest is the ratio [P.sup.c.sub.js,kt] : [P.sup.T3.sub.js,kt]. If this ratio exceeds 1, it implies characteristic c is exerting upward pressure on the overall price index, while when less than 1, c is exerting downward pressure on the index. F3, by contrast, does not simplify in such an intuitively appealing way.

As has been noted earlier, it is also possible to start by defining price indexes in characteristics space. For example, Laspeyres and Paasche characteristics price indexes can be defined as follows (Diewert 2001; Dulberger 1989):

Laspeyres5: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Laspeyres5: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

L5 computes a hypothetical average model for region-period js and then compares the imputed price of this product in region-periods js and kt. P5 by contrast uses a hypothetical average model for region-period kt as its point of reference. Taking the geometric mean of L5 and P5, we obtain a Fisher characteristics price index, F5.

[P.sup.F5.sub.js,kt] = exp[1/2 [C.summation over (c = 1) ([[??].sub.c,kt] - [[??].sub.c,js])([[??].sub.c,js] + [[??].sub.c,kt])].

What is particularly interesting is that F5 on closer inspection can be seen to be the same as T3!

T3, therefore, has attractive properties when the hedonic regression model takes the semilog form. The fact that it can be defined in either goods or characteristics space adds flexibility to the way the results can be interpreted. For example, T3 can be interpreted either as measuring the average of the ratios over the two region-periods of the imputed prices of each house or as the ratio of the imputed price of the average house. Which perspective is most useful may depend on the context. In addition, the representation in characteristics space implies that the price index is multiplicatively decomposable by characteristic, which allows the role played by each characteristic to be much more easily discerned. Hence, we favor T3 as our preferred hedonic imputation price index, when used in conjunction with the semilog model.

For the linear model, L3 can be reexpressed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where we have defined

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The term [x.sup.n.sub.c,js] represents the contribution of characteristic c to the total price of model n. The term [[bar.x].sub.c,js] denotes the average contribution to total price, across all models, of characteristic c. In other words, L3 in this case reduces to another variety of Laspeyres characteristics price index, which we refer to as L6. (7)

[P.sup.L6.sub.js,kt] = [C.summation over (c = 1)]([[??].sub.c,kt]/[[??].sub.c,js])].

The corresponding Paasche price index is

[P.sup.P6.sub.js,kt] = [C.summation over (c = 1)] [[[[[bar.x].sub.c,kt]([[??].sub.c,kt]/[[??].sub.c,js]).sup.-1]].sup.- 1].

For the linear model, it can similarly be shown that P3 is equivalent to P6:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Taking the geometric mean of P6 and L6, we obtain a Fisher characteristics price index, F6, which by construction is equal to F3 for the linear model.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Although F3 does not decompose multiplicatively by characteristic, it is still our method of choice for the linear model due to its dual representation in goods and characteristics space.

VII. AN EMPIRICAL APPLICATION: THE CASE OF HOUSE PRICES IN SYDNEY

In this section, house price indexes are computed for three regions in Sydney over a period of 3 yr. This application should be viewed as merely illustrative. It provides an opportunity to implement many of the issues discussed in the paper and assess their empirical significance. We are interested primarily in the spreads between different formulas and varieties rather than in the price levels themselves.

The data set was obtained from Australian Property Monitors. It covers about 20% of house sales in 128 postcodes, each of which contains at least 4,000 private residences for the years 2001, 2002, and 2003. (8) We aggregate the postcodes into three broad regions, which are referred to as the central, northern, and western regions. The central region includes inner Sydney, eastern suburbs, inner west, and Sydney metro. The northern region includes the lower north shore, upper north shore, Mosman/Cremourne, Manly/Warringah, northwest. The western region includes the western suburbs, Fairfield/Liverpool, Canterbury/ Bankstown, St. George, Cronulla/Sutherland, Campbelltown. The combination of three time periods and three regions yields a total of nine region-periods. Data on each of the characteristics are available in every region-period. Hence, we are not required to confront the problem of unmatched characteristics.

A total of 70 characteristics were used to estimate the hedonic equations of each region-period. These characteristics are listed in Table 1. The normalization adopted is for a three bedroom house with two bathrooms in the first quarter of the year. Most of the characteristics are dummy variables. The exceptions are the geospatial characteristics, which are measured in kilometers. We consider four transformations of geospatial distances: linear, square, log, and square root. We prefer the square-root transformation since it generated higher [R.sup.2] coefficients and accords best with our intuition regarding the rate of diminishing returns with respect to distance. With regard to the dependent variable, we consider the linear and log transformations. The large sample sizes generate relatively tight confidence intervals for the Box-Cox test, which rejects both the log and the linear specifications, though the logarithmic specification lies closest to estimated value. We prefer the logarithmic specification for this reason and also because it is likely to reduce the amount of heteroscedasticity in the data (Diewert 2003). The overall functional form, therefore, is a variant on semilog, with square roots of geospatial distances replacing actual distances. It follows that our preferred formula variety is T3.

The estimated parameter coefficients are shown in Table 2. Here, we use ordinary rather than weighted least squares. As each observation represents a single house, the use of weighting is unlikely to generate significantly different results. Note also that under the imputations approach, our primary aim is to obtain an estimate of each price, and in this case, weighting is not necessary. This is in contrast to the time-dummy method where the average price change between periods is also simultaneously being estimated so weighting becomes more important (see also Footnote 9). Most of the coefficients seem intuitively reasonably plausible, although for some characteristics (e.g., distance to nearest school), it is not clear whether the coefficient should be positive or negative. Most coefficients (those in bold) are significant at the 95% level. The t statistics were calculated using the method of White (1980) and hence are asymptotically robust to heteroscedasticity. The coefficients on the physical characteristics are generally of similar size (and of the same sign) across region-periods. For the geospatial characteristics, however, the signs sometimes vary across regions, although hardly ever across periods for a particular region. The greater variability in the geospatial coefficients across region-periods is probably attributable to multicollinearity.

The [R.sup.2] coefficients and information on the number of parameters and observations are shown in Table 3. The [R.sup.2] coefficients range between .69 and .77.

The resulting price indexes are shown in Table 4 (Panel a). To save on space, we only consider bilateral comparisons, with the central region in 2001 as the base. Given that there are a total of nine region-periods, we thus obtain nine bilateral comparisons for each variety of each formula. We compare the variety and formula spread for Varieties 1, 1', 3, and 3' of Fisher and Tornqvist. (9) We do not consider Varieties 2, 2', 4, and 4' since in spatial comparisons, there is no difference between Varieties 1 and 2 or between Varieties 3 and 4. Excluding the base region-period, the average spread between the four varieties of Fisher indexes (i.e., variety spread) is 0.8%, while for Tornqvist, it is 1.2%. (l0) The average spread between the same varieties of Fisher and Tornqvist (i.e., the formula spread) is 1.1%. (11) In this sense, the choice of variety seems to matter as much as the choice between Fisher and Tornqvist. The same will not necessarily be true for other formulas. For example, in our data set, in a comparison of Laspeyres with Paasche, the results are much more sensitive to the choice of formula than to the choice of variety.

Panel b of Table 4 replicates the results in Panel a of Table 4 for the linear model. That is, the dependent variable in the hedonic model underlying Panel b of Table 4 is [p.sup.n.sub.kt] as opposed to ln [p.sup.n.sub.kt] in Panel a of Table 4. The explanatory variables on the right-hand side in both hedonic models are identical. The results in Panel b of Table 4 provide a robustness check on the relative magnitudes of the variety and formula spreads. The average variety spread for Fisher in Panel b of Table 4 is 1.5% and 1.3% for Tornqvist, while the average Fisher-Tornqvist formula spread is 4.8%. In this case, the choice of formula seems to be more important than the choice of variety. 12 The fact that the average spreads are larger in Panel b of Table 4 is a sign that the linear model does not fit the data as well as the semilog model. A second disadvantage of the linear model is that, on occasion, it generates negative price estimates. These observations were removed when we computed the price indexes in Panel b of Table 4.

A comparison of Panels a and b of Table 4 also sheds light on the sensitivity of the results to the choice of functional form for the hedonic model. The average spread between paired observations in Panels a and b of Table 4, excluding the base region-period, is 4.4%. This functional form spread is about the same as the Fisher-Tornqvist formula spread in Panel b of Table 4. It remains to be seen whether similar patterns are observed for other data sets.

VIII. CONCLUSIONS

We have shown how the hedonic imputation method adds a new dimension to the price index problem. In addition to choosing between different formulas, we must also choose a functional form for the hedonic model and decide when to replace actual with imputed prices. The semilog model has a natural affinity with the Tornqvist price index, and the linear model with Fisher. Hence, the choice of price index formula is largely determined by the choice of hedonic model, in much the same way as it is by the choice of functional form for the expenditure function under the economic approach. Essentially, this means that one must choose between hedonic model-price index pairs (e.g., semilog and Tornqvist vs. linear and Fisher).

This, however, is not the end of the story. It is also necessary to distinguish between eight different varieties of each price index formula depending on which prices are imputed. Our preferred variety, at least in the housing context, uses double imputation on the price relatives (even for repeat sales) but does not impute expenditure shares. We show algebraically that the error introduced into the index is likely to be smaller when double imputation is used because the biases tend to offset each other. Our application to house prices in Sydney reveals a difference of between 1% and 2% in the results across varieties of the same formula. Although this variety spread seems to be smaller than the formula and functional form spreads, it is still large enough to warrant attention.

ABBREVIATIONS

CPI: Consumer Price Index

COLI: Cost-of-Living Index

REFERENCES

Anselin, L. Spatial Econometrics: Methods and Models.

Dordrecht: Kluwer, 1988.

Bailey, M. J., R. F. Muth, and H. O. Nourse. "A Regression Method for Real Estate Price Index Construction." Journal of the American Statistical Association, 58, 1963, 933-42.

Balk, B. M. "Axiomatic Price Index Theory: A Survey." International Statistical Review, 63, 1995, 69-93.

--. "On Curing the CPI's Substitution and New Goods Bias." Paper Presented at the Fifth Meeting of the Ottawa Group, Reykjavik, Iceland. 1999.

Basu, S., and T. G. Thibodeau. "Analysis of Spatial Correlation in House Prices." Journal of Real Estate Finance and Economics, 17, 1998, 61-85.

Berndt, E. R., Z. Griliches, and N. J. Rappaport. "Econometric Estimates of Price Indexes for Personal Computers in the 1990's." Journal of Econometrics, 68, 1995, 243-68.

Boskin, M. J., E. R. Dulberger, R. J. Gordon, Z. Griliches, and D.. Jorgenson. Toward a More Accurate Measure of the Cost of Living, Final Report to the Senate Finance Committee from the Advisory Commission to Study the Consumer Price Index (unpublished) 1996.

Cartwright, D. W. "Improved Deflation of Purchases of Computers." Survey of Current Business, 66, 1986, 7-9.

Court, A. T. "Hedonic Price Indexes with Automotive Examples." in The Dynamics of Automobile Demand. New York: The General Motors Corporation, 1939, 99-117.

de Haan, J. "Direct and Indirect Time Dummy Approaches to Hedonic Price Measurement." Journal of Economic and Social Measurement, 29, 2004a, 427-43.

--. "Hedonic Regression: The Time Dummy Index as a Special Case of the Imputation Tornqvist Index." Paper Presented at the 8th Ottawa Group Meeting, Helsinki. 2004b.

Diewert, W. E. "Exact and Superlative Index Numbers." Journal of Econometrics, 4, 1976, 115-45.

--. "Hedonic Regressions: A Consumer Theory Approach." Discussion Paper 01-12, Department of Economics, University of British Columbia, 2001.

--. "Hedonic Regressions: A Review of Some Unresolved Issues." Mimeo, University of British Columbia, 2003.

--. "Elementary Indices." in Chapter 20 of the Consumer Price Index Manual." Theory and Practice, edited by T. P. Hill. Geneva: International Labour Organization, 2004, 355-72.

Dubin, R. A. "Estimation of Regression Coefficients in the Presence of Spatial Autocorrelated Error Terms." Review of Economics' and Statistics, 70, 1988, 466-74.

Dulberger, E. R. "The Application of a Hedonic Model to a Quality-Adjusted Price Index for Computer Pro cessors," in Technology and Capital Formation, edited by D. W. Jorgenson and R. Landau. Cambridge, MA: MIT Press, 1989, 37-75.

Eichhorn, W., and J. Voeller. Theory of the Price Index, Lecture Notes in Economics and Mathematical Systems, Vol. 140. Berlin: Springer-Verlag, 1976.

Feenstra, R. C. "New Product Varieties and the Measurement of International Prices." American Economic Review, 84, 1994, 157-77.

Fisher, I. The Making of Index Numbers. Boston: Houghton-Mifflin, 1922.

Griliches, Z. "Hedonic Price Indexes for Automobiles: An Econometric Analysis of Quality Change," in The Price Statistics of the Federal Government, edited by G. Stigler (chairman). Washington, DC: Government Printing Office, 1961, 173-96.

Hausman, J. "Valuation of New Goods under Perfect and Imperfect Competition," in The Economics of New Goods, edited by T. F. Bresnahan and R. J. Gordon.

Chicago: University of Chicago Press, 1997, 209-37.

--. "Cellular Telephone, New Products, and the CPI." Journal of Business and Economic Statistics, 17, 1999, 188-94.

Hicks, J. R. "The Valuation of Social Income." Economica, 7, 1940, 105-24.

--. Value and Capital. 2nd ed. Oxford: Clarendon Press, 1946.

Hill, R. J. "Constructing Price Indexes Across Countries and Time: The Case of the European Union." American Economic Review, 94, 2004, 1379-410.

Kennedy, P. E. "Estimation with Correctly Interpreted Dummy Variables in Semilogarithmic Equations." American Economic Review, 71, 1981, 801.

Lancaster, K. J. "A New Approach to Consumer Theory." Journal of Political Economy, 74, 1966, 132-57.

Nahm, D. "Incorporating the Effect of New and Disappearing Products into the Cost of Living: An Alternative Index Number Formula." Seoul Journal of Economics, 11, 1998, 261-82.

Pakes, A. "A Reconsideration of Hedonic Price Indexes with an Application to PC's." American Economic Review, 93, 2003, 1578-96.

Rosen, S. "Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition." Journal of Political Economy, 82, 1974, 34-55.

Schultze, C. S., and C. Mackie. "At What Price? Conceptualizing and Measuring Cost-of-Living and Price Indexes," in Panel on Conceptual, Measurement, and Other Statistical Issues in Developing Cost-of-Living Indexes. Committee on National Statistics, Division of Behavioral and Social Sciences and Education. Washington, DC, National Academy Press, 2002.

Silver, M., and S. Heravi. "Quality Adjustment, Sample Rotation and CPI Practice: An Experiment." Presented at the Sixth Meeting of the International Working Group on Price Indices, Canberra, Australia, April 2-6, 2001.

Triplett, J. E. "Hedonic Methods in Statistical Agency Environments: An Intellectual Biopsy," in Fifty Years of Economic Measurement, edited by E. R.

Berndt and J. E. Triplett. Chicago: Chicago University Press and NBER, 1990, 207-33.

--. Handbook on Hedonic Indexes and Quality Adjustments in Price Indexes: Special Application to Information Technology Products. STI Working Paper 2004/9. Paris: Directorate for Science, Technology and Industry, Organisation for Economic Co-operation and Development, 2004.

van Garderen, K. J., and C. Shah. "Exact Interpretation of Dummy Variables in Semilogarithmic Equations." Econometrics Journal, 5, 2002, 149-59.

Waugh, F. V. "Quality Factors Influencing Vegetable Prices." Journal of Farm Economics, 10, 1928, 185-96.

White, H. "A Heteroscedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroscedasticity." Econometrica, 48, 1980, 817-38.

(1.) Nevertheless, in certain circumstances, one may prefer to use imputed prices over actually prices when constructing an imputations index (see Section VI).

(2.) It has been shown by de Haan (2004b) that the time-dummy method can be viewed as a restricted form of the imputations method where implicit characteristics prices are held fixed and only some prices imputed.

(3.) When the semilog model is used, moving from ln [[??].sup.n.sub.kt] (z.sup.n.sub.js) to [[??].sup.n.sub.kt] (z.sup.n.sub.js) is not straightforward. In particular simply taking the exponent wall generate a biased estimate. This problem is addressed in Kennedy (1981) and van Garderen and Shah (2002), and we did not discuss this issue further as it is not of direct relevance to the problem addressed in this paper.

(4.) By construction, a Laspeyres index only considers the models sold in region-period js.

(5.) See, for example, the Boskin Commission (1996) and the Schultze Report to the National Research Council (2002).

(6.) In constructing the index, we take the exponent of the log price relative. The error in the price relative will be a monotonic (i.e., exponential) transformation of the error in the log price relatives, so we can consider the bias in terms of either (also see Footnote 3 for further details).

(7.) An index similar to L6 has been used since 1968 by the census bureau in the United States to compute a price index for new single-family homes (Triplett 1990).

(8.) The limited coverage of house sales may bias the price-level results. However, it is rather less likely to impact on the observed formula and variety spreads and hence is not a major concern for us here.

(9.) Housing is an unusual case for price index construction in two respects First in almost all cases [q.sup.n.sub.js] = 1. That m, it is rare to find two houses sold in a particular region-period js that have identical characteristics. In this case, Paasche and Laspeyres end up resembling elementary indexes of the Dutot type (see Diewert 2004, for a discussion of elementary indexes and their properties). Second, a case can be made for giving equal weight to all house sales, rather than weighting by expenditure shares. The usual justification for weighting price indexes by expenditure shares is so that the basket of goods and services will be representative for the average household. Housing is unusual, however, in that most households only buy one house, and the distribution of house prices is significantly skewed to the right. Most households therefore are not buying from the right tail. Hence, it may be preferable to give all houses equal weight in the index (i.e., to set [w.sub.js] = 1/[N.sub.js]) since this will provide an index that is more representative for the median household. However, we do not pursue this path here.

(10.) For any given particular region-period, the Fisher spread, for example, is calculated as follows: max(F1, F3, F1', F3')/min(F1, F3, F1', F3'). The average spread is then obtained by averaging this result across all region-periods (excluding the base region-period).

(11.) For example, the formula spread for Variety 1 for a particular region-period is calculated as follows: max(F1, T1)/min(F1, T1). These formula spreads are then averaged across all varieties and region-periods (excluding the base region-period).

(12.) Again, this finding is even more clear-cut in a comparison of Laspeyres with Paasche.

ROBERT J. HILL and DANIEL MELSER *

* Funding from the Australian Research Council Discovery Grant DP0667209 and Linkage Grant LP0347618 in collaboration with the Australian Bureau of Statistics is gratefully acknowledged. We also thank Australian Property Monitors for supplying the data.

Hill: Professor, School of Economics, University of New South Wales, Sydney, NSW 2052, Australia. Phone 61-2-93853076, Fax 61-2-93136337, E-mail r.hill@ unsw.edu.au

Melser: Lecturer, Department of Economics, Monash University, Caulfield Campus, Melbourne, Vic 3145, Australia. Phone 61-3-99052478, Fax 61-3-99055476, E-mail danielmelser@buseco.monash.edu.au

TABLE 1 Characteristics Used in the Hedonic Model

Physical Characteristics

C1 INTERCEPT
C2 UNIT
C3 TERRACE
C4 SEMI
C5 COTTAGE
C6 TOWNHOUSE
C7 DUPLEX
C8 VILLA
C9 1 BEDROOM
CIO 2 BEDROOMS
C11 4 BEDROOMS
C12 5 BEDROOMS
C13 6+ BEDROOMS
C141 BATHROOM
C15 3 BATHROOMS
C16 4+ BATHROOMS
C17 AREA
C18 AREA SQUARED
C19 EXTRA ROOM
C20 AIR CONDITION
C21 ALARM SYSTEM
C22 BRICK CONSTR.
C23 COURTYARD
C24 DECK BALCONY
C25 ENSUITE
C26 FREESTANDING
C27 FIRE PLACE
C28 GARDEN
C29 GYM
C30 SECURE PARKING
C31 POLISHED FLOOR
C32 SWIMMING POOL
C33 REAR LANE ACC.
C34 SEPARATE DINING
C35 STRATA
C36 TENNIS COURT
C37 TIMBER FLOOR
C38 TOP FLOOR
C39 WALK IN WARD
C40 WEATHERBOARD
C41 CITY VIEWS
C42 HARBOR VIEWS
C43 WATERFRONT
C44 QUARTER 2
C45 QUARTER 3
C46 QUARTER 4

Geospatial Characteristics
Square Root of Distance To

C47 COMBINED SCHOOL
C48 GOLF CLUB
C49 GENERAL HOSPITAL
C50 PUB. HOSPITAL WITH EMERG.
C51 PRIV. HOSPITAL WITH EMERG.
C52 HEALTH SERVICE (GENERAL)
C53 HIGH SCHOOL
C54 NATIONAL PARK
C55 PARK OR RESERVE
C56 PRE-SCHOOL
C57 PRIMARY SCHOOL
C58 RACECOURSE
C59 RAILWAY STATION
C60 RIFLE RANGE
C61 SHOPPING CENTER--REGIONAL
C62 SHOPPING CENTER--SUB-REG.
C63 SHOPPING CENTER--LOCAL
C64 SWIMMING CENTER

C65 SPECIAL SCHOOL
C66 SYDNEY AIRPORT
C67 TECH./BUS/TRADE COLLEGE
C68 UNIVERSITY
C69 BANKSTOWN AIRPORT
C70 BEACH

TABLE 2 Coefficient Estimates for Each Region-Period for Semilog Model

 c01 c02 c03 n01 n02

C1 12.684 12.731 13.016 14.315 15.265
C2 -0.445 -0.456 -0.467 -0.489 -0.510
C3 -0.150 -0.127 -0.116 -0.048 -0.125
C4 -0.128 -0.097 -0.096 -0.161 -0.201
C5 -0.089 -0.017 -0.047 -0.041 -0.062
C6 -0.326 -0.349 -0.311 -0.291 -0.334
C7 -0.193 -0.157 0.033 -0.071 -0.157
C8 -0.071 -0.274 -0.152 -0.228 -0.214
C9 -0.500 -0.521 -0.556 -0.540 -0.582
C1O -0.194 -0.207 -0.221 -0.186 -0.206
C11 0.149 0.122 0.099 0.135 0.132
C12 0.189 0.208 0.198 0.236 0.170
Cl3 0.264 0.264 0.096 0.186 0.290
C14 -0.106 -0.096 -0.104 -0.024 -0.020
C15 0.082 0.147 0.167 0.075 0.097
C16 0.269 0.285 0.298 0.206 0.281
C17 0.000 0.000 0.000 0.000 0.000
C18 0.000 0.000 0.000 0.000 0.000
C19 0.048 0.063 0.064 0.059 0.064
C20 0.139 0.126 0.096 0.093 0.044
C21 -0.156 0.026 0.058 0.134 0.076
C22 0.015 -0.042 -0.014 -0.143 0.013
C21 -0.018 -0.028 -0.014 -0.006 0.021
C24 0.000 -0.004 -0.012 -0.016 -0.019
C25 0.067 0.091 0.077 0.043 0.065
C26 -0.025 -0.037 0.021 -0.038 0.019
C27 0.046 0.035 0.027 0.017 0.040
C28 0.027 0.012 0.034 0.030 0.037
C29 0.131 0.020 0.106 0.099 0.137
C30 0.116 0.101 0.103 0.028 0.025
C31 0.048 -0.003 -0.009 -0.053 -0.011
C32 0.112 0.113 0.075 0.092 0.065
C31 0.021 0.019 0.014 0.744 -0.713
C34 -0.042 -0.017 -0.025 -0.052 -0.032
C35 -0.049 -0.049 -0.103 0.003 0.006
C36 0.187 0.083 0.130 0.287 0.273
C37 -0.011 -0.006 -0.007 -0.008 0.018
C38 0.397 0.060 0.083 0.265 -0.016
C39 0.075 0.009 -0.006 0.031 0.061
C40 -0.113 -0.046 -0.079 0.149 0.107
C41 0.018 -0.080 -0.035 0.109 -0.028
C42 0.128 0.150 0.149 0.177 0.183
C43 0.390 0.360 0.418 0.381 0.323
C44 0.050 0.073 0.030 0.050 0.058
C45 0.108 0.088 0.097 0.108 0.103
C46 0.117 0.102 0.083 0.137 0.125
C47 0.219 0.236 0.267 0.450 0.311
C48 -0.204 -0.181 -0.252 -0.332 -0.435
C49 -0.118 -0.166 -0.171 0.058 0.085
C50 0.039 0.036 0.070 -0.009 0.022
C51 -0.029 -0.021 0.001 -0.090 -0.110
C52 -0.157 -0.108 -0.098 -0.024 0.041
C53 -0.165 -0.157 -0.235 -0.306 -0.341
C54 0.122 0.147 0.176 0.008 0.015
C55 0.095 0.119 0.064 0.072 0.070
C56 -0.063 -0.079 -0.095 0.029 0.055
C57 -0.076 -0.023 -0.043 0.032 0.029
C58 -0.010 0.032 -0.009 0.089 0.039
C59 0.062 0.087 0.098 0.025 0.015
C60 -0.049 -0.074 -0.062 -0.310 -0.306
C61 0.121 0.098 0.063 -0.162 -0.154
C62 0.123 0.108 0.157 0.086 0.102
C63 -0.142 -0.134 -0.184 0.034 0.027
C64 0.109 0.063 0.134 0.038 0.013
C65 0.062 0.045 0.060 0.065 0.072
C66 -0.011 -0.008 0.030 -0.095 -0.087
C67 0.117 0.134 0.139 -0.051 -0.066
C68 0.317 0.276 0.264 -0.243 -0.197
C69 0.158 -0.128 -0.152 0.094 0.068
C70 -0.040 0.000 -0.036 0.246 0.265

 n03 w01 w02 w03

C1 15.821 12.761 13.089 13.198
C2 -0.606 -0.437 -0.489 -0.522
C3 -0.121 -0.107 -0.098 -0.046
C4 -0.211 -0.112 -0.170 -0.149
C5 -0.066 -0.022 -0.038 0.011
C6 -0.424 -0.224 -0.228 -0.314
C7 -0.191 -0.065 -0.090 -0.106
C8 -0.349 -0.256 -0.304 -0.297
C9 -0.558 -0.364 -0.375 -0.310
C1O -0.157 -0.122 -0.115 -0.084
C11 0.102 0.136 0.133 0.100
C12 0.186 0.298 0.221 0.186
Cl3 0.235 0.279 0.272 0.378
C14 -0.046 -0.045 -0.071 -0.050
C15 0.081 0.094 0.088 0.089
C16 0.189 0.266 0.082 0.274
C17 0.000 0.000 0.000 0.000
C18 0.000 0.000 0.000 0.000
C19 0.027 0.036 0.050 0.039
C20 0.060 0.072 0.068 0.043
C21 0.077 0.142 0.020 0.029
C22 -0.014 -0.050 0.018 0.052
C21 0.010 0.037 0.023 -0.016
C24 -0.028 -0.007 0.023 0.016
C25 0.057 0.096 0.057 0.094
C26 -0.082 -0.058 -0.075 -0.084
C27 0.034 0.041 0.030 0.031
C28 0.020 0.016 0.013 0.024
C29 0.175 0.155 -0.056 -0.052
C30 0.038 0.035 0.021 0.019
C31 -0.009 0.015 0.007 0.007
C32 0.059 0.078 0.082 0.082
C31 -0.512 0.075 -0.046 0.008
C34 0.009 -0.025 -0.027 -0.019
C35 -0.124 0.050 -0.048 -0.058
C36 0.217 0.396 0.286 0.268
C37 -0.027 0.026 0.001 0.004
C38 0.047 0.501 0.073 0.007
C39 0.084 0.010 0.047 0.080
C40 0.065 -0.086 0.086 -0.015
C41 -0.038 0.143 0.055 0.007
C42 0.168 0.291 0.249 0.230
C43 0.383 0.538 0.476 0.461
C44 0.003 0.051 0.071 0.044
C45 0.065 0.104 0.130 0.094
C46 0.071 0.145 0.159 0.117
C47 0.185 0.031 -0.008 -0.014
C48 -0.428 -0.119 -0.160 -0.182
C49 0.083 0.060 0.064 0.055
C50 0.000 -0.018 -0.008 -0.003
C51 -0.069 -0.039 -0.034 -0.026
C52 0.001 0.028 0.033 0.016
C53 -0.280 -0.035 -0.029 -0.038
C54 0.035 -0.078 -0.062 -0.037
C55 0.035 0.002 0.031 0.039
C56 0.016 0.024 0.062 0.060
C57 0.063 0.054 0.054 0.085
C58 0.057 0.021 0.027 0.020
C59 0.044 0.034 0.058 0.052
C60 -0.291 0.035 0.055 0.043
C61 -0.194 0.169 0.137 0.157
C62 0.053 0.111 0.125 0.117
C63 0.035 -0.124 -0.124 -0.097
C64 0.018 0.108 0.076 0.085
C65 0.023 -0.010 -0.006 0.027
C66 -0.109 0.125 0.084 0.092
C67 -0.055 -0.009 -0.064 -0.055
C68 -0.135 -0.058 -0.021 0.001
C69 0.117 0.122 0.146 0.125
C70 0.250 -0.188 -0.214 -0.208

Notes: Numbers in bold are significant at the 95%, level. c01, central
region in 2001; c02, central region in 2002; c03, central region in
2003; n01, northern region in 2001; n02, northern region in 2002; n03,
northern region in 2003; w01, western region in 2001; w02, western
region in 2002; w03, western region in 2003.

TABLE 3 Regression Statistics (semilog model)

 R SQ. R SQ. ADJ. NO. PAR. NO. OBSERVATIONS F STATISTIC

c01 0.7423 0.7388 70 5,169 212.8
c02 0.7739 0.7707 70 4,979 243.5
c03 0.7685 0.7646 70 4,101 194.0
n01 0.7441 0.7408 70 5,440 226.3
n02 0.7442 0.7397 70 3,983 165.0
n03 0.7186 0.7128 70 3,432 124.4
w01 0.6997 0.6957 70 5,311 176.9
w02 0.7383 0.7339 70 4,221 169.7
w03 0.7418 0.7378 70 4,518 185.2

Notes: c01, central region in 2001; c02, central region in 2002; c03,
central region in 2003; n01, northern region in 2001; n02, northern
region in 2002; n03, northern region in 2003; w01, western region in
2001; w02, western region in 2002; w03, western region in 2003; R SQ.,
R squared; R SQ. ADJ., R squared adjusted; NO. PAR., number of
parameters.

TABLE 4
Temporal and Spatial Price Indexes for Regions in Sydney

 [P.sub.c01,c01] [P.sub.c01,c02] [P.sub.c01,c03]

(a) Semilog model
 F1 1.0000 1.1843 1.3181
 F3 1.0000 1.1856 1.3204
 F1' 1.0000 1.1928 1.3291
 F3' 1.0000 1.1856 1.3216
 T1 1.0000 1.1862 1.3143
 T3 1.0000 1.1869 1.3210
 T1' 1.0000 1.1866 1.3210
 T3' 1.0000 1.1870 1.3218
(b) Linear model
 Fl 1.0000 1.1930 1.2595
 F3 1.0000 1.1795 1.2466
 Fl' 1.0000 1.1974 1.2707
 F3' 1.0000 1.1934 1.2606
 T1 1.0000 1.1877 1.2510
 T3 1.0000 1.1839 1.2562
 T1' 1.0000 1.1895 1.2579
 T3' 1.0000 1.1910 1.2538

 [P.sub.c01,n01] [P.sub.c01,n02] [P.sub.c01,n03]

(a) Semilog model
 F1 0.9938 1.2671 1.3524
 F3 0.9954 1.2719 1.3552
 F1' 0.9906 1.2773 1.3584
 F3' 0.9950 1.2727 1.3554
 T1 0.9983 1.2509 1.3480
 T3 1.0014 1.2633 1.3543
 T1' 1.0033 1.2686 1.3582
 T3' 1.0028 1.2676 1.3572
(b) Linear model
 Fl 1.0162 1.2320 1.3846
 F3 0.9838 1.2328 1.3885
 Fl' 1.0150 1.2393 1.3897
 F3' 1.0187 1.2350 1.3887
 T1 1.0228 1.2270 1.3953
 T3 1.0272 1.2380 1.4053
 T1' 1.0253 1.2394 1.4042
 T3' 1.0241 1.2319 1.3979

 [P.sub.c01,w01] [P.sub.c01,w02] [P.sub.c01,w03]

(a) Semilog model
 F1 0.7846 0.9623 1.0955
 F3 0.7872 0.9701 1.0989
 F1' 0.7926 0.9673 1.1097
 F3' 0.7902 0.9708 1.1037
 T1 0.7967 0.9760 1.1041
 T3 0.8072 0.9928 1.1203
 T1' 0.8110 0.9978 1.1278
 T3' 0.8119 0.9991 1.1285
(b) Linear model
 Fl 0.7057 0.8557 0.9762
 F3 0.7164 0.8670 0.9806
 Fl' 0.7108 0.8660 0.9864
 F3' 0.7050 0.8546 0.9760
 T1 0.7883 0.9467 1.0707
 T3 0.8016 0.9637 1.0887
 T1' 0.8055 0.9702 1.0957
 T3' 0.7901 0.9496 1.0757

Notes: c01, central region in 2001; c02, central region in 2002; c03,
central region in 2003; n01, northern region in 2001; n02, northern
region in 2002; n03, northern region in 2003; w01, western region in
2001; w02, western region in 2002; w03, western region in 2003.