Hedonic imputation and the price index problem: an application to housing.
Hill, Robert J. ; Melser, Daniel
I. INTRODUCTION
Price indexes play a significant role in modem economies. The
consumer price index (CPI), for example, is used to index various
government payments, as a target for monetary policy and as a benchmark
in wage negotiations. Our focus in this paper, however, is on price
indexes at a more disaggregated level, in markets where it is hard to
match products from one period or region to the next. Computers and
housing are notable examples of such markets. As well as being important
inputs into the CPI, price indexes for such goods are often useful in
their own right. Price indexes for computers play a critical role in
productivity measurement across market sectors, while house price
indexes provide an important indication of the state of an economy.
For the case of computers, the matching problem arises due to
technological progress, which leads to the rapid evolution of products
in the market, resulting in a short product cycle. For housing, the
problem is that every house is different and that they tend to sell
relatively infrequently. Hence, there is usually very little overlap in
the houses sold from one period to the next and no overlap at all from
one region to the next.
The fact that products can often not be matched across periods or
regions poses a significant measurement problem in that it is therefore
difficult to disentangle price differences from changes in the quality
of products. In this paper, we focus primarily on the hedonic regression method for solving this problem. The hedonic method reduces the matching
problem to one of comparing products on the basis of their
characteristics. The "regression" aspect of hedonic regression
refers to how the implicit prices for these characteristics are
measured.
In the next section, we explain what is meant by the price index
problem. Section III outlines more rigorously the measurement problem
created by unmatched products. The hedonic imputation method is
introduced in Section IV. Section V shows how the use of the hedonic
imputation method complicates the price index problem. In addition to
choosing between different formulas such as Fisher and Tornqvist, it is
necessary to choose between different varieties of each formula. This is
because index compilers have a certain amount of discretion over which
prices are imputed. Possible solutions are considered in Section VI. We
show that the choice of formula variety can affect the sensitivity of
the results to omitted variables bias. The choice of price index formula
(as opposed to variety) is also considered in this section. We show that
this is intimately connected with the choice of functional form for the
hedonic model. Section VII provides an empirical application of the
issues raised. The case considered is the construction of house price
indexes for three regions in Sydney over a 3-yr period. Section VIII
concludes the paper.
II. THE PRICE INDEX PROBLEM
Let [P.sub.js,kt] denote a bilateral price index comparison between
region j in time period s and region k in time period t. The price and
quantity data of commodity heading n for country k in period t are
denoted, respectively, by [p.sup.n.sub.kt] and [q.sup.n.sub.kt]. Six
important bilateral formulas are: Paasche, Laspeyres, Fisher, geometric
Paasche, geometric Laspeyres, and Tornqvist. These indexes are defined
as follows:
(1) Paasche : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(2) Laspeyres : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(3) Fisher: [P.sup.F.sub.js,kt] = [square root of
([P.sup.P.sub.js,kt] x [P.sup.L.sub.js,kt])]
(4) Geometric Paasche :
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(5) Geometric Laspeyres :
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
(6) Tornqvist : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN
ASCII].
Here, [w.sup.n.sub.kt] =
[p.sup.n.sub.kt][q.sup.n.sub.kt]/[[summation].sup.N.sub.m=1]
[p.sup.m.sub.kt][q.sup.m.sub.kt] denotes the expenditure share of
product n in region-period kt.
These price index formulas all give the same answer if the price
data satisfy the conditions for Hicks' aggregation theorem (Hicks 1946), that is, [p.sup.n.sub.kt] = [lambda][p.sup.n.sub.js] [for all].
Under this scenario, all the price relatives
[p.sup.n.sub.js]/[p.sup.n.sub.kt] take the same value [lambda]; hence,
there is no substitution effect. In such cases, [P.sub.js,kt] =
[lambda], irrespective of the choice of formula. However, when there is
some variation in the price relatives across products, the formulas
diverge from each other. This is what is meant by the price index
problem. It is a problem that has attracted some of the greatest minds
in the economic profession over the best part of two centuries, such as
Marshall, Edgeworth, Fisher, and Samuelson. Fisher (1922), for example,
considers in excess of 100 different formulas.
The price index problem has been attacked from two main directions,
usually referred to as the economic and the axiomatic approaches. The
economic approach views quantities as utility maximizing responses to
prices. This approach has culminated in the work of Diewert (1976), who
proposed the concept of a superlative price index (a class of indexes
that attain a second-order approximation to the underlying
cost-of-living index [COLI]). Each index outlined above can be derived
from a particular functional form for the cost or the utility function.
Diewert's contribution was to show that some of the indexes are
based upon more flexible representations of the cost function than
others. The Fisher and Tornqvist index are superlative indexes as they
allow for flexible substitution behavior. An alternative approach to
justifying the form of index numbers is the axiomatic approach, which
proposes a series of axioms that a price index should satisfy and then
discriminates between them on the basis of their performance relative to
these axioms (Balk 1995; Eichhorn and Voeller 1976). Fortunately, the
axiomatic approach also tends to favor the Fisher and Tornqvist indexes
as these usually emerge as best.
This literature, however, assumes that there is no matching
problem. That is, it is assumed that all region-periods supply price and
quantity data on the same list of commodity headings. Once this
assumption is relaxed, the price index problem becomes more complex.
III. THEORETICAL FOUNDATIONS OF THE HEDONIC APPROACH
The problem posed by incompletely overlapping sets of products can
be seen by outlining the conventional economic measurement framework. In
terms of measuring price change between region j in period s and region
k in period t, we want to estimate the cost for some representative
consumer of obtaining a given level of utility under the two price and
choice set regimes. Let the time periods be indexed by t = 1, ..., T;
the set of regions by k = 1, ..., K; and the set of commodity headings
by n = 1, ..., [N.sub.kt]. The price and quantity data of commodity
heading n for region k in period t are denoted, respectively, by
[p.sup.n.sub.kt] and q.sup.n.sub.kt] The COLI is defined as follows:
[P.sup.*.sub.js,kt] = C([p.sub.kt],[??])/C([p.sub.js],[??])
where the cost function is defined below.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
The problem that arises frequently in practice is that the price
vectors [p.sub.kt] and [p.sub.js] may not be comparable. For example,
there might be some variety of computer that is available in
region-period kt but not in region-period js. This makes the estimation
of the COLI more complex.
A number of methods have been developed to tackle this problem.
Hausman (1997, 1999) following Hicks (1940) suggested estimating the
reservation price of the non-matched items. While this approach is
conceptually appealing, it involves the estimation of demand systems and
is econometrically and theoretically complex. Detailed data on both the
prices paid and the quantities purchased by consumers are also required.
An alternative approach suggested by Feenstra (1994) is to assume that
the cost function takes the constant elasticity of substitution functional form in which case it is possible to derive the COLI exactly
(see also Balk 1999; Nahm 1998). However, perhaps the most promising
approach to dealing with hard-to-match products is hedonic regression.
The hedonic approach dates back to Waugh (1928) and Court (1939).
However, it was only with Griliches (1961) that interest in hedonics really took off (Schultze and Mackie 2002; Triplett 2004).
The conceptual basis of the hedonic approach, dating back to
Lancaster (1966) and Rosen (1974), is that consumers' utility is
derived from the characteristics of the goods and hence decisions also
relate to these characteristics. At its most general, the hedonic
approach reorients the measurement problem to one related to
characteristics rather than to goods, which are bundles of
characteristics.
At a conceptual level, there appears to be two main options for the
application of hedonic techniques. First, we could completely
reconstruct the consumers' optimization problem in terms of
characteristics. That is, we could think of consumers minimizing the
cost of obtaining a certain level of characteristics utility, given
characteristics prices. Such an approach is at least implicit in the
writing of Triplett (2004). For illustrative purposes, let us depict such a cost function, where [z.sup.c] denotes the characteristics c = 1,
..., C and [b.sup.c.sub.kt] the prices of these characteristics in
region-period kt.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
This may be termed a characteristics approach and gives rise to
price indexes defined over the characteristics (rather than goods)
prices and quantities.
A second approach is to use the hedonic hypothesis to construct a
relationship between prices and characteristics and to apply this
relationship in goods space. If the hedonic price relationship is
denoted by [p.sup.*.sub.kt], then this can be thought of as enabling the
extension of the cost function to those goods for which we do not have
comparable prices in region-periods kt and js. We order the goods so
that the [N.sub.js] models available in region-period js are as follows:
n = 1, ..., [N.sub.js,kt] indexes the models available in both
region-periods js and kt, while n = [N.sub.js,kt] + 1, ..., [N.sub.js]
indexes the models available in region-period js but not in
region-period kt.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
This approach to hedonics is called the imputations approach as we
are imputing missing prices.
The characteristics and imputations approaches are quite different.
The imputations method extends the well-established goods space approach
to price measurement. The characteristics approach, however, transforms
the whole problem into characteristics space.
A significant drawback of the characteristics approach is that
characteristics are not observed directly. The goods that are actually
traded are tied bundles of characteristics. This means that except in
very rare circumstances, we do not observe characteristics prices
directly; hence, that characteristics prices must always be estimated
rather than recorded directly from market transactions. This is in
contrast to the imputations index that requires estimation only in the
case where there is incomplete matching of goods. (1) For these reasons,
we prefer the imputations approach over the characteristics method.
IV. THE HEDONIC IMPUTATION METHOD
When a product sold in region-period js is not sold in
region-period kt (or vice versa), it is no longer possible to compute the bilateral formulas mentioned in Section II. The problem is not the
presence of zero quantities per se but rather that when [q.sub.kt] = 0,
the corresponding price [p.sub.kt] is not observed. Zero quantities can
arise in a temporal context due to technological progress and market
turnover leading to the emergence of new goods and the disappearance of
existing goods. If a new improved model of computer is simply matched
with the previous model, this will create an upward bias in the price
index (Boskin et al. 1996). A quality adjustment must therefore be made.
The hedonic imputation method is ideal for this task. It can be used to
impute what the price of the new model would have been the period before
it appeared and the price of the old model the period after it
disappeared.
Housing is particularly problematic since every house is different.
Hence, there may be very little overlap in the houses sold from one
period to the next and zero overlap from one region to the next. In a
temporal context, the repeat-sales method of Bailey, Muth, and Nourse
(1963) has nevertheless been extensively used. Whether this introduces a
bias depends on whether there are any inherent differences (that lead to
differing price paths) between houses that sell frequently and those
that do not. Even if such an index is free of bias, it will dramatically
limit the range of house sales that are included in each bilateral
comparison and therefore increase the variance and reduce the
reliability of the comparison.
Hedonic methods seem to provide the only satisfactory way of
computing price indexes when there is a significant mismatch of products
across periods (regions). Three main classes of hedonic methods have
been proposed in the literature. They go by various names. Here, we
adopt the terminology of Triplett (2004) and refer to them as the
hedonic imputation, time-dummy, and characteristics price index methods,
respectively. Our focus in this paper is on the hedonic imputation
method, which uses hedonic regressions to impute prices for any models
that are missing in particular region-periods. Once all the models have
been matched, standard price index formulas can then be used. If
required, these bilateral price indexes (e.g., Fisher and Tornqvist) can
then be multilateralized. The hedonic imputation method is used by the
Bureau of Economic Analysis to construct price indexes for computers in
the U.S. national accounts (Cartwright 1986; Dulberger 1989; Triplett
2004).
Our reasons for preferring the hedonic imputation method to the
characteristics approach are outlined in the previous section. Our
concern with the time-dummy method arises from the fact that it computes
a single pooled regression equation for all the periods in the sample
and derives the price indexes directly from the regression equation.
This has the major disadvantage that it does not allow characteristic
shadow prices to change over time, a drawback that has led to criticism
(Berndt, Griliches, and Rappaport 1995; Pakes 2003; Schultze and Mackie
2002). (2) Also, it is difficult to update the results when new periods
are added to the data set, since reestimation of the hedonic equation
will change all the results. In other words, the time-dummy method
violates temporal fixity (Hill 2004).
The hedonic imputation method runs a separate regression for each
region-period in the comparison. The explanatory variables on the
right-hand side of the equation are the characteristics of the product.
For the case of computers, examples of relevant characteristics include
RAM, hard drive capacity, and processor speed. For the case of housing,
relevant characteristics include land area, number of bathrooms, and
geospatial characteristics such as distance from the city center.
The choice of functional form for the hedonic model is an
interesting question in its own right. Here, we focus attention on the
linear and semilog models. These models differ only in the dependent
variable, which in our case takes the form [p.sup.n.sub.kt] for the
linear model and In [p.sup.n.sub.kt] for the semilog model. The
functional form for the semilog hedonic model is as follows:
(7) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Consider a product model n sold in region-period js. An imputed
price for this same model in region-period kt is obtained as follows:
(8) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [[??].sub.c,kt] denotes the estimator of [[beta].sub.c,kt].
(3) We abstract here from issues of exactly how [[??].sub.c,kt] is
computed but return to this in the empirical section. The important
point to note at this stage is that when product n is unavailable in
region-period kt, it can be imputed and then standard price index
formulas used.
V. ALTERNATIVE VARIETIES OF HEDONIC PRICE INDEXES
The use of the hedonic imputation method adds a new dimension to
the price index problem. This is because we have some discretion as to
which prices are imputed. If a product is unavailable in a particular
region-period, we have no choice but to impute it. Even if the product
is available, we may nevertheless still prefer to use an imputed price
over the actual price. This might seem counterintuitive. However, it
turns out that replacing real prices with imputations can sometimes
reduce the omitted variables bias and help ensure that like is compared
with like. These issues are explored further in the next section.
To illustrate how the hedonic imputation method complicates the
price index problem, we focus first on the case of the Laspeyres price
index. Four different varieties of the Laspeyres price index are
obtained depending on how exactly the hedonic imputation method is
implemented. For the case of L2 and L4, we order the [N.sub.js] models
available in region-period js as follows: n = 1, ..., [N.sub.js,kt]
indexes the models sold in both region-periods js and kt, while n =
[N.sub.js,kt] + 1. ..., [N.sub.js] indexes the models sold in
region-period js but not in region-period kt. (4)
Laspeyres 1: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Laspeyres 2: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Laspeyres 3: [P.sup.L3.sub.js,kt] = [[N.sub.js].summation over (n =
1)] {[w.sup.n.sub.js][[??].sup.n.sub.kt](z.sup.n.sub.js)/[[??].sup.n.sub.js] (z.sup.n.sub.js)]}
Laspeyres4: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
The expenditure weights are defined as follows:
[w.sup.n.sub.js] = [p.sup.n.sub.js][q.sub.js]/[[N.sub.js].summation
over (m = 1] [p.sup.m.sub.js][q.sup.m.sub.js].
Variety 2 uses the minimum number of imputations. The other three
varieties (to differing extents) sometimes throw away real price
observations and replace them with imputed prices.
Four more varieties of Laspeyres indexes are obtained if we allow
expenditure weights to be imputed as well. That is, in each of the four
equations mentioned above, the expenditure weight [w.sup.n.sub.js] we
could be replaced by an imputed expenditure weight [[??].sup.n.sub.js]
that is calculated as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
We refer to these variants on the four methods mentioned above as
L1', L2', L3', and L4'. We argue in the next section
that there is no clear benefit to replacing actual expenditure weights
with imputations. Hence, L1', L2', L3', and L4' do
not warrant serious consideration.
Different varieties of Paasche, Fisher, geometric Paasche,
geometric Laspeyres, and Tornqvist can be derived in an analogous
manner. Here, we illustrate this point for the first variety only:
Paasche 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Fisher 1 : [P.sup.F1.sub.js,kt] = [square root of
[P.sup.P1.sub.js,kt] x [P.sup.L1.sub.js,kt]]
Geometric Paasche 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN
ASCII]
Geometric Laspeyres 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
IN ASCII]
Tornqvist 1 : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
It should be noted that if we allow different varieties of
Laspeyres and Paasche indexes to be combined, we obtain 64 rather than 8
different varieties of Fisher. By applying an analogous argument to
geometric Laspeyres and geometric Paasche indexes, we also obtain 64
varieties of Tornqvist. However, it would be hard to justify on economic
grounds combining unmatched pairs of Laspeyres and Paasche or geometric
Laspeyres and geometric Paasche indexes. Hence, in practice, we can
limit the range of varieties of Fisher and Tornqvist to eight (or four
once we rule out imputing expenditure weights).
VI. CHOOSING A PRICE INDEX VARIETY AND FORMULA
A. Single versus Double Imputation
Which variety is best? To simplify matters, this question is
addressed for the case of the Laspeyres index. The arguments carry
forward equally well to other price index formulas such as Fisher and
Tornqvist. Our focus is on minimizing omitted variables bias and
generating results that are easy to interpret. Other criteria such as
variance and sample selection bias minimization are not considered here
(see Pakes 2003, for a discussion of these criteria).
The eight varieties of Laspeyres indexes differ in their treatment
of the price relatives and expenditure shares. We consider the former
first. A Laspeyres index only considers models sold in the base
region-period. A distinction can be drawn between price relatives where
the model is sold in only region-period js and price relatives where it
is sold in both region-periods. Again, we consider the former first.
Single imputation uses the price relatives
[[??].sup.n.sub.kt]([z.sup.n.sub.js])/[p.sup.n.sub.js], while double
imputation uses [[??].sup.n.sub.kt]([z.sup.n.sub.js])/[[??].sup.n.sub.js] ([z.sup.n.sub.js]). There has been some debate in the literature on
which approach is best. The discussion focuses primarily on the case of
computers. Silver and Heravi (2001), Pakes (2003) and de Haan (2004a)
all argue that double imputation may be preferable because of the
problem that price-determining variables may be omitted from the hedonic
equation, which can bias the estimated price relative in the single
imputation case.
To see how omitted variables affect the estimated price relatives,
it is useful to work through the problem algebraically. Here, we focus
on any potential bias introduced into the index by single and double
imputation methods. While the variability of the index is also of
concern, the problem of bias is often given higher priority in the
production of official statistics. (5) Let us suppose that the hedonic
researcher estimates a model with a set of characteristics, c = 1, ...,
C. The model with estimated parameters [[??].sub.js,c] is shown below
for the semilog specification:
(9) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
In fact, the true model reflects a set of additional
characteristics, d = 1, ..., D, which are not available to the
researcher. The true data-generating model of prices, with estimated
parameters, is shown below:
(10) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Or alternatively the estimate of the log price from the complete
model can be written as the sum of the actual value and the regression
residual. That is,
(11) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Consider first the case where we use the actual prices in
region-period js and the estimated prices in region-period kt (i.e.,
single imputation). This gives the following estimate of the log price
relative: (6)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (12)
The alternative approach is to use predictions from the model in
both periods (i.e., double imputation).
(13) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Our best estimate of the log price relative for a product n that
sells in region-period js but not in kt is given by [ln
[p.sup.n.sub.kt]([z.sup.n.sub.js]) - ln [p.sup.n.sub.js]]. Comparing the
single and double imputation log price relatives from the misspecified
model with this best estimate, we obtain the following errors.
Single imputation error:
(14) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
Double imputation error:
(15) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
These two equations show the errors from using double and single
imputation methods. The size and sign of the error depend upon a number
of factors, and one method does not always dominate the other. However,
there are good reasons, based upon Equations (14) and (15), to prefer
double imputation. When the omitted variables are reasonably stable
across region-periods (which is generally the case at least in temporal
comparisons), the terms [[summation].sup.C.sub.c = 1] ([[??].sub.js,c] -
[[??].sub.js,c])[z.sup.n.sub.js,c] and [[summation].sup.C.sub.c = 1]
([[??].sub.kt,c] - [[??].sub.kt,c])[z.sup.n.sub.js,c] in Equation (15),
representing the omitted variable bias in the included variables, will
tend to partially offset each other. Similarly, the terms
[[summation].sup.D.sub.d = 1] ([[??].sub.kt,d][z.sup.n.sub.js,d] and
[[summation].sup.D.sub.d = 1] ([[??].sub.js,d][z.sup.n.sub.js,d] in
Equation (15), reflecting the effects of the omitted variables on price,
are likely to take similar values over region-periods and also partially
offset each other. By contrast, in Equation (14), there are no pairs of
partially offsetting errors in levels. Instead, we are left with the
full level of each of the error components.
For example, consider the case of a house for which
[[??].sup.n.sub.js] ([z.sub.js]) > [p.sup.n.sub.js]. This means that
either the buyer got a bargain or the house performs poorly on its
omitted variables. Assuming that the latter is correct, it follows that
[[??].sup.n.sub.kt]([z.sub.js]) will overestimate the true price of a
house with characteristics vector [z.sub.js] in region-period kt. It
follows that the price relative
[[??].sup.n.sub.kt]([z.sub.js])/[p.sup.n.sub.js] will have an upward
bias. In contrast, the biases in [[??].sup.n.sub.js]([z.sub.js]) and
[[??].sup.n.sub.kt]([z.sub.js]) will partially offset each other in the
price relative [[??].sup.n.sub.kt]([z.sub.js])/[[??].sup.n.sub.js]([z.sub.js]), thus tending to generate a more accurate overall estimate.
The use of double imputation is particularly beneficial in cases
such as housing where there is likely to be a serious omitted variables
problem. This leads us to prefer Varieties 3 and 4 over Varieties 1 and
2. Some of the omitted variables in the housing context may be location
specific, thus causing spatial correlation in house prices. If the
hedonic model takes explicit account of spatial correlation, this should
reduce the extent of the omitted variables problem (see, e.g., Anselin
1988; Basu and Thibodeau 1998; Dubin 1988), thus weakening slightly the
case for using double imputation.
B. Zero versus Double Imputation for Repeat Observations
The difference between Varieties 3 and 4 is that Variety 4 only
uses double imputation when a model is not available in both periods.
Variety 3 by contrast always uses double imputation, even when there are
no missing prices. A similar argument applies for Varieties 3' and
4'. As far as we are aware, the possibility of always imputing for
repeat observations (i.e., Varieties 3 and 3') has not previously
been considered in the literature. For the case of computers, this would
be hard to justify since a particular model is the same irrespective of
when it is sold. Housing, however, is another matter. There is no
guarantee even for a repeat sale that we are comparing like with like.
This is because the characteristics of a house may change over time due
to renovations or the building of a new shopping center nearby, etc. The
only way to be sure that like is compared with like is to double impute
all houses (even repeat sales). Hence, for the case of housing, we are
led to prefer Variety 3.
C. Imputation of Expenditure Weights
Varieties 1'-4' impute expenditure shares, while
Varieties 1-4 do not. Given that expenditure shares are effectively used
as measures of the importance of a product, it seems natural to weight
products according to their actual rather than imputed importance. The
difference between this case and the "prices" case is that for
the latter, we are interested in a price relative, while for expenditure
shares what we require is an estimate of the level. We have noted that
we can miss-estimate levels when variables are omitted from the hedonic
equation indicating that it is best to use actual rather than imputed
expenditure shares. Hence, we prefer Variety 3 over Variety 3'.
Drawing all these results together, for the case of housing, we are
led to favor Variety 3. This means that double imputation of price
relatives is used even when both prices are available. Expenditure
shares, however, are not imputed.
D. Choosing between Fisher and Tornqvist
The choice of formula should be between Fisher and Tornqvist since
these formulas treat both region-periods symmetrically, are superlative,
and have desirable economic and axiomatic properties (Balk 1995; Diewert
1976). From an economic and axiomatic perspective, neither Fisher nor
Tornqvist clearly dominates the other. When used in conjunction with the
hedonic imputation method, we must also consider the functional form of
the hedonic regression.
We focus here on the case of housing. Our choice is between F3 and
T3. For the semilog case, GL3 can be reexpressed as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where we have defined
[[??].sub.c,js] = [[N.sub.js].summation over (n = 1)]
[w.sup.n.sub.js][z.sup.n.sub.c,js].
Rearranging GP3 in a similar manner, we obtain that
[P.sup.GP3.sub.js,kt] = exp[1/2 [C.summation over (c = 1)]
([[??].sub.c,kt] - [[??].sub.c,js])[[??].sub.c,kt]].
Taking the geometric mean of GP3 and GL3, we obtain the following
expression for T3:
[P.sup.T3.sub.js,kt] = exp[1/2 [C.summation over (c = 1)]
([[??].sub.c,kt] - [[??].sub.c,js])([[??].sub.c,js] + [[??].sub.c,kt])].
The fact that T3 can be decomposed multiplicatively by its
characteristics has advantages when interpreting the results. It means
that the contribution of each characteristic to the overall price index
can be easily discerned. That is, we can decompose the price index as
follows:
[P.sup.T3.sub.js,kt] = [P.sup.l.sub.js,kt] x [P.sup.2.sub.js,kt] x
... x [P.sup.C.sub.js,kt],
where [P.sup.c.sub.js,kt] measures the multiplicative contribution
of characteristic c to the differences in house prices between
region-periods js and kt. Of particular interest is the ratio
[P.sup.c.sub.js,kt] : [P.sup.T3.sub.js,kt]. If this ratio exceeds 1, it
implies characteristic c is exerting upward pressure on the overall
price index, while when less than 1, c is exerting downward pressure on
the index. F3, by contrast, does not simplify in such an intuitively
appealing way.
As has been noted earlier, it is also possible to start by defining
price indexes in characteristics space. For example, Laspeyres and
Paasche characteristics price indexes can be defined as follows (Diewert
2001; Dulberger 1989):
Laspeyres5: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Laspeyres5: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
L5 computes a hypothetical average model for region-period js and
then compares the imputed price of this product in region-periods js and
kt. P5 by contrast uses a hypothetical average model for region-period
kt as its point of reference. Taking the geometric mean of L5 and P5, we
obtain a Fisher characteristics price index, F5.
[P.sup.F5.sub.js,kt] = exp[1/2 [C.summation over (c = 1)
([[??].sub.c,kt] - [[??].sub.c,js])([[??].sub.c,js] + [[??].sub.c,kt])].
What is particularly interesting is that F5 on closer inspection
can be seen to be the same as T3!
T3, therefore, has attractive properties when the hedonic
regression model takes the semilog form. The fact that it can be defined
in either goods or characteristics space adds flexibility to the way the
results can be interpreted. For example, T3 can be interpreted either as
measuring the average of the ratios over the two region-periods of the
imputed prices of each house or as the ratio of the imputed price of the
average house. Which perspective is most useful may depend on the
context. In addition, the representation in characteristics space
implies that the price index is multiplicatively decomposable by
characteristic, which allows the role played by each characteristic to
be much more easily discerned. Hence, we favor T3 as our preferred
hedonic imputation price index, when used in conjunction with the
semilog model.
For the linear model, L3 can be reexpressed as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where we have defined
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
The term [x.sup.n.sub.c,js] represents the contribution of
characteristic c to the total price of model n. The term
[[bar.x].sub.c,js] denotes the average contribution to total price,
across all models, of characteristic c. In other words, L3 in this case
reduces to another variety of Laspeyres characteristics price index,
which we refer to as L6. (7)
[P.sup.L6.sub.js,kt] = [C.summation over (c =
1)]([[??].sub.c,kt]/[[??].sub.c,js])].
The corresponding Paasche price index is
[P.sup.P6.sub.js,kt] = [C.summation over (c = 1)]
[[[[[bar.x].sub.c,kt]([[??].sub.c,kt]/[[??].sub.c,js]).sup.-1]].sup.-
1].
For the linear model, it can similarly be shown that P3 is
equivalent to P6:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Taking the geometric mean of P6 and L6, we obtain a Fisher
characteristics price index, F6, which by construction is equal to F3
for the linear model.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Although F3 does not decompose multiplicatively by characteristic,
it is still our method of choice for the linear model due to its dual
representation in goods and characteristics space.
VII. AN EMPIRICAL APPLICATION: THE CASE OF HOUSE PRICES IN SYDNEY
In this section, house price indexes are computed for three regions
in Sydney over a period of 3 yr. This application should be viewed as
merely illustrative. It provides an opportunity to implement many of the
issues discussed in the paper and assess their empirical significance.
We are interested primarily in the spreads between different formulas
and varieties rather than in the price levels themselves.
The data set was obtained from Australian Property Monitors. It
covers about 20% of house sales in 128 postcodes, each of which contains
at least 4,000 private residences for the years 2001, 2002, and 2003.
(8) We aggregate the postcodes into three broad regions, which are
referred to as the central, northern, and western regions. The central
region includes inner Sydney, eastern suburbs, inner west, and Sydney
metro. The northern region includes the lower north shore, upper north
shore, Mosman/Cremourne, Manly/Warringah, northwest. The western region
includes the western suburbs, Fairfield/Liverpool, Canterbury/
Bankstown, St. George, Cronulla/Sutherland, Campbelltown. The
combination of three time periods and three regions yields a total of
nine region-periods. Data on each of the characteristics are available
in every region-period. Hence, we are not required to confront the
problem of unmatched characteristics.
A total of 70 characteristics were used to estimate the hedonic
equations of each region-period. These characteristics are listed in
Table 1. The normalization adopted is for a three bedroom house with two
bathrooms in the first quarter of the year. Most of the characteristics
are dummy variables. The exceptions are the geospatial characteristics,
which are measured in kilometers. We consider four transformations of
geospatial distances: linear, square, log, and square root. We prefer
the square-root transformation since it generated higher [R.sup.2]
coefficients and accords best with our intuition regarding the rate of
diminishing returns with respect to distance. With regard to the
dependent variable, we consider the linear and log transformations. The
large sample sizes generate relatively tight confidence intervals for
the Box-Cox test, which rejects both the log and the linear
specifications, though the logarithmic specification lies closest to
estimated value. We prefer the logarithmic specification for this reason
and also because it is likely to reduce the amount of heteroscedasticity
in the data (Diewert 2003). The overall functional form, therefore, is a
variant on semilog, with square roots of geospatial distances replacing
actual distances. It follows that our preferred formula variety is T3.
The estimated parameter coefficients are shown in Table 2. Here, we
use ordinary rather than weighted least squares. As each observation
represents a single house, the use of weighting is unlikely to generate
significantly different results. Note also that under the imputations
approach, our primary aim is to obtain an estimate of each price, and in
this case, weighting is not necessary. This is in contrast to the
time-dummy method where the average price change between periods is also
simultaneously being estimated so weighting becomes more important (see
also Footnote 9). Most of the coefficients seem intuitively reasonably
plausible, although for some characteristics (e.g., distance to nearest
school), it is not clear whether the coefficient should be positive or
negative. Most coefficients (those in bold) are significant at the 95%
level. The t statistics were calculated using the method of White (1980)
and hence are asymptotically robust to heteroscedasticity. The
coefficients on the physical characteristics are generally of similar
size (and of the same sign) across region-periods. For the geospatial
characteristics, however, the signs sometimes vary across regions,
although hardly ever across periods for a particular region. The greater
variability in the geospatial coefficients across region-periods is
probably attributable to multicollinearity.
The [R.sup.2] coefficients and information on the number of
parameters and observations are shown in Table 3. The [R.sup.2]
coefficients range between .69 and .77.
The resulting price indexes are shown in Table 4 (Panel a). To save
on space, we only consider bilateral comparisons, with the central
region in 2001 as the base. Given that there are a total of nine
region-periods, we thus obtain nine bilateral comparisons for each
variety of each formula. We compare the variety and formula spread for
Varieties 1, 1', 3, and 3' of Fisher and Tornqvist. (9) We do
not consider Varieties 2, 2', 4, and 4' since in spatial
comparisons, there is no difference between Varieties 1 and 2 or between
Varieties 3 and 4. Excluding the base region-period, the average spread
between the four varieties of Fisher indexes (i.e., variety spread) is
0.8%, while for Tornqvist, it is 1.2%. (l0) The average spread between
the same varieties of Fisher and Tornqvist (i.e., the formula spread) is
1.1%. (11) In this sense, the choice of variety seems to matter as much
as the choice between Fisher and Tornqvist. The same will not
necessarily be true for other formulas. For example, in our data set, in
a comparison of Laspeyres with Paasche, the results are much more
sensitive to the choice of formula than to the choice of variety.
Panel b of Table 4 replicates the results in Panel a of Table 4 for
the linear model. That is, the dependent variable in the hedonic model
underlying Panel b of Table 4 is [p.sup.n.sub.kt] as opposed to ln
[p.sup.n.sub.kt] in Panel a of Table 4. The explanatory variables on the
right-hand side in both hedonic models are identical. The results in
Panel b of Table 4 provide a robustness check on the relative magnitudes
of the variety and formula spreads. The average variety spread for
Fisher in Panel b of Table 4 is 1.5% and 1.3% for Tornqvist, while the
average Fisher-Tornqvist formula spread is 4.8%. In this case, the
choice of formula seems to be more important than the choice of variety.
12 The fact that the average spreads are larger in Panel b of Table 4 is
a sign that the linear model does not fit the data as well as the
semilog model. A second disadvantage of the linear model is that, on
occasion, it generates negative price estimates. These observations were
removed when we computed the price indexes in Panel b of Table 4.
A comparison of Panels a and b of Table 4 also sheds light on the
sensitivity of the results to the choice of functional form for the
hedonic model. The average spread between paired observations in Panels
a and b of Table 4, excluding the base region-period, is 4.4%. This
functional form spread is about the same as the Fisher-Tornqvist formula
spread in Panel b of Table 4. It remains to be seen whether similar
patterns are observed for other data sets.
VIII. CONCLUSIONS
We have shown how the hedonic imputation method adds a new
dimension to the price index problem. In addition to choosing between
different formulas, we must also choose a functional form for the
hedonic model and decide when to replace actual with imputed prices. The
semilog model has a natural affinity with the Tornqvist price index, and
the linear model with Fisher. Hence, the choice of price index formula
is largely determined by the choice of hedonic model, in much the same
way as it is by the choice of functional form for the expenditure
function under the economic approach. Essentially, this means that one
must choose between hedonic model-price index pairs (e.g., semilog and
Tornqvist vs. linear and Fisher).
This, however, is not the end of the story. It is also necessary to
distinguish between eight different varieties of each price index
formula depending on which prices are imputed. Our preferred variety, at
least in the housing context, uses double imputation on the price
relatives (even for repeat sales) but does not impute expenditure
shares. We show algebraically that the error introduced into the index
is likely to be smaller when double imputation is used because the
biases tend to offset each other. Our application to house prices in
Sydney reveals a difference of between 1% and 2% in the results across
varieties of the same formula. Although this variety spread seems to be
smaller than the formula and functional form spreads, it is still large
enough to warrant attention.
ABBREVIATIONS
CPI: Consumer Price Index
COLI: Cost-of-Living Index
REFERENCES
Anselin, L. Spatial Econometrics: Methods and Models.
Dordrecht: Kluwer, 1988.
Bailey, M. J., R. F. Muth, and H. O. Nourse. "A Regression
Method for Real Estate Price Index Construction." Journal of the
American Statistical Association, 58, 1963, 933-42.
Balk, B. M. "Axiomatic Price Index Theory: A Survey."
International Statistical Review, 63, 1995, 69-93.
--. "On Curing the CPI's Substitution and New Goods
Bias." Paper Presented at the Fifth Meeting of the Ottawa Group,
Reykjavik, Iceland. 1999.
Basu, S., and T. G. Thibodeau. "Analysis of Spatial
Correlation in House Prices." Journal of Real Estate Finance and
Economics, 17, 1998, 61-85.
Berndt, E. R., Z. Griliches, and N. J. Rappaport. "Econometric Estimates of Price Indexes for Personal Computers in the
1990's." Journal of Econometrics, 68, 1995, 243-68.
Boskin, M. J., E. R. Dulberger, R. J. Gordon, Z. Griliches, and D..
Jorgenson. Toward a More Accurate Measure of the Cost of Living, Final
Report to the Senate Finance Committee from the Advisory Commission to
Study the Consumer Price Index (unpublished) 1996.
Cartwright, D. W. "Improved Deflation of Purchases of
Computers." Survey of Current Business, 66, 1986, 7-9.
Court, A. T. "Hedonic Price Indexes with Automotive
Examples." in The Dynamics of Automobile Demand. New York: The
General Motors Corporation, 1939, 99-117.
de Haan, J. "Direct and Indirect Time Dummy Approaches to
Hedonic Price Measurement." Journal of Economic and Social
Measurement, 29, 2004a, 427-43.
--. "Hedonic Regression: The Time Dummy Index as a Special
Case of the Imputation Tornqvist Index." Paper Presented at the 8th
Ottawa Group Meeting, Helsinki. 2004b.
Diewert, W. E. "Exact and Superlative Index Numbers."
Journal of Econometrics, 4, 1976, 115-45.
--. "Hedonic Regressions: A Consumer Theory Approach."
Discussion Paper 01-12, Department of Economics, University of British
Columbia, 2001.
--. "Hedonic Regressions: A Review of Some Unresolved Issues." Mimeo, University of British Columbia, 2003.
--. "Elementary Indices." in Chapter 20 of the Consumer
Price Index Manual." Theory and Practice, edited by T. P. Hill.
Geneva: International Labour Organization, 2004, 355-72.
Dubin, R. A. "Estimation of Regression Coefficients in the
Presence of Spatial Autocorrelated Error Terms." Review of
Economics' and Statistics, 70, 1988, 466-74.
Dulberger, E. R. "The Application of a Hedonic Model to a
Quality-Adjusted Price Index for Computer Pro cessors," in
Technology and Capital Formation, edited by D. W. Jorgenson and R.
Landau. Cambridge, MA: MIT Press, 1989, 37-75.
Eichhorn, W., and J. Voeller. Theory of the Price Index, Lecture
Notes in Economics and Mathematical Systems, Vol. 140. Berlin:
Springer-Verlag, 1976.
Feenstra, R. C. "New Product Varieties and the Measurement of
International Prices." American Economic Review, 84, 1994, 157-77.
Fisher, I. The Making of Index Numbers. Boston: Houghton-Mifflin,
1922.
Griliches, Z. "Hedonic Price Indexes for Automobiles: An
Econometric Analysis of Quality Change," in The Price Statistics of
the Federal Government, edited by G. Stigler (chairman). Washington, DC:
Government Printing Office, 1961, 173-96.
Hausman, J. "Valuation of New Goods under Perfect and
Imperfect Competition," in The Economics of New Goods, edited by T.
F. Bresnahan and R. J. Gordon.
Chicago: University of Chicago Press, 1997, 209-37.
--. "Cellular Telephone, New Products, and the CPI."
Journal of Business and Economic Statistics, 17, 1999, 188-94.
Hicks, J. R. "The Valuation of Social Income." Economica,
7, 1940, 105-24.
--. Value and Capital. 2nd ed. Oxford: Clarendon Press, 1946.
Hill, R. J. "Constructing Price Indexes Across Countries and
Time: The Case of the European Union." American Economic Review,
94, 2004, 1379-410.
Kennedy, P. E. "Estimation with Correctly Interpreted Dummy
Variables in Semilogarithmic Equations." American Economic Review,
71, 1981, 801.
Lancaster, K. J. "A New Approach to Consumer Theory."
Journal of Political Economy, 74, 1966, 132-57.
Nahm, D. "Incorporating the Effect of New and Disappearing
Products into the Cost of Living: An Alternative Index Number
Formula." Seoul Journal of Economics, 11, 1998, 261-82.
Pakes, A. "A Reconsideration of Hedonic Price Indexes with an
Application to PC's." American Economic Review, 93, 2003,
1578-96.
Rosen, S. "Hedonic Prices and Implicit Markets: Product
Differentiation in Pure Competition." Journal of Political Economy,
82, 1974, 34-55.
Schultze, C. S., and C. Mackie. "At What Price?
Conceptualizing and Measuring Cost-of-Living and Price Indexes," in
Panel on Conceptual, Measurement, and Other Statistical Issues in
Developing Cost-of-Living Indexes. Committee on National Statistics,
Division of Behavioral and Social Sciences and Education. Washington,
DC, National Academy Press, 2002.
Silver, M., and S. Heravi. "Quality Adjustment, Sample
Rotation and CPI Practice: An Experiment." Presented at the Sixth
Meeting of the International Working Group on Price Indices, Canberra,
Australia, April 2-6, 2001.
Triplett, J. E. "Hedonic Methods in Statistical Agency
Environments: An Intellectual Biopsy," in Fifty Years of Economic
Measurement, edited by E. R.
Berndt and J. E. Triplett. Chicago: Chicago University Press and
NBER, 1990, 207-33.
--. Handbook on Hedonic Indexes and Quality Adjustments in Price
Indexes: Special Application to Information Technology Products. STI Working Paper 2004/9. Paris: Directorate for Science, Technology and
Industry, Organisation for Economic Co-operation and Development, 2004.
van Garderen, K. J., and C. Shah. "Exact Interpretation of
Dummy Variables in Semilogarithmic Equations." Econometrics
Journal, 5, 2002, 149-59.
Waugh, F. V. "Quality Factors Influencing Vegetable
Prices." Journal of Farm Economics, 10, 1928, 185-96.
White, H. "A Heteroscedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroscedasticity." Econometrica,
48, 1980, 817-38.
(1.) Nevertheless, in certain circumstances, one may prefer to use
imputed prices over actually prices when constructing an imputations
index (see Section VI).
(2.) It has been shown by de Haan (2004b) that the time-dummy
method can be viewed as a restricted form of the imputations method
where implicit characteristics prices are held fixed and only some
prices imputed.
(3.) When the semilog model is used, moving from ln
[[??].sup.n.sub.kt] (z.sup.n.sub.js) to [[??].sup.n.sub.kt]
(z.sup.n.sub.js) is not straightforward. In particular simply taking the
exponent wall generate a biased estimate. This problem is addressed in
Kennedy (1981) and van Garderen and Shah (2002), and we did not discuss
this issue further as it is not of direct relevance to the problem
addressed in this paper.
(4.) By construction, a Laspeyres index only considers the models
sold in region-period js.
(5.) See, for example, the Boskin Commission (1996) and the
Schultze Report to the National Research Council (2002).
(6.) In constructing the index, we take the exponent of the log
price relative. The error in the price relative will be a monotonic (i.e., exponential) transformation of the error in the log price
relatives, so we can consider the bias in terms of either (also see
Footnote 3 for further details).
(7.) An index similar to L6 has been used since 1968 by the census
bureau in the United States to compute a price index for new
single-family homes (Triplett 1990).
(8.) The limited coverage of house sales may bias the price-level
results. However, it is rather less likely to impact on the observed
formula and variety spreads and hence is not a major concern for us
here.
(9.) Housing is an unusual case for price index construction in two
respects First in almost all cases [q.sup.n.sub.js] = 1. That m, it is
rare to find two houses sold in a particular region-period js that have
identical characteristics. In this case, Paasche and Laspeyres end up
resembling elementary indexes of the Dutot type (see Diewert 2004, for a
discussion of elementary indexes and their properties). Second, a case
can be made for giving equal weight to all house sales, rather than
weighting by expenditure shares. The usual justification for weighting
price indexes by expenditure shares is so that the basket of goods and
services will be representative for the average household. Housing is
unusual, however, in that most households only buy one house, and the
distribution of house prices is significantly skewed to the right. Most
households therefore are not buying from the right tail. Hence, it may
be preferable to give all houses equal weight in the index (i.e., to set
[w.sub.js] = 1/[N.sub.js]) since this will provide an index that is more
representative for the median household. However, we do not pursue this
path here.
(10.) For any given particular region-period, the Fisher spread,
for example, is calculated as follows: max(F1, F3, F1',
F3')/min(F1, F3, F1', F3'). The average spread is then
obtained by averaging this result across all region-periods (excluding
the base region-period).
(11.) For example, the formula spread for Variety 1 for a
particular region-period is calculated as follows: max(F1, T1)/min(F1,
T1). These formula spreads are then averaged across all varieties and
region-periods (excluding the base region-period).
(12.) Again, this finding is even more clear-cut in a comparison of
Laspeyres with Paasche.
ROBERT J. HILL and DANIEL MELSER *
* Funding from the Australian Research Council Discovery Grant
DP0667209 and Linkage Grant LP0347618 in collaboration with the
Australian Bureau of Statistics is gratefully acknowledged. We also
thank Australian Property Monitors for supplying the data.
Hill: Professor, School of Economics, University of New South
Wales, Sydney, NSW 2052, Australia. Phone 61-2-93853076, Fax
61-2-93136337, E-mail r.hill@ unsw.edu.au
Melser: Lecturer, Department of Economics, Monash University,
Caulfield Campus, Melbourne, Vic 3145, Australia. Phone 61-3-99052478,
Fax 61-3-99055476, E-mail danielmelser@buseco.monash.edu.au
TABLE 1 Characteristics Used in the Hedonic Model
Physical Characteristics
C1 INTERCEPT
C2 UNIT
C3 TERRACE
C4 SEMI
C5 COTTAGE
C6 TOWNHOUSE
C7 DUPLEX
C8 VILLA
C9 1 BEDROOM
CIO 2 BEDROOMS
C11 4 BEDROOMS
C12 5 BEDROOMS
C13 6+ BEDROOMS
C141 BATHROOM
C15 3 BATHROOMS
C16 4+ BATHROOMS
C17 AREA
C18 AREA SQUARED
C19 EXTRA ROOM
C20 AIR CONDITION
C21 ALARM SYSTEM
C22 BRICK CONSTR.
C23 COURTYARD
C24 DECK BALCONY
C25 ENSUITE
C26 FREESTANDING
C27 FIRE PLACE
C28 GARDEN
C29 GYM
C30 SECURE PARKING
C31 POLISHED FLOOR
C32 SWIMMING POOL
C33 REAR LANE ACC.
C34 SEPARATE DINING
C35 STRATA
C36 TENNIS COURT
C37 TIMBER FLOOR
C38 TOP FLOOR
C39 WALK IN WARD
C40 WEATHERBOARD
C41 CITY VIEWS
C42 HARBOR VIEWS
C43 WATERFRONT
C44 QUARTER 2
C45 QUARTER 3
C46 QUARTER 4
Geospatial Characteristics
Square Root of Distance To
C47 COMBINED SCHOOL
C48 GOLF CLUB
C49 GENERAL HOSPITAL
C50 PUB. HOSPITAL WITH EMERG.
C51 PRIV. HOSPITAL WITH EMERG.
C52 HEALTH SERVICE (GENERAL)
C53 HIGH SCHOOL
C54 NATIONAL PARK
C55 PARK OR RESERVE
C56 PRE-SCHOOL
C57 PRIMARY SCHOOL
C58 RACECOURSE
C59 RAILWAY STATION
C60 RIFLE RANGE
C61 SHOPPING CENTER--REGIONAL
C62 SHOPPING CENTER--SUB-REG.
C63 SHOPPING CENTER--LOCAL
C64 SWIMMING CENTER
C65 SPECIAL SCHOOL
C66 SYDNEY AIRPORT
C67 TECH./BUS/TRADE COLLEGE
C68 UNIVERSITY
C69 BANKSTOWN AIRPORT
C70 BEACH
TABLE 2 Coefficient Estimates for Each Region-Period for Semilog Model
c01 c02 c03 n01 n02
C1 12.684 12.731 13.016 14.315 15.265
C2 -0.445 -0.456 -0.467 -0.489 -0.510
C3 -0.150 -0.127 -0.116 -0.048 -0.125
C4 -0.128 -0.097 -0.096 -0.161 -0.201
C5 -0.089 -0.017 -0.047 -0.041 -0.062
C6 -0.326 -0.349 -0.311 -0.291 -0.334
C7 -0.193 -0.157 0.033 -0.071 -0.157
C8 -0.071 -0.274 -0.152 -0.228 -0.214
C9 -0.500 -0.521 -0.556 -0.540 -0.582
C1O -0.194 -0.207 -0.221 -0.186 -0.206
C11 0.149 0.122 0.099 0.135 0.132
C12 0.189 0.208 0.198 0.236 0.170
Cl3 0.264 0.264 0.096 0.186 0.290
C14 -0.106 -0.096 -0.104 -0.024 -0.020
C15 0.082 0.147 0.167 0.075 0.097
C16 0.269 0.285 0.298 0.206 0.281
C17 0.000 0.000 0.000 0.000 0.000
C18 0.000 0.000 0.000 0.000 0.000
C19 0.048 0.063 0.064 0.059 0.064
C20 0.139 0.126 0.096 0.093 0.044
C21 -0.156 0.026 0.058 0.134 0.076
C22 0.015 -0.042 -0.014 -0.143 0.013
C21 -0.018 -0.028 -0.014 -0.006 0.021
C24 0.000 -0.004 -0.012 -0.016 -0.019
C25 0.067 0.091 0.077 0.043 0.065
C26 -0.025 -0.037 0.021 -0.038 0.019
C27 0.046 0.035 0.027 0.017 0.040
C28 0.027 0.012 0.034 0.030 0.037
C29 0.131 0.020 0.106 0.099 0.137
C30 0.116 0.101 0.103 0.028 0.025
C31 0.048 -0.003 -0.009 -0.053 -0.011
C32 0.112 0.113 0.075 0.092 0.065
C31 0.021 0.019 0.014 0.744 -0.713
C34 -0.042 -0.017 -0.025 -0.052 -0.032
C35 -0.049 -0.049 -0.103 0.003 0.006
C36 0.187 0.083 0.130 0.287 0.273
C37 -0.011 -0.006 -0.007 -0.008 0.018
C38 0.397 0.060 0.083 0.265 -0.016
C39 0.075 0.009 -0.006 0.031 0.061
C40 -0.113 -0.046 -0.079 0.149 0.107
C41 0.018 -0.080 -0.035 0.109 -0.028
C42 0.128 0.150 0.149 0.177 0.183
C43 0.390 0.360 0.418 0.381 0.323
C44 0.050 0.073 0.030 0.050 0.058
C45 0.108 0.088 0.097 0.108 0.103
C46 0.117 0.102 0.083 0.137 0.125
C47 0.219 0.236 0.267 0.450 0.311
C48 -0.204 -0.181 -0.252 -0.332 -0.435
C49 -0.118 -0.166 -0.171 0.058 0.085
C50 0.039 0.036 0.070 -0.009 0.022
C51 -0.029 -0.021 0.001 -0.090 -0.110
C52 -0.157 -0.108 -0.098 -0.024 0.041
C53 -0.165 -0.157 -0.235 -0.306 -0.341
C54 0.122 0.147 0.176 0.008 0.015
C55 0.095 0.119 0.064 0.072 0.070
C56 -0.063 -0.079 -0.095 0.029 0.055
C57 -0.076 -0.023 -0.043 0.032 0.029
C58 -0.010 0.032 -0.009 0.089 0.039
C59 0.062 0.087 0.098 0.025 0.015
C60 -0.049 -0.074 -0.062 -0.310 -0.306
C61 0.121 0.098 0.063 -0.162 -0.154
C62 0.123 0.108 0.157 0.086 0.102
C63 -0.142 -0.134 -0.184 0.034 0.027
C64 0.109 0.063 0.134 0.038 0.013
C65 0.062 0.045 0.060 0.065 0.072
C66 -0.011 -0.008 0.030 -0.095 -0.087
C67 0.117 0.134 0.139 -0.051 -0.066
C68 0.317 0.276 0.264 -0.243 -0.197
C69 0.158 -0.128 -0.152 0.094 0.068
C70 -0.040 0.000 -0.036 0.246 0.265
n03 w01 w02 w03
C1 15.821 12.761 13.089 13.198
C2 -0.606 -0.437 -0.489 -0.522
C3 -0.121 -0.107 -0.098 -0.046
C4 -0.211 -0.112 -0.170 -0.149
C5 -0.066 -0.022 -0.038 0.011
C6 -0.424 -0.224 -0.228 -0.314
C7 -0.191 -0.065 -0.090 -0.106
C8 -0.349 -0.256 -0.304 -0.297
C9 -0.558 -0.364 -0.375 -0.310
C1O -0.157 -0.122 -0.115 -0.084
C11 0.102 0.136 0.133 0.100
C12 0.186 0.298 0.221 0.186
Cl3 0.235 0.279 0.272 0.378
C14 -0.046 -0.045 -0.071 -0.050
C15 0.081 0.094 0.088 0.089
C16 0.189 0.266 0.082 0.274
C17 0.000 0.000 0.000 0.000
C18 0.000 0.000 0.000 0.000
C19 0.027 0.036 0.050 0.039
C20 0.060 0.072 0.068 0.043
C21 0.077 0.142 0.020 0.029
C22 -0.014 -0.050 0.018 0.052
C21 0.010 0.037 0.023 -0.016
C24 -0.028 -0.007 0.023 0.016
C25 0.057 0.096 0.057 0.094
C26 -0.082 -0.058 -0.075 -0.084
C27 0.034 0.041 0.030 0.031
C28 0.020 0.016 0.013 0.024
C29 0.175 0.155 -0.056 -0.052
C30 0.038 0.035 0.021 0.019
C31 -0.009 0.015 0.007 0.007
C32 0.059 0.078 0.082 0.082
C31 -0.512 0.075 -0.046 0.008
C34 0.009 -0.025 -0.027 -0.019
C35 -0.124 0.050 -0.048 -0.058
C36 0.217 0.396 0.286 0.268
C37 -0.027 0.026 0.001 0.004
C38 0.047 0.501 0.073 0.007
C39 0.084 0.010 0.047 0.080
C40 0.065 -0.086 0.086 -0.015
C41 -0.038 0.143 0.055 0.007
C42 0.168 0.291 0.249 0.230
C43 0.383 0.538 0.476 0.461
C44 0.003 0.051 0.071 0.044
C45 0.065 0.104 0.130 0.094
C46 0.071 0.145 0.159 0.117
C47 0.185 0.031 -0.008 -0.014
C48 -0.428 -0.119 -0.160 -0.182
C49 0.083 0.060 0.064 0.055
C50 0.000 -0.018 -0.008 -0.003
C51 -0.069 -0.039 -0.034 -0.026
C52 0.001 0.028 0.033 0.016
C53 -0.280 -0.035 -0.029 -0.038
C54 0.035 -0.078 -0.062 -0.037
C55 0.035 0.002 0.031 0.039
C56 0.016 0.024 0.062 0.060
C57 0.063 0.054 0.054 0.085
C58 0.057 0.021 0.027 0.020
C59 0.044 0.034 0.058 0.052
C60 -0.291 0.035 0.055 0.043
C61 -0.194 0.169 0.137 0.157
C62 0.053 0.111 0.125 0.117
C63 0.035 -0.124 -0.124 -0.097
C64 0.018 0.108 0.076 0.085
C65 0.023 -0.010 -0.006 0.027
C66 -0.109 0.125 0.084 0.092
C67 -0.055 -0.009 -0.064 -0.055
C68 -0.135 -0.058 -0.021 0.001
C69 0.117 0.122 0.146 0.125
C70 0.250 -0.188 -0.214 -0.208
Notes: Numbers in bold are significant at the 95%, level. c01, central
region in 2001; c02, central region in 2002; c03, central region in
2003; n01, northern region in 2001; n02, northern region in 2002; n03,
northern region in 2003; w01, western region in 2001; w02, western
region in 2002; w03, western region in 2003.
TABLE 3 Regression Statistics (semilog model)
R SQ. R SQ. ADJ. NO. PAR. NO. OBSERVATIONS F STATISTIC
c01 0.7423 0.7388 70 5,169 212.8
c02 0.7739 0.7707 70 4,979 243.5
c03 0.7685 0.7646 70 4,101 194.0
n01 0.7441 0.7408 70 5,440 226.3
n02 0.7442 0.7397 70 3,983 165.0
n03 0.7186 0.7128 70 3,432 124.4
w01 0.6997 0.6957 70 5,311 176.9
w02 0.7383 0.7339 70 4,221 169.7
w03 0.7418 0.7378 70 4,518 185.2
Notes: c01, central region in 2001; c02, central region in 2002; c03,
central region in 2003; n01, northern region in 2001; n02, northern
region in 2002; n03, northern region in 2003; w01, western region in
2001; w02, western region in 2002; w03, western region in 2003; R SQ.,
R squared; R SQ. ADJ., R squared adjusted; NO. PAR., number of
parameters.
TABLE 4
Temporal and Spatial Price Indexes for Regions in Sydney
[P.sub.c01,c01] [P.sub.c01,c02] [P.sub.c01,c03]
(a) Semilog model
F1 1.0000 1.1843 1.3181
F3 1.0000 1.1856 1.3204
F1' 1.0000 1.1928 1.3291
F3' 1.0000 1.1856 1.3216
T1 1.0000 1.1862 1.3143
T3 1.0000 1.1869 1.3210
T1' 1.0000 1.1866 1.3210
T3' 1.0000 1.1870 1.3218
(b) Linear model
Fl 1.0000 1.1930 1.2595
F3 1.0000 1.1795 1.2466
Fl' 1.0000 1.1974 1.2707
F3' 1.0000 1.1934 1.2606
T1 1.0000 1.1877 1.2510
T3 1.0000 1.1839 1.2562
T1' 1.0000 1.1895 1.2579
T3' 1.0000 1.1910 1.2538
[P.sub.c01,n01] [P.sub.c01,n02] [P.sub.c01,n03]
(a) Semilog model
F1 0.9938 1.2671 1.3524
F3 0.9954 1.2719 1.3552
F1' 0.9906 1.2773 1.3584
F3' 0.9950 1.2727 1.3554
T1 0.9983 1.2509 1.3480
T3 1.0014 1.2633 1.3543
T1' 1.0033 1.2686 1.3582
T3' 1.0028 1.2676 1.3572
(b) Linear model
Fl 1.0162 1.2320 1.3846
F3 0.9838 1.2328 1.3885
Fl' 1.0150 1.2393 1.3897
F3' 1.0187 1.2350 1.3887
T1 1.0228 1.2270 1.3953
T3 1.0272 1.2380 1.4053
T1' 1.0253 1.2394 1.4042
T3' 1.0241 1.2319 1.3979
[P.sub.c01,w01] [P.sub.c01,w02] [P.sub.c01,w03]
(a) Semilog model
F1 0.7846 0.9623 1.0955
F3 0.7872 0.9701 1.0989
F1' 0.7926 0.9673 1.1097
F3' 0.7902 0.9708 1.1037
T1 0.7967 0.9760 1.1041
T3 0.8072 0.9928 1.1203
T1' 0.8110 0.9978 1.1278
T3' 0.8119 0.9991 1.1285
(b) Linear model
Fl 0.7057 0.8557 0.9762
F3 0.7164 0.8670 0.9806
Fl' 0.7108 0.8660 0.9864
F3' 0.7050 0.8546 0.9760
T1 0.7883 0.9467 1.0707
T3 0.8016 0.9637 1.0887
T1' 0.8055 0.9702 1.0957
T3' 0.7901 0.9496 1.0757
Notes: c01, central region in 2001; c02, central region in 2002; c03,
central region in 2003; n01, northern region in 2001; n02, northern
region in 2002; n03, northern region in 2003; w01, western region in
2001; w02, western region in 2002; w03, western region in 2003.