文章基本信息

标题：A hazard function approach to modeling consumer search.
作者：Choong, Peggy
期刊名称：Academy of Marketing Studies Journal
印刷版ISSN：1095-6298
出版年度：2003
期号：July
语种：English
出版社：The DreamCatchers Group, LLC
摘要：This study recognizes the stochastic element in the consumer search process and develops a stochastic model of search termination that incorporates the effects of time elapsed since commencing search, individual and product characteristics and unobserved heterogeneity.
关键词：Heterogeneous catalysis;Stochastic analysis

A hazard function approach to modeling consumer search.

Choong, Peggy

ABSTRACT

This study recognizes the stochastic element in the consumer search process and develops a stochastic model of search termination that incorporates the effects of time elapsed since commencing search, individual and product characteristics and unobserved heterogeneity.

Results indicate substantial duration dependence. Perceived benefits of search, size of the evoked set and the quality of past experience are found to be important determinants of the hazard function. This study highlights the importance of accounting for unobserved heterogeneity and the sensitivity of the parameter estimates to the specification of its distribution.

INTRODUCTION

What causes a customer to terminate his search process and purchase? This is a pervasive question faced by numerous marketing managers. Many studies have documented the effects of consumer characteristics on the extent of information search for durable products as well as patterns of search across information sources (Punj & Staelin, 1983; Kiel & Layton, 1981; Furse, Punj & Stewart, 1984; Beatty & Smith, 1987; Srinivasan & Ratchford, 1991; Putsis & Srinivasan, 1994). However, to date, there has been little significant work documenting the termination of consumer search and final purchase. In addition, contemporary marketing literature on the extent of search does not explicitly model the fact that the duration of search is a stochastic process. Thus, a consumer may terminate search early if he/she got lucky and chanced on a good deal early in the search process or he/she may be unlucky and not obtain an acceptable offer until late in the search.

A complete model of search behavior would account for this stochastic element. Another shortcoming of the literature on search is that it fails to model the possible effects of unobserved heterogeneity. The most common method of accounting for observed heterogeneity is to include consumer, retailer and product characteristics in the model, and to estimate how the measured extent of search varies with these variables. However, given the difficulty of determining and measuring these characteristics, there are likely to be many variables that affect search that are unmeasured. A complete model of search behavior would explicitly model the effect of this unobserved heterogeneity, and failure to do so may contaminate the estimates of the included variables (Heckman & Singer, 1984).

This study focuses on the rate of search termination and its determinants, using a stochastic model of search within the framework of a conditional hazard function. The probability of search termination is modeled as a function of the duration of search, measured consumer characteristics and unobserved factors.

HAZARD FUNCTION

Hazard function models have been used extensively in economics and statistics literature especially in the areas of research on job search, employment and unemployment (Jones, 1988; Lancaster, 1985; Flinn & Heckman, 1982). It has been used in a study on inter-purchase timing in marketing (Jain & Vilcassim, 1991) but never in the area of search. Since the hazard function can be thought of as the rate at which an event occurs, its application in this area of search termination is very appropriate.

For the purpose of this study, let the random variable T be the time a consumer spends searching for external information before purchasing an automobile. Duration T spans between the interval [0,8). The hazard function [lambda](t) can therefore, be defined as:

(1.) [lambda](t) = [lim.sub.[DELTA]t[right arrow]0] Pr(t < T < t + [DELTA]t|T > t)/[DELTA]t

Equation 1 indicates that the hazard function simply specifies the instantaneous rate of search terminating at time t, given that the consumer is still searching at t. In other words, conditional on the consumer not having purchased, the hazard function measures the likelihood of search ending at time t.

The hazard function is a convenient method of organizing, testing and interpreting data in cases where conditional probabilities are theoretically or intuitively appealing. The basic requirements of the hazard function are non-negativity and finiteness. This makes it less stringent than the requirements of probability distributions, which are required not only to be non-negative but also to sum or integrate to unity.

Since there are likely to be individual differences in the rate of terminating search, how individual characteristics enter into the hazard model needs to be specified. Also, since identifying all relevant characteristics is difficult, if not impossible, unobserved or unmeasurable heterogeneity needs to be taken into account. Accordingly, the conditional hazard, conditioned on a vector of consumer characteristics X, and on unmeasured heterogeneity [theta] is specified as (Flinn & Heckman, 1982; Heckman & Singer, 1984):

(2.) [lambda] (t|X, [theta]) = [[lambda].sub.0](t) [phi](X,[beta])[psi]([theta]),

where [[lambda].sub.o] is the baseline hazard corresponding to [phi] = [psi] = 1; [beta] is a vector of parameters corresponding to the consumer characteristics X; [theta] is unobserved heterogeneity. The observed and unobserved heterogeneity act multiplicatively on the hazard function and in effect serve to shift the hazard from its baseline.

For the specification of the measure of covariates, the commonly used form is adopted:

(3.) [phi](X,[beta]) = exp[X[beta]]

Since the expression exp(.) is always positive, the hazard function is automatically non-negative and finite for all X and [beta]. Following Heckman and Singer (1984) the unobserved heterogeneity shall be specified as follows:

(4.) [psi]([theta]) = exp(c[theta])

where [theta] is the individual heterogeneity that remains constant within each spell and c is the associated coefficient.

Finally, the baseline hazard is parameterized in as general a form as possible. To this end, the Box-Cox formulation is adopted because the most commonly used probability distributions are nested within this general form (Cox, 1972).

(5.) [[lambda].sub.o](t) = exp[[[gamma].sub.o] + [J.summation over (j=1)] [[gamma].sub.j] ([T.sup.[epsilon]j] - 1)/[[epsilon].sub.j]]

The baseline hazard captures the time elapsed since embarking on search and T is the duration of search. Here again, the expression exp(.) ensures the non-negativity of the baseline hazard and, hence, the hazard. Two commonly used distributions in studies on duration, namely the Weibull and Erlang-2, are used in this study. These are nested in the Box-Cox formulation and statistical tests can, therefore, be performed to test their suitability. Table 1 illustrates restrictions on the parameters in Equation (5.) and the resulting probability distribution.

ESTIMATION

Defining Y= ([[gamma].sub.0], [[gamma].sub.1], [beta], c), the method of maximum likelihood is used to estimate Y. The likelihood function of Y for individual i on [theta] is, therefore, given by:

(6.) [L.sub.i](Y|[theta]) = [f([t.sub.i]|[theta])]

Substituting for f(t|[theta]) into the above equation and assuming the covariates remain constant during the search, we obtain the following conditional likelihood function:

(7.) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By integrating over the distribution of [theta], the nuisance term u is eliminated. Therefore, the unconditional likelihood function [L.sub.i](Y) is given by:

(8.) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Parameter estimates are obtained by maximizing the likelihood function across all N individuals in the sample:

(9.) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The specification of G([theta]) is approached in two ways. First, following past research a standard normal distribution is adopted (Massy, Montgomery & Morrison, 1970). Second, a non-parametric approach is used in which the structural parameters and the distribution of the unobserved covariates of the model are jointly estimated. Results for the two approaches will be compared.

Estimates of the parameters are obtained through an iterative maximum likelihood procedure using the CTM (Continuous Time Model) program developed by George Yates at the National Opinion Research Institute (NORI), Chicago, Illinois. Non-parametric estimation of the unobserved heterogeneity requires the joint estimation of Y and the cluster points [[theta].sub.1]..... [[theta].sub.s]. . Estimates obtained through the iterative likelihood procedure are consistent (Amemiya, 1985). To ensure that global optimality is obtained, the iterative program is applied to different sets of starting values. When the final estimates are nearly identical, global optimality is concluded to have been achieved (Yi, Honore & Walker, 1987).

DESCRIPTION OF COVARIATES

Search is defined as the effort directed toward the acquisition of marketer and non-marketer dominated information from the external environment. It begins when need triggers the serious consideration of a purchase and ends with the actual purchase transaction (Beatty & Smith, 1987; Srinivasan & Ratchford, 1991). The hazard rate is modeled as the dependent variable. In essence, the hazard can be thought of as the rate of terminating search and is inversely related to duration.

Based on a general knowledge of the automobile market and from current marketing literature on search behavior, motivational determinants of search that would influence the distribution of the hazard function are identified. The variables included in the study are consumer, product and demographic factors. Variable names are included in parenthesis.

1) Amount of Experience (AMOUNT, +) is defined as the number of new automobiles purchased in the last ten years. Consumers who have experience in buying cars are likely to develop simplifying procedures that reduce the amount of time required to reach a decision (Alba & Hutchinson, 1987; Johnson & Russo, 1984; Furse, Punj & Stewart, 1984). By being more efficient, the amount of time spent by the consumer in searching for external information is reduced. Duration of search is, therefore, reduced, and it is posited that the amount of experience is positively related to the hazard function.

(2.) Perceived Risk (RISK, -) is a measure of the consumer's belief of the chance of incurring a physical, financial, performance and convenience loss (Peter & Ryan, 1976; Peter & Tarpey, 1975; Srinivasan & Ratchford, 1991). The higher a consumer's perceived risk of making a wrong choice, the greater is the duration of search. Hence, perceived risk is negatively related to the hazard function.

(3.) Evoked Set (EVOKE, -) is the number of models included in the individual's consideration set. A larger set would require more extensive information search as opposed to a smaller set, thereby leading to an extended duration of search. Evoked set is, therefore, hypothesized to be negatively related to the hazard function.

(4.) Perceived Benefits of search (BENFT, -) is a measure of a consumer's perception of potential gains from search. For example, the consumer may benefit in the form of obtaining a better price or a more satisfactory model. A greater perception of the benefits of search would drive the consumer toward more extensive search.

(5.) Interest (INTRST, -) in the product class would result in more time spent collecting information (Maheswaran & Sternthal, 1990). The rate of terminating search is, thereby reduced. Interest is, therefore, hypothesized to be negatively related to the hazard function.

(6.) Knowledge (KNO, +) is the knowledge and understanding that an individual has of a product within a particular product class. It enables the person to process information more efficiently by excluding irrelevant information (Bettman & Park, 1980; Johnson & Russo, 1984; Beatty & Smith, 1987; Urbany, Dickson & Wilkie, 1989; Brucks & Schurr, 1990). Duration of search is thereby shortened, and product class knowledge is, therefore, posited to be positively related to the hazard function.

(7.) Positive experience (EXPER, +) with the product reflects the quality of past experience with the previous car and the dealer or manufacturer. A positive experience builds feelings of trust and confidence toward the manufacturer and/or dealer and impacts positively on decision making in that product category. Positive experience manifests itself in simplified decision processes often based on simplistic rules (such as purchasing the same brand of car or buying from the regular dealer). This is similar to what Bettman and Zins (1977) refer to as "preprocessed choice." Therefore, we expect greater amounts of positive experience to be accompanied by shorter durations of search. In other words, positive experience is positively related to the hazard function.

(8.) Price (PRICE, -). This is defined as the net price after taxes. Consumers tend to spend a long time searching for items of higher value (Kiel & Layton, 1981). The higher the price of the automobile, the more extended the duration of search. Price is posited to be negatively related to the hazard function.

(9.) Discount (DISCOUNT, +). This is the combined total manufacturer and dealer discounts. Discounts act as incentives to purchase. Larger discounts would encourage consumers to terminate search and complete the purchase transaction. Therefore, large discounts are associated with higher rates of terminating search, and the covariate is hypothesized to be positively related to the hazard function.

(10.) Age (AGE, +) reflects the lifestage of an individual. Hempel (1969) and Srinivasan and Ratchford (1991) have shown that older individuals tend to engage in less search. In other words, their duration of search is smaller. Hence, age is hypothesized to be positively related to the hazard function.

(11.) Education (EDU, -) is used as a proxy measure of an individual's ability to collect, process and use external information (Newman & Staelin, 1972; Ratchford & Srinivasan 1993). More educated consumers tend to engage in extended search, thereby, leading to higher durations of search. Education is, therefore, negatively related to the hazard function.

While all attempts have been made to adequately measure and include variables that might account for heterogeneity, it is expected that there remain some factors which are unaccounted for or unmeasurable. The heterogeneity factor, c, captures these unexplained effects and leaves the estimated parameters unbiased.

DATA

The data set used in this study is a subset of a data set obtained through a mail survey of people who registered new cars in a northeastern SMSA. The questionnaires elicited response from the person mainly responsible for buying the new car. After eliminating all cases with any missing data, 1024 usable cases remained representing a response rate of 46%. These were employed in the analysis.

The measure of time spent searching in this data set is the sum of self-reported time spent in the search process on the following categories: talking to friends/relatives, reading books/magazine articles, reading/listening to ads, reading about car ratings in magazines, reading automobile brochures/pamphlets, driving to/from dealers, looking around showrooms, talking to salespersons, test driving cars.

DISCUSSION

Equation (9.) is estimated using the iterative maximum likelihood procedure. While several commonly used distributions for the hazard function were estimated, the Weibull hazard gave the best results. Results for this model are displayed in Table 2. This table reports results for three different specifications of the unobserved heterogeneity factor, namely a specification that does not account for unobserved heterogeneity, another that assumes standard normality, and finally a non-parametric specification that represents heterogeneity in terms of a discrete distribution of mass points.

In estimating the non-parametric specification, the end points of the interval over which the support points are estimated are fixed at 0 and 1, and other support points between these are determined in estimation. Also, the probability mass associated with each point is estimated. In estimation, support points are added one at a time until two points become clustered at approximately the same location. In the analysis, five support points are required to adequately estimate the underlying probability distribution. The estimated support points are 0, 0.33, 0.55, 0.75 and 1.00 with associated probabilities of 0.0264, 0.1392, 0.3194, 0.3924 and 0.1226 respectively.

Effects of Time on Search Termination

Looking across the columns in Table 2, the duration term (coefficient of lnt) is seen to be significant at the 0.001 level. It takes on the values of 0.25, 0.54 and 3.36 under the no heterogeneity, standard normal and non-parametric specifications respectively.

The hazard is positively related to lnt in Table 2, implying that the longer the time elapsed while searching, the greater is the likelihood of terminating search. For the non-parametric heterogeneity case, the estimated coefficients of lnt exceeds one, implying that the second derivative of the hazard with respect to time is positive, which means that the hazard increases at an increasing rate.

Effects of Covariates on Search Termination

The covariates with the strongest effects on the duration of search are perceived benefits of search and size of the evoked set, both of which tend to lower the hazard and lengthen the search. As expected, both amount and type of experience are associated with an increased hazard, and hence a shorter duration of search. The effect of the covariate interest is significant at the 0.001 level. Interest in a certain product class encourages more external search for information. Past research indicate that knowledgeable consumers experience pleasure in collecting and processing information (Maheswaran & Sternthal,1990). The results are similar to Srinivasan and Ratchford (1991) who have shown that the interest a consumer has for a certain product class is a major motivator of search. It follows, then, that greater interest leads to a lower probability of terminating search or lower hazard values.

While these results are in general agreement with past studies of search effort for automobiles, our study has the advantage of controlling for changes in the hazard through time, and for unmeasured heterogeneity. The estimated effect of several of the covariates changes considerably when heterogeneity is taken into account, indicating that it is important to control for unmeasured heterogeneity when studying the determinants of search.

The heterogeneity factor, c, shown in Table 2 is significant at the 0.001 level in both the standard normal and non-parametric specifications, thereby rejecting the null hypothesis of no heterogeneity. This implies that unobserved heterogeneity has a positive impact on the hazard function and if unaccounted for will contaminate the parameter estimates (Heckman & Singer, 1984). While the more flexible non-parametric model yields a higher log likelihood than the model with normal heterogeneity, the two models are not nested, and no formal significance test for their difference was run.

CONCLUSION

This study attempted to model the stochastic nature of search. One of its contributions is to provide a framework within which three distinct effects on the hazard function can be examined. They are the effects of time, the influence of observed product and consumer motivational factors and the significance of unobserved or unmeasured heterogeneity.

The results show significant amounts of duration dependence and point to duration as a major determinant of the rate of terminating search. The effects of time elapsed since commencing search is biased when unobserved heterogeneity is not taken into account.

Another important finding relates to the magnitude and nature of the unobserved heterogeneity. This component is found to be highly significant and exerts substantial impact on the parameter estimates. The results highlight the importance of accounting for unobserved heterogeneity and the sensitivity of parameter estimates to the specification of its distribution. Problems associated with the assumption of standard normality for the unobserved heterogeneity are also presented.

Covariates that exert the largest impact on the hazard function are found to be the perceived benefits of search and the size of the evoked set. Price and amount of discount are also found to be significant. Of the demographic characteristics, age is found to be positively related to the hazard while education is not significant.

Due to the nature of the data, this study restricts itself to a single spell. Assuming that consumers build up an inventory of knowledge and experiences, which impact on future actions and choices, it would be interesting to build and estimate a model incorporating several spells.

REFERENCES

Alba, J. B. & Hutchinson J. W. (1987). Dimensions of consumer expertise. Journal of Consumer Research, 13, 411-454.

Amemiya, T. (1985). Advanced econometrics. Cambridge, MA: Harvard University Press.

Bayus, B.L. (1991). The consumer durable replacement buyer. Journal of Marketing, 55, 42-51.

Beatty, S. E. & Smith S.M. (1987). External search effort: An investigation across several product categories. Journal of Consumer Research, 14, 83-95.

Bettman, J. & Park C. W. (1980). Effects of prior knowledge and experience and purchase of the choice process on consumer decision processes: A protocol analysis. Journal of Consumer Research, 7, 234-248.

Bettman, J. & Zins M. (1977). Constructive processes in consumer choice. Journal of Consumer Research, 4, 75-85.

Brucks, M. & Schurr P. (1990). The effects of bargainable attributes and attribute range knowledge on consumer choice processes. Journal of Consumer Research, 4, 409-419.

Bucklin L.P. (1969). Consumer search role: Enactment and market efficiency. Journal of Business, 42, 416-438.

Cox D.R. (1972). Regression models and life-tables. Journal of Royal Statistical Society, 34, 187-200.

Flinn, C. & Heckman J. (1982). Models for the analysis of labor force dynamics. Advances in Econometrics, 1, 35-95.

Furse, D. H., Punj G. N. & Stewart D. W. (1984). A typology of individual search strategies among purchasers of new automobiles. Journal of Consumer Research, 10, 417-427.

Heckman, J. & Singer B. (1984).A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica, 52, 271-320.

Jain, D. C. & Vilcassim N. J. (1991). Investigating household purchase timing decisions: A conditional hazard function approach. Marketing Science, 10, 1-13.

Johnson, E. C. & Russo J. E. (1984). Product familiarity and learning new information. Journal of Consumer Research, 11(June), 542-550.

Jones, S. (1988). The relationship between unemployment spells and reservation wages as a test of search theory. The Quarterly Journal of Economics, 743-765.

Kiel, G. C. & Layton R. A. (1981). Dimensions of consumer information seeking. Journal of Consumer Research, 8, 233-239.

Lancaster, T. (1985). Simultaneous equations models in applied search theory. Journal of Econometrics, 28, 113-126.

Lancaster, T. (1990). The Econometric Analysis of Transition Data. New York: Cambridge University Press.

Maheswaran, D. & Sternthal, B. (1990). The effects of knowledge, motivation and type of message on ad processing. Journal of Consumer Research, 17(1), 66-73.

Marmorstein, H., Grewal D. & Fishe R. (1992). The value of time spent in price-comparison shopping: Survey and experimental evidence. Journal of Consumer Research, 19, 52-61.

Massey W. F., Montgomery D. G. & Morrison D.G. (1970). Stochastic models of buying behavior. Cambridge, MA.: MIT Press.

Newman, J. W. & Staelin R. (1972). Prepurchase information seeking for new cars and major household appliances. Journal of Marketing Research, 9, 249-257.

Peter, P. J. & Ryan M. J. (1976). An investigation of perceived risk at the brand level. Journal of Marketing Research, 13, 186-188.

Peter, P.J & Tarpey, Sr., L. X. (1975). Comparative analysis of three consumer decision strategies. Journal of Consumer Research, 2, 29-37.

Punj, G.N. & Staelin R. (1983) A model of consumer search behavior for new automobiles. Journal of Consumer Research, 9, 366-380.

Putsis, W. & Srinivasan, N. (1994). Buying or just browsing? The duration of purchase deliberation. Journal of Marketing Research, 31, 393-402.

Ratchford, B. T. & Srinivasan, N. (1993). An empirical investigation of returns to search. Marketing Science, 12, 73-87.

Srinivasan, N. & Ratchford, B. T. (1991). An empirical test of a model of external search for automobiles. Journal of Consumer Research, 18, 233-241.

Urbany, J., Dickson P. & Wilkie W.(1989).Buyer uncertainty and information search. Journal of Consumer Research, 16, 208-215.

Vilcassim, N. J. & Jain D. C. (1991). Modeling purchase-timing and brand-switching behavior incorporating explanatory variables and unobserved heterogeneity. Journal of Marketing Research, 28, 29-41.

Yi K.M., Honore B. & Walker J.(1987). Program for the estimation and testing of continuous time multi-state multi-spell models, user's manual, program version 50. Chicago, Ill: National Opinion Research Center.

Peggy Choong, Niagara University

Table 1: Restrictions on Parameters in Equation (5) and the Resulting
Probability Distribution

 Corresponding
 Probability
 Restrictions Baseline Hazard Distribution

1. [[gamma].sub.k] = 0 exp([[gamma].sub.0]) = Exponential
 [k.sup.3]1 constant

2. [[member of].sub.1] = 0; exp([[gamma].sub.0] + Weibull
 [[member of].sub.k] = 0 [[gamma].sub.1] lnt)
 [k.sup.3]

3. [[member of].sub.1] = 1; exp[([[gamma].sub.0] - Approximately
 [R] 0; [[gamma].sub.1] - Erlang-2
 [[member of].sub.3] = 2 [[gamma].sub.1.sup.2]/2)
 [[gamma].sub.1] < 0; + lnt +
 [[gamma].sub.2] = 1 ([[gamma].sub.1.sup.2] /
 [[gamma].sub.3] = 2) [t.sup.2]]
 [[gamma].sub.1.sup.2];
 [[gamma].sub.k] = 0
 [k.sup.3] 4

Table 2: Parameter Estimates

 (a) No (b) Standard (c) Non-
Variables Heterogeneity Normal Parametric

Intercept 4.570 (++++) 22.798 (++++) 9.052 (++++)
 (0.307) (0.744) (0.528)
lnt 0.249 (++) 4.426 (++++) 3.356 (++++)
 (0.029) (0.167) (0.179)
KNO (+) 0.885 (++) 4.921 (++++) 3.688 (++++)
 (0.383) (0.579) (0.621)
EXPER (+) 1.225 (++++) 1.206 (++) 6.261 (++++)
 (0.336) (0.509) (0.601)
AMOUNT (+) 1.085 (++++) 0.461 2.695 (++++)
 (0.309) (0.490) (0.554)
RISK (-) -0.453 -2.219 (++++) -1.279 (++)
 (0.272) (0.478) (0.566)
EVOKE (-) -2.459 (++++) -13.913 (++++) -11.752 (++++)
 (0.329) (0.687) (0.675)
BENFT (-) -3.588 (++++) -16.371 (++++) -14.438 (++++)
 (0.305) (0.684) (0.782)
INTRST (-) -0.830 (++++) -2.558 (++++) -1.705 (++++)
 (0.181) (0.235) (0.376)
PRICE (-) -1.259 (+++) -1.054 (+) -1.437 (+++)
 (0.392) (0.548) (0.596)
DISCOUNT (+) 0.964 (++) 1.412 (++) 3.009 (++++)
 (0.427) (0.589) (0.651)
AGE (+) 1.150 (++++) 7.199 (++++) 3.440 (++++)
 (0.325) (0.527) (0.596)
EDU (-) 0.618 (+) 2.892 (++++) 0.865
 (0.339) (0.493) (0.524)
HETEROGENEITY -- 3.887 (++++) 14.305 (++++)
 FACTOR (c) (0.128) (0.642)
Negative Log 1624.39 1617.64 1606.0
 Likelihood

Standard errors are in parentheses.

((++++)) Significant at the p = 0.001 level;

(+++) Significant at the p = 0.02 level;

(++) Significant at the p = 0.05 level;

(+) Significant at the p = 0.1 level.