文章基本信息

标题：Interpreting the regression coefficient in semilogarithmic functions: a note.
作者：Krautmann, Anthony C. ; Ciecka, James
期刊名称：Indian Journal of Economics and Business
印刷版ISSN：0972-5784
出版年度：2006
期号：June
语种：English
出版社：Indian Journal of Economics and Business
摘要：Prior studies have warned us against interpreting a coefficient in semilogarithmic models as being the proportional change in the dependent variable associated with a unit change in the independent variable. In this paper, we argue that this critique regarding the regression coefficients is itself misplaced, and show that such coefficients can reasonably be interpreted as the proportional change in the dependent variable.
关键词：Coefficients;Regression analysis

Interpreting the regression coefficient in semilogarithmic functions: a note.

Krautmann, Anthony C. ; Ciecka, James

Abstract

Prior studies have warned us against interpreting a coefficient in semilogarithmic models as being the proportional change in the dependent variable associated with a unit change in the independent variable. In this paper, we argue that this critique regarding the regression coefficients is itself misplaced, and show that such coefficients can reasonably be interpreted as the proportional change in the dependent variable.

Concern about the proper interpretation of a regression coefficient in a semilogarithmic functional form was originally expressed by Halvorsen and Palmquist (1980), who warned against interpreting the coefficient of a dummy variable as being the proportional change in the dependent variable. Halvorsen and Palmquist (hereafter, H&P) argued that the proportionate change in the dependent variable, g, is related to the regression coefficient, a, on a dummy variable by the equation g = exp ([alpha]) - 1.

Given the popularity of the semilogarithmic functional form, especially in the estimation of Mincer-type wage equations, it is important to fully understand the meaning of regression coefficients. In surveying the literature on studies that estimated semilogarithmic models, it is obvious that H&P's interpretation of the regression coefficient continues to be widely held (Kennedy, 1981; Thornton and Innes, 1989; Lassibille, 1998; Asgary, et al, 1997; Benson, et al, 1998; Curington, et al, 1997; Dor and Farley, 1996; Malpezzi, et al, 1998; Cancio, et al, 1996; Macdonald and Cavalluzzo, 1996; Boulding and Purohit, 1996; Levy and Miller, 1996; Curington, 1994; Even and Macpherson, 1993; Rummary, 1992; Baimbridge, 1998; Baimbridge and Whyman, 1997) (1).

In this note, we argue that H&P's interpretation of the regression coefficient is misleading. We contend, on the contrary, that it is quite reasonable to interpret u as the proportional change in the dependent variable. This conclusion arises from the recognition that g = ([Y.sub.1] - [Y.sub.0])/Y necessarily entails a reference point, given by the value of the dependent variable in the denominator. In H&P, the authors use Y = [Y.sub.0] in the denominator of g (i.e., where [Y.sub.0] is the value of Y when the dummy variable equals zero). We argue that there is no logical reason for using [Y.sub.0] over [Y.sub.1] (where [Y.sub.1] is the value of Y when the characteristic is present). How one defines g is important, for it ends up determining whether the regression coefficient overestimates or underestimates g. When using [Y.sub.0] as the reference point, the regression coefficient does indeed underestimates g. When using [Y.sub.0] as the reference point, the regression coefficient does indeed understimate g (especially as the coefficient deviates further from zero). But using [Y.sub.1] as the reference point flips this conclusion around-the regression coefficient overestimates g.

To see this, consider the following semilogarithmic function:

In Y= [delta] + [beta]X + [alpha]D ...(1)

where X is a continuous variable, and D is a dummy variable representing some qualitative characteristic. One notion of the proportional change, and that which was used by H&P, is defined relative to [Y.sub.0], given by:

[g.sub.0] = ([Y.sub.1 - [Y.sub.0])/[Y.sub.0] ...(2)

But an equally plausible notion of the proportional change can be defined relative to [Y.sub.1] instead of [Y.sub.0]. Let [g.sub.1] be the definition of the proportional change in Y when using [Y.sub.1] as the reference point, given by:

[g.sub.1] = ([Y.sub.1 - [Y.sub.0])/[Y.sub.1] ...(3)

Because neither reference point is more appropriate than the other, we face the same ambiguity we confront when teaching Principles students about elasticities. The typical solution to this ambiguity problem is obtained by calculating an arc elasticity, where the percentage change is taken relative to the average of the "beginning" and "ending" values. This proposed approach is even more relevant in regression models where a qualitative characteristic is being measured using a dummy variable. Given the arbitrary assignment of one and zero in defining a dummy variable, one cannot argue that the proportional change is any more relevant in comparison to [Y.sub.0] (when the dummy is equal to zero) than [Y.sub.1] (when it is equal to one). Thus, there exists another equally plausible definition of the proportional change, one based on the point of reference being the average of [Y.sub.0] and [Y.sub.1], given by:

[g.sub.2] = ([Y.sub.1 - [Y.sub.0])/[bar.Y] ...(4)

where [bar.Y] = ([Y.sub.0] + [Y.sub.1])/2

Using the semilogarithmic function given in (1), these three alternative definitions of g are related to the regression coefficient a in the following manner:

[g.sub.0] = exp ([alpha]) - 1 ...(2')

[g.sub.1] = exp ([alpha]) - 1/exp([alpha]) ...(3')

[g.sub.2] = [exp ([alpha]) - 1/ exp ([alpha]) + 1] ...(4')

In Table One below are the theoretical values for [g.sub.0], [g.sub.1], and [g.sub.2] for a wide range of values of the regression coefficient a.

As Table One shows, [alpha] is bounded by [g.sub.0] and [g.sub.1], and the divergence of a from g worsens as a deviates from zero. Since neither [g.sub.0] nor [g.sub.1] provides an unambiguously correct standard with which to compare to [alpha], the definition given by (4) is equally appealing. As seen in Table One, when one compares [alpha] to [g.sub.2], the differences are very small, even for relatively large values of [alpha]. To illustrate, consider a value of [alpha] = 0.50, leading the analyst to infer a 50 per cent difference in Y due to the existence of the qualitative characteristic. The definition of the proportional change using [Y.sub.0] as the reference point would suggest the true proportional change is about 65% (i.e., [g.sub.0] = 0.649) --a nearly 25 percent understatement arising from using the regression coefficient. Yet when this proportional change is defined relative to [bar.Y], we get a value of [g.sub.2] = 0.490, meaning the regression coefficient underestimated g by only about 2 percent. While a is not exactly the same as [g.sub.2], this numerical example suggests that the misinterpretation proposed by H&P is severely overstated (even when a is relatively large).

To illustrate the appeal of using [g.sub.2] as the standard, consider the following gender wage gap example. Suppose Y denotes earnings and D is a dummy variable equal to 1 for males and 0 for females. If [alpha] = 0.50 in (1), then we would infer that males earn 50 percent more than females after controlling for other variables affecting earnings. If the claim were made that [g.sub.0] is the correct measure of the percentage change in Y, then H&P would argue that males actually earn 64.9% more than females-that is, [alpha] underestimates the true proportional change. Since the assignment of zeroes and ones is arbitrary when defining a dummy variable, suppose the zero-one value assignments had been reversed. Then (1) becomes

In Y=([delta] + [alpha])+ [beta]X - [alpha]D...(5)

where D now equals 1 for females and 0 for males (2). In this case, the regression coefficient on D in (5) will be -[alpha] = -0.50 implying that [g.sub.0] = -.393 or -39.3 or -39.3 percent. Thus, while the analyst would infer that females earn 50 percent less than males, H&P would propose that the correct proportion is actually 39.3 percent less. Ignoring negative signs, this suggests that a overestimates the 39.3% lower earnings of females. Hence, using [g.sub.0] as the standard, the regression coefficient either underestimates or overestimates the impact on Y depending on the arbitrary assignment of zeroes and ones in defining D. This inconsistency is eliminated when [g.sub.2] = -0.490 or minus 49 percent. Since D's coefficient in (5) is -0.50, the analyst would infer that females earn 50% less than males, when the "correct" answer is 49% lower earnings. This is exactly the same conclusion we would have reached had D been defined equal to 1 if male and 0 if female as in equation (1). In this case, [alpha] = 0.50 (i.e., a 50% higher salary for males) when the proportion is [g.sub.2] = 0.49.

While the journals surveyed above suggest that many economists accept the interpretation proposed by H&P, we find a marked contrast when it comes to the discussion appearing in many of the major econometrics textbooks. We found that nearly all texts continue to interpret the coefficient as the proportional change (see Greene, 2000; Studenmund, 2001; Gujarati, 1995; Wooldridge, 2000; Berndt, 1991). Given the discussion above, it is obvious that these textbook authors are technically correct only if we accept the definition of a proportional change given by [g.sub.2].

In sum, misinterpretation arguments regarding regression coefficients in semilogarithmic functions are themselves misplaced given the lack of a clear reference point for calculating changes in the dependent variable. We propose an equally appealing definition of the proportional change, one which resembles the method often used in calculating arc elasticities. Using that definition, we show that the regression coefficient in a semilogarithmic function is extremely close to the theoretical value of the proportional change. This leads us to suggest that there is no need to transform the regression coefficients as prescribed by H&P in order to get the correct proportional changes.

REFERENCES

Asgary, N., P. Gregory, and M. Mokhtari (1997), "Money Demand and Quantity Constraints: Evidence from the Soviet Interview Project", Economic Inquiry, 35:365-77. Baimbridge, M. (1998), "Academic and private Sector Salaries: Chalk and Cheese?" Applied Economic Letters, 5:211-14.

--and P. Whyman (1997), "Demand for Religion in the British Isles," Applied Economic Letters, 4:79-82.

Benson, E., J. Hansen, A. Schwartz, and G. Smersh (1998), "Pricing esidential Amenities: The Value of View", Journal of Real Estate Finance and Economics, 16: 1, 55-73.

Berndt, E. (1991), "Analyzing the Determinants of Wages and Measuring Wage Discrimination: Dummy Variables in Regression Models", in" The Practice of Econometrics, Reading, MA: Addison--Wesley Publishing (Page 173).

Boulding, W. and D. Purohit (1996), "The Price of Safety", Journal of Consumer Research, 23: 12-15.

Cancio, A., T. Evens, and D. Maume (1996), "Reconsidering the Declining Significance of Race: Racial Differences in Early Career Wages", American Sociological Review, 61: 541-56.

Currington, W. (1994), "Compensation for Permanent Impairment and the Duration of Work Absence", Journal of Human Resources, 29: 3, 888-910.

--(1997), A. Farmer, and W. Allen. "Retroactive Benefits in Income Replacement Programs: Results from a Modified Natural Experiment", Southern Economic Journal, 64: 1,255-67.

Dor, A. and D. Farley (1996), "Payment Sources and the Cost of Hospital Care: Evidence from a Multiproduct Cost Function with Multiple Payers", Journal of Health Economics, 15: 1-21.

Even, W. and D. Macpherson (1993), "The Decline of Private-Sector Unionism and the Gender Wage Gap", Journal of Human Resources, 28: 2, 279-95.

Greene, William (2000), Econometric Analysis (4th edition) New Jersey: Prentice-Hall, Inc. (Page-215).

Gujarati, Damodar (1995), Basic Econometrics (3rd edition) Your: McGraw-Hill (Page-169).

Halvorsen, Robert and Raymond Palmquist (1980), "The Interpretation of Dummy Variables in Semilogarithmic Equations", American Economic Review, 70: 3,474-75.

Kennedy, P. (1981), "Estimation with Correctly Interpreted Dummy Variables in Semilogarithmic Equations", American Economic Review, 71: 4,801.

Lassibille, G. (1998), "Wage Gaps Between the Public and Private Sectors in Spain", Economics of Education Review, 17: l, 83-92.

Levy, D. and T. Miller (1996), "Hospital Rate Regulations, Fee Schedules, and Workers' Compensation Medical Payments", Journal of Risk and Insurance, 63: 1, 35-47.

Malpezzi, S., G. Chun, and R. Green (1998), "New Place-to-place Housing Price Indexes for U.S. Metropolitan Areas, and their Determinants", Real Estate Economics, 26: 2, 235-51.

Macdonald, J. and L. Cavalluzzo (1996), "Railroad Deregulation: Princing Reforms, Shipper Responses, and the Effects on Labor", Industrial and Labor Relations Review, 50: 1, 80-91.

Rummery, S. (1992), "The Contribution of Intermitten Labour Force Participation to the Gender Wage Differential", Economic Record, 68: 202, 351-64.

Studenmund, A, H. Using Econometrics: A Practical Guide (4th edition), Boston: Addition Wesley Longman, p. 209, 2001.

Thornton, R. and J. Innes (1989), "Interpreting Semilogarithmic Regression Coefficients in Labor Research", Journal of Labor Research, 10: 4, 443-47.

Wooldridge, Jeffrey (2000), Introductory Econometrics: A Modern Approach, Australia: South-Western College Publishing, p. 184.

Table 1

[alpha] [g.sub.0] [g.sub.1] [g.sub.2]

1.00 1.718 0.632 0.924
0.75 1.117 0.528 0.717
0.50 0.649 0.393 0.490
0.25 0.284 0.221 0.249
0.10 0.051 0.049 0.100
0.05 0.051 0.049 0.050
0 0 0 0
-0.05 -0.049 -0.051 -0.050
-0.10 -0.095 -0.105 -0.100
-0.25 -0.221 -0.284 -0.249
-0.50 -0.393 -0.649 -0.490
-0.75 -0.528 -1.117 -0.717
-1.00 -0.632 -1.718 -0.924