数学
计量经济学
估计员
人口
统计
回归分析
残余物
线性回归
回归
工具变量
应用数学
算法
社会学
人口学
作者
Kwok Leung Tsui,Nicholas P. Jewell,Changbao Wu
标识
DOI:10.1080/01621459.1988.10478664
摘要
Abstract A description is given of a new method of estimating the regression parameters in the linear regression model from data where the dependent variable is subject to truncation. The residual distribution is allowed to be unspecified. The method is iterative and involves estimation of the residual distribution under the truncated sampling scheme. The technique can be interpreted as an iterative bias adjustment of the observations in order to correct the regression relationship in the sampled population to match that of the model. A simulation study compares the performance of various estimators, including one suggested by Bhattacharya, Chernoff, and Yang (1983). This truncation regression problem arises in many contexts of scientific and social research. In economics Tobin (1958) analyzed household expenditure on durable goods using a regression model that took account of the fact that the expenditure is always nonnegative. A more general situation was studied by Hausman and Wise (1976, 1977) in connection with negative income-tax experiments. Another example concerning the schooling and earnings of low achievers was studied by Hansen, Weisbrod, and Scanlon (1970). There is also a controversy in astronomy involving Hubble's law and Segal's chronometric theory (Nicoll and Segal 1982; Turner 1979). Both theories predict a straight line relating the negative log of luminosity and the log of velocity as measured by red shift for celestial objects. The problem is complicated by the fact that objects of low luminosity are not visible, and hence all data relating to them are unobserved. Holgate (1965) described a biological example. A truncated linear regression model is defined as y = x T β + e, where x is a vector of covariates, β is the vector of parameter of interest, and e is independent of x with mean 0 and cumulative distribution F. The datum (x, y) is observed only if y ≤ y 0. The truncation point y 0 is known. Based on n independent observations (x i , y i ) with yi ≤ y 0, it is desired to estimate β and F. Note that this differs from the censored regression model where data (x, y) with y > y 0 is observed but with the y value set to y 0. The procedures described in the article are easily extended to truncation from below and the situation where the truncation points vary across observations. It is straightforward to see that the ordinary least squares estimate of β is inconsistent. A common method of dealing with this problem is to assume that the error distribution F is Gaussian and proceed with standard parametric methods. In many applications this assumption may not be reasonable. Hence there is interest in developing nonparametric methods of estimation that do not rely on assumptions about F. In this article a new approach for estimating β is introduced. The method allows the error distribution F to be arbitrary and is general enough to handle multiple linear regression. The rank-based method of Bhattacharya et al. (1983), which was designed for simple linear regression, is compared with the proposed method using a simulation study. The new approach appears to give estimators with good bias and efficiency properties in a wide variety of situations.
科研通智能强力驱动
Strongly Powered by AbleSci AI