Combined Stochastic and Rule-based Approach to Improve Regression Models with Mismeasured Monotonic Covariates Without Side Information
ZEW Discussion Paper No. 11-013 // 2011The education variable in the IAB datasets suffers from problems like missing and misclassified values. The data problems do not occur randomly, but are highly associated with other variables from the dataset. This issue has become more and more important over the last years and it severely influences empirical findings. The education variable should represent a person's highest formal degree. People can only attain degrees over time but not lose them. This property of the education variable imposes restrictions leading to inconsistencies. Recently, this problem has been addressed in the literature by using correction rules. The newly developed procedure utilises these rules to identify misclassified values and replace them with missing values. We derive a new estimator for this kind of data correction based on a EM-based estimator for incomplete data. The estimation results of this new procedure are unbiased and consistent under the classical MAR assumption. We apply this new estimator to a set of Mincer-type wage regression for the years 1993–2003 separately in order to observe changes in the impact of the educational degrees on the wage. These coefficient estimates clearly show that the quality of education is more important than the number of years of education. We also find a rising wage differential between the different educational degrees over time. This indicates that the educational expansion of this decade does not exceed the request for high-skilled workers. Thus, we did not find any evidence that would suggest an inflation of formal education or a devaluation of degrees.
Dlugosz, Stephan (2011), Combined Stochastic and Rule-based Approach to Improve Regression Models with Mismeasured Monotonic Covariates Without Side Information, ZEW Discussion Paper No. 11-013, Mannheim.