Prediction of Latent Variables in a Mixture of Structural Equation Models, with an Application to the Discrepancy Between Survey and Register Data
Published Jun 13, 2008
The authors study the prediction of latent variables in a finite mixture of linear structural equation models. The latent variables can be viewed as well-defined variables measured with error or as theoretical constructs that cannot be measured objectively, but for which proxies are observed. The finite mixture component may serve different purposes: it can denote an unobserved segmentation in subpopulations such as market segments, or it can be used as a nonparametric way to estimate an unknown distribution. In the first interpretation, it forms an additional discrete latent variable in an otherwise continuous latent variable model. Different criteria can be employed to derive “optimal” predictors of the latent variables, leading to a taxonomy of possible predictors. The authors derive the theoretical properties of these predictors. Special attention is given to a mixture that includes components with degenerate distributions. They then apply the theory to the optimal estimation of individual earnings when two independent observations are available: one from survey data and one from register data. The discrete components of the model represent observations with or without measurement error, and with either a correct match or a mismatch between the two data sources.