An introduction to estimation with choice-based sample data

by James Hosek

Purchase Print Copy

 FormatList Price Price
Add to Cart Paperback12 pages $20.00 $16.00 20% Web Discount

Choice-based sampling selects a population conditional on the choices it has made. By allowing the sampling rate to be dependent on choice, certain outcomes can be sampled at higher or lower rates than occurring in the population, and can assure obtaining sufficient observations to conduct an empirical analysis of choice. Samples formed on the basis of choice can produce consistent estimates of the population making a particular choice by using weights. This paper defines the weights, demonstrates their role in estimating the population subgroup making a particular choice, and gives a theorem in the use of weights in maximum likelihood estimation of a choice equation. This theorem greatly enhances the usefulness of choice-based samples in empirical work and, in a special case of the theorem, the conditional logit model, consistent estimates of all parameters except the intercept terms can be obtained from "raw" choice-based data, without weighting.

This report is part of the RAND Corporation paper series. The paper was a product of the RAND Corporation from 1948 to 2003 that captured speeches, memorials, and derivative research, usually prepared on authors' own time and meant to be the scholarly or scientific contribution of individual authors to their professional fields. Papers were less formal than reports and did not require rigorous peer review.

The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.