As with all surveys based on random samples, the composition of the un-weighted sample differs from the population composition. RAND constructs sampling weights to correct for this sampling error and to make the sample as representative of the population of interest as possible. The benchmark distributions against which the ALP is weighted are derived from the Current Population Survey (CPS). This choice follows common practice in surveys of consumers, for example, the Health and Retirement Study (HRS). Three weighting methods have been implemented for the ALP: cell-based post stratification, logistic regression, and raking. After some experimentation, raking was found to give the best results among these different methods. It allows finer categorizations of variables of interest (in particular, age) than cell-based post-stratification does, while still matching these distributions exactly.
Variables were created that account for interactions with gender or with the number of household members, as described below, so that the interacted distributions (i.e., income is not exactly weighted by gender as it isn’t interacted with that, and the remaining are not exactly matched on household members, etc.) are matched separately for males and females, and for number of household members. Smaller surveys at times require collapsing some of these cells in order to achieve convergence in the raking algorithm. A few surveys include additional variables that are weighted on upon request by the researcher commissioning the survey to reflect the underlying needs of the data. Examples include working status and health insurance status. The standard set of variables whose distributions are matched exactly is:
- Gender x age, with 10 categories: (1) male, 18-32; (2) male, 33-43; (3) male, 44-54; (4) male, 55-64; (5) male, 65+. Categories (6)-(10) are the same as (1)-(5), except that they are for females instead of males.
- Gender x race/ethnicity, with 6 categories: (1) male, non-Hispanic white; (2) male, non-Hispanic African American; (3) male, Hispanic and other; (4) female, non-Hispanic white; (5) female, non-Hispanic African American; (6) female, Hispanic and other.
- Gender x education, with six categories: (1) male, high school or less; (2) male, some college or associate's degree; (3) male, bachelor's degree or more; (4) female, high school or less; (5) female, some college or associate's degree; (6) female, bachelor's degree or more.
- Number of household members x (household) income, with twelve categories: (1) household with one individual, <$25,000; (2) household with one individual, $25,000-$49,999; (3) household with one individual, $50,000-$74,999; (4) household with one individual, $75,000+. Categories (5)-(8) are the same as (1)-(4), but for households with two individuals. Categories (9)-(12) are the same as (1)-(4), but for households with more than two individuals.