2020 California Neighborhoods Count

Validation of U.S. Census Population Counts and Housing Characteristic Estimates Within California

by Lane F. Burgette, Beverly A. Weidmer, Robert Bozick, Aaron Kofner, Michael Tzen, Jennie E. Brand, Hiram Beltran-Sanchez, Regina A. Shih

Download eBook for Free

Full Document

Does not include Appendix.

FormatFile SizeNotes
PDF file 5.4 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Appendix: Supplemental Tables

FormatFile SizeNotes
PDF file 2.8 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Research Questions

  1. How accurate are the 2020 Census population totals for a subset of census blocks across California?
  2. What kinds of adaptations may explain differences in 2020 Census and CNC-based estimations of population and housing unit totals for small areas (i.e., census blocks)?

The U.S. Constitution mandates that the federal government count all persons living in the United States every ten years. The census is critical to states because its results are used to reapportion seats in the U.S. House of Representatives; guide redistricting; and form the basis for allocating federal funds, such as those used for schools, health services, child care, highways, and emergency services.

In response to long-standing concerns about the accuracy of census data and about a possible undercount, a group of researchers conducted the California Neighborhoods Count (CNC) — the first-ever independent, survey-based enumeration to directly evaluate the accuracy of the U.S. Census Bureau's population totals for a subset of California census blocks.

This 2020 research was intended to produce parallel estimates of the 2020 Census population and housing unit totals at the census block level, employing the same items as the census and using enhanced data collection strategies and exploration of imputation methods. Although the CNC was intended to largely replicate census data collection processes, there were a few methodological differences: For example, much of the address canvassing for the 2020 Census was done in-office, whereas the CNC team undertook a complete in-person address-listing operation that included interviews with residents and door-to-door verification of each structure.

In this report, the researchers detail their methodology and present the enumeration results. They compare the 2020 Census counts with the CNC estimates, describe limitations of their data collection effort, and offer considerations for future data collection.

Key Findings

In general, the CNC population estimates are broadly similar to the 2020 Census numbers, and this study does not offer evidence that invalidates the 2020 Census

  • The CNC estimated around 3 percent fewer individuals (51,812) than the census counted (53,295).
  • Because of low CNC survey response rates and a lack of available additional high-quality administrative data to help with imputation, it was not possible to determine whether subgroups within the blocks surveyed were undercounted or overcounted.

The CNC's in-person address canvassing may have contributed to a higher count of housing units than the census count

  • The CNC identified 23,929 housing units, while the 2020 Census identified 22,668.
  • Despite advancements in geospatial imaging software, as well as many other approaches used by the U.S. Census Bureau to assess coverage and validate addresses, in-field address verification might yield a more complete accounting of inhabited housing units than partially conducting address canvassing with in-office approaches.

Bayesian Improved Surname Geocoding (BISG) is a potent methodological tool for supplementing survey responses that are missing information on race/ethnicity

  • This method formalizes the observation that knowing a person's name provides some information about what that person's self-identified race/ethnicity might be.
  • BISG allowed the researchers to use administrative records that provide names but not race/ethnicity information to improve their imputations of race/ethnicity.
  • Given the relatively high rates of missing data in many blocks, the resulting estimates were sensitive to the statistical model used to impute the missing data.

Table of Contents

  • Chapter One

    Introduction

  • Chapter Two

    Sampling and Survey Methods

  • Chapter Three

    Response Rates, Imputation, and Adjustment Strategies

  • Chapter Four

    Comparison of 2020 Census Counts with CNC Responses

  • Chapter Five

    Summary and Conclusion

  • Appendix A

    CNC Block Observation Form

  • Appendix B

    CNC Address-Canvassing Form Final Codebook

  • Appendix C

    CNC Survey and Enumeration Form

  • Appendix D

    CNC Short-Form Survey

  • Appendix E

    CNC Study Brochure

  • Appendix F

    Data Elements Used to Impute Missing Data

Research conducted by

The study was sponsored by California Complete Count — Census 2020 and conducted in the Community Health and Environmental Policy Program within RAND Social and Economic Well-Being.

This report is part of the RAND Corporation Research report series. RAND reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.