Data Alert for RAND HRS, Version J (June – 2010)

We released Version J of the RAND HRS dataset at the beginning of March-2010. Since then, with the help of other users, we have discovered some inconsistencies in the data, some of which will be addressed in future releases, while others have been corrected.

The RAND HRS data files, including the codebook, will be updated on the HRS web site in the data download area. However, for Correction #3 (described below), we have created a series of fix files, which can be used to update the affected variables on Version J of the RAND HRS dataset that you may have already downloaded. These fix files are smaller to download, but you will need to merge the changes into your existing file. Sample programs for such a merge are provided in SAS, Stata, and SPSS.

We are sorry for any inconvenience this may have caused. Please let us know if you have any questions by emailing RANDHRSHelp@rand.org.

The issues/corrections addressed here include:

  1. Different sample sizes in the RAND HRS and Tracker 2008 datasets
  2. Attaching value labels to Stata SE version of the RAND HRS
  3. Corrections to “Whether and age when started to receive Social Security” variables

Different sample sizes in the RAND HRS and Tracker 2008 datasets

The RAND HRS dataset has a total sample size of 30,548, while the Tracker 2008 dataset (V1.0, December 2009) has a total sample size of 31,022. The table below provides a summary of the differences:

IN_RANDHRS    IN_TRACKER     Frequency 
---------------------------------------
    No           Yes             475
    No           Yes               1
    Yes           No               2
    Yes          Yes          30,546   

The first row (N = 475) represents cases in the Tracker 2008 dataset, but not in the RAND HRS dataset. These are people where either a core interview was never obtained, or an exit interview (i.e., a proxy interview on the deceased) was conducted. These are more likely to be spouses who never responded, rather than being exit interviews, as only about 75 of the 475 ever have an exit interview.

The second row (N = 1) also represents cases in the Tracker 2008 dataset, but not in the RAND HRS dataset. This particular individual, however, is an “HRS-AHEAD Overlap” case (see the RAND HRS Documentation for an explanation of the RAOVRLAP variable). Specifically, this person (HHIDPN: 204940020) was married to 204940010 in 1992, and the household was turned over to AHEAD. However, 204940020 never responded, and was left on Tracker as an HRS case (020582020). In other words, on the RAND HRS dataset, this person can be identified by their AHEAD ID (204940020), whereas on the Tracker dataset, they are identified by their HRS ID (020582020).

The third row (N = 2) represents cases in the RAND HRS dataset, but not in the Tracker 2008 dataset. One of the individuals (HHIDPN: 204940020) is described in the paragraph above. Again, this is an “HRS-AHEAD Overlap” case, which appears in the RAND HRS dataset as 204940020, but in the Tracker 2008 dataset as 020582020. The other person (HHIDPN: 22965041) looks like it should perhaps be dropped based on the following statement in the Tracker 2008 documentation:

“In the course of reviewing data, it was discovered that one line, HHID 022965 and PN 041, was indeed never a qualified respondent. Also is has been determined that the interview obtained from HHID 022965 and PN 040 cannot be verified and should not have been taken. The line for PN 041 has been removed from the tracker file and the wave specific variables have been recoded for PN 040 to reflect non-response.”

The 22965041 case does appear on the RAND-Enhanced Fat File for 2002 and the RAND HRS dataset, and will thus be removed in future releases. Moreover, according to HRS, the spouse’s (HHIDPN: 22965040) interview for 2008 should never have been taken. This case does appear on the RAND-Enhanced Fat File for 2008 and the RAND HRS dataset, and will thus be removed in future releases.

Attaching value labels to Stata SE version of the RAND HRS

The variable labels (e.g., RAGENDER, 1 = “Male”, 2 = “Female”) were inadvertently left off of the Stata SE version of the RAND HRS dataset. Users who downloaded this version of the dataset should go to the HRS data download page, and download the updated version of the Stata SE zip package (randJstataSE.zip).

Alternatively these labels can be added using the following Stata commands:

label define gender 1 "1. Male" 2 "2. Female"
label value ragender gender

and re-saving your file.

Corrections to “Whether and age when started to receive Social Security” variables

In the process of updating the following variables, we inadvertently did not incorporate the relevant data from HRS 2008:

Respondent variables:

RASSRECV whether Respondent receives Social Security
RASSAGEM age in months when Respondent first received SS income
RASSAGEB age in years when Respondent first received SS income

Spouse variables:

SASSRECV whether Spouse receives Social Security
SASSAGEM age in months when Spouse first received SS income
SASSAGEB age in years when Spouse first received SS income

The problem affected RASSRECV for 918 cases, RASSAGEM and RASSAGEB for 934 cases, SASSRECV for 723 cases, and SASSAGEM and SASSAGEB for 730 cases.

The following tables list the relevant descriptive statistics for the affected variables both before and after the corrections:

Before Corrections:

Variable      N         Mean       Std Dev        Minimum      Maximum

RASSAGEM  13218       733.276        72.338       109.000      1081.000    

SASSAGEM   9406       738.837        65.041       229.000      1081.000    

RASSAGEB  13218        61.120         6.028         9.100        90.100    

SASSAGEB   9406        61.584         5.420        19.100        90.100    

RASSRECV  30548         0.705         0.456         0.000         1.000    

SASSRECV  22841         0.631         0.482         0.000         1.000    

Value-------------------------|RASSRECV                                                                            
0.no                          |    9023                                                                            
1.yes                         |   21525                                                                            

Value-------------------------|SASSRECV                                                                            
.U=Unmar                      |    7707                                                                            
0.no                          |    8424                                                                            
1.yes                         |   14417

After Corrections:

Variable      N         Mean       Std Dev        Minimum      Maximum

RASSAGEM  13923       734.103        71.844       109.000      1081.000    

SASSAGEM   9874       739.418        64.646       229.000      1081.000    

RASSAGEB  13923        61.189         5.987         9.100        90.100    

SASSAGEB   9874        61.632         5.387        19.100        90.100    

RASSRECV  30548         0.735         0.442         0.000         1.000    

SASSRECV  22841         0.663         0.473         0.000         1.000    

Value-------------------------|RASSRECV                                                                            
0.no                          |    8105                                                                            
1.yes                         |   22443                                                                            

Value-------------------------|SASSRECV                                                                            
.U=Unmar                      |    7707                                                                            
0.no                          |    7701                                                                            
1.yes                         |   15140

To update these variables, users may choose to either download the newest version of the RAND HRS dataset from the HRS data download page, or use the fix files we have provided, which are described below. If you re-download the entire file, you do not need to use the fix files.

Fix files for download

These data are encrypted. To unencrypt, please use the passphrase provided in the “unlock_cd.txt” file found on the HRS data download page under RAND Contributions. Note that you will need WinZip V10 or higher to unzip the file. You can download WinZip from www.winzip.com.

There are separate zip files for SAS, SPSS, and Stata SE or Intercooled called rndfix_j1SAS.zip, rndfix_j1SPSS.zip, and rndfix_j1STATASE.zip or rndfix_j1STATAI.zip, respectively. Included in this zip file are:

  • this document (Data Alert VerJ June2010.doc)
  • means and frequencies showing values before and after the corrections (rndfix_j1tables.txt)
  • a sample program for updating your existing rndhrs_j file. You may need to adjust the code for the locations of files on your system. It will save your current rndhrs_j file as rndhrs_j1, then will update with the corrected variables.
  • a data file with the corrected variables. These files are encrypted.

The following lists the details for each zip file:

My RAND ?

Saved Items

Recommended