IFLS Help Data Usage Notes, Tips and FAQs
IFLS Data Updates
For a listing of post-public release updates to the IFLS databases, see:
Utilities
For tips, utilities and answers to frequently asked questions about using the IFLS databases, see:
If you have suggestions for data updates or utilities, please send email to ifls-supp@rand.org.
Return to IFLS page.
GENERAL TIPS and answers to FAQs
INDEX
- Tracking HHs and Individuals
- Using PIDLINK: Linking individuals across the waves
- Special codes
- Meaning of variables that end with X
- Identifying geographic location of HH
- Linking HH and community data
- Can I visit IFLS respondents, facilities or communities?
- Are there price indices for IFLS1 and IFLS2?
TRACKING HHs and INDIVIDUALS
HTRACK AND PTRACK
The TRACKING FILES, HTRACK and PTRACK, are provided to facilitate using the longitudinal dimensions of the survey. All variables included in these files are drawn from interviews conducted in IFLS1 (1993) and IFLS2 (1997/8).
HTRACK: HOUSEHOLD TRACKING FILE
HTRACK contains a list of all HOUSEHOLDS that were interviewed in IFLS1; they are identified by the 1993 HOUSEHOLD identifier, HHID93. It is a 7 digit string variable. The first 3 digits are the enumeration area in which the household resided in 1993 and the next two digits are a household sequence number within that enumeration area which uniquely identifies the household. The last two digits are always '00'. (The first 5 digits of HHID93 are the same as the last 5 digits of CASE, the HH identifier variable in the original IFLS1 release.)
In IFLS2, households are identified by HHID97. If an IFLS household was found intact in 1997, it was assigned the same identifier in 1993 and 1997; in this case HHID93=HHID97. If the household had split up between 1993 and 1997, then when the first respondent from that household was re-contacted in IFLS2, that respondent's household was designated the 'original' household and the 1993 household identifier assigned to it. Each additional household that was spawned by that HHID93 was given a new HHID97. The first 5 digits of the new HHID97 are identical to the first 5 digits of HHID93 (and, therefore, all new households in 1997 that are spawned by one 1993 household share the same first 5 digits in their HHIDs). The last two digits of HHID97 are 1 (in column 6) and then a sequence number starting at 1 (in the 7th column); these digits tells us this is a split-off household. Thus the last two digits of HHID97 are '00' for the first household found in 1997 and then '11' for the first split off, '12' for the second split off and so on.
For example, say HHID93 is 2071900. This is household 19 from enumeration area 207. The household split into 3 households between 1993 and 1997. The three households are assigned HHID97 2071900 (for the first HH relocated), 2071911 (for the first split-off HH that was relocated) and 2071912 (for the second split-off HH that was relocated).
HH RECONTACT
There were 7,730 households in the target sample for IFLS1. Of those, interviews were completed with 7,224 households. These households are included in HTRACK. For information on the 506 households that were listed but never interviewed, see Book K in IFLS1.
IFLS2 sought to re-interview all 7,224 IFLS1 households. Around 6% of the target households were not interviewed. The results of our attempts to re-interview all households are summarised in RESULT97.
In addition to the approximately 6,750 'ORIGIN' households that were interviewed in IFLS1 and IFLS2, over 850 'SPLIT OFF' households were interviewed in IFLS2. These are households in which a TARGET respondent who had moved out of an IFLS1 household was interviewed. There are slightly over 7,600 households in IFLS2 that completed a household roster. These households, in combination with the households that were not found in IFLS2 make up the 8,116 households in HTRACK.
In 1997, we discovered 9 of the IFLS1 households had combined with another IFLS1 household. The original household members were interviewed in the new household.
In a small number of households, it was determined that all the members of the household had died by 1997. These were typically one or two member households in 1993 and the members in the 1993 household were typically relatively old. There were, however, a small number of households in which 1993 household members were still alive in 1997 but the household was treated as if all members had died. These cases arose because the TARGET individuals in the household had died by 1997 and the interviewers mistakenly thought they did not need to track the remaining members who had moved away.
PTRACK: PERSON TRACKING FILE
PTRACK is a person-level file that tracks all IFLS respondents across waves of the survey. PID93 is a two digit sequence number identifying each individual within a household. The combination of HHID93 and PID93 uniquely identifies every respondent in IFLS1. It may be used to link records within IFLS1. If an IFLS1 respondent was found in the original household, HHID97=HHID93 and PID97=PID93. All new respondents in IFLS2 are assigned PID97 that begins after the highest PID93 for that household. In split-off households, PID97 was assigned starting at 01 for the household head. The combination of HHID97 and PID97 uniquely identifies every respondent in IFLS2. It may be used to link records within IFLS2.
HHID93 and PID93 or HHID97 and PID97 should NOT be used to link respondents across waves of IFLS.
PIDLINK: LINKING RESPONDENTS ACROSS WAVES
Several individuals have moved across households between IFLS1 and IFLS2. In order to link records for a particular individual across waves of the IFLS, use PIDLINK. It is a unique person-level identifier which is the same in IFLS1 and IFLS2 for a particular individual. PIDLINK is a string variable comprising 9 digits. For a respondent in IFLS1 and IFLS2, PIDLINK is made up of HHID93 followed by 00 (denoting an original household member) and then PID93, the person identifier in IFLS1. If the respondent has moved from his or her original household, PIDLINK will retain the information necessary to identify the original household. For a new respondent in IFLS2, PIDLINK is made up of HHID97 and PID97.
PTRACK contains one record for every respondent. Some respondents were interviewed in both IFLS1 and IFLS2, some were interviewed only in IFLS1 and some were interviewed only in IFLS2. Note that in BK_AR1, a respondent may appear more than once in a roster since all 1993 household members are listed in the 'ORIGIN' roster. A respondent who has moved out will be designated thus (AR01A_97=3). If that respondent has been found in a new household, then AR01A_97 will equal 4. Since this respondent is found in 2 different households in 1993 and 1997, HHID93 and HHID97 are different. PID93 and PID97 will also be different in general. PIDLINK, however, remains constant.
Continuing the example of HHID93=2071900, there were 5 members in the household in 1993. In 1997, persons 1, 2 and 4 were still there but persons 3 and 5 had moved out. Person 3 was found in HHID97=2790911 and person 5 was found in HHID97=2790912.
The PTRACK records for this household are as follows:
PIDLINK HHID93 PID93 HHID97 PID97 207190001 2071900 1 2071900 1 ) Original HH 207190002 2071900 2 2071900 2 ) Persons 3 and 207190003 2071900 3 2071911 3 ) 5 have split 207190004 2071900 4 2071900 4 ) off and are found 207190005 2071900 5 2071912 2 ) elsewhere. 207191101 2071911 1 ) First split off 207191102 2071911 2 ) (207190003 is in 207191104 2071911 4 ) this HH.) 207191201 2071912 1 ) Second split off (207190005 is in this HH.)
SPECIAL CODES
The following values are reserved and have a special meaning:
Numeric Alphanumeric Meaning 5 V Top coded/out of range 6 W Not applicable 7 X Refused to answer 8 Y Don't know 9 Z Missing
Numeric special values are preceded by as many 9s as necessary to fill the field and yield an unambiguous value. For example if a field is 4 digits wide, 9998 indicates the respondent did not know the answer.
"X" VARIABLE CONVENTION
Since special values that are embedded in continuous variables are tedious to deal with, in many cases a continous variable, VAR, say, is accompanied by another variable, VARX which contains the special codes. If a valid value of VAR is recorded, VARX is set to 1. The other values of VARX provide information about why there is not a valid value.
In some cases, VARX contains information about the unit VAR is recorded in. This is common, for example, when dealing with distances, times, frequencies and so on.
In many cases, VARX does not appear in the questionnaire because the question was not asked of the respondent in this way. The variables have been created ex post to assist users with the data. In general, if you are interested in VAR you should always check to see if there is an associated VARX and use the variables in combination.
VARIABLE "VERSION" indicates DATASET VERSION
The variable VERSION identifies the release version of these data; it will be updated with each revision of the data and can be used to confirm that you are using the most recent version of the data. If you send questions to ifls-supp, please tell us the data set version that you are using.
In SAS: data _null_; set lib.bk_cov; file 'version.not'; if _n_=1 then put version; In STATA: use bk_cov list version in 1/1
LOCATION OF HOUSEHOLD
The enumeration area in 1993 (digits 1-3) in the HHID is not the location of residence of the household in 1997 (unless the household has not moved) and should not be used to determine geographic location of the household. 1993 location was built into the 1993 HH identifier, CASE. It is not built into HHID93 or HHID97.
Location information is recorded in module BK_SC in each wave of the survey. A summary is included in HTRACK. Location in 1993 is recorded in SC01_93 through SC05_93; the 1993 location codes are based on the 1993 BPS codes. Some of these codes have been changed by BPS (because the community boundaries have been re- defined, for example). The 1998 BPS codes for the location of each of our respondents are recorded in the revised kabupaten code, SC02_93R, and the revised kecamatan code, SC03_93R. (There are no revised province codes.)The 1997 location of the respondents is recorded in SC01_97 through SC05_97. These locations use the 1998 BPS codes and so may be directly compared with the revised 1993 codes, SC02_93R and SC03_93R.
MOVER97 is intended to summarise the location of the respondent in 1997, relative to the location in 1997. It is defined only for those respondents interviewed in both waves of the survey.
LINKING HH AND COMMUNITY LEVEL DATA
Commid is the variable that should be used to link household survey data with the community and facility data. COMMID93 identifies the community of residence of the respondent in 1993. COMMID97 is the 1997 community of residence. COMMID93 will be the same as COMMID97 if the respondent has not moved between the waves of the survey.
CAN I VISIT RESPONDENTS, FACILITIES OR COMMUNITIES?
No, the names, addresses, locations and neighborhoods or all IFLS respondents and facilities are strictly confidential. When respondents participate in the survey, they are given an assurance that their answers are confidentially and that their identity will not be revealed to anyone other than through an anonymous code.
The IFLS data are placed in the public domain to support research analyses. As a user of the IFLS public use files, you are expected to respect the anonymity of all our respondents. This means that you will make no attempt to identify any individual, household, family, service provider or community other than in terms of the anonymous codes used in the IFLS.
We take protection of the confidentiality of our respondents very seriously. However, we recognise that for some research questions, it may be necessary to know more about a respondent, facility or neighborhood than is available in the public use files. In such an instance, please send email to ifls-supp@rand.org. briefly explaining what research question you plan to address, why you need the identifying information and what you will do with that information. If your request does not violate our Human Subjects Protection rules, we will describe the process that you have to go through in order to obtain permission for the information to be released to you.
ARE THERE PRICE INDICES FOR IFLS1 AND IFLS2?
General CPI by province 1993 to 1997, 1986 is base year.
The source is the Central Bureau of Statistics (BPS) in Indonesia. Contact BPS (www.bps.go.id) for other indices that are available.
Prices are collected in the province capitals only. There are 22 cities included in the series from 1993 to 1997.
Thomas, Frankenberg, Beegle and Teruel discusses some of the problems associated with the BPS prices -- and, in particular, the fact that they are only available for urban areas.
1986=100 provcode CPI year 11 204.9 1997 12 216.2 1997 13 195.7 1997 14 200.6 1997 15 212.2 1997 16 195.8 1997 17 189.6 1997 18 208.2 1997 31 223 1997 32 203.2 1997 33 198.9 1997 34 205.7 1997 35 218.1 1997 51 217.8 1997 52 221.1 1997 53 196.7 1997 54 202.1 1997 61 214 1997 62 201.2 1997 63 220.7 1997 64 220.6 1997 71 205.2 1997 72 198.6 1997 73 192.8 1997 74 216.4 1997 81 272.9 1997 82 206.3 1997 11 192.1 1996 12 198.9 1996 13 182.4 1996 14 192.3 1996 15 202.1 1996 16 184.5 1996 17 180.2 1996 18 196.2 1996 31 207.9 1996 32 190.7 1996 33 190.6 1996 34 199.6 1996 35 204.4 1996 51 211.2 1996 52 208 1996 53 183.3 1996 54 191.4 1996 61 202.4 1996 62 190.3 1996 63 213.8 1996 64 212.1 1996 71 197.4 1996 72 186.8 1996 73 184.4 1996 74 205.8 1996 81 257.2 1996 82 193.2 1996 11 176 1995 12 185.4 1995 13 168.3 1995 14 179.7 1995 15 187.5 1995 16 170.2 1995 17 169.7 1995 18 179.8 1995 31 189.8 1995 32 179.3 1995 33 175.7 1995 34 182.1 1995 35 188.1 1995 51 199.7 1995 52 191 1995 53 171.5 1995 54 177.8 1995 61 187.6 1995 62 174.3 1995 63 196.5 1995 64 193.7 1995 71 175.1 1995 72 171.9 1995 73 169 1995 74 191.9 1995 81 236.9 1995 82 180.6 1995 11 161.5 1994 12 171.3 1994 13 154.8 1994 14 162.7 1994 15 172.7 1994 16 156.1 1994 17 153.6 1994 18 165.7 1994 31 171.7 1994 32 164 1994 33 165 1994 34 167.7 1994 35 173.7 1994 51 185.9 1994 52 175 1994 53 161 1994 54 165.6 1994 61 171.8 1994 62 164.6 1994 63 180.6 1994 64 178.1 1994 71 159.2 1994 72 159.3 1994 73 155.9 1994 74 175.3 1994 81 220.9 1994 82 163 1994 11 147.1 1993 12 156.1 1993 13 141.6 1993 14 148.2 1993 15 158.3 1993 16 143.6 1993 17 139.2 1993 18 152.8 1993 31 155.7 1993 32 149.4 1993 33 150.8 1993 34 152.5 1993 35 157.7 1993 51 169.1 1993 52 160.3 1993 53 147.5 1993 54 156.4 1993 61 158.5 1993 62 152.3 1993 63 167.2 1993 64 165.1 1993 71 144.6 1993 72 146 1993 73 142.3 1993 74 157.3 1993 81 204.2 1993 82 150.3 1993
Return to top of page.
Return to IFLS page.
STATA UTILITIES
Return to top of page.
Return to IFLS page.
SAS UTILITIES
- Importing IFLS2 EXPORT files in SAS6
- Using FORMAT libraries provided with IFLS2
- Reading ASCII (RAW) data files into SAS
IFLS2 EXPORT FILES
SAS export files in IFLS2 are grouped in modules. The files were created using PROC COPY. The following program. created the HH level data files. An example showing how to read the export files is provided at the end of the code.
*-------------------------------------------------------*; * EXPORT sas datasets for IFLS2 HH *-------------------------------------------------------*; libname lib v612 "LOCATION OF LIBRARY OF SAS DATASETS"; libname library v612 "LOCATION OF FORMATS FOR SAS DATASETS"; libname fmt xport "hh97fmt.xpt"; libname bk xport "hh97bk.xpt"; libname b1 xport "hh97b1.xpt"; libname b2 xport "hh97b2.xpt"; libname b3 xport "hh97b3.xpt"; libname b4 xport "hh97b4.xpt"; libname b5 xport "hh97b5.xpt"; * convert formats into common structure so can be exported; proc format library=library cntlout=lib.hhfmts; proc copy in=lib out=fmt; select hhfmts; * copy each book of modules into a single export file; proc copy in=lib out=bk; select htrack ptrack bk_cov bk_sc bk_ar0 bk_ar1 bk_krk ; proc copy in=lib out=b1; select b1_cov b1_ks0 b1_ks1 b1_ks2 b1_ks3 b1_ks4 b1_pp1 ; proc copy in=lib out=b2; select b2_cov b2_kr b2_ut1 b2_ut2 b2_nt1 b2_nt2 b2_hr1 b2_hr2 b2_hi b2_ge ; proc copy in=lib out=b3; select b3a_cov b3a_dl1 b3a_dl2 b3a_dl3 b3a_dl4 b3a_dlr1 b3a_dlr2 b3a_hr0 b3a_hr1 b3a_hr2 b3a_hi b3a_kw1 b3a_kw2 b3a_kw3 b3a_pk1 b3a_pk2 b3a_pk3 b3a_br b3b_cov b3b_km b3b_kk b3b_ak b3b_ma1 b3b_ma2 b3b_ps b3b_rj1 b3b_rj2 b3b_rn1 b3b_rn2 b3b_pm1 b3b_pm2 b3b_pm3 b3b_ba0 b3b_ba1 b3b_ba2 b3b_ba3 b3b_ba4 b3b_ba5 b3b_ba6 b3p_cov b3p_kw1 b3p_dl1 b3p_dl3 b3p_dl4 b3p_pm1 b3p_pm2 b3p_km b3p_kk b3p_ma b3p_rj1 b3p_rj2 b3p_rn1 b3p_rn2 b3p_br b3p_ch0 b3p_ch1 b3p_cx b3p_ba0 b3p_ba1 b3p_ba2 b3p_ba3 b3p_ba4 b3p_ba5 b3p_ba6 ; proc copy in=lib out=b4; select b4_cov b4_kw1 b4_kw2 b4_br b4_ba6 b4_bx6 b4_bf b4_ch0 b4_ch1 b4_cx1 b4_cx2 ; proc copy in=lib out=b5; select b5_cov b5_dla1 b5_dla2 b5_dla3 b5_maa0 b5_maa1 b5_psa b5_rja0 b5_rja1 b5_rja2 b5_rja3 b5_rna1 b5_rna2 ; * to import use, for example:; proc copy in=bk out=lib; * this will select all files from module bk and place the; * sas datasets in sas library given by ddname=lib;
The export file containing all SAS datasets was created in the same way without the SELECT statement. You may use the SELECT statement when you import the data sets. See PROC COPY in the SAS manual.
SAS FORMAT LIBRARIES
The example above includes code to convert the FORMAT LIBRARY into a structure that allows it to be exported. The FORMAT library stores all "value labels" (or format assignments). If you want to use those value labels, you should make them accessible to SAS using the LIBRARY statement to point to the directory in which they are stored on your computer system.
libname LIBRARY "your directory";
If you do not want to use the formats, you may override them in several ways.
Using options nofmterr statement and not referencing the FORMAT library
The statement
*disable value label error messages; options nofmterr;
in your program will disable value label error messages. Make sure that you do not have a LIBRARY statement pointing to the format library. SAS will fail to find the format or value label assigned to a variable and, because the nmfmterr option is turned out, SAS will move on to the next statement in your program.
Unformatting all variables in a dataset
In the dataset, bk_ar1, for example, you may unformat all variables with the following code:
*unformat all variables in dataset; data bk_ar1; set bk_ar1; format _all_ ;
The statement _all_ refers to all variables in the dataset and tells SAS to revert to the default (null) format for all the variables. See the FORMAT statement in the SAS manual.
READING ASCII (RAW) DATA FILES
A SAS program to read the IFLS2 HH data files are stored with the zip file containing the data. The program is also available here .
If you wish to make permanent SAS datasets, you will need to set up a LIBNAME statement and give the datasets that are created a two-level name.
For example: libname PERMDATA "your_directory"; data PERMDATA.htrack; infile intrk pad lrecl=141; etc.
In this case, you will need to make a permanent FORMAT library which you access when you load the data. To do this, set up a LIBNAME for the format LIBRARY:
libname LIBRARY "your directory";
and amend the PROC FORMAT statement at the top of the read file:
PROC FORMAT LIBRARY=LIBRARY
When you load the data, ensure the libname LIBRARY is at the top of your program. This will ensure the format library is accessible to the data.
If you do not want to have variables formatted (or assigned value labels), delete (or comment out) the format assignment statements in the read program at the end of the input statement for each dataset.
For example: /* COMMENT THESE VALUE LABEL ASSIGNMENTS OUT... format RESULT93 RES_DONE.; format RES93BK RES_DONE.; format RES93B1 RES_DONE.; ... format MOVER97 MOVER.; */
See, also, the helppage on using FORMAT libraries with IFLS2.
Return to:
| Top of document | Utilities and tips | IFLS home page |
IFLS1 Updates
All updates to IFLS1 have been applied to IFLS1-RR. You are encouraged to use those data. For a list of updates applied to IFLS1, see the fixes files provided with the original release and the updates listed in the FLS Newsletters.
Return to top of page.
Return to IFLS page.
IFLS1-RR Updates
No updates have been made since the data were made public.
Expenditure computations
For those interested in how the expenditure variables in the IFLS1-RR subfile "expend2" were generated, the programs that created those variables are available. They are SAS programs and are not supported by RAND. As noted in the IFLS1-RR documentation, the "expend2" file was created by another project and was given to us to share with other users.
Download SAS expenditure programs.
Return to
top of page.
Return to IFLS page.
IFLS2 Updates
We encourage you to use the most recent version of the public use data. The current version of the HH data is 28MAY2000.
You can tell which version you are using by printing an observation of the variable VERSION which is contained in all public use files. If you are not using the most recent version, we encourage you to update your data files. You will need to re-register to return to the download page. When you re-register, you need only give us your name and email address. (Only one entry will be stored in the user database.)
Return to download page
IFLS3 Updates
The CFS-mini files were updated on Oct 7, 2005. The id variable MKID00 was added to them.
In March 2008, the file B3A_TK3 was updated to correct a problem with TK31AA, TH41A, TK32B, and TK42B. For those who changed jobs, the codes for those items had not been properly merged on—all jobs across the years for a respondent had the same industry and occupation codes. This has now been corrected and the data shows industry/occupation changes over time as reported in B3A_TK3.
On Sept 1, 2009, the HH Book 1 files were updated to correct for a problem with missing HHID00 values. The problem occurred in an update of the data done after 2005 so earlier versions of the Book 1 files would not have this particular problem.
When the HH Book 1 files were updated, a few other identifier issues were updated in the IFLS3 household data. This 2009 updated version of the IFLS3 household data also includes the HHID and PIDLINK changes based on the 2007 field work. These were applied to both the 2007 survey and the 2000 survey. Therefore some users might notice changes to a few records if they re-download the 2000 data and compare it to earlier versions. In particular, for ptrack, we dropped a number of duplicate pidlinks (we dropped one record from each pair) and consolidated the responses to the subsequent questions.
Return to top of page.
Return to IFLS page.
IFLS4 Updates
The B3A_TK2 data file was revised on June 30, 2009 to correct for a problem with the occupation codes. The old occ2007 has been replaced with occ07tk2 (code for job in TK20a) and occ07tk3 (code for job in TK20b). The codebook for book B3a data was revised as well.
The B5_RJA2 data file was revised on July 2, 2009 to correct for a problem with identifiers. The HHID07 and PID07 variables had been accidentally omitted and were added when the file was updated.
The IFLS4 data was updated on Sept 25, 2009 to correct some identifier variable problems uncovered by users. Some of the files affected were BK_AR1, B3A_MG1, B3A_KW1, and B3A_KW3. In addition, a problem of a few missing records in B2_COV was corrected as well. Users may wish to re- download the full set of IFLS4 data because of the corrections to PIDLINK.
The B3A_PK2 file was updated on Oct 18, 2009 to correct for a problem with the variable PK18. PK18, which was previously blank, now has values for everyone who answered PK18.
The following variables in the files listed below were updated on Oct 24, 2009 to correct a problem where values were inadvertently missing:
| File | Variable |
|---|---|
| b3a_pk2 | pk18 |
| b3a_mg2 | mg36 |
| b3b_ak1 | ak05 |
| bp_ak1 | ak05 |
| b1_cr1 | cr15 |
| b1_cr2 | cr24 |
| b2_bh | bh26 |
| b2_nd1 | nd01 |
| b4_BA6 | ba90 |
| b4_cx2 | cx26 |
| b5_fma | fma01a |
| bp_tf | tftype |
The BK_SC file was updated on Oct 24, 2009. The variables HHID07_9, HHID07, and "X" in BK_SC have been renamed so that merging should be less confusing for the user.
The household codebook files were updated on Oct 24, 2009.
The English and Indonesian CF questionaires for School are now included with the CF documentation as of Oct 24, 2009.
The IFLS4 Comfas data was updated on Nov 18, 2009. The revisions include the following:
- The bk1_c1, bk1_c3, and bk1_c4 files have been removed and their contents have been restructured and attached to the main bk1 data file. the main bk1 file. They no longer exist as separate modules.
- CP variables have been added to all files (usually to the "main" book, e.g. - bk1, bk2, etc).
- The minikamades module is now included with the rest of the CF data in the cf07_all data zip files. These are the MKD files. In the near future there will be a separate link to the minikamades data itself.
- Some variable labels have been corrected for inaccuracies. For example, the labels for SC04, SC06, SC09-SC12 in the school file have been updated.
- Comfas file codebooks are now available in the cf07_all_doc zip file. In the near future, there will be a separate link to just the comfas codebooks themselves.
Return to:
| Top of document | Utilities and tips | Data updates | IFLS home page | FLS home page |

Top