Insights into heaping from retrospective breastfeeding data
Recall bias is a pervasive problem in the analysis of retrospective data (Shyrock et al., 1973; Ewbank, 1981). The problem is a recurrent concern in the literature on the determinants of breastfeeding duration, its trend over time, and the effect of breastfeeding on infant mortality (e.g., Knodel and Debavalya, 1980; Diamond, McDonald, and Shah, 1986). Much of the concern is stimulated by "heaping" - the pronounced spikes at 12, 24, and 36 months in the retrospective breastfeeding data. Previous research on the nature and effects of recall bias has been hampered by the lack of data that would make it possible to distinguish heaping in the recall process from heaping in the true behavior, and the lack of data that would allow inference of the true behavior underlying the heaped responses. In the absence of such data, previous analyses have made strong (and, as the author will show, incorrect) assumptions or have analyzed only the current status data. This paper exploits unique data from the Malaysian Family Life Surveys (MFLS) to reexamine the nature, correlates, and consequences of heaping and other forms of recall bias for the analysis of durations of breastfeeding with hazard models. Together, MFLS-1 in 1976 and MFLS-2 in 1988 record retrospective breastfeeding durations for over 11,000 infants, including two responses (separated by 12 years) for over 3,000 of them.