Fertility histories: with and without restrictions--an analysis of PLM data.
Khan, Zubeda
During the last decade, a large number of countries participated in
the World Fertility Survey but few of them collected fertility histories
that were not partially restricted. In a majority of the cases
information on the duration of breast-feeding and contraceptive use was
restricted to the last closed and the open intervals only. These
restrictions on the fertility histories have raised many questions about
the possibility of sample selection bias in the results. A number of
researchers in the developed countries have used these surveys for
analyzing the effects of breastfeeding and contraception on the length
of birth intervals. They have acknowledged the possibility of a bias in
the results and have taken measures to minimize these potential biases.
In this paper we will initially discuss the ways in which biased
histories produce a biased sample of births. Later we will evaluate the
effects of the restrictions by using the fertility data from the
Population Labour Force and Migration (PLM) Survey. This data contains
detailed reproductive histories of 9416 currently married women having
38,746 children selected from 11,000 households sampled in the PLM
survey.
There are two distinct issues in this regard. The first is the
extent to which the selection of the last closed and open interval leads
to biased estimates of the duration of breast-feeding and the levels of
contraceptive use. The second is whether such restrictions bias the
findings regarding the structure of relationships between the variables
of interest.
The selective nature of using last closed and open birth intervals
can best be seen by considering cohorts of birth intervals begun various
years preceding the interview. Figure 1 shows the percentage of births
in each year that began the last closed or open birth intervals.
[FIGURE 1 OMITTED]
Two facts are obvious:
(1) The production of births is very small during the initial year
before the survey but it becomes quite substantial prior to 5 years
before the data of survey ; and
(2) The presumption that the last closed and the open interval
represent very recent experience is wrong.
It is evident from Figure 1 that the last closed or open intervals
reach far back in time. 14 percent of these intervals were initiated 10
years before the survey and 9 percent 19 years before. It would clearly
be a mistake to base estimates of current breast-feeding or
contraception on data for all births for which they were restricted to
the last closed and open intervals only.
Now we will talk about bias in the estimated relations. As our
dependent variable is binary, the least-squares approach or other
standard econometric procedures yields biased results. Thus, the
logistic model is applied.
We are interested in relating differences in fertility to
intermediate and socioeconomic variables like contraception,
breast-feeding, abortion, infant mortality, age at first birth,
education, place of residence, and son preference. This last variable is
of particular interest because some studies suggest that women who do
not have at least one son may intentionally curtail breast-feeding in
order to hasten the birth of the next child.
The intervals that began 2 years before the survey were not
included in the analysis because these would not have had sufficient
time to be closed. To analyse birth history completely, we have chosen
the 2-12 years period prior to the survey as our final model for
analysis. Later, the durations before the interview were varied to see
whether there was any systematic variation associated with decreasing
selectivity. We restricted the universe to birth intervals begun 2-6
years, 2-5 years and 2-4 years preceding the interview. We estimated
these models with and without WFS restrictions. The original results,
i.e. the estimates based on intervals began 2-12 years before the
interview and without WFS restrictions, were compared with other sets of
estimates. Since the structure of the process differs with parity, birth
intervals 2, 3 and 4-8 were examined separately.
The approach used a large number of logistic regression estimates.
There are:
3 sets of birth order intervals (2, 3 and 4-8);
4 sets of segments months (17-22, 23-28, 29, 34 and 35-40);
4 sets of time period (2-12, 2-6, 2-5 and 2-4) years; and
2 sets of restrictions (with and without WFS restrictions).
This yields 96 logistic regression runs. (1)
Taking the estimates based on intervals begun 2-12 years before the
interview as a comparison point, we then established a confidence
interval around the betas from this restricted model that is equal to
plus or minus twice the standard error of the betas. We examined the
corresponding betas from the other models to see whether they fell
within this interval or not.
Table 1 summarises the results. We have taken socio-economic and
intermediate variables separately as well as combined in the four time
periods under study.
One thing is obvious, that the longer the period preceding the
survey in the unrestricted sample the higher the proportion of unbiased
results. In this table 85 to 96 percent of the results are unbiased. But
when we restricted the fertility history to the last closed and open
interval the results are biased with only 25 to 57 percent of the betas
lying within the confidence interval, the rest of the results are
significantly different. The estimates for the levels of the duration of
breastfeeding and contraceptive use are biased up to 75 percent.
(1) The detailed results can be had from the author on request.
ZUBEDA KHAN, The author is Research Demographer at the Pakistan
Institute of Development Economics, Islamabad. This is an. abridged
version of the paper presented at the Fifth Annual General Meeting of
the Pakistan Society of Development Economists.
Table 1
Percentage of [B.sub.S] Falling within the Confidence Interpal
for [B.sub.s] in the 2-12 Years Unrestricted Model, by Type of
Restrictions, Type of Variables, and Number of Years Preceding
the Survey *
Number of Years Preceding
the Survey
Type of Variables
and Restrictions 2-12 2-6 2-5 2-4
Socio-economic Variables
WFS Restriction 65 48 50 25
No Restriction 100 96 85 81
Intermediate Variables
WFS Restriction 82 57 49 29
No Restriction 100 94 94 78
All Variables
WFS Restriction 74 53 49 27
No Restriction 100 95 90 80
* Confidence interval equal to plus or minus twice the standard
error of beta for analogous no-restriction 2-12 model.