Tutorial: Linking NLSY79 Mothers and Their Children

Objective:  Our goal is to link NLSY79 mothers with their children.  This tutorial explains the general logic to link mothers and children of any age covered in the Children of the NLSY79.  The tutorial then gives a specific example of using data on mothers and young adult daughters by creating two variables:  (1) whether the mother had a first birth prior to age 18, and (2) whether the daughter had a first birth prior to age 18.  This allows one to examine intergenerational correlations in teenage childbearing.

Throughout the tutorial, information applying to linking all mothers and children is presented in blue/bold text. Information for the mother/daugther example is in regular black text.

Knowledge Assumed:  This tutorial assumes that you already know how to use the NLS Investigator to create a tag set that saves your variables and to extract data.  If you need assistance with the NLS Investigator before starting this tutorial, please contact NLS User Services.

Background Reading: To understand how to link mother and child/young adult files, see the NLSY79 Child and Young Adult Users Guide section on Linking Children, Young Adults and Mothers and Appendix E and Appendix F for sample SPSS and SAS programs. In the NLSY79 User's Guide, see the sections on Age and Fertility.   

Preview of Steps

  1. Find and extract the respondent IDs, age at first birth, and other needed variables in the Child/Young Adult data set.
  2. Find and extract comparable variables in the NLSY79 data.
  3. Merge NLSY79 and Child/Young Adult data files and create new fertility variables.
  4. Statistics output from sample program

 

Step 1: Find and extract the respondent IDs, age at first birth, and other needed variables in the Child/Young Adult data set

First we'll need to use the NLS Web Investigator to find the Child/Young Adult variables.

  1. Let's start by finding the mother and child/young adult IDs.  Choose to select variables by reference number and pick "C000" and then submit.  C00001.00 is the respondent ID from the child/young adult file, and C00002.00 is the mother ID.  Tag these two variables.  (Note that the respondent ID, C00001.00, is a comprehensive ID variable, created for all children regardless of age or young adult status.  For this example, one could also use the Young Adult ID, Y00001.00, which will have the same value as C00001.00, but exists only for those children who participate at age 15 or older.)
  2. Now let's find age at first birth. Search on the "YA Fertility and Relationship Data" Created" Area of Interest and Survey Year = 2006 (or whatever the most recent survey year is). Y12111.00 is age at first birth at the most recent interview the young adult completed.  Tag Y12111.00.
  3. We need a gender variable, since we are only looking at young adult females. We also need to know if the respondent was at least 18 at her last young adult interview. Choose to select variables by the "YA Common Key Variables" Area of Interest. Scroll down and select gender (Y06774.00), most recent young adult interview year (Y12051.00), and age at each young adult interview (Y19485.00, Y16727.00, Y14343.00, Y11924.00, Y09748.00, Y06776.00, Y03424.00).
  4. Now let's run an extract to create the data set and corresponding SAS/SPSS/STATA program.

Click here to view the variables selected in step 1.

 

Step 2: Find and extract comparable variables in the NLSY79 data set

Next, we'll need to use the NLS Web Investigator to find the NLSY79 variables.

  1. Let's start by finding the NLSY79 respondent ID. Choose to select variables by reference number and pick "R000" and then submit. R00001.00 is the respondent ID. Tag this variable. Note that R00001.00 = C00002.00 in the child/young adult data set.
  2. Next, search on the Word in Title "Birth," Search Variable Title "Age," and the "Fertility and Relationship History/Created" Area of Interest. From the fairly long list, we can find the age at first birth created variables from 1982 forward (R08988.40, R11468.32, R15220.39, R18927.39, R22598.39, R24480.39, R28778.00, R30768.44, R34079.00, R36590.49, R40094.49, R44449.00, R50877.00, R51730.00, R64866.00, R70144.00, R77120.00, R85045.00, T09962.00). Note that we will need this variable for each year because if the respondent misses an interview, it is not created for that interview year.
  3. Searching on Word in Title "Age" and the "Key Variables" Area of Interest will give us the list of the created variables for age at the interview date, which we need from 1982 forward (R08983.10, R11451.10, R15203.10 R18910.10, R22581.10, R24455.10, R28713.00, R30750.00, R34017.00, R36571.00, R40076.00, R44187.00, R50817.00, R51670.00, R64798.00, R70075.00, R77048.00, R84972.00, T09890.00).
  4. Now let's run an extract to create the data set and corresponding SAS/SPSS/STATA program. 

Click here to view the variables selected in step 2.

 

Step 3: Merge NLSY79 and Child/Young Adult data files and calculate needed fertility variables

Now that we have our two data sets from Steps 1 and 2, we're ready to merge them and start programming our variables. The logic of this is as follows:

  1. We'll start by merging the two data sets. We will merge NLSY79 mother characteristics in with the Child/Young Adult data set.
  2. Next, we want to code whether the mother had a birth prior to age 18. We'll only create this variable for mothers who are interviewed after they turn 18. We'll calculate the age at last interview and the year of last interview from 1982 forward. Then we'll use this information to code the teen birth variable.
  3. We'll do something similar with the young adults. First we'll restrict our data to female young adults, and construct variables for age and year of last interview. Then we'll code the teen birth variable for young adults who are interviewed after they turn 18.

Program code is available in SAS, SPSS, and STATA.

 

Step 4: Statistics

Final Statistics from Program: Data through 2006 survey

m_teenbirth (mean = .249, N = 3147);
y_teenbirth (mean = .137, N = 2419) smaller sample size because only created for those at least 18;
y_year_lint (mean = 2006, N = 3147)
y_age_lint (mean = 21.3, N = 3147)
m_year_lint (mean = 2005, N = 3147)
m_age_lint (mean = 44.6, N = 3147)

Next Step: This tutorial focuses on linking mothers and their young adult daughters. One can use similar techniques to link other characteristics of mothers with characteristics of their children.