Intergenerational Linking of NLSY79 Mothers and Their Young Adult Daughters

SAS code for Step 3

Part A


*sort two data sets by mother id, and then merge;
data child;
set x;
*rename mother id to match name of variable in mom dataset;
momid = C0000200;
proc sort;  by momid;

data mom;
set y;
*rename NLSY79 id to match name of mother id variable in child dataset;
momid = R0000100;
proc sort;  by momid;

data childmom;
merge child mom;  by momid;
*eliminate NLSY79 respondents with no children in child data set, final data set has 11469 observations using data through 2006;
if C0000100 ne . ;

Part B

*create variables for age and year of last interview for mom;
*note that we name these variables to start with an "m" to denote that these are variables for the mother;
if R0898310 gt 0 then do;
m_age_lint = R0898310;
m_year_lint = 1982;
end;
*repeat for all intervening years;
*note that we redefine these variables each time "age at interview" is reported to find the age at last interview;

if T0989000 gt 0 then do;
m_age_lint = T0989000;
m_year_lint = 2006;
end;
*create variable indicating that mom had a teen birth;
*note that we define this variable only for women ages 18 and over;
if m_age_lint ge 18 then do;     
*age at 1st birth is between 0 and 17;
if (m_year_lint = 1982 and  R0898840 gt 0  and R0898840 lt 18) then m_teenbirth = 1;
*age at 1st birth is 18 or greater, so no teen birth;
if (m_year_lint = 1982 and  R0898840 ge 18) then m_teenbirth = 0;
*never gave birth, so no teen birth;  
if (m_year_lint = 1982 and R0898840 = -998) then m_teenbirth = 0;
*repeat for each year-- this strategy lets us define these variables using data reported at the last interview;

if (m_year_lint = 2006 and T0996200 gt 0  and T0996200 lt 18) then m_teenbirth = 1;
if (m_year_lint = 2006 and T0996200 ge 18) then m_teenbirth = 0;
if (m_year_lint = 2006 and T0996200 = -998) then m_teenbirth = 0;
end;

*using data through 2006;
*m_teenbirth (mean = .201, N = 11463);
*m_year_lint (mean = 2002, N = 11465);
*m_age_lint (mean =41.4, N = 11465);

Part C

*restrict sample to female young adults, drops 8322 observations using data through 2006;
*sample size is now 3147;
if Y0677400 = 2;
*create variables for age and year of last interview for young adult;
*note that we name these variables to start with "y" to denote that these are variables for the young adult;
if Y1205100 gt 0 then y_year_lint = Y1205100;
if y_year_lint = 1994 then y_age_lint = Y0342400;
*repeat for each year--this strategy lets us define these variables using data reported at the last interview;

if y_year_lint = 2006 then y_age_lint = Y1948500;
*create variable indicating that female young adults who are 18 or over had a teen birth;
if y_age_lint ge 18 then do;
if (Y1211100 gt 0  and Y1211100 lt 18) then y_teenbirth = 1;
if (Y1211100 ge 18) then y_teenbirth = 0;
if (Y1211100 = -998) then y_teenbirth = 0;
end;

*Final Statistics from Program:  Data through 2006 survey;
*m_teenbirth (mean = .249, N = 3147);
*y_teenbirth (mean = .137, N = 2419)-smaller sample size because only created for those at least 18;
*y_year_lint (mean = 2006, N = 3147);
*y_age_lint (mean = 21.3, N = 3147);
*m_year_lint (mean = 2005, N = 3147);
*m_age_lint (mean = 44.6, N = 3147);