Skip to main content

Linking Roster Items Across Rounds in the NLSY97

Tutorial objective and prerequisites

Objective

The goal is to link household roster items across survey rounds using unique household member ID codes provided in the NLSY97 data set. This tutorial explains how to find the same member of the respondent's household across different survey rounds in the NLSY97.

Knowledge assumed

This tutorial assumes that you already know how to use the NLS Investigator to create a tagset that saves your variables and to extract data. If you need assistance with the NLS Investigator before starting this tutorial, please review the Investigator User Guide or contact NLS User Services.

Background reading

To understand how to find the same household member across different survey rounds, you should first know how rosters are created. A detailed discussion is provided in the Types of Variables section of the NLSY97 Users Guide. Additional information about the household roster can be found in the Household Composition section of the guide.

Example: NLSY97 household roster

Preview of steps

  1. Step 1: Find the household roster variables
  2. Step 2: Extract selected roster data
  3. Step 3: Compare Unique ID (UID) codes across rounds

Additional information at the end of the tutorial provides some basic guidance on using the same concepts to investigate other rosters in the NLSY97.

Step 1: Find the household roster variables

The household roster is basically a list of all the people in the respondent's household. It can be pictured as a grid or table, where each person in the household gets a line on the roster which contains their information. For example, a simple round 1 roster might look like this:

Line number Household member name Age Sex Highest Grade Completed Relationship to Respondent
1 John 17 M 11 [Respondent]
2 Steve 48 M 16 Father
3 Mary 46 F 16 Mother
4 Susan 11 F 6 Sister

In round 1, all household roster variables labeled HH Member 01 (or similar terms like Person 01, Line 01, etc.) have information about John, variables labeled HH Member 02 have information about his father Steve, and so on.

Rosters are used not only to organize data in one round but also to link information across rounds. For example, Susan's gender won't change in round 2, but her highest grade completed probably will. So, researchers might need to look at the gender code from one round and the highest grade completed from another round. This requires linking items on a roster across survey rounds using a unique identification number, or UID.

The following steps will show you how to find the same household member across rounds using the UID. The first step is to use NLS Investigator to find the household roster variables.

  1. Start with round 1. The household roster variables for round 1 have question names starting with HHI2. In NLS Investigator, select the Question Name (search text) search. Type HHI2 into the search box and submit. Now, you will see a long list of variables that begin with HHI2. If you scroll down through the list you'll see various characteristics of household members, such as gender, race/ethnicity, marital status, age, etc.
  2. Tag all the variables that describe characteristics that you are interested in. For this example, please tag highest grade completed (HHI2_HIGHGRADE) for all household members.
  3. Next, tag the unique ID variables (HHI2_UID) for all household members.
  4. You can use the same process to find the corresponding variables from round 2. The round 2 roster variables have question names starting with HHI, so you will search for HHI_ in the NLS Investigator. Please make sure you tag the HHI_UID ID variable for all household members, along with HHI_HIGHGRADE for our example.

Note that searching for question name=HHI_ will return the household roster variables from round 2 and all subsequent rounds, since they all have the same question name. If you only want round 2, you can do a combined search in which question name=HHI_ and survey year=1998. This will give you a much shorter list to pick from.

Reference Number Question Name Variable Title Year
R10994.00 HHI2_HIGHGRADE.01 HHI2_HIGHGRADE (ROS ITEM) L1 1997 1997
R10995.00 HHI2_HIGHGRADE.02 HHI2_HIGHGRADE (ROS ITEM) L2 1997 1997
R10996.00 HHI2_HIGHGRADE.03 HHI2_HIGHGRADE (ROS ITEM) L3 1997 1997
R10997.00 HHI2_HIGHGRADE.04 HHI2_HIGHGRADE (ROS ITEM) L4 1997 1997
R10998.00 HHI2_HIGHGRADE.05 HHI2_HIGHGRADE (ROS ITEM) L5 1997 1997
R10999.00 HHI2_HIGHGRADE.06 HHI2_HIGHGRADE (ROS ITEM) L6 1997 1997
R11000.00 HHI2_HIGHGRADE.07 HHI2_HIGHGRADE (ROS ITEM) L7 1997 1997
R11001.00 HHI2_HIGHGRADE.08 HHI2_HIGHGRADE (ROS ITEM) L8 1997 1997
R11002.00 HHI2_HIGHGRADE.09 HHI2_HIGHGRADE (ROS ITEM) L9 1997 1997
R11003.00 HHI2_HIGHGRADE.10 HHI2_HIGHGRADE (ROS ITEM) L10 1997 1997
R11004.00 HHI2_HIGHGRADE.11 HHI2_HIGHGRADE (ROS ITEM) L11 1997 1997
R11005.00 HHI2_HIGHGRADE.12 HHI2_HIGHGRADE (ROS ITEM) L12 1997 1997
R11006.00 HHI2_HIGHGRADE.13 HHI2_HIGHGRADE (ROS ITEM) L13 1997 1997
R11007.00 HHI2_HIGHGRADE.14 HHI2_HIGHGRADE (ROS ITEM) L14 1997 1997
R11008.00 HHI2_HIGHGRADE.15 HHI2_HIGHGRADE (ROS ITEM) L15 1997 1997
R11009.00 HHI2_HIGHGRADE.16 HHI2_HIGHGRADE (ROS ITEM) L16 1997 1997
R11621.00 HHI2_UID.01 HHI2_UID (ROS ITEM) L1 1997 1997
R11622.00 HHI2_UID.02 HHI2_UID (ROS ITEM) L2 1997 1997
R11623.00 HHI2_UID.03 HHI2_UID (ROS ITEM) L3 1997 1997
R11624.00 HHI2_UID.04 HHI2_UID (ROS ITEM) L4 1997 1997
R11625.00 HHI2_UID.05 HHI2_UID (ROS ITEM) L5 1997 1997
R11626.00 HHI2_UID.06 HHI2_UID (ROS ITEM) L6 1997 1997
R11627.00 HHI2_UID.07 HHI2_UID (ROS ITEM) L7 1997 1997
R11628.00 HHI2_UID.08 HHI2_UID (ROS ITEM) L8 1997 1997
R11629.00 HHI2_UID.09 HHI2_UID (ROS ITEM) L9 1997 1997
R11630.00 HHI2_UID.10 HHI2_UID (ROS ITEM) L10 1997 1997
R11631.00 HHI2_UID.11 HHI2_UID (ROS ITEM) L11 1997 1997
R11632.00 HHI2_UID.12 HHI2_UID (ROS ITEM) L12 1997 1997
R11633.00 HHI2_UID.13 HHI2_UID (ROS ITEM) L13 1997 1997
R11634.00 HHI2_UID.14 HHI2_UID (ROS ITEM) L14 1997 1997
R11635.00 HHI2_UID.15 HHI2_UID (ROS ITEM) L15 1997 1997
R11636.00 HHI2_UID.16 HHI2_UID (ROS ITEM) L16 1997 1997
R11636.01 HHI2_UID.17 HHI2_UID (ROS ITEM) L17 1997 1997
R24079.00 HHI_HIGHGRADE.01 HHI HIGHGRADE (ROS ITEM) L1 1998 1998
R24080.00 HHI_HIGHGRADE.02 HHI HIGHGRADE (ROS ITEM) L2 1998 1998
R24081.00 HHI_HIGHGRADE.03 HHI HIGHGRADE (ROS ITEM) L3 1998 1998
R24082.00 HHI_HIGHGRADE.04 HHI HIGHGRADE (ROS ITEM) L4 1998 1998
R24083.00 HHI_HIGHGRADE.05 HHI HIGHGRADE (ROS ITEM) L5 1998 1998
R24084.00 HHI_HIGHGRADE.06 HHI HIGHGRADE (ROS ITEM) L6 1998 1998
R24085.00 HHI_HIGHGRADE.07 HHI HIGHGRADE (ROS ITEM) L7 1998 1998
R24086.00 HHI_HIGHGRADE.08 HHI HIGHGRADE (ROS ITEM) L8 1998 1998
R24087.00 HHI_HIGHGRADE.09 HHI HIGHGRADE (ROS ITEM) L9 1998 1998
R24088.00 HHI_HIGHGRADE.10 HHI HIGHGRADE (ROS ITEM) L10 1998 1998
R24089.00 HHI_HIGHGRADE.11 HHI HIGHGRADE (ROS ITEM) L11 1998 1998
R24090.00 HHI_HIGHGRADE.12 HHI HIGHGRADE (ROS ITEM) L12 1998 1998
R24091.00 HHI_HIGHGRADE.13 HHI HIGHGRADE (ROS ITEM) L13 1998 1998
R24092.00 HHI_HIGHGRADE.14 HHI HIGHGRADE (ROS ITEM) L14 1998 1998
R24093.00 HHI_UID.01 HHI UNIQUE ID (ROS ITEM) L1 1998 1998
R24094.00 HHI_UID.02 HHI UNIQUE ID (ROS ITEM) L2 1998 1998
R24095.00 HHI_UID.03 HHI UNIQUE ID (ROS ITEM) L3 1998 1998
R24096.00 HHI_UID.04 HHI UNIQUE ID (ROS ITEM) L4 1998 1998
R24097.00 HHI_UID.05 HHI UNIQUE ID (ROS ITEM) L5 1998 1998
R24098.00 HHI_UID.06 HHI UNIQUE ID (ROS ITEM) L6 1998 1998
R24099.00 HHI_UID.07 HHI UNIQUE ID (ROS ITEM) L7 1998 1998
R24100.00 HHI_UID.08 HHI UNIQUE ID (ROS ITEM) L8 1998 1998
R24101.00 HHI_UID.09 HHI UNIQUE ID (ROS ITEM) L9 1998 1998
R24102.00 HHI_UID.10 HHI UNIQUE ID (ROS ITEM) L10 1998 1998
R24103.00 HHI_UID.11 HHI UNIQUE ID (ROS ITEM) L11 1998 1998
R24104.00 HHI_UID.12 HHI UNIQUE ID (ROS ITEM) L12 1998 1998
R24105.00 HHI_UID.13 HHI UNIQUE ID (ROS ITEM) L13 1998 1998
R24106.00 HHI_UID.14 HHI UNIQUE ID (ROS ITEM) L14 1998 1998

Open step 1 variable list in a separate browser window

Step 2: Extract selected roster data

In Step 1, you created a tagset with the Unique ID (UID) and Highest Grade Completed for the respondent's household members in rounds 1 and 2. In this step, you will run the extract process to create a data set and corresponding SAS/SPSS/STATA/R programs.

  1. Click on the Save/Down Tab in the NLS Investigator.
  2. Choose either the Basic Download Tab or the Advanced Download Tab.
    • Basic downloads include: Tagset, SAS/SPSS/STATA files, Codebook, and Comma-delimited datafile.
    • Advanced downloads include: Tagset, SAS/SPSS/STATA/R, Codebook, Short Description file, Comma-delimited datafile, and the ability to create frequency tables or apply universe restrictors.
    • To review the download process, visit the Save/Download Tab section of the Investigator User Guide.
  3. After you have chosen the Basic or Advanced options, assign a filename and click the download button to process your variable request.
  4. Once your request has been processed, the files will be available in the Manage Downloads Tab for you to access.

Importing SAS/SPSS/STATA/R files

Instructions for loading files into your statistics software can be found in the Importing Data section of the Investigator User Guide or view our video How to Import NLS Data into Statistical Software.

Step 3: Compare Unique ID (UID) codes across rounds

Now that you have your data set, you are ready to start comparing UID codes. The logic of this is as follows:

  1. Start by looking at the second household member in round 1 (since the first member is the respondent, who does not appear on the round 2 roster). The unique identification code for this person is found in the variable HHI2_UID.02.
  2. Next, look at each UID variable in round 2, that is HHI_UID.01-HHI_UID.14, and see if one has a value that matches the number in HHI2_UID.02. If there is a match, then that particular entry on the round 2 roster is the same person as the second entry on the round 1 roster.
  3. Record the line number on the round 2 roster in a new variable called position. There will be a position variable for each line number (2-17) on the round 1 household roster, so our position variables will be numbered position2-position17.

Click below for sample programming code in SAS, SPSS, and STATA to find round 1 household member 2 on the round 2 household roster.

           position2 = 0;
           if R1uid2 = R2uid1, then position2 = 1;
           if R1uid2 = R2uid2, then position2 = 2;
           if R1uid2 = R2uid3, then position2 = 3;
           if R1uid2 = R2uid4, then position2 = 4;
           if R1uid2 = R2uid5, then position2 = 5;
           if R1uid2 = R2uid6, then position2 = 6;
           if R1uid2 = R2uid7, then position2 = 7;
           if R1uid2 = R2uid8, then position2 = 8;
           if R1uid2 = R2uid9, then position2 = 9;
           if R1uid2 = R2uid10, then position2 = 10;
           if R1uid2 = R2uid11, then position2 = 11;
           if R1uid2 = R2uid12, then position2 = 12;
           if R1uid2 = R2uid13, then position2 = 13;
           if R1uid2 = R2uid14, then position2 = 14;
        

Open SAS sample code in a separate browser window

Note that more experienced SAS programmers can use arrays to achieve the same result.

           compute position2 = 0
           if (R1uid2 = R2uid1) position2 = 1
           if (R1uid2 = R2uid2) position2 = 2
           if (R1uid2 = R2uid3) position2 = 3
           if (R1uid2 = R2uid4) position2 = 4
           if (R1uid2 = R2uid5) position2 = 5
           if (R1uid2 = R2uid6) position2 = 6
           if (R1uid2 = R2uid7) position2 = 7
           if (R1uid2 = R2uid8) position2 = 8
           if (R1uid2 = R2uid9) position2 = 9
           if (R1uid2 = R2uid10) position2 = 10
           if (R1uid2 = R2uid11) position2 = 11
           if (R1uid2 = R2uid12) position2 = 12
           if (R1uid2 = R2uid13) position2 = 13
           if (R1uid2 = R2uid14) position2 = 14
        

Open SPSS sample code in a separate browser window

           gen position2 =0;
           replace position2 =1 if R1uid2 = = R2uid1;
           replace position2 =2 if R1uid2 = = R2uid2;
           replace position2 =3 if R1uid2 = = R2uid3;
           replace position2 =4 if R1uid2 = = R2uid4;
           replace position2 =5 if R1uid2 = = R2uid5;
           replace position2 =6 if R1uid2 = = R2uid6;
           replace position2 =7 if R1uid2 = = R2uid7;
           replace position2 =8 if R1uid2 = = R2uid8;
           replace position2 =9 if R1uid2 = = R2uid9;
           replace position2 =10 if R1uid2 = = R2uid10;
           replace position2 =11 if R1uid2 = = R2uid11;
           replace position2 =12 if R1uid2 = = R2uid12;
           replace position2 =13 if R1uid2 = = R2uid13;
           replace position2 =14 if R1uid2 = = R2uid14;

Open STATA sample code in a separate browser window

The sample code adjusts the question names as follows:

  • HHI2_UID.02 (round 1 UID) will be shown as R1uid2
  • HHI_UID.01 (round 2 UID) will be shown as R2uid1
  • position2 will be the newly created variable corresponding to household member 2 in round 1

Suppose after you run your program you find that position2 = 9. You can then look at HHI2_HIGHGRADE.02 in round 1 and HHI_HIGHGRADE.09 in round 2 to see if that person completed an additional year of schooling.

  • If position2 = 0, this means that the person no longer lives in the respondent's household in round 2 or the respondent did not complete a round 2 interview (and all variables will have a value of -5 for that round).

Use the same code to find the position of round 1 household members #3-17 in round 2. Simply start by creating a new position variable for each member (position3, position4, etc.). Then substitute R1uid3, R1uid4, etc. for R1uid2 in the original code.

Additional information

Rosters are used in a number of other sections of the NLSY97 data

The techniques described here for using the household roster can be applied to these other rosters as well. Figure 4 in the Types of Variables: Raw, Symbols, Rosters & Created section of the NLSY97 guide shows what rosters are available in each round; the question name can be used to find the various items in the NLS Investigator.

Special note about UID codes and rosters

The household, nonresident, biochild, bioadoptchild, partners, otherparents, and cumpartners rosters are interlinked. People related to/living with the respondent will have the same UID code on all seven rosters. This makes it possible to get information about the same person from more than one roster. For example, assume in our code above that position2 = 0, meaning that the person left the respondent's household between rounds 1 and 2. If this person was related to the respondent, he or she will appear on the nonresident roster in round 2 and can be identified using similar programming code (R2NRuid1 = NONHHI_UID.01 from round 2):

SAS
               NRposition2 = 0;
               if R1uid2 = R2NRuid1, then NRposition2 = 1;
               if R1uid2 = R2NRuid2, then NRposition2 = 2;
               if R1uid2 = R2NRuid3, then NRposition2 = 3;
               [and so on through R2NRuid22]
            
SPSS
               compute NRposition2 = 0
               if (R1uid2 = R2NRuid1) NRposition2 = 1
               if (R1uid2 = R2NRuid2) NRposition2 = 2
               if (R1uid2 = R2NRuid3) NRposition2 = 3
               [and so on through R2NRuid22]
            
STATA
               gen NRposition2 = 0;
               replace NRposition2 = 1 if R1uid2 = = R2NRuid1;
               replace NRposition2 = 2 if R1uid2 = = R2NRuid2;
               replace NRposition2 = 3 if R1uid2 = = R2NRuid3;
               [and so on through R2NRuid22]
            

The same programming logic applies to household members who may also appear on the biochild/bioadoptchild roster or the various partners rosters.