Tutorial: Matching Cohabiting Partners to Their Characteristics in the NLSY97

Objective:  To link the respondent's first cohabiting partner to the partner's characteristics using information from the created event history arrays, the household and non-resident rosters, and the partner roster. In particular, this tutorial shows you how to find the age of the respondent's first cohabiting partner in the NLSY97 data set.

Knowledge Assumed:  This tutorial assumes that you already know:

  • how to use the NLS Web Investigator to create a tag set that saves your variables and to extract data. If you need assistance with the NLS Web Investigator before starting this tutorial you should see "How to use the NLS Investigator."
  • how to use rosters to link individuals to their characteristics. To learn more about using rosters, see the tutorial on "Linking Roster Items across Rounds in the NLSY97."

Background Reading:

  1. Understand event history arrays. To find the respondent's 1st cohabiting partner's age, you should first understand the event history arrays that have been created to help you use the NLSY97 data. The Marital and Marriage-Like Relationships section of the user's guide describes the event history variables that describe marital status. We'll use the monthly array MAR_PARTNER_LINK to find the id of the 1st partner.

  2. Understand Household Composition, including marriage-like relationships. You will also need to know what characteristics are collected for spouses and partners and where those pieces of information are collected. Cohabitation variables can be found in the in the marriage and cohabitation section of the questionnaire, as well as in the household rosters. Consequently, characteristics of spouses and partners could be located in one of three parts of the survey, depending on when, relative to his/her interview, the respondent lived with the partner.
    1. For partners with whom the respondent lived prior to the Round 1 interview, the partner's characteristics are found on the non-resident roster.
    2. For partners with whom the respondent is living at the date of the interview, the partner's characteristics are found on the household roster.
    3. For partners with whom the respondent lived between interviews, the partner's characteristics are collected as part of the marriage/cohabitation section of the questionnaire.

    For more information, read the Household Composition and Marital and Marriage-Like Relationships sections of the NLSY97 user's guide.

  3. Understand the ID variables. There are two id variables for partners. Partners who appear on the non-resident or household rosters have valid values for both id variables: (1) id (round and loop in the marriage section when reported) and (2) uid (unique id) that let the user identify that person across rosters and rounds. Those partners with whom the respondent lived between rounds may have a valid value only for the id variable and not for the uid.

Preview of Steps

  1. Find and tag the relevant variables.
    1. Tag age and ID from Household Roster
    2. Tag age and ID from non-resident Roster
    3. Tag age and ID from partner Roster
    4. Tag Mar_Partner_link event history array variables
  2. Extract selected variables.
  3. Find partner id for the 1st partner.
  4. Split 1st partner id into round number and loop number.
  5. Locate partner on partner roster collected in marriage and cohabitation section and link partner to his/her age.
  6. For partners who were in the household use the unique id to find the partner on the household roster and link partner to his/her age.
  7. For non-resident partners from prior to the date of the NLSY97 Round 1 interview, get age from the non-resident roster in Round 1.

Review the output of the program

Review additional information about this tutorial

Step 1: Use NLS Web Investigator to find and tag the relevant variables

First, be aware you'll want to pull data from all rounds of the survey because the respondents could have first cohabitated in any of these rounds.

Second, in this example, I will require that respondents are interviewed in Round 10 that lets us use only the event history data created in Round 10. When working on your own projects, you can decide your own criteria. For example, must the sample members have interviewed after a certain age?

  1. Let's start with the variables from the household roster. You will need both the unique ids (UIDs) and ages of household members. In Round 1, the variables on the household roster begin with "HHI2". First search on Question Name (pick from list) starts with "HHI2_U" and you'll get the unique id's for all household members in Round 1. Next, if you search on Question Name (pick from list) starts with "HHI2_A" and Word in Title (enter search term) contains "age", you will get the age of all household members in Round 1.

    Next, you'll need the unique ids and age from the household rosters for Rounds 2 and higher. If you search using Question Name (pick from list) starts with "HHI_UI", you'll get all the variables that contain the unique id numbers for the household members in Rounds 2 and higher. Next, if you change the filter to Question Name (pick from list) starts with "HHI_AG", the list of variables that result contains the ages and the estimated ages of the household members. For this tutorial, we are not going to use the information in the estimated ages variables (asked only when the respondent can't report the age of the household member), so search one more time keeping the previous filter and including an additional one where Word in Title (enter search term) does NOT contain "estimated".

    To see the list of variables tagged from the household rosters, click here.

  2. Next, you'll pull the relevant variables from the non-resident roster. To find the unique ids of members of the non-resident roster, search with Question Name (pick from list) starts with "NONHHI", Word in Title (enter search term) contains "unique", and Year=1997. Tag these variables. Next, to get the ages of the members of the non-resident roster, keep the filters for NONHHI and 1997, and add a new filter where Word in Title (enter search term) contains "age". Click here to see the list of variables.

  3. Next, you'll tag the partner characteristics from the partner rosters. You'll need to tag both id and uid variables. The ids (round and loop on the partner roster) match to the values on the MAR_PARTNER_LINK array. The uids match to the household roster variables. To find id and unique id (UID) on the partner roster, search with Question Name (pick from list) starts with "PARTNE" and Word in Title (enter search term) contains "id". This will bring up partner ids and partner uids. You'll want to tag all of these variables. Next to find age from the partner roster, search with Question Name (pick from list) starts with "YMAR-3" and Word in Title (enter search term) contains "age". This brings up more variables than only the age of the partners. If you refine your search, by keeping the filters, and adding Word in Title (enter search term) contains "start", you'll be closer to only those variables that you want. This 2nd search brings up the age of the partner and the estimated age of the partner when the couple started living together. We can further limit our search by adding a filter of Word in Title (enter search term) does NOT contain "estimated". Click here to review the list of variables.

    Now we have all the age and id variables for the partners.

  4. Last, you'll need the variables that tell you with whom the respondent first cohabitated. The MAR_PARTNER_LINK monthly arrays provide the id of the partner in each month that the respondent is cohabitating. The partner id lets you link to partner roster variables. To find these search Question Name (enter search term) contains "MAR_PARTNER_LINK". Tag the variables from April 1994 through May of 2007. To see the list of selected variables, click here.

Step 2: Extract selected variables

In step 1, you created a tagset of variables from the household rosters across rounds, the Round 1 non-resident roster, and Round 10 mar_partner_link array. Now you have partner ids, ages, and information about when the respondent first cohabitated and with whom. Now it's time to run an extract to create a data set and corresponding SAS/SPSS/Stata programs. Note that this tutorial uses SAS programming code, but the same logic applies to other statistical software packages. Note that the variables used in the program examples are based on question names from the NLSY97, in most cases they are altered to reflect the survey year and be readable by the software. My rename statement is available here.

Note to Stata users: We are currently working on an up-to-date version of the programming steps for this tutorial using Stata.

Pages