Skip to main content

Matching Cohabiting Partners to Their Characteristics in the NLSY97

Tutorial objective and prerequisites

Objective

The goal is to link the respondent's first cohabiting partner to the partner's characteristics using information from the created event history arrays, the household and non-resident rosters, and the partner roster.

Knowledge assumed

This tutorial assumes that you already know how to use the NLS Investigator to create a tagset that saves your variables and to extract data. You will also need to know how to use rosters to link individuals to their characteristics. To learn more about using rosters, see the Linking Roster Items across Rounds in the NLSY97 tutorial. If you need assistance with the NLS Investigator before starting this tutorial, please review the Investigator User Guide or contact NLS User Services.

Background reading

  • Understand event history arrays. To find the respondent's 1st cohabiting partner's age, you should first understand the event history arrays that have been created to help you use the NLSY97 data. The Marital and Marriage-Like Relationships section of the user's guide describes the event history variables that describe marital status. We'll use the monthly array MAR_PARTNER_LINK to find the id of the 1st partner.
  • Understand Household Composition, including marriage-like relationships. You will also need to know what characteristics are collected for spouses and partners and where those pieces of information are collected. Cohabitation variables can be found in the in the marriage and cohabitation section of the questionnaire, as well as in the household rosters. Consequently, characteristics of spouses and partners could be located in one of three parts of the survey, depending on when, relative to his/her interview, the respondent lived with the partner.
    • For partners with whom the respondent lived prior to the Round 1 interview, the partner's characteristics are found on the non-resident roster.
    • For partners with whom the respondent is living at the date of the interview, the partner's characteristics are found on the household roster.
    • For partners with whom the respondent lived between interviews, the partner's characteristics are collected as part of the marriage/cohabitation section of the questionnaire.

    For more information, read the Household Composition and Marital and Marriage-Like Relationships sections of the NLSY97 user's guide.

  • Understand the ID variables. There are two id variables for partners. Partners who appear on the non-resident or household rosters have valid values for both id variables:
    • id (round and loop in the marriage section when reported).
    • uid (unique id) that let the user identify that person across rosters and rounds. Those partners with whom the respondent lived between rounds may have a valid value only for the id variable and not for the uid.

Example: Find the age of the respondent's first cohabiting partner in the NLSY97

Preview of steps

  1. Step 1: Find and tag the relevant variables
    1. Tag age and ID from Household Roster
    2. Tag age and ID from non-resident Roster
    3. Tag age and ID from partner Roster
    4. Tag Mar_Partner_link event history array variables
  2. Step 2: Extract selected variables
  3. Step 3: Find partner id for the 1st partner
  4. Step 4: Split 1st partner id into round number and loop number
  5. Step 5: Locate partner on partner roster and link to his/her age
  6. Step 6: For resident partners, locate partner on household roster and link to his/her age
  7. Step 7: For non-resident partners (prior to Round 1), get age from the non-resident roster in Round 1

Additional information provides the example's output and suggestions for extending the tutorial.

Step 1: Find and tag the relevant variables

First, be aware you will want to pull data from all rounds of the survey because the respondents could have first cohabitated in any of these rounds.

Second, in this example, we will require that respondents are interviewed in Round 10. That lets us use only the event history data created in Round 10. When working on your own projects, you can choose different criteria. For example, must the sample members have interviewed after a certain age?

  1. Tag age and ID from Household Roster
    • Start with the variables from the household roster. You will need both the unique ids (UIDs) and ages of household members. In Round 1, the variables on the household roster begin with HHI2.
    • Search on Question Name (pick from list) starts with HHI2_U and you will get the unique id's for all household members in Round 1. If you search on Question Name (pick from list) starts with HHI2_A and Word in Title (enter search term) contains age, you will get the age of all household members in Round 1.
    • Next, you will need the unique ids and age from the household rosters for Rounds 2 and higher. If you search using Question Name (pick from list) starts with HHI_UI, you will get all the variables that contain the unique id numbers for the household members in Rounds 2 and higher. If you change the filter to Question Name (pick from list) starts with HHI_AG, the list of variables that result contains the ages and the estimated ages of the household members.
    • For this tutorial, we are not going to use the information in the estimated ages variables (asked only when the respondent can't report the age of the household member), so search one more time keeping the previous filter and including an additional one where Word in Title (enter search term) does NOT contain estimated.
  2. Tag age and ID from non-resident Roster
    • Pull the relevant variables from the non-resident roster. To find the unique ids of members of the non-resident roster, search with Question Name (pick from list) starts with NONHHI, Word in Title (enter search term) contains unique, and Year=1997. Tag these variables.
    • Next, to get the ages of the members of the non-resident roster, keep the filters for NONHHI and 1997, and add a new filter where Word in Title (enter search term) contains age.
  3. Tag age and ID from partner Roster
    • Tag the partner characteristics from the partner rosters. You will need to tag both id and uid variables. The ids (round and loop on the partner roster) match to the values on the MAR_PARTNER_LINK array. The uids match to the household roster variables. To find id and unique id (UID) on the partner roster, search with Question Name (pick from list) starts with PARTNE and Word in Title (enter search term) contains id. This will bring up partner ids and partner uids. Tag all of these variables.
    • Next, to find age from the partner roster, search with Question Name (pick from list) starts with YMAR-3 and Word in Title (enter search term) contains age. This brings up more variables than only the age of the partners.
    • Note: if you refine your search, by keeping the filters, and adding Word in Title (enter search term) contains start, you will be closer to only those variables that you want. This second search brings up the age of the partner and the estimated age of the partner when the couple started living together. You can further limit the search by adding a filter of Word in Title (enter search term) does NOT contain estimated. Now you have all the age and id variables for the partners.
  4. Tag Mar_Partner_link event history array variables
    • Last, you will need the variables that tell you with whom the respondent first cohabitated. The MAR_PARTNER_LINK monthly arrays provide the id of the partner in each month that the respondent is cohabitating. The partner id lets you link to partner roster variables. To find these search Question Name (enter search term) contains MAR_PARTNER_LINK and tag the variables from April 1994 through May of 2007.

Step 2: Extract selected variables

In Step 1, a variable tagset was created from the household rosters across rounds, the Round 1 non-resident roster, and Round 10 MAR_PARTNER_LINK array. Now you have partner ids, ages, and information about when the respondent first cohabitated and with whom.

Run an extract to create a data set and corresponding SAS/SPSS/Stata/R programs.

Using the NLS Investigator

To create a tagset of specific variables and then extract the data set, use the Save / Download Tab in the NLS Investigator.

Note that this tutorial uses SAS programming code, but the same logic applies to other statistical software packages. Also, the variables used in the program examples are based on question names from the NLSY97, in most cases they are altered to reflect the survey year and to be readable by the software.

Step 3: Find partner id for the first partner

To find the id for the first partner, look at the element of the MAR_PARTNER_LINK array for the first element that contains a partner id. For instance, for respondent id=7525, the first partner id (501) is found in MAR_PARTNER_LINK_02_01 (February 2001).

In Round 10, the MAR_PARTNER_LINK array is available for the months April 1994 to May 2007.

Using SAS, the sample code provided below shows how to find the id of the first partner. The code first creates an array with all the MAR_PARTNER_LINK variables. It then loops through the array until SAS finds the first element in which the array contains a partner id. When the array first contains a value for partner id, the value of MAR_PARTNER_LINK is the id for the respondent's first partner. The sample creates two new variables:

  1. firstpid which contains the value of the 1st partner id.
  2. month1cohab which indicates the number of the element in array when the respondent first cohabitated.

Step 4: Split 1st partner id into round number and loop number

The values of the MAR_PARTNER_LINK variables are the round followed by the partner number within that round. For instance, for sample member 7525, the MAR_PARTNER_LINK value in February of 2001 is 501, indicating that this partner is first listed in Round 5. You will need to look in the partner roster in this round. To do this more efficiently, split the partner id into its two components' round and number within that round (withinrnd).

Step 5: Locate partner on partner roster and link to his/her age

Using SAS, define three arrays:

  1. pid lists the partners' ids and is made up of the variables that indicate the partner's id. A match between the value of firstpid and the value in this array that is needed to find the partner's age.
  2. puid lists the partners' unique ids and is made up of the variables that provide the unique id of the respondent's partners.
  3. page lists the partners' ages and is made up of the variables that provide the age of the respondent's partners' for partners who were not living in the respondent's household at the time of the interview.

All three arrays have 40 elements; that is, 10 rows (one for each round of the data) and 4 columns (4 is the maximum number of partners that any respondent reports in a round). In some rounds, all respondents report fewer than 4 partners. In that case, there is no variable for the 4th partner's age or unique id. This sample uses blank as a placeholder variable to fill out the array in these cases.

The sample code below shows how to loop through the arrays and find the 1st partner. The 1st partner will show up for the first time in the row of the array pid is equal to the variable round that was created in Step 4. Search over the elements of this array to find the partner id that is equal to firstpid. At this same location in the puid array, you will find the 1st partner's unique id and at the same location in the page array, you will find the 1st partner's age. Create two new variables (See Step 5 code below):

  1. partuid which is the 1st partner's unique id.
  2. partage which is the 1st partner's age (again for partner's not living in the household at the date of the interview).

Step 6: For resident partners, locate partner on household roster and link to his/her age

Thus far, you have found the age of the respondent's first partner for partners with whom the respondent lived between rounds, but who were no longer living with the respondent at the date of the interview. The characteristics of partners with whom the respondents were living at the time of the interview are available on the household roster for that round.

You will use the partner's unique id to identify him or her on the household roster. To do this make the hhuid array of household unique id's. There are up to 17 household members in some Round 1 households. That is the largest household size across rounds, so make the arrays 10 rows (one for each round) by 17 columns (largest household size).

The sample code below uses the round variable that you created in Step 5 to find the round in which the partner is first reported by the respondent. Then check the household unique id's for that round until you find the one that is the same as the partner's unique id. At that same location in the hhage array, you will find the 1st partner's age.

Step 7: For non-resident partners (prior to Round 1), get age from the non-resident roster in Round 1

There is one last possible location to find the age of the respondent's first partner and that is on the Round 1 non-resident roster. The structure is very similar to the partner roster or the household roster from previous steps.

Define two arrays for this step:

  1. nruid which contains the unique ids of all key non-residents that get rostered onto the Round 1 non-resident roster.
  2. nrage which contains the age of those on the non-resident roster.

Because we are only checking the Round 1 non-resident roster, the arrays have only one row. The longest non-resident roster in Round 1 has 23 people on it, thus we have 23 variables in the NLSY97 for unique id and 23 for age (and other characteristics) of the members of the non-resident roster. Consequently, the arrays used here have 23 elements.

If the variable for the 1st partner's unique id (partuid) has a value between 200 and 300, then that partner is on the Round 1 non-resident roster. The first line of the sample code below checks this condition and only executes the lines that loop through the elements in the array if the 1st partner is on the Round 1 non-resident roster. The sample code then checks to find the 1st partner's unique id, and if found, fills in the variable partage with the corresponding age in the array nrage.

Additional information

Output

You have created a variable that tells you the age of respondent's first partner: partage.

Partage is defined for 4499 observations, has a mean value of 21.6059124, and a standard deviation of 4.2393980.

Extensions

What might be more relevant is the difference in age between the respondent and his or her first partner. This requires a few more steps but is easy to figure out.

  1. The respondent's month and year of birth are collected as part of the survey, KEY!BDATE_M and KEY!BDATE_Y.
  2. For respondents whose partner's age is reported as of the date that the couple began living together, you can calculate the respondent's age at that point: (168+month1cohab)-(KEY!BDATE_Y-1980)*12+ KEY!BDATE_M) is the respondent's age in months at that point. Note that:
    • 168+month1cohab will give you the month of first cohabitation from the event history variables.
    • A created variable is available called: CV_FIRST_COHAB_MONTH, but the value recorded in this variable will not necessarily match the event history.
    • You would need to take the integer of this age, since the partner's age is reported in years.
  3. For respondents whose partner's age is reported as of the interview date, you can use the respondent's age at the interview. The difference between the respondent's age and the 1st partner's age can now be calculated.

Special notes

  • Up until Round 9, cohabitation was collected only for opposite sex couples. Beginning in Round 9, the NLSY97 began collecting spells of cohabitation including those with same-sex partners.
  • Timing of the age of the 1st partner differs for partners currently living in the household and those who are no longer living in the household. For partners currently living in the household, the respondent reports their age in the household section at the date of the interview. For partners no longer living with the respondent, the respondent reports their age when the couple started living together.
  • The variable CV_FIRST_COHAB_MONTH will not necessarily match the information in the event history arrays. Different rules are applied when constructing the created variable versus the event history array.