Skip to main content
National Longitudinal Survey of Youth 1997 (NLSY97)

Appendix 5: Income and Assets Variable Creation

This appendix presents SAS programs for the creation of income and assets variables. Unless otherwise noted, the programs presented are for round 20. In cases where a variable has been discontinued, the program for the final round in which it was created is presented. Except for minor hand edits to account for inconsistencies in the raw data, variable creation code remains generally constant from round to round. However, users who need code for a specific round not included here may contact NLS User Services.

Click a topic below to view details and programming code:

Variables created

  • CV_INCOME_FAMILY
  • CV_HH_POV_RATIO

This program creates the total family income variables. The total family income variable (coded as groshhIY in the programming code) includes total annual cash receipts before taxes from all sources. The program then creates a ratio comparing the family's total income to federal poverty thresholds based on the number of household residents and the number of members under age 18. (For rounds 1-7 the questionnaire asked for income of all household members rather than all family members living in the household. Therefore the total income variable, INCOME_GROSS_YR, reflects total household income.)

Researchers should note that, like many income and asset variables in the data set, these two variables are topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Open the Household Income program file

Variables created

  • CVC_GOVNT_PRG_EVER--number of months cash or transfer payments were ever received from government programs
  • CVC_GOVNT_PRG_YR.80 - CVC_GOVNT_PRG_YR.xx--number of months cash or transfer payments were received from government programs during the year.
  • CVC_AMT_GOVNT_PRG_YR.80 - CVC_AMT_GOVNT_PRG_YR.xx--total amount received from other government programs during the year

This program creates several variables describing the respondent's participation in government programs for the economically disadvantaged. During the interview, respondents report amounts received and months of participation in Aid to Families with Dependent Children (AFDC); food stamps; and Women, Infants, and Children (WIC). There is also an "other assistance" question to capture information about any other government program from which respondents may have received assistance.

The program to create these variables first creates month-by-month participation arrays for four categories of assistance (AFDC, food stamps, WIC, and other programs). These month-by-month variables constitute part of the event history array for program participation; see Appendix 7: Continuous Month Scheme and Crosswalk for more information. After all the arrays are created, this program uses the array data to create the seven summary variables listed above.

Worker's Compensation. In early survey rounds, comparable variables were created for receipt of worker's compensation. These variables included:

  • CV_AMT_WC_YR (1997-2001)--dollar amount of the worker's compensation received during each year
  • CV_WC_WKS (1997)--number of weeks the respondent ever received worker's compensation
  • CV_WC_EVER (1998-99)--number of months the respondent ever received worker's compensation
  • CV_WC_YR (1997-1999)--number of months the respondent received worker's compensation during each year

Because worker's compensation can be reported either as a lump sum received on a single day or as payments over the period of time that the respondent was not working, the number of weeks/number of months summary variables were discontinued after round 3. The total benefits received in each calendar year were calculated through round 5. Since that time, respondents have simply been asked their total income from worker's compensation in each year and so no created variable is needed.

The logic and structure of the SAS code for the creation of the early-round worker's compensation variables is the same as that shown below for the other program participation variables. Because it is embedded in the full program participation variable creation code for those rounds it is not reproduced here. Users who need exact code should contact NLS User Services.

Open the Participation in Government Programs program file

Variables created

  • CV_UI_EVER
  • CV_UI_YR.80-CV_UI_YR.xx
  • CV_AMT_UI_YR.80-CV_AMT_UI_YR.xx
  • CV_UI_SPELLS_YR.80-CV_UI_SPELLS_YR.xx

This program creates several variables describing the respondent's receipt of unemployment compensation:

  1. CV_UI_YR.XX - indicates the number of months in any given year (from 1980 to most recent interview year) that R received Unemployment Compensation.
  2. CV_UI_EVER - indicates the total number of months (from 1980 to most recent interview year) that R received Unemployment Compensation.
  3. CV_AMT_UI_YR.XX - indicates the amount of Unemployment Compensation received in any given year (from 1980 to most recent interview year) by R.
  4. CV_UI_SPELLS_YR.XX - indicates the number of spells of Unemployment Compensation that R received started in any given year (from 1980 to most recent interview year).

The program first creates a month-by-month participation array for unemployment compensation. These month-by-month variables constitute part of the event history array for program participation; see Appendix 7: Continuous Month Scheme and Crosswalk for more information. After the array is created, the program merges data to create the summary variables.

Open the Unemployment Compensation program file

These variables were last created in round 12. The age 20 assets section was discontinued after that round as NLSY97 respondents had aged out of the section.

Variables created

  • CVC_HH_NET_WORTH_20 (total net worth)
  • CVC_HOUSE_VALUE_20 (value of owned housing)
  • CVC_HOUSE_DEBT_20 (amount of housing debt)
  • CVC_HOUSE_TYPE_20 (type of housing owned)
  • CVC_ASSETS_FINANCIAL_20 (value of financial assets)
  • CVC_ASSETS_NONFINANCIAL_20 (value of non-financial assets, excluding housing)
  • CVC_ASSETS_DEBTS_20 (amount of debt, excluding housing)
  • CVC_ASSETS_RND_20 (round in which assets data were collected)

This program first creates a set of asset variables for those respondents who went through the age 20 assets section in round 12. The variables used in this section are listed in a separate file (due to the length of the list).

The program then combines the round 12 variables with the corresponding variables from rounds 3-11, creating a set of "collapsed" variables that includes age 20 assets information for all respondents, regardless of the round in which the data were collected. These variables are also included for all respondents, even if they were not interviewed in round 12. The individual round created variables are not available in the data set; only the collapsed variables are included.

Researchers should note that, like many income and asset variables in the data set, this variable is topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Open the Over 20 Assets program file

Variables created

  • CVC_HH_NET_WORTH_25 (total net worth)
  • CVC_HOUSE_VALUE_25 (value of owned housing)
  • CVC_HOUSE_DEBT_25 (amount of housing debt)
  • CVC_HOUSE_TYPE_25 (type of housing owned)
  • CVC_ASSETS_FINANCE_25 (value of financial assets)
  • CVC_ASSETS_NONFINANCE_25 (value of non-financial assets, excluding housing)
  • CVC_ASSETS_DEBT_25 (amount of debt, excluding housing)
  • CVC_ASSETS_RND_25 (round in which assets data were collected)

Round 9 initiated the age 25 assets section. Like the age 20 assets section, this module is administered in the first interview after the respondent's 25th birthday. These variables report the respondent's assets at age 25. As with the age 20 assets section, the variables are presented as collapsed variables which contain data from all survey rounds in which the section was administered. (See the description of the Over 20 Assets variables above for more information.)

Researchers should note that, like many income and asset variables in the data set, this variable is topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Open the Over 25 Assets program file

Variables created

  • CVC_HH_NET_WORTH_30 (total net worth)
  • CVC_HOUSE_VALUE_30 (value of owned housing)
  • CVC_HOUSE_DEBT_30 (amount of housing debt)
  • CVC_HOUSE_TYPE_30 (type of housing owned)
  • CVC_ASSETS_FINANCE_30 (value of financial assets)
  • CVC_ASSETS_NONFINANCE_30 (value of non-financial assets, excluding housing)
  • CVC_ASSETS_DEBT_30 (amount of debt, excluding housing)
  • CVC_ASSETS_RND_30 (round in which assets data were collected)

Round 14 initiated the age 30 assets section; the final round where these questions were included was round 18. Like the age 20 and 25 sections, this module is administered in the first interview after the respondent's 30th birthday. These variables report the respondent's assets at age 30. As with the earlier sections, the variables are presented as collapsed variables which contain data from all survey rounds in which the section is administered. (See the description of the Over 20 Assets variables above for more information.)

Researchers should note that, like many income and asset variables in the data set, this variable is topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Variables used

Variables in the programs that generated the 2017 (round 18) assets30 created variables are listed in the programming file. In general several naming conventions are used:

  • Survey year is deleted from the end of extracted qname
  • Decimals (.) and hyphens (-) in qnames were replaced with underscores (_)
  • Tildas were eliminated from qnames
  • Qnames beginning with "yast" were shortened to "yas"
  • Qnames beginning with "yhhi" were shortened to "hhi"
  • Qnames beginning with "ysch" were converted to "e", and the loop numbers were collapsed to single digits

Open the Over 30 Assets program file

Variables created

  • CVC_HH_NET_WORTH_35 (total net worth)
  • CVC_HOUSE_VALUE_35 (value of owned housing)
  • CVC_HOUSE_DEBT_35 (amount of housing debt)
  • CVC_HOUSE_TYPE_35 (type of housing owned)
  • CVC_ASSETS_FINANCE_35 (value of financial assets)
  • CVC_ASSETS_NONFINANCE_35 (value of non-financial assets, excluding housing)
  • CVC_ASSETS_DEBT_35 (amount of debt, excluding housing)
  • CVC_ASSETS_RND_35 (round in which assets data were collected)

Round 17 initiated the age 35 assets section. Like the earlier assets sections at each 5-year age mark, this module is administered in the first interview after the respondent's 35th birthday. These variables report the respondent's assets at age 35. As with the earlier sections, the variables are presented as collapsed variables which will eventually contain data from all survey rounds in which the section is administered. (See the description of the Over 20 Assets variables above for more information.)

Researchers should note that, like many income and asset variables in the data set, this variable is topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Open the Over 35 Assets program file

Variables created

  • CVC_HH_NET_WORTH_40 (total net worth)
  • CVC_HOUSE_VALUE_40 (value of owned housing)
  • CVC_HOUSE_DEBT_40 (amount of housing debt)
  • CVC_HOUSE_TYPE_40 (type of housing owned)
  • CVC_ASSETS_FINANCE_40 (value of financial assets)
  • CVC_ASSETS_NONFINANCE_40 (value of non-financial assets, excluding housing)
  • CVC_ASSETS_DEBT_40 (amount of debt, excluding housing)
  • CVC_ASSETS_RND_40 (round in which assets data were collected)

Round 20 initiated the age 40 assets section. Like the earlier assets sections at each 5-year age mark, this module is administered in the first interview after the respondent's 40th birthday. These variables report the respondent's assets at age 40. As with the earlier sections, the variables are presented as collapsed variables which will eventually contain data from all survey rounds in which the section is administered. (See the description of the Over 20 Assets variables above for more information.)

Researchers should note that, like many income and asset variables in the data set, this variable is topcoded to protect respondent privacy. More information about topcoding is available in the Income, Assets & Program Participation section.

Open the Over 40 Assets program file

The Household Net Worth (PDF) file presents code for the creation of several round 1 household income and assets variables:

  • CV_HH_NET_WORTH_Y
  • CV_HH_NET_WORTH_P
  • CV_INCOME_GROSS_YR
  • CV_HH_INCOME_SOURCE
  • CV_HH_POV_RATIO