Skip to main content

NLSY79

NLSY79 Appendix 19: SF-12 Health Scale Scoring

The SF-12, which stands for short-form 12-question, is a brief inventory of self-reported mental and physical health. This scale was administered to respondents who had turned 40 since their last interview as part of the age 40+ health module, included in the 1998, 2000, 2002, 2004, and 2006 surveys, and in the 50+ health module administered in the 2008, 2010, 2012, 2014 and 2016 surveys. The 2018 and 2020 surveys contain the 60+ health module for respondents born in 1957 or 1958, as well as a small number of under-60 respondents in 2018 who had not previously been administered the 50+ module.

Rather than using the twelve questions separately, SF-12 users often create two summary scores:

  • PCS-12 or Physical Component Summary (measures physical health): question name H40-SF12_PCS_SCORE, H50-SF12_PCS_SCORE, H60-SF12_PCS_SCORE
  • MCS-12 or Mental Component Summary (measures mental health): question name H40-SF12_MCS_SCORE, H50-SF12_MCS_SCORE, H60-SF12_MCS_SCORE

CHRR has received permission to calculate these summary scores for NLSY79 respondents and release the scores with the main data set. Scores are created according to the manual by Ware, Kosinski, and Keller (1995) and are provided on the data set with the question names listed above. However, we are not permitted to release the scoring formula; interested users should visit the QualityMetric website for SF survey information. For users looking at the SF surveys on the Internet, note that the NLSY79 uses version 1, not version 2, of the survey.

In large national surveys of the entire US population, both the PCS-12 and MCS-12 have a mean of 50 and a standard deviation of 10. The interpretation of these two scores is straightforward. NLSY79 respondents with a score above 50 have better health than the typical person in the general U.S. population (age is not held constant). NLSY79 respondents with scores below 50 have worse health than the typical U.S. person. Each one-point difference above or below 50 corresponds to a one-tenth of a standard deviation. For example, a person with a score of 30 is two standard deviations away from the mean.

Table 1. Summary Statistics for NLSY79 SF-12 Scores
 NLSY79 Survey Year PCS Score
 
MCS Score
 
1998
40+ Health Module
 
Mean:52.54234
StDev:7.36181
Mean:53.09726
StDev:7.84907
2000
40+ Health Module
Mean:51.9258
StDev:8.3178
Mean:52.8005
StDev:8.7351
2002
40+ Health Module
Mean:51.8369
StDev:8.22884
Mean:52.9219
StDev:8.45037
2004
40+ Health Module
Mean:52.0493 
StDev:8.05267
Mean:52.9520 
StDev:8.39288
2006
40+ Health Module
Mean:51.2377
StDev:8.3259
Mean:53.0284
StDev:8.1516
2008
50+ Health Module
Mean:49.3411
StDev:10.1989
Mean:52.2461
StDev:8.9656
2010
50+ Health Module
Mean:48.9792
StDev:10.3186
Mean:52.9244
StDev:8.9174
2012
50+ Health Module
Mean:49.1479
StDev:10.1989
Mean:52.9057
StDev:9.1790
2014
50+ Health Module
Mean:48.9292
StDev:10.4001
Mean:53.2262
StDev:8.6177
2016
50+ Health Module
Mean:48.5636
StDev:11.7487
Mean:52.3556
StDev:10.6123
2018
50+ Health Module
Mean:48.9917
StDev:11.4059
Mean:53.8298
StDev:8.9462
2018
60+ Health Module
Mean:46.5358
StDev:11.4087
Mean:52.8598
StDev:8.7285
2020
60+ Health Module
Mean:45.1580
StDev:12.9898
Mean:52.0627
StDev:10.7300

As Table 1 indicates, the typical NLSY79 respondent self-reports better health than the typical U.S. respondent while in their 40s. This matches the results described in the SF-12 scoring manual, which in addition to population norms reports the norms for U.S. residents who are between the ages of 35 and 44. The manual reports the mean PCS score for this subgroup as 52.18 (std. dev. 7.30) and the mean MCS score as 50.1 (std. dev. 8.62).

The SF-12 scoring manual also indicates that for U.S. residents between the ages of 45 and 54, the mean PCS score is 49.71 (std. dev. 9.5) and the mean MCS score is 50.45 (std. dev. 9.55). NLSY respondents in 2008 have higher MCS scores than the overall population and similar PCS scores.

Overall, the SF-12 manual shows higher-than-average physical scores prior to age 55 and rapidly falling scores after that age. Mental scores do not appear to decline with age. Information from the NLSY79 40+, 50+, and 60+ health modules appears to match this pattern.

Reference

Ware, John, Mark Kosinski and Susan Keller. 1995. SF-12: How to Score the SF-12 Physical and Mental Health Summary Scales, 2nd edition. Boston: The Health Institute, New England Medical Center.

NLSY79 Appendix 18: Work History Data

NLSY79 WEEK NUMBERS AND CORRESPONDING DATES (Separate Excel File). The Continuous Week Crosswalk contains the start date for each week (Sunday) from January 1, 1978, through December 31, 2021, and the week numbers assigned to that week in the construction of the work history data file. These week numbers do not match the week numbers printed on the employment calendar included with the survey instrument materials for earlier survey years. Week numbers in the work history programs are assigned based upon actual dates collected during the course of the interview. The variable names for the week-by-week arrays (status, hours, dual jobs) incorporate the specific year and number of the week within the specific year. For example, the 10th week in 1989 in the status array is called STAT8910. These names do not correspond to the strictly consecutive week numbers from 1-2184 listed in the Excel spreadsheet. The spreadsheet also contains the week numbers for each calendar year so that users will have a crosswalk for both calendar-year and continuous week numbers.

DESCRIPTION OF THE 1979-2018 NLSY79 WORK HISTORY PROGRAM

This document provides a general explanation of the procedures and logic of the work history programming and variables. The original PL/I programming was used to establish and maintain the structure and data from survey years 1979 to 1994. Therefore, the following discussion heavily references these programs. The series of SQL programs currently in use were converted directly from the PL/I code for the 1996 release.

The original PL/I work history program was written to create key work variables like "Number of Weeks Worked since Date of Last Interview," "Number of Weeks Worked in Last Calendar Year," etc. These key variables use all recorded jobs for each respondent (up to 10 jobs). The WEEKLY LABOR STATUS, HOURS WORKED, and DUAL JOBS arrays also were created with data from up to 10 jobs for each respondent. However, only 1% of all respondents have more than 5 jobs in any given survey year, resulting in valid missing data for jobs 6 through 10 for 99% of the sample. In order to reduce the total number of variables, public data files contain the specific employer variables for only 5 jobs for each respondent.

The purpose of the WEEKLY LABOR STATUS, HOURS WORKED and DUAL JOBS arrays is to create a longitudinal work history record for each respondent through the 2018 (round 28) interview date. Because each year's survey collects information on jobs held and periods not working since the date of the last interview, it is possible to construct a continuous, week-by-week record for each respondent.

There are a few exceptions, however. In the 1979 and 1980 surveys, job information was collected only for respondents age 16 and older at the date of the interview. Additionally, the 1979 survey data contain the most cases with inconsistent or invalid employment-related data of any survey year, resulting in a greater proportion of missing gaps in the work history record. For example, in 1979 there are 86 cases that have job dates that exceed the interview date; in 1980, there are 11 cases that have job dates that exceed the interview date; in 1981 there are none.

Users should also note that 1,079 members of the military sample were dropped as of the 1985 survey. In 1991, all members of the economically disadvantaged non-black/non-Hispanic oversample were dropped as well. More information on these sample types is available in the Retention and Reasons for Noninterview section.

Description of the 1979-94 PL/1 Program

The following is an abbreviated step-by-step description of the original 1979-1994 PL/I programming. In 1996, the PL/I program was converted to SQL code in a series of programs that replicate the PL/I program and functions.

  1. All of the variables used in the program are declared and most are included in the PL/I structure called VARIABLES.
  2. The variables common to all respondents, like ID, SAMPLE_ID, etc. are assigned values. The week-by-week arrays are initialized to zero and all of the variables included in the WORK_HISTORY part of the structure are initialized to -4.
  3. For each interview year, procedures (VARIABLES1979, VARIABLES1980, etc.) that assign the variables for each survey year are called if the respondent was interviewed. Start and stop dates for jobs and periods not working are sent to the WEEK procedure, where the valid month, day and year variables are converted to a week number, with week 1 being January 1, 1978. If the respondent was not interviewed, then all WORK_HISTORY variables for that survey year are set to -5.
  4. After all VARIABLES19XX are assigned, the procedure CALC is called to evaluate the various start and stop dates, to assign codes, and to create the job number for all of the jobs for each interview year. Within CALC, the procedure FILL is called to fill in the codes that are assigned to the WEEKLY LABOR STATUS and DUAL JOBS arrays and to calculate the hours worked during each week that are loaded into the HOURS WORKED array.
  5. Finally, the procedure SUMMER is called to calculate and sum the key work history variables.

CALC Procedure (in original PL/1 program)

This procedure processes all jobs for each survey year, beginning with the first job. CALC starts by calculating each year the number of jobs since the date of the last interview, assigning a job number, and calculating the hourly wage for each job. If the respondent had the job at the date of the last interview, the start date becomes the date of the last interview, which is then "ceiled" or rounded up using the "ceil" function. Next, if the respondent is currently working at the job, it assigns the interview date, which is "floored" or rounded down using the "floor" function, as the stop date. (All dates at this point have been converted to week numbers in the WEEK procedure.)

If the start and stop dates of the job are valid and do not coincide with an interview date, the start and stop dates are "ceiled." The number of weeks tenure on the job is calculated by subtracting the start week from the stop week of the job. FILL is then called to fill in the week arrays for the particular job. The start and stop weeks of the job, the job number, and the number of hours usually worked per week (HOURSWEEK) at the job are sent to the FILL procedure.

If the job had any periods not working associated with it, then up to four periods not working for the employer are processed. If the start and stop dates for the periods not working are valid, a code is assigned indicating whether the respondent was out of the labor force (OLF) or unemployed for the period. If the respondent is OLF the whole period, a code of 4 is assigned. If the period not working is divided between OLF and unemployed, a temporary code of 9 is assigned and the number of weeks unemployed is determined. If the start and stop dates of the period are valid, but the labor force status cannot be determined, a code of 2 is assigned.

The period start and stop dates, CODE, and HOURSWEEK are sent to FILL. If the period dates are invalid, a code of 3 is assigned and start and stop dates of the job are passed to FILL, along with HOURSWEEK. This is only done for the first period not working for the first employer this week.

Next, tenure at the job is again calculated, this time in terms of total weeks on the job instead of just since the date of the last interview. First, a determination is made to see if the employer is the same employer a respondent reported at the time of the previous interview. If there is a previous employer number and the tenure for that previous employer is valid, then the tenure for the job from the previous interview is added to the tenure for the job being processed. Only tenure with an employer that is reported during contiguous survey years can be calculated over the total time spent with an employer. For example, consider a respondent who was interviewed in 1981, 1982 and 1983 surveys. Now suppose the respondent reported having worked for the Department of Labor at the time of the 1981 survey and left and then began working for that same employer again by the time of the 1983 survey. Because the employer numbers are only followed between contiguous interviews, there is no way to calculate total tenure with the Labor Department since the respondent did not report that employer during the 1982 survey. Only employers from the previous year's survey are compared with employers reported in the current year's survey.

Finally, CALC evaluates up to six periods not working or in the military between jobs. For each of the periods not working, the same logic used for the periods not working on a job is used for the periods between jobs.

User note

  1. If the start and stop dates for a job are invalid, then that job has no dates that can be sent to FILL. As a result, there is no record of that job in the WEEKLY LABOR STATUS array and no indication that the job is missing. In 1979, there were 1190 cases with any invalid start or stop dates (i.e., at least one week is unaccounted for - WEEKLY LABOR STATUS=0); in 1980, there were 942 cases; in 1981, there were 254; and in each of the following survey years, there were fewer than 200 cases.
  2. A job held in any day of a week is counted as a job for the whole week. This is achieved by "flooring" start dates and "ceiling" stop job dates to integer week values. There is one exception previously mentioned--stop dates for jobs held at the interview date are floored. This is done to avoid double counting across interview years.
  3. tart and stop dates for periods not working either within tenure with a job or between jobs are "ceiled" in FILL.
  4. The HOURS WORKED array is set to -3 if any job in the week has an invalid value for HOURSWEEK. Between 1979 and 1992, the maximum number of hours for any given week is 96. Beginning in 1993, the maximum number of hours for a given week can be reported up to 168 hours (the total number of hours possible in a single week).

FILL Procedure

The FILL procedure takes the start and stop dates that have been converted to week number values and fills in values for the WEEKLY LABOR STATUS, HOURS WORKED and DUAL JOBS arrays for each week between stopping and starting dates that are passed to it.

In FILL, the WEELY LABOR STATUS array is loaded with either a survey year job number or a code signifying that there was not one civilian job that week (a code of 0, 2, 3, 4, 5, or 7). The DUAL JOBS array is loaded with a survey year job number(s) if more than a civilian job is held that week; otherwise it is assigned a value of 0. The HOURS WORKED array is loaded with the number of hours worked on all jobs held that week, up to a maximum of 96 through 1992, and a maximum of 168 in subsequent years.

FILL is called from the CALC procedure for all start and stop dates except for military start and stop dates. Military start and stop dates are determined in the VARIABLES procedures for each year, and FILL is called from those procedures to fill in a code of 7 in the WEEKLY LABOR STATUS array for active military service.

Initially, FILL checks for valid start and stop dates. If the dates are valid, then FILL takes one of three paths. The first path is to evaluate the WEEKLY LABOR STATUS array for that week to see (1) if it contains a job number, (2) if the code passed from CALC is a job number, and (3) if the previous employer number for the job is different from the job number in the WEEKLY LABOR STATUS array. If all of these statements are true, then FILL determines that the job is not a duplication of the job that exists in the WEEKLY LABOR STATUS array for that week.

Next, FILL looks at the DUAL JOBS array to see if there is a job number in DUAL JOBS. If DUAL JOBS already has a job number(s), then the current job number is compared to the job number(s) in DUAL JOBS. If the job number does not exist in DUAL JOBS, then the HOURSWEEK for that job is added to the number of hours for that week for the HOURS WORKED array and the job number is added to DUAL JOBS. If the job is a duplicate job, then nothing is done to the arrays.

The second path is taken if there is no dual job and if the week dates are associated with a job or if there is not job number in the WEEKLY LABOR STATUS array. If this is the case, FILL tests for two conditions. The first condition is met if COD is 9. (A code of 9 means that the respondent had a period not working that was part OLF and part unemployed.) If COD equals 9, then the HOURSWEEK are subtracted from the hours in the HOURS WORKED array, because the respondent is not working at the job. The number of weeks unemployed (code of 4) is arbitrarily assigned to the middle portion of the weeks not working, and the rest of the period is determined to be OLF (code of 5).

The second condition in the second path tests to see if the value in the WEEKLY LABOR STATUS array is not a code of 4; if COD is a job number then the job number is placed into WEEKLY LABOR STATUS. If there are hours for the week and if the respondent was not working for the employer during this week, then the hours for the week are set to zero if the HOURS WORKED array is greater than zero. Otherwise, HOURS WORKED receives whatever value is in HOURSWEEK.

The third path FILL evaluates is if the week falls in a period not working and if there is a dual job. Then, the job number is deleted from the DUAL JOBS array and HOURSWEEK for the job are subtracted from the HOURS WORKED array.

Finally, if there are more than four dual jobs in the DUAL JOBS array then no other job numbers are added to DUAL JOBS because the array for each week is limited to four dual job variables.

User note

A few last notes about FILL:

  1. Civilian work takes precedence over any other activity.If the respondent has a civilian job while in the military, then the civilian job code replaces the military code in the WEEKLY LABOR STATUS array.
  2. The order of precedence in the construction of the WEEKLY LABOR STATUS array after a civilian job is as follows:
    1. a code of 3, associated with an employer but periods not working with employer are missing; if any period not working is missing, then the entire period of the job is assigned a 3. In 1979, there are 274 cases with invalid period dates, and in each of the following survey years, there are fewer than 60 cases
    2. a code of 4, unemployed
    3. a code of 5, OLF
    4. a code of 2, period not working with employer, but OLF vs unemployed status is unknown
    5. a code of 7, active military service
    6. a code of 0, no information is reported to account for the week
  3. About 32 cases have a week in which JOB # 1 from a survey week first appears in the DUAL JOBS array rather than the WEEKLY LABOR STATUS array. This occurs when (1) there is a discrepancy between the date of the previous interview date as it appears on the info sheet that the interviewer uses at the time of the interview and the interview date recorded at the previous interview or (2) the starting date and ending date for a job across interview years are the same due primarily to the way the dates are floored and ceiled. In all these cases, an erroneous entry appears in the DUAL JOBS array for that given week.

CHANGES TO THE WORK HISTORY DATA

There have been a number of changes and updates to programs and input variables that are used to create the Work History data over the NLSY79 tenure. The most significant programming change was the conversion of the programming from PL/1 to equivalent SQL code. A list of the main NLSY79 variables used in the creation of the 1979-96 work history data set is accessible at the end of this appendix.

Other changes and updates are detailed in Changes in the NLSY79 Work History Data.

DESCRIPTION AND CODES FOR VARIABLES IN 1979-2018 NLSY79 WORK HISTORY DATA

Below are discussions of three types of variables:

  • the weekly arrays created by the Work History programs 
  • other items produced by the Work History programs 
  • variables that are either used in the Work History programs, or are basic, commonly used job-specific survey items that were duplicated on the separate Work History data sets prior to the 1979-2000 release.  When the Work History data became part of the general public 1979-2000 release, these variables were assigned to areas of interest titled WORK HISTORY --MAIN -- JOB INFORMATION [YEAR], to make it easier for historical data users to recreate the separate Work History data files they may have been working with up to that point.

Variable coding information, as well as formulas for combining job-specific characteristics from several sources, are included where relevant.

Work history weekly array variables

The foundation of the work history data is the set of week-by-week arrays depicting labor force status, total number of hours, and dual job holdings if any, for each week since January 1, 1978. These array variables are found in three areas of interest in the NLSY79 public release. The construction and coding for each of the three arrays are described below, listed by their area of interest.

Area of interest: WORK HISTORY-WEEKLY LABOR STATUS

The WEEKLY LABOR STATUS array is the work history week array. Each variable corresponds to a week relative to 1/1/78.[1] There are 2085 variables in the 1979-2018 WEEKLY LABOR STATUS array--one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[1] [4] There are no missing data codes, and the codes that are in the array are as follows:

  0= no information reported to account for week.
  2= not working (unemployment vs. out of the labor force cannot be determined.)
  3= associated with an employer but the periods not working for the employer are missing. If all of the time with the employer cannot be accounted for, a 3 is loaded into the STATUS array instead of a job code.
  4= unemployed. If a respondent is not working and part of the time is spent looking for work or on layoff, the exact weeks spent looking for work is unknown. As a result, the number of weeks spent looking is assigned to the middle part of the period not working.
  5= out of the labor force.
  7= active military service. If a respondent has a civilian job while in active military service, the civilian job code is loaded into the array instead of a code of 7.
  >100= worked. The code represents the appropriate work history year multiplied by 100 plus the job number for that employer in that year. For example, 102=year 1, job 2; 305=year 3, job 5. This allows one to associate any characteristic for a job with that week. If a respondent has more than one job at the same time, the job number that is loaded into the array is determined by the starting date of the job with the lowest job number, not by any particular characteristics of the job such as the number of hours worked at the job. The year in the job code is the year in which the job is reported. Jobs held in year 2, but reported in year 10 would be assigned job numbers beginning with 1001 instead of 201.

 

User Notes

In some cases, a respondent reports a period not working that is part OLF and part unemployed. In these cases, a week-specific distinction between OLF and unemployed cannot be made. Users should refer to the Work History Program Description in this appendix for a discussion of how OLF and unemployed codes are assigned to the WEEKLY LABOR STATUS array in the event that such a period occurs.

Area of interest: WORK HISTORY-HOURS WORKED

The HOURS WORKED array contains the usual hours worked per week at all jobs. There are 2185 variables in the 1979-2018 HOURS WORKED array--one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[2] [5] The codes are as follows:

  0 no hours worked or interview does not cover array week
  1-95 = usual hours worked per week
  96 = 96 or more hours per week
  -5 = noninterview
  -4 = valid skip
  -3 = invalid skip
  -2 = don't know
  -1 = refusal

 

User Notes

Beginning in 1993, the first all-CAPI survey year, the maximum hours allowed per week is 168.

Area of interest: WORK HISTORY-DUAL JOB 1-4
 

The DUAL JOB arrays contain job numbers for any weeks when the respondent worked at more than one job simultaneously. There can be up to 2185 variables in each DUAL JOB [#] array -- one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[2] [5] DUAL JOB array variables are present if a dual job was reported.[3]

The codes are as follows:

  0 = no dual job
  >100 = dual job year and job number

For example, if a respondent worked at three jobs at the same time, the code for the lowest job number would be in the WEEKLY LABOR STATUS array, and the codes for the other two jobs would be in the DUAL JOB arrays (see item 3 in the user notes below). If the three jobs that the respondent held during week 190 from the 1981 (round 3) survey were jobs 1, 5, and 6, then WEEKLY LABOR STATUS would contain the value '301' for that week, the DUAL JOB 1 array for week 190 would contain the value '305' and the DUAL JOB array for week 190 would contain '306'.

User Notes

A few additional notes are in order:

1. The maximum number of dual jobs accounted for is 4. The variable descriptions for variables in the WORK HISTORY - DUAL JOB [#] areas of interest indicate the relevant job number and week.

2. The DUAL JOB [#] arrays do not provide labor force status in the detailed manner of the WEEKLY LABOR STATUS array. They contain only second, third, fourth, and fifth job numbers for weeks in which the respondent reports more than one employer.

3. Users should be aware that it is possible in survey years 1979-92 for the CPS job number to appear in one of the DUAL JOB [#] arrays instead of the WEEKLY LABOR STATUS array, as would be expected. In most cases, the CPS job will be the lowest number job for a given year. However, this is not always the case. Each year contains a relatively small number of cases for which JOB #1 is not the CPS job. For these cases, the job number assigned by the work history program will not necessarily be the lowest one for that year. In cases for which the CPS job is not held simultaneously to any other job, the job number for the CPS job will appear in the WEEKLY LABOR STATUS array as expected. However, in cases for which the CPS job is held simultaneously with another job with a lower job number, the possibility exists that the job number for the CPS job will appear in one of the DUAL JOB [#] arrays instead of the WEEKLY LABOR STATUS array. Mechanical changes implemented in the 1993 CAPI instrument to ensure that the CPS job is always the first job have virtually eliminated this possibility from 1993 forward.

Work History non-weekly array created

Non-weekly array variables produced by the work history programs are listed below by Work History area of interest. Variables marked with an asterisk (*) contain an actual consecutive week number, ranging from week #0-2184, with the week of January 1, 1978, being week #1. Week #0 represents information for time prior to that date.

Area of interest: WORK HISTORY-MAIN-CREATED

TENURE[#] Total weeks tenure at each job as of interview date
MILWK-SLI Weeks of active military service since date of last interview
WKSWK-SLI Number of weeks worked since date of last interview
HRSWK-SLI Number of hours worked since date of last interview
WKSUEMP-SLI Number of weeks unemployed since date of last interview
WKSOLF-SLI Number of weeks out of the labor force since date of last interview
WKSUNACCT-SLI Percentage of weeks unaccounted for in calculating weeks worked since date of last interview
MILWK-PCY Weeks of active military service in past calendar year
WKSWK-PCY Number of weeks worked in past calendar year
HRSWK-PCY Number of hours worked in past calendar year
WKSUEMP-PCY Number of weeks unemployed in past calendar year
WKSOLF-PCY Number of weeks out of the labor force in past calendar year
WKSUNACCT-PCY Percentage of weeks unaccounted for in calculating weeks worked in past calendar year
WKSSINCELI Number of weeks since date of last interview
JOBSNUM Number of jobs ever reported as of interview date

Area of interest: WORK HISTORY-MAIN-JOB INFORMATION-[YEAR]
 

HRP[#] Usual wage earned at each job converted to an hourly rate

Area of interest: WORK HISTORY-HISTORY

  LASTINT_WK#_[YEAR]* Week of last interview
  CURRINT_WK#_[YEAR]* Week of current interview

Area of interest: WORK HISTORY-CALENDAR YEAR

  CAL_YEAR_JOB[#]_[YEAR] Job number that is loaded into the WEEKLY LABOR STATUS array for each job. The 1st two digits of the number are the year (01 thru 28) and the 2nd two digits are the job for that year (job 01 thru 10)
  CAL_YEAR_JOBS_[YEAR] Number of jobs in past calendar year
  WKS_NWMISSC_[YEAR] Percentage of weeks not employed in past calendar year that cannot be split between unemployed and out of the labor force

Area of interest: WORK HISTORY-JOBS

  START_WK#_[YEAR]_JOB#[##] Starting week of each job
  STOP_WK#_[YEAR]_JOB#[##] Stopping week of each job
  PER[#]_START_[YEAR]_JOB#[##] Starting week of each period not working for each job
  PER[#]_STOP_[YEAR]_JOB#[##] Stopping week of each period not working for each job

Area of interest: WORK HISTORY-GAPS BETWEEN JOBS

  BSTART_[YEAR]_PERIOD[#] Week started each period not working between jobs
  BSTOP_[YEAR]_PERIOD[#] Week stopped each period not working between jobs.

Area of interest: WORK HISTORY-SINCE LAST INTERVIEW

  LASTINT_#JOBS_[YEAR] Number of jobs since the date of the last interview
  WKS_NWMISSL_[YEAR] Percentage of weeks not employed since the date of the last interview that cannot be split between unemployed and out of the labor force

Area of interest: WORK HISTORY-MILITARY

  MIL_START1_[YEAR] Starting week of first period of active military service.
  MIL_START2_[YEAR] Starting week of second period of active military service.
  MIL_STOP1_[YEAR] Stopping week of first period of active military service.
  MIL_STOP2_[YEAR] Stopping week of second period of active military service.

NLSY79 Main Data Work History variables

A third set of variables are either used in the Work History programs, or are basic, commonly used job-specific and gap-related survey items that were at one time duplicated on the separate Work History data sets prior to the 1979-2000 release. When the Work History data became part of the general public 1979-2000 release, these variables were assigned to areas of interest titled WORK HISTORY -- MAIN -- JOB INFORMATION [YEAR], to make it easier for historical data users to recreate the separate Work History data files they may have been working with up to that point. These areas of interest continue to be maintained. Go to Work History Job- and Gap-specific Survey Items table 2018 (DOCX) to see these variables, with example reference numbers from the most recent round.

VARIABLES USED IN CREATION OF 1996 AND SUBSEQUENT WORK HISTORY DATA FILES

Beginning in 1996, the work history programming was converted to SQL programming. The SQL programs, which mirror the older PL/1 program, are not available to users. However, the Work History Input Variables 1996-Present Table (PDF) lists the variables used as inputs to the SQL programs. Users who need more information should contact NLS User Services.

Users should be aware that not all of variables listed in the table appear in the NLSY79 public release data file. Variables with no valid data for any respondent, jobs 6-10, within-job gap 4 and between-job gaps 5-6 are not currently included in the main file.

Endnotes

[1] All week number references in this program are relative to 1/1/78 and end with the most recent interview date. A week #0 is included at the beginning of the week-by-week array structures to indicate time prior to 1/1/78. Users are discouraged from incorporating data contained in this week in analysis. Researchers should instead use information from the 1979 interview concerning labor force activity prior to 1/1/78 in order to construct event histories of a more thorough nature. (Some information concerning labor force activity for respondents prior to the time frame of the initial 1979 interview is asked on an age restricted basis for respondents still in their teens at the time of interview.)

[2] See footnote 1.

[3] All variables have standard missing value codes unless otherwise noted.

[4] The final 2018 (round 28) interviews were conducted in November 2019. Therefore, valid data are only present through variables for week #2184 in the current data set. The maximum week number variable in the Dual Job [#] arrays is week #2181, as no one reported multiple jobs in weeks #2182-2184.

[5] See footnote 4.

NLSY79 Appendix 17: Interviewer Characteristics Data

Interviewer Characteristics Data and Data Review

Many researchers are interested in knowing if or how much interviewers affect respondents' answers. To enable researchers to investigate these questions, NLSY79 data releases since 1988 have contained information on interviewers' characteristics. From 1979-2000, information on the characteristics of NLSY79 interviewers primarily comes from NORC's interviewer personnel files. Data from the 2002-present surveys come from forms filled in by interviewers during their NLSY79 training program.

An extensive review of longitudinal interviewer ids (INTCHARS_INT_ID) was made prior to the 2014 data release. This review has resulted in a number of improvements in the identification of previously unidentifiable interviewers and in the linkages that can be established between specific interviewers' cases across survey years. Several types of issues were addressed, including:

  • A number of interviewers who had been assigned multiple longitudinal interviewer ids in different survey years were identified and assigned a consistent longitudinal interviewer id.
  • Interviewer ids assigned in 2002 had been based only on the project id assigned to interviewers in that survey year, which disrupted the longitudinal INTCHARS_INT_ID particularly severely in that year. The longitudinal id links have been reestablished for many interviewers in 2002 accounting for several thousand respondents.
  • A number of interviewers in isolated years had been assigned inordinately large project ids which were used as their longitudinal INTCHARS_INT_ID. These ids have been shortened to be less problematic when extracting the data.
  • Identification of a number of previously unidentifiable interviewers has been made wherever possible through re-examination of available records. These interviewers had been unlinked to their interviews in other years generally owing to erroneously recorded or undocumented ids.

As a result of this review, an improved and more consistently linked set of INTERVIEWER CHARACTERISTICS variables for each survey year is included in the current data release. Because IDs can vary considerably between survey years, matching interviewers through survey years can rely significantly on documentation containing interviewer names for verification. Future updates for mismatched and unmatched interviewers will be made, depending on availability of further documents or data.

Each NLSY79 survey year from 1979-2014 has the following variables available: interviewer ID number, number of times this interviewer has interviewed the respondent, and the interviewer's race, sex, age, and level of education. Interviewer age and number of times the interviewer interviewed a particular respondent were dropped in 2016. However, variables denoting years of experience interviewing and Hispanic ethnicity were added for 2016. These variable names and response categories are listed at the end of this appendix. Researchers should note that CHRR built the 1979-2012 variables from data sources that represent the interviewer characteristics at specific points in time. Hence, changes in items like an interviewer's educational attainment are not always reflected in the data. The preliminary source reflects existing interviewer data a few months prior to the fielding of the NLSY79 1994 survey. After 1994, data sources come from short demographic questionnaires filled out during interviewer training.

As of 2014, the matching of interviewers is being compiled at NORC and sent to CHRR to code with longitudinal IDs attached. The actual procedures for matching interviewers remain the same. Interviewers are identified from a master list of interviewer names. New IDs are assigned to new interviewers who do not have a longitudinal ID. Interviewer characteristics are then assigned based on a short demographic survey filled out during interviewer training. Users should note that not all interviewers have filled out this demographic survey in the survey years that they worked.

In 2018, interviewer characteristics were included for all respondents who were fielded, but not interviewed, as well as interviewed respondents. For non-interviewed respondents, characteristics for the most recent interviewer assigned to the case are contained in the data where available. In addition, two variables depicting the percentage of "don’t know" and "refusal" responses in each interview (see variables listed at the bottom of this appendix) have been added to the Interviewer Characteristics area of interest.

Constructing the Original Interviewer Characteristics ID

The key variable, which links the NLSY data set with the interviewer characteristics data set, is INTCHARS_INT_ID. This ID variable is often similar but not necessarily identical to the Interviewer ID variable, which is entered in the questionnaire and can be found on the NLSY79 public use data set for many years. INTCHARS_INT_ID is a constructed longitudinal variable that allows identification of cases interviewed by the same person over time. The previous version of the longitudinal interviewer ID, upon which the improved INTCHARS_INT_ID is based, was constructed using the following steps:

  • First, the NLSY interviewer ID for all years prior to 1996 is divided by ten to truncate the last digit. This last digit was used to cluster interviewers together and the digit was not used in the NORC interviewer characteristics database. IDs for 1996 and 1998 do not have this last digit.
  • Second, each ID was then run through the list of all known interviewers who changed their ID. Interviewers changed their ID if they moved to different states, were promoted or demoted. Only a partial list of interviewers who changed their ID is available, so there is no year when the characteristics of all interviewers are known. However, even if all the characteristics of an interviewer are not known, efforts were made to create a consistent ID number since a researcher might be interested in knowing who interviewed whom from year to year even if other information like education level is not available.
  • Third, for all surveys after 2002, all NLSY79 interviewers were given new round-specific project IDs, whether or not they had previously participated in the project. These project IDs were primarily used as the 2002 longitudinal id, creating a significant number of breaks in the continuous record of contact for some interviewers who already had an ID assigned.

The resulting ID was then used to search for each identified interviewer's characteristics from one of the two sources (1979-2000 or 2002-present) noted earlier. While most IDs match, some do not. That number is relatively small in most years. Readers should note that most interviewers interviewed multiple respondents, so not finding even a single interviewer's characteristics in the data sources can affect the number of cases missing interviewer characteristics dramatically.

Table 1. Interviewers Identified in NORC database by Survey Year
Year # of Respondents Interviewed # of Interviewers Matched
 
Percentage Not Matched
 
1979 12686 9838 22.4%
1980 12141 11200 7.8%
1981 12195 11850 2.8%
1982 12123 11736 3.2%
1983 12221 11980 2.0%
1984 12069 11585 4.0%
1985 10894 10850 0.4%
1986 10655 10560 1.0%
1987 10485 10485 0.0%
1988 10465 10386 10.2%
1989 10605 9906 6.6%
1990 10436 9321 10.7%
1991 9018 8933 1.0%
1992 9016 8947 0.3%
1993 9011 8933 1.0%
1994 8891 8701 2.1%
1996 8636 8050 6.8%
1998 8399 8302 1.2%
2000 8033 7814 2.7%
2002 7726 7723 0.0%
2004 7661 6851 10.6%
2006 7654 7328 4.3%
2008 7757 7742 0.1%
2010 7565 7559 0.0%
2012 7301 7293 0.1%
2014 7071 6977 1.3%
2016 6913 6913 0.0%
2018 6878 6878 0.0%

Other Interviewer Characteristics Variables

This section describes a select set of other variables available beyond the Interviewer's ID (INTCHARS_INT_ID). Additional variables are available for earlier survey years in area of interest INTERVIEWER CHARACTERISTICS.

Variables Available for Survey Years 1979-2018 (as noted below)
 

Interviewer Count (INTCHARS_YRSINTR) (available through 2014)
 

This counts the number of years the interviewer has interviewed the respondent, including the current survey year. Note that as telephone interviews become more prevalent, the number of first-time interviews expands considerably, as most interviewers are not assigned to specific cases year after year.

Interviewer Race (INTCHARS_RACE) (available through 2018)
 

1 = WHITE
2 = BLACK
3 = HISPANIC
4 = ASIAN
5 = AMERICAN INDIAN
-3 = missing

Interviewer Sex (INTCHARS_SEX) (available through 2018)
 

1 = MALE
2 = FEMALE
-3 = missing

Interviewer Age (INTCHARS_AGE) (available through 2014)
 

Age of the interviewer in the interview year, calculated as ([survey year]-[interviewer’s year of birth])
-3 = missing

Interviewer Education (INTCHARS_EDUCATION) (available through 2018)
 

1 = Grade 0-8
2 = Grade 9-11
3 = High School Graduate
4 = Vocational degree
5 = Some College
6 = College Graduate
7 = Graduate School
8 = Masters Degree
9 = Professional Degree
0 = Other

Interviewer Experience (INTCHARS_FI_EXP) (available beginning in 2016)

Interviewer experience in years, from < 1 – 17+ years

Interviewer of Hispanic Ethnicity (INTCHARS_HISPANIC) (available beginning in 2016)

Yes/No

Percent Don’t Know Responses Current Round (R##_PCT_DK) (available beginning in 2004)

Actual percent DK responses

Percent Refusal Responses Current Round (R##_PCT_REF) (available beginning in 2004)

Actual percent REF responses

NLSY79 Appendix 16: 1994 Recall Experiment

The Recall Experiment

Beginning with the 1996 survey, the NLSY79 became a biennial survey. In anticipation of reverting to a two-year interview period, an experiment dubbed the "Recall Experiment" was conducted in 1994 on a portion of the eligible sample. A sub-sample was drawn from the members of the original 12686 sample still eligible for interview in 1994, who were also interviewed in 1992 and 1993. This sub-sample was treated as if their 1993 interview never took place; their date of last interview was established as the 1992 interview date. The information that drove the 1994 interview was that gathered in the 1992 interview. The affected respondents were periodically reminded where applicable, that the reference date for their interview was not the 1993 interview date, but the 1992 interview date.

The result for the 854 "recall respondents" interviewed in 1994 is that retrospective information, pertaining mainly to the period since the last interview, was essentially re-reported for the period between the 1992 and 1993 interviews, in addition to the new information for the 1993-94 survey period. The re-reported information for the 1992-93 interview period can be compared to that previously reported during the 1993 interview for possible discrepancies.

The goal of this experiment was to gain a better sense of the possible consequences for respondent recall, of a biennial instead of annual survey administration, for accuracy and consistency of data and overall respondent burden of participating in the survey. Certain segments of NLSY79 surveys in specific years have previously collected retrospectives over a two-year period or longer. For example, the more detailed two-year fertility history sponsored by The National Institute for Child Health and Development (NICHD) has been administered in selected survey years. However, except in the case of a respondent who actually skips one or more interviews, retrospectives in most segments of the questionnaire for a given year, have required that a respondent only recall events and circumstances over the period of roughly a year.

Effects on NLSY79 Data

Because the information for the 1992-1993 interview period was reported twice -- both during the 1993 and 1994 interviews -- users may encounter some degree of difference and inconsistency in that data gathered in 1993 and 1994. Data pertaining to past calendar year, such as spouse's labor force activity or respondent's income and assets, is not affected by the experiment. Retrospectives for which data between the 1992 and 1993 interview period would have been reported both in the 1993 and 1994 interviews is listed below.

Marital History

  • Retrospective/event history of changes in marital status since date of last interview

Regular Schooling

  • Two-year retrospective of specific months of enrollment (if any) in regular school for the 1993 calendar year only
  • Retrospective/event history of college attendance since date of last interview

Military

  • Retrospective/event history of military enlistment and separation dates since date of last interview

On Jobs/Employer Supplements

  • Retrospective/event history of employment with specific employers, and periods not working for specific employers (gaps within jobs), since date of last interview

Gaps

  • Retrospective/event history of periods of non-employment (gaps between employers), since date of last interview

Training

  • Retrospective/event history of (continued) participation since the last interview, in training programs either reported at the date of last interview or enrolled in since the date of last interview

Health

  • Retrospective/event history of most recent and most severe work-related injuries (if any), since date of last interview

Income and Assets

  • Retrospective/event history of program recipiency (respondent/spouse unemployment compensation, AFDC, government food stamps, SSI/other welfare) since either December 1992, if receipt reported during that month, or January 1993 for those not reporting receipt in December 1992

The Fertility History section is a special case with respect to the Recall Experiment. Between the 1986 and 1992 interview years, in which paper-and-pencil interviewing (PAPI) was used, information on biological children was collected in each odd-numbered year on any biological children born since the date of last interview. However, even-numbered years contained an expanded fertility history section, sponsored by NICHD. This expanded history included a re-reporting of biological children born since the date of the last NICHD interview (even-numbered year). Newly reported children were handwritten onto the records when paper-and-pencil instruments were being used. This allowed interviewers to easily identify children about whom certain series of questions should be asked. The fertility data therefore, has for years contained a limited version of the 1994 Recall Experiment, specific to biological children. Other segments of the fertility history (pregnancy information for female respondents and visitation habits of biological children with non-residential parents) were collected exclusively as two-year retrospectives in even-numbered years. With the advent of CAPI interviewing, it was decided that re-reporting of biological children in NICHD years was no longer necessary. Interviewers no longer needed to rely on visual identification of children reported since the last NICHD interview, because these children could be mechanically flagged. Therefore, in 1994, respondents not belonging to the Recall Experiment sample were for the first time since 1986, only required to report new children born since the date of last interview, instead of the last NICHD interview (two years ago in most cases). However, Recall Experiment respondents were asked to update their biological child records since the last NICHD interview year (1992), as was the norm in even-numbered years since 1986. This will continue to be the case for all respondents with the biennial administration, which began in 1996. However, the elimination of the odd-year survey also means the continued elimination of re-reporting of children born since the last NICHD interview.

Effects on Auxiliary Data Files and Variables in NLSY79

The potential seam effects introduced by the Recall Experiment are particularly relevant for the work history data file and the creation of the key variable, Total Net Family Income. In each case, a different procedure was used to eliminate possible discrepancies in information reported both during the 1993 and 1994 interviews.

WORK HISTORY 1979-1994 DATA FILE
The 1979-93 release of the work history data file already incorporated information covering the period between the 1992-93 interviews in each respondent's longitudinal labor force history. The possible disruptions that might occur to the longitudinal record, if inconsistent data were introduced into the formulas for a single time frame, were unpredictable and potentially serious. It was determined that information covering the period between the 1992 and 1993 interview, reported by the recall respondents in the 1994 interview, would be eliminated for the purposes of creating the 1979-94 work history file. The following basic decision rules were applied:

  1. Data pertaining to employers, gaps in employment and/or periods of military service for which the start and stop dates fell completely prior to the 1993 interview date for recall respondents were completely eliminated for the purposes of creating the 1979-94 work history data file. These should have been reported during the original 1993 interview;
  2. Data pertaining to employers, gaps in employment and/or periods military service for which the start and stop dates fell completely within the period between the 1993 and 1994 interview were retained in their entirety. These constitute new information and would not have been reported during the 1993 interview;
  3. Data pertaining to employers, gaps in employment and/or periods military service for which the start and stop dates fell partially prior and partially after the 1993 interview date, was truncated where necessary at the 1993 interview date (e.g. start dates of these periods of employment/non-employment were set to the 1993 interview date).

Users should note that, with respect to gaps in employment, the NLSY79 does not establish specific weeks when a respondent might be looking for work or laid off (making him/her unemployed as opposed to out of the labor force or OLF). Respondents report only a total number of weeks during each gap for each status. This made it impossible to determine which segment of the 1992-94 interview period should be assigned the unemployed code and which the OLF code for gaps which fell partially prior and partially after the 1993 interview, and in which the respondent reported both statuses.

TOTAL NET FAMILY INCOME
For the purposes of creating Total Net Family Income, the total number of weeks/months of program recipiency in the past calendar year (1993) and the average amount received per week/month in 1993 are required. Recall respondents would have reported all 1993 recipiency in the 1994 interview, not just that occurring prior to the 1993 interview date. Therefore, 1994 data was used exclusively to compute the total 1993 recipiency figures for the recall respondents. This avoided the task of attempting to combine data from 1993 and 1994 for these respondents and deal with potential inconsistencies in reports between the two interviews.

NLSY79 Appendix 15: Recipiency Event History Data

The NLSY79 surveys have solicited information about program recipiency since the first round in 1979. Methods used to collect information about program recipiency from 1979-1992 and from 1993 to the present differed. These methods are described below. The variables found in the RECIPIENT MONTH and RECIPIENT YEAR areas of interest have been constructed to provide users with a more straightforward sequence of variables containing monthly and yearly receipt amounts (if any) and flags for the survey year in which the data was reported. The roots of these qnames are contained in the table below. Recipiency event history variables exist for five types of recipiency (question name roots in parentheses): AFDC (Q13A-), Food Stamps (Q13F-), SSI/SSDI/other public assistance/welfare (Q13SSI-), unemployment compensation (Q13U-), and spouse/partner unemployment compensation (Q13S-). Since survey year 2018, respondents have been asked to distinguish between SSI and SSDI if possible. Array items are present from January 2017 through the current interview.

RECIPIENT MONTH QNAME ROOTS RECIPIENT YEAR QNAMES ROOTS
UNEMPR-[MOYR]-AMT UNEMPR-TOTAL-[YEAR]
UNEMPSP-[MOYR]-AMT UNEMPSP-TOTAL-[YEAR]
AFDC-[MOYR]-AMT AFDC-TOTAL-[YEAR]
FDSTMPS-[MOYR]-AMT FDSTMPS-TOTAL-[YEAR]
SSI-[MOYR]-AMT SSI-TOTAL-[YEAR]
SSDI-[MOYR]-AMT SSDI-TOTAL-[YEAR]
[MOYR]-RECIP-FILL WELFARE-AMT-[YEAR]

The current NLSY79 release contains revised recipiency event history data for survey years 1979-2002. Some anomalies in the construction of the recipiency event history data in previous releases necessitated these revisions. Researchers should use only the recipiency variables with reference numbers beginning with "G." These include the revised variables for survey years 1979-2002 (containing data from calendar years 1978 through the 2002 interview) and the most up-to-date data series for 2003 through the most recent survey. The unrevised original series has been included in the release as a non-primary set of variables for historical purposes.

This appendix first contrasts the collection of information on recipiency in the paper and pencil interviewing (PAPI) years (1979-1992) to that of the computer-assisted personal interviewing (CAPI) years (1993-present). It then describes the creation processes for the recipiency event history variables. A discussion of the kinds of problems necessitating the revisions (PDF) is available.

Program Recipiency in Paper-and-Pencil Interviews

In paper-and-pencil (PAPI) NLSY79 rounds (1992 and prior), information on R and spouse unemployment compensation, AFDC, Food Stamps and other welfare recipiency was gathered for the calendar year prior to the interview year only. For instance, someone interviewed in 1992 was asked about the months of recipiency in 1991 only. An average figure per week/month was then asked for the entirety of 1991. For example, if a respondent said s/he was receiving AFDC in March, April and May of 1991, and again in September and October of 1991, s/he was only asked for an average amount per month received during those months in 1991.

Data collected in this manner generates a complete event history only for respondents who were interviewed at each interview date. For those respondents, information would be present for each month benefits were received from January 1978 through December 1991 (the year before the 1992 interview). However, a respondent skipping one or more interviews would be missing information for each calendar year preceding missed interview years. For example, a respondent missing the 1985 and 1990 interviews would be missing recipiency information for calendar years 1984 and 1989.

Program Recipiency in CAPI Interviews

In the computer-assisted-personal-interviewing (CAPI) NLSY79 rounds (1993 - present) respondents are asked about receipt from government programs since their last interview. Surveys through the 1990s (1993, 1994, 1996 and 1998) also contain some specialized questions to aid the transition from the previous method to the current method of collection and to minimize the number of respondents with any interruptions in their month-to-month event histories.

Respondents are now asked if they have received benefits at all since their date of last interview. Individuals who report no recipiency since their last interview year to the next section of the interview. If the respondent answers "yes," s/he is then asked for the date when the benefits began. This is considered the first spell. Respondents are then asked if benefits have been received continuously since this start date. If the respondent answers "yes," receipt has been continuous, s/he is asked for the average dollar amount received per month/week for each year in the spell. These respondents then proceed to the next section of the interview. If the respondent answers that receipt has not been continuous since this first start date, s/he is asked to report the first date s/he stopped receiving benefits. Average dollar figures per month/week are collected for each year within this first spell.

All respondents who report completing a first spell since their last interview are asked if they started receiving benefits again since the first spell ended. In the 1993, 1994 and 1996 survey years, Information on up to five spells is collected in the manner described above. If there are more than five spells, the respondent is asked about the first five and the most recent. Beginning in 1998, respondents were asked about all spells of receipt. 

In interviews following the initial CAPI interview (1993 in most cases), respondents are asked to verify the last month they reported receiving benefits (if any). They are then asked about any recipiency since their date of last interview. In the electronic questionnaires, the retrospective recipiency event history is collected from the date of the last interview, providing a more continuous longitudinal record, even for respondents who skip interviews.

The flow of questions in the recipiency modules is illustrated in Figure A15.1.

Figure A15.1 Flow of Program Recipiency Questions in CAPI Interviews

In the 2018 survey, some updates were made to the Income module of the questionnaire, including some relatively minor updates to the recipiency-related segments. While the structure of each of the recipiency question series remained essentially the same, the SSI questions (beginning with Q13SSI-) were split into separate segments for the respondent and/or any dependent children (Q13SSI-), and a spouse/partner (Q13SSI-SP-) if applicable. In addition, respondents were asked to differentiate between SSI and SSDI if they know which they are receiving. In addition, specification was added to the TANF segment of questions (beginning with Q13A-) to specify that the questions refer to dependent children where applicable. Finally, language referencing SNAP was added to the Food Stamps segment of questions (Q13F-) as the program has been renamed.

Variable Creation

PAPI Interviews

For most of the PAPI years, the yearly and monthly receipt/non-receipt variables are taken directly from responses, and the average monthly value of benefits is used for each month that the respondent reports receiving benefits. For unemployment compensation, weekly averages were collected. This weekly average was multiplied by 4.3 and then used as the monthly average. However, there are two main exceptions to this. First, the Food Stamp program underwent a change in 1979. Prior to this, recipients were allowed to purchase food stamps at a price below their market value. Because the 1979 interview asked respondents about recipiency in 1978, respondents who reported receiving food stamps were asked how much they paid for the food stamps in addition to the dollar value of the food stamps received in the last month they received benefits in 1978. The net transfer for 1979 is estimated by subtracting the dollar amount paid from the dollar value received. In all subsequent years, respondents were only asked for the dollar value received in the last month of the previous year that benefits were received.

The second exception concerns SSI and other forms of public assistance/welfare. The series of questions pertaining to public assistance/welfare and SSI has undergone some changes since the beginning of the survey. Initially, in 1979, respondents were asked in a single question if they had received income from any of the sources mentioned above. Respondents were also asked in which months benefits were received and the average amount received each month. They were then asked to identify from which sources they received benefits. However, it is not possible to identify how much of this amount is attributable to each source if more than one source was reported.

From 1980 through 1984, the question was divided into two separate ones. Respondents were first asked if they had received any benefits from SSI in the preceding year. They were then asked in which months benefits were received and the average amount received each month. A second set of questions asked respondents if they had received public assistance/welfare in the preceding year and, if so, in which months and the average amount received each month.

The format of the questions was changed once again in 1985 and remained the same through 1996. As with the initial interview in 1979, respondents were asked if they had received any benefits from SSI, public assistance/welfare. They were then asked in which months benefits were received and the average amount received each month. However, unlike the 1979 interview, respondents were not asked to identify the source of the benefits.

In 1998, to address major welfare reform policies passed in 1996, the series was altered once again. Respondents were asked first about SSI receipt only. A separate series of questions was added to solicit information on other types of general public assistance.

The final changes were made for the 2000 interview and have been in effect since that time. Information on other forms of public assistance was curtailed significantly. The series of questions collecting information about SSI remained in the instrument.

Users should be aware that the responses since the 2000 interview contain the least amount of information, pertaining to SSI receipt. This is reflected in the recipiency event history variables since 2000 as well.

CAPI Interviews

Due to the way PAPI interviews collected data (for the calendar year prior to the survey year), information on recipiency is available beginning with January of 1978. Designating this to be month 1 of the monthly event history, all start and stop dates can be identified by their month number. This may be easily calculated using the following algorithm: month_# = (year - 1978) x 12 + month. For instance, June of 1993 would be: (1993 - 1978) x 12 + 6 = 186. Once all start and stop dates have been calculated, the event history for each individual can be created.

To illustrate this, consider Case 1 from Table A15.1. This respondent was not interviewed in 1992 which means that her/his event history from the PAPI survey years would contain information up through December of 1990. Thus, the beginning month of the CAPI event history would be January of 1991 (month 157). According to the example, this respondent was receiving benefits in December of 1991 and continued to do so until June of 1991 (month 162) and then received no further benefits. The dollar amount event history would then be formed by placing dollar value reported for the average benefits in 1991, $135, into months 157 - 162 and zeros into the dollar amounts for months 163 - 186. This same logic can be applied to each respondent, regardless of the number of reported spells of recipiency: placing reported dollar amounts into all months within a spell (from start_spell(i) to stop_spell(i) ) and zeros into all months outside of spells (1 + stop_spell(i) to start_spell(i+1) -1).

To illustrate more completely how each respondent's event history was created, Table A15.1 depicts four additional hypothetical cases. Cases 2 and 4 represent respondents who receive continuously after their start dates; Case 3 depicts a respondent who reports no benefit receipt; and Case 5 represents a respondent who reports two completed spells of recipiency. Table A15.2 presents the event histories which would result if the information had been given by the respondents portrayed in Table A15.1.

Table A15.1 Five Hypothetical CAPI Cases
Question Case
1 2 3 4 5
Interview date 6/93 6/93 10/93 7/93 8/93
Year of last interview 1991 1991 1991 1991 1991
Receive Dec year before last interview? Y Y N N N
Spell_0 continuous? N Y - - -
First stop date spell_0 6/91 - - - -
Average monthly/weekly benefits in '91 months (rec'd Dec or year before last int) 135 - - - -
Receive since Jan of last interview? - - N Y Y
Start date spell_1 - - - 3/91 3/91
New spell since stop date spell_0 - - - - -
Start date spell_1 - - - - -
Spell_1 continuous? - - - Y N
Stop date spell_1 - - - - 9/91
Average monthly/weekly benefits in '91 months (1st new spell) - - - - 200
New spell since stop date spell_1? - - - - Y
Start date spell_2 - - - - 2/93
Spell_2 continuous? - - - - N
Stop date spell_2 - - - - 5/93
Average monthly/weekly benefits in '93 months (2nd new spell) - - - - 225
New spell since stop date spell_2? - - - - N
Average monthly/weekly benefits in '91 months (rec'd contn'ly since last start date) - 145 - 157 -
Average monthly/weekly benefits in '92 months (rec'd contn'ly since last start date) - 152 - 160 -
Average monthly/weekly benefits in '93 months (rec'd contn'ly since last start date) - 175 - 163 -
Table A15.2 Resultant Event Histories
  Case
Case 1
Dollar
Case 2
Dollar
Case 3
Dollar
Case 4
Dollar
Case 5
Dollar
1/91 135 145 0 0 0
2/91 135 145 0 0 0
3/91 135 145 0 157 200
4/91 135 145 0 157 200
5/91 135 145 0 157 200
6/91 135 145 0 157 200
7/91 0 145 0 157 200
8/91 0 145 0 157 200
9/91 0 145 0 157 200
10/91 0 145 0 157 0
11/91 0 145 0 157 0
12/91 0 145 0 157 0
1/92 0 152 0 160 0
2/92 0 152 0 160 0
3/92 0 152 0 160 0
4/92 0 152 0 160 0
5/92 0 152 0 160 0
6/92 0 152 0 160 0
7/92 0 152 0 160 0
8/92 0 152 0 160 0
9/92 0 152 0 160 0
10/92 0 152 0 160 0
11/92 0 152 0 160 0
12/92 0 152 0 160 0
1/93 0 175 0 163 0
2/93 0 175 0 163 225
3/93 0 175 0 163 225
4/93 0 175 0 163 225
5/93 0 175 0 163 225
6/93 0 175 0 163 0
7/93 -4 -4 0 163 0
8/93 -4 -4 0 -4 0
9/93 -4 -4 0 -4 -4
10/93 -4 -4 0 -4 -4
11/93 -4 -4 -4 -4 -4
12/93 -4 -4 -4 -4 -4

In each CAPI interview from 1993 through the present survey year, information is collected for all time up to the current interview date. Because all respondents are not interviewed in the same month, the resultant event histories would be of unequal length. In order to avoid this, a -4 is placed into each monthly dollar value from the month following the interview month to last month of the field period for the most recent survey. These -4s function merely as place savers and will be replaced by information collected from the next interview. For example, if the respondent represented by Case 1 is interviewed in September of 1994 and reports no benefit receipt since the last year, then the -4s for July to December of 1993 become 0s and -4s are placed in the dollar values for October to December of 1994. These new -4s would later be replaced by information from the 1996 interview.

Handling Don't Knows and Refusals

In PAPI years when the respondent did not know whether s/he had received benefits in the previous year, a "-2" was placed in all months and dollar values for that year. For respondents who refused to answer this question, "-1" was entered into all months and dollar values for that year.

In CAPI years, when asked for the start or stop date of a spell, a respondent could respond "don't know." When the respondent does not know (or refuses to answer) the start date of a spell of recipiency, s/he is then asked approximately how many months/weeks s/he received benefits and how much s/he received in the last month/week s/he received benefits. If a respondent does not know the start date and there are valid responses for these questions (i.e., responses greater than zero), the start date is set at the first possible point of the unfilled event history and "-2" is placed into the months that the respondent reports receiving. For example, if a respondent last interviewed in 1990 and being interviewed in 1993 responds that s/he has received benefits since January of the last interview year but does not know when s/he started receiving, the start date is set at January of 1990. If this same respondent reports that s/he received benefits for six months and received $200 the last month s/he received benefits, then "-2" would be filled into the January through June of 1990 dollar values for these months. The dollar value monthly variable for July 1990 through the interview date would then be filled with zeros. If the respondent does not know the stop date but has reported a start date, the same logic is employed using the reported start date.

Fill Flags

The recipiency variables have been assigned an XRND survey year classification. The electronic questionnaire format collects recipiency information from years when an individual has missed an interview. This allows variables for past years to be updated with data from the most recent survey, similar to the Work History arrays. The traditional non-interview code "-5" is not found in these data. In order to identify the "true" non-interviews in each year, a series of fill flags have been created. These are monthly variables which indicate the interview year from which the information was collected. For the vast majority of the PAPI years, the data will have come from the interview after that calendar year, i.e., data for March 1985 will have been provided in the 1986 interview. Respondents for whom variables representing months between 1978 and 1991 that are not filled in this manner (from the interview year after that calendar year) will have generally missed a number of interviews and had their information filled in from a CAPI survey year years later, when the event history format began collecting data from the date of last interview.

NLSY79 Appendix 14: Instrument Rosters

Instrument rosters

In the paper and pencil (PAPI) questionnaires administered from 1979-1992, basic information about specific types of subjects (household members, children, etc.) was often recorded in a table or grid structure, and subsequently stored in a similar format. With computer assisted personal interviewing (CAPI), expanded electronic versions of these grids and tables were implemented.

During the course of the survey, a number of these matrices of data, or "rosters," are constructed. Rosters contain one or more pieces of information on a given subject. They are often presented to the interviewers as lists of information that are used to verify information, or from which one of the subjects on the roster is chosen as the answer to a survey question. For example, the EMPLOYER roster (a list of employers for whom the employer has worked since the date of last interview), is presented to the interviewer at the end of the ON JOBS module of the survey, so that s/he can verify that the list of employers, and some specific information associated with each employer is accurate.

Many of the rosters used during the administration of the survey are not presented as contiguous blocks of data in the public release data. Often in these cases, the relevant information contained in these rosters is present in variables scattered throughout the public data. For instance, although the many variables found on the EMPLOYER roster are not present as a contiguous block, or a roster per se, much of the information contained in the roster is present in other variables in the Employer Supplements.

Occasionally, sets of created variables may also be loaded into a roster structure. The primary example of a roster constructed outside of a running survey is the EMPLOYER_HISTORY roster, which contains over 38,667 variables in the current public release data.

A number of commonly used rosters (or comparable grid items from the 1979-1992 PAPI questionnaires) are listed below, along with the formats of question names for those roster items, relevant areas of interest and helpful search criteria. Work continues on imposing more consistency on question names and substantive areas of interest, so users can expect to see improvements with each successive release. Changes will be reflected in accompanying documentation.

For further discussion of rosters, see Appendix 13: Intro to CAPI Questionnaires and Codebooks.

Household roster

The HHI_FINAL roster in the 1993-present electronic questionnaires replaced the Household Enumeration in the PAPI instruments from 1979-1992.

  • Relevant Areas of Interest = HOUSEHOLD RECORD
  • Roster Names for Survey Years 1979-Present = HHI_FINAL_[FIELDNAME]

Employer roster

The EMPLOYER roster did not exist in a cohesive fashion prior to 1993. However, most of the elements on this roster are included in a systematic fashion in the Employer Supplements sections for each survey year.

  • Areas of Interest include JOB INFORMATION; EARNINGS; INDUSTRY & OCCUPATION; TIME & TENURE W/EMPLOYER; PERIODS NOT WORKING WITHIN JOB TENURE; EMPLOYER LINKS; JOBS
  • Roster Names for Survey Years 1979-present = EMPLOYER_[FIELDNAME]

Roster of biological children

The FERTILITY AND RELATIONSHIP HISTORY/CREATED area of interest, survey year XRND, provide a cumulative and extensively reviewed record of biological children. These variables have been updated with each survey round and contain the latest information available for gender, birthdate, death date (if applicable) and latest residential status for each biological child. Researchers can use the following variables as an up-to-date cumulative roster of biological children:

  • C#DOB~M DATE OF BIRTH OF ## CHILD – MONTH
  • C#DOB~Y DATE OF BIRTH OF ## CHILD – YEAR
  • C#SEX SEX of ## CHILD
  • C#ID TWO-DIGIT ID OF ## CHILD
  • C#RES_DLI RESIDENCE OF ## CHILD AT DATE OF LAST INTERVIEW
  • C#DOD~M DATE OF DEATH OF ## CHILD – MONTH
  • C#DOD~Y DATE OF DEATH OF ## CHILD - YEAR

Non-biological child roster

The NBIOCHILD rosters in 1994-2014 replace the paper and pencil Non-biological Children's Record Form in the PAPI instruments from 1979-1992.

  • Relevant Areas of Interest = CHILD RECORD FORM/NONBIOLOGICAL
  • Roster Names for Survey Years 1979-1996 = NBIO[FIELDNAME]
  • Roster Names for Survey Years 1998-2012 = NBIOCHILD4_[FIELDNAME]
  • Roster Names for Survey Year 2014 = NBIOCHILD_[FIELDNAME]

Employer History roster

The EMPLOYER HISTORY roster was compiled and first released in a 2013 interim public release. This roster is compiled mainly from elements in the Employer Supplements and On Jobs sections for each survey year. It constructs a single record from the first report to the most recent stopdate given for each employer reported by a respondent, and includes many commonly used employer-associated variables. More information can be found at Appendix 28: NLSY79 Employer History Roster.

  • Areas of Interest include those beginning with EMPLOYER HISTORY – [SUBSTANTIVE CONTENT]
  • Roster Names for Survey Years 1979-present = EMPLOYERS_ALL_[FIELDNAME]

NLSY79 Appendix 13: Intro to CAPI Questionnaires and Codebooks

This appendix details the development of the NLSY79 questionnaires and codebooks.

In 1992 (round 14) and prior survey years, interviews were conducted by Paper and Pencil Interviewing (PAPI). Computer-Assisted Personal Interviewing (CAPI) technology was introduced in a small experiment in 1989 (round 11) and a larger experiment in 1990 (round 12). Round 15 (1993) marked the first round of the National Longitudinal Survey of Youth 1979 to be administered entirely using CAPI technology. Wherever possible, comparability was maintained between the 1993 data documentation and that of previous survey years. However, continuous innovations in technology and certain data collection procedures have resulted in considerable improvement in the format and content of electronic questionnaires and codebook documentation. A number of these transformations are outlined below.

Questionnaire innovations

Technological advancements in CAPI survey software have facilitated efficiencies in the conduct of interviews that were not possible with the PAPI instruments. Some significant advancements that accompanied the transition to CAPI data collection have included:

  • the availability of specialized question types, allowing for more precision in data entry and processing.
  • automated accessing of information reported earlier in the survey, eliminating the need for interviewers to search back through the questionnaire or keep track of earlier responses to reference them in later questions;
  • automated text substitutions ranging from gender pronouns to large substantive texts that vary based on circumstance;
  • replacement of most hard-copy QxQ material with electronic help screens, linked to individual relevant questions;
  • imposition of minimums and maximums for many numeric questions, reducing the instances of erroneous outliers;
  • "bounded interviewing" - minimum/maximum values for dates, similar to those set for numeric questions; bounding of dates drastically reduces if not completely eliminates, anomalies such as job gap dates that precede the appropriate start date or exceed the appropriate stop dates, start dates that exceed interview dates, etc.
  • structured rostering of records pertaining to groups that are the subject of inquiry, such as employers, children, household members; the roster organizational structure provides more efficient storage of data related to the individual employers/children/etc., as well as clarity in presentation to the interviewer on the computer screen.
  • question loops: Loops are repeating sets of questions, asked about different subjects of inquiry. For example, series of similar or identical questions that need to be asked about each appropriate subject in a group (employer/child/household member/etc.) can be programmed in loops. The electronic instrument then cycles through these loops containing very similar or identical questions for each individual subject in the group. Loops reinforce the uniformity of data collection about each unit of observation in a group. Finally, question loops allow for expansion of the number of subjects or events in an event history that can be collected without adding potentially substantial volume or complication to a hard-copy questionnaire. In PAPI survey years, the number of columns available in the hard-copy questionnaires to collect information on subjects such as household members, children, gaps in employment, training programs, etc. were finite -- limited by space on the page. This limitation disappeared with the use of electronic questionnaires. This feature also allowed the expansion of the recipiency questions to a more extensive event history.
  • additional functionality of electronic questionnaires; Mechanical data checks that filter respondents into correct questions and the capability to sort individual subjects organized in a roster, combine with question loops to help ensure that questions are administered about the correct subjects in the correct order. For example, rostering and sorting ensure that the "current/most recent employer," referred to historically as the "CPS employer," is always asked about in the first loop of questions. Although this has always been the intention, recording and processing errors in PAPI years resulted in occasional cases for which the CPS employer was not entered as the first employer.

Data and documentation innovations

Continuing enhancements to CHRR's Designer CAPI software have made possible several more advancements in the presentation of data and documentation as well. These include:

  • increased uniformity in question naming conventions, improving the ability to identify comparable questions across survey rounds; In PAPI survey rounds (1979-1992) question names changed with each round. The evolution of Designer CAPI software also entailed several rounds of changes in question naming formats. CHRR personnel have been working as schedules and manpower allows, to eliminate question name inconsistencies resulting from these multiple transitions. This effort is on-going with additional progress reflected in each successive public release.
  • improvement in assignment of more precise areas of interest. Multiple areas of interest can be assigned to variables that make sense in more than one substantive category. In addition, areas of interest (particularly large and/or more generic ones) can be broken down into more precise and meaningful categorizations. Adjustments to area of interest assignments are part of an on-going effort, similar to that described above with respect to question naming.
  • greater ease in documentation capability. Vastly improved documentation features in the software have allowed sets of created variables of considerable size (Work History week-by-week arrays and the Employer History roster for example) to be incorporated into public release data sets relatively smoothly.

Continuing data and documentation enhancements

As noted above, changes in interviewing modes and rapid technological innovation led to some inconsistencies in documentation over the history of the survey. Many of these inconsistencies have arisen because of modifications in the conventions for question naming, assignment of areas of interest and formatting of various documentation components. CHRR personnel are working continually on updating documentation to increase comparability of documentation through survey years, and increase user-friendliness of the data and codebook.

Comparability in Data Presentation: "Consolidated" Variables

An effort was been made in the 1979-1993 data release to maintain comparability with PAPI data releases in terms of data presentation. Toward this end, some sets of variables have been "consolidated." In other words, the responses for multiple variables are collapsed into a single created variable or set of variables. This has been done primarily for variables that in previous years were a single data item or set of data items, but are collected in more than one variable or set of variables in the CAPI questionnaires.

In each case, the variables being consolidated are mutually exclusive with respect to substantive responses. In other words, if variable A, variable B and variable C are consolidated, respondents will have given a response to only one of these - either variable A, or variable B or variable C.

Consolidation spares users from having to access a larger number of variables and use each separately or combine the responses themselves.

NLSY79 Appendix 12: Most Important Job Learning Activities (1993-94)

Respondents were asked in 1993 and in 1994 to identify the activities most important in helping them to learn how to perform their current or most recent job duties, and in helping them to learn about how work place changes would affect their jobs.

Most Important Job Learning Activities (1993) 

The 1993 data are contained in the following variables:

  • R41986. Most Important Activity To Learn Current/Most Recent Occupation (RE)
  • R41987. Prompt-Most Important Ac To Learn Current/Most Recent Occupation (RE)
  • R42641. Most Important Act To Learn How Wrkplc Chngs Would Affect Job
  • R42642. Prompt-Most Important Act To Learn How Wrkplc Chngs Would Affect Job?

These variables contain only codes. The substantive value labels for the codes are listed below:

1 More than one activity mentioned as most important 9 On-the-job/hands on experience, learning by doing
2 Classes or seminars 10 Other
3 Spending time with supervisors 20 Reported only classes or seminars
4 Spending time with coworkers 30 Reported only spending time with supervisors
5 Using self-teaching materials 40 Reported only spending time with supervisors
6 Learning new skills on own 50 Reported only self-teaching materials
7 Trial and error 60 Reported only learning new skills on own
8 Previous job experience  

Most Important Job Learning Activities (1994)

These data are contained in the following variables:

  • R45931. Most Important Activity To Learn Job #1 Occupation (RE)
  • R45932. Prompt-Most Important Ac To Learn Job #1 Occupation (RE)
  • R46372. Most Important Activity To Learn Job #2 Occupation (RE)
  • R46373. Prompt-Most Important Ac To Learn Job #2 Occupation (RE)
  • R46809. Most Important Activity To Learn Job #3 Occupation (RE)
  • R46810. Prompt-Most Important Ac To Learn Job #3 Occupation (RE)
  • R47195. Most Important Activity To Learn Job #4 Occupation (RE)
  • R47196. Prompt-Most Important Ac To Learn Job #4 Occupation (RE)
  • R47530. Most Important Activity To Learn Job #5 Occupation (RE)
  • R47531. Prompt-Most Important Ac To Learn Job #5 Occupation (RE)
  • R48046. Most Important Act To Learn How Wrkplc Chngs Would Affect Job
  • R48047. Prompt-Most Important Act To Learn How Wrkplc Chngs Would Affect Job?

These variables contain only codes. The substantive value labels for the codes are listed below:

1 More than one activity mentioned as most important 12 Practicing
2 Classes or seminars 13 Read journals/books/articles/viewed videos
3 Spending time with supervisors 14 Communicate with/observe experts/colleagues/peers
4 Spending time with coworkers 15 Self-taught/studying on own
5 Using self-teaching materials 16 Experience (unspecified)
6 Learning new skills on own 20 Reported only classes or seminars
7 Trial and error 30 Reported only spending time with supervisors
8 Previous job experience 40 Reported only spending time with supervisors
9 On-the-job/hands on experience, learning by doing 50 Reported only self-teaching materials
10 Other 60 Reported only learning new skills on own
11 Classes/workshops/conferences  

NLSY79 Appendix 11: Round 12 (1990) Survey Administration Methods

Round 12 of the NLSY79 survey incorporated a large-scale experiment involving comparison between PAPI (Paper and Pencil Interviewing) and CAPI (Computer-Assisted Personal Interviewing) methods of interviewing, in anticipation of a possible conversion to CAPI-only data collection in the future. An experimental control-group design was implemented during the fielding period for the 1990 survey, allowing the examination of possible mode effects, any differences in data quality between modes, as well as differences in time and cost factors of administration. The CAPI version of the instrument was designed to replicate the paper instrument, which reduced the efficiency of the CAPI instrument somewhat. More detailed information on the results of examinations of mode effects are available through CHRR. There are no indications that data quality was adversely affected by the experimental CAPI administration; indeed, the efficiency of data collection and data quality appear to have improved. Variables depicting interview modes in the 1990 data reflect respondents' assignments to various design groups at the outset of the survey as well as the mode actually used, allowing researchers to conduct their own methodologically-oriented examinations (see vars. R34003., R34004.).

NLSY79 Appendix 9: Linking Employers from OnJobs Section to Employer Supplement and Through Survey Years

This appendix contains several components:

For more information on these and other employer-related topics, consult the Jobs & Employers pages and Employment section of the NLSY79 Topical Guide.

Questionnaire Modules

OnJobs Module

In the OnJobs module, a list of employers for whom the respondent worked between the previous and current interviews is built (sorted most recent to least recent).

  • First, the list of employers with whom the respondent was active at the date of last interview ("DLI" employers) are verified and corrected if necessary.
  • Second, respondents are asked if they worked again for any employers they were associated with prior to the last interview ("PDLI" employers).
  • Third, respondents report any new employers ("NEW" employers) that are not on either the "DLI" or "PDLI" employer list.

The result is a complete list of employers for whom the respondent worked between the previous to current interview. If multiple employers were reported, the list is then sorted from most to least recent stopdate, creating the final list or "roster" of ES employers for the current interview.

Before the 1979-2018 data release, and particularly for respondents with multiple employers, researchers have not been able to definitively link information collected in the OnJobs module to specific employers in the ESs. Prior to 2002, data produced from the OnJobs modules was limited both in quantity and analytic utility. However, beginning in 2002, questions were added to the OnJobs modules to establish and verify a "job type" (traditional, non-traditional or self-employment) for each employer. Researchers have been interested in linking this more substantive information to specific ES employers. The 1979-2018 data release contains variables (described below) that make this possible.

Employer Supplements

An ES is administered for each employer on the final roster that results from the OnJobs module. Respondents are asked to provide extensive employer-related information related to each employer. This information varies through survey years. The final job type determined for each job is available in the ES for survey years 2002-2018. However, the questions used to establish the job type are not available in the ES. The current data release includes variables that allow researchers to link data in the OnJobs module and to the corresponding ESs. A description of these variables follows.

Variables Linking OnJobs and Employer Supplement Employer Data

For the first time with the 2018 data release, the NLSY79 data includes several variables linking employer data collected in the OnJobs module with the corresponding ES.

  • DLILINK.## - contains DLI employer loop number in OnJobs module corresponding to ES ## (2002-2012). Qnames Q6-8[].## contain job type and verification data for DLI employers.
  • PDLILINK.## - contains PDLI employer loop number in OnJobs module corresponding to ES ## (2002-2012) Qnames Q6-16[].## contain job type and verification data for PDLI employers.
  • NEWLINK.## - contains NEW employer loop number in OnJobs module corresponding to ES ## (2002-2012). Qnames Q6-27[].## contain job type and verification data for NEW employers.
  • EMPLINK.## - contains loop number in OnJobs module corresponding to ES ## (2014-2018) (DLI, PDLI and NEW employer loops in OnJobs module were collapsed into one loop beginning in 2014)

Consider a respondent with the following characteristics and reflected in Table 1 below. At the 2008 interview, this respondent confirms two DLI employers and reports the stopdates in Table 1. S/he also reports returning to a PDLI employer for a period between the 2006 and 2008 interview, stopping work again on the date in Table 1. In addition, a NEW employer is reported with the stopdate in Table 1. Once these employers are confirmed, the job type questions would be asked in their entirety for newly added employers, or confirmed and updated if necessary, for employers for which a job type was previously established.

Table 1: Example: 2008 Respondent OnJobs to Employer Supplement Links
OnJobs Employers Reported Stopdates reported (final roster is sorted by stopdate) ES Number (after sorting final roster by stopdate) DLILINK.## (## = ES number)

PDLILINK.##

(## = ES number)

NEWLINK.## (## = ES number)
DLI emp #1 8/2007 4 DLILINK.04=1 PDLILINK.04=-4 NEWLINK.04=-4
DLI emp #2 11/2008 1 DLILINK.01=2 PDLILINK.01=-4 NEWLINK.01=-4
PDLI emp #1 8/2008 2 DLILINK.02=-4 PDLILINK.02=1 NEWLINK.02=-4
NEW emp #1 12/2007 3 DLILINK.03=-4 PDLILINK.03=-4 NEWLINK.03=1

In the example in Table 1:

  • Data in OnJobs DLI employer #2 (DLI loop #2) corresponds to ES #1
  • Data in OnJobs PDLI employer #1 (PDLI loop #1) corresponds to ES #2
  • Data in OnJobs NEW employer #1 (NEW loop #1) corresponds to ES #3
  • Data in OnJobs DLI employer #1 (DLI loop #1) corresponds to ES #4

Process for Linking Jobs Through Survey Years

Beginning in July 2013, the comprehensive "Employer History" roster has been made available to data users. This roster not only establishes links between employers through multiple survey years, but compiles a large amount of information about employers from one survey year to another into a single record for each employer. For further information about the Employer History roster, see Appendix 28: Employer History Roster.

It is still possible for users to link the respondents' employers from each contiguous survey year to the next using the traditional method. This appendix describes the traditional job linking process.

Comparable variables exist for each employer in all survey years, allowing a link to be established through all contiguous interview years during which the respondent reported working for a specific employer.

Use the variables with question names beginning with "PREV_EMP." All of these previous employer variables have a reference number that begins with "W." For example, the variables listed below in Table 2 relate to employers reported during the 1990 interview.

Table 2. Variables identifying job number assigned in previous interview
Reference # Qname Variable Title
W05057.00 PREV_EMP#_1990_JOB#01 PREVIOUS JOB NUMBER AT LAST INTERVIEW, JOB 1, 1990
W05058.00 PREV_EMP#_1990_JOB#02 PREVIOUS JOB NUMBER AT LAST INTERVIEW, JOB 2, 1990
W05059.00 PREV_EMP#_1990_JOB#03 PREVIOUS JOB NUMBER AT LAST INTERVIEW, JOB 3, 1990
W05060.00 PREV_EMP#_1990_JOB#04 PREVIOUS JOB NUMBER AT LAST INTERVIEW, JOB 4, 1990
W05061.00 PREV_EMP#_1990_JOB#05 PREVIOUS JOB NUMBER AT LAST INTERVIEW, JOB 5, 1990

These variables identify the number that a job was assigned in the previous interview year, if that job was reported. If these variables contain a valid missing code (-4), then the job was not reported in the previous interview year (1989) and therefore cannot be linked to any employer in the previous year. This is essentially a "new" employer, reported for the first time during the current survey year (or possibly an employer reported before the previous interview). If any of these variables contains a valid number (1 or greater), this is the number of that job in the previous interview year. For example, if W05057.00 contains a "2", this would mean that employer #1 in 1990 is the same employer as employer #2 from 1989. One could then attach information from employer #1 in 1990 to information from employer #2 in 1989 as a continuing record of the respondent's experience with that employer. Using corresponding variables through contiguous interview years, information for an employer can be traced back through the first time the employer was ever reported by the respondent.

Of course, forward linking of employers can be accomplished in much the same way, using the same set of Previous Employer variables. For instance, to link information about employer #2 in 1989 with its continuation in 1990, one would search the 1990 Previous Employer Number variables for the number "2." Thus, a "2" in variable W0505900 would indicate that employer #3 in 1990 is the continuation of employer #2 from 1989.

This procedure works through contiguous survey years, even if the respondent has skipped interviews and the survey years are not consecutive. For example, if a respondent was interviewed in all years from 1985-1990, a direct match can not be made between an employer reported in 1990 and one reported in 1985 without first establishing matches (or the lack thereof) through all intervening years. However if a respondent was interviewed in 1985, and not again until 1990, a link between employers reported in these two years would be accomplished in the same manner as that described above between 1989 and 1990, as there would be no intervening year(s) to interfere with a direct match.

User note: Data characteristics

Users should be aware of several data characteristics:

First, the NLSY79 employment history data are employer based. All references to a "job" should be understood as a reference to an employer. Information about work duties and positions and/or changes in duties or position performed or held during the respondent's tenure with a specific employer is collected as part of the record for that specific employer. For example, a respondent may regard him/herself as having held a number of "jobs" or positions with employer #1. However, any information collected about these different positions would all be regarded as information about employer #1.

The data for many survey years contain multiple versions of the PREVIOUS JOB NUMBER variables for employer matching. However, the PREVIOUS JOB NUMBER variables are compiled as part of the Work History programming and (as signified by reference numbers beginning with "W" consolidate this information into a single item for each employer for each survey year. Examples of these for 1990 are listed above in Table 2.

In the course of creating the Employer History roster mentioned earlier, a new more comprehensive version of the unique id was created for each job ever reported. These variables are called "EMPLOYERS_ALL_UID." These new unique employer ids are present for all jobs reported by all respondents since 1979. While these ids will usually match the id described in the paragraph above, they will not always match. Users should consider using the EMPLOYERS_ALL_UID.## variables wherever necessary, as these will generally be more consistent and available for all jobs.

This unique identification number consists of the first survey year in which the employer was reported, appended with the employer number in that survey year multiplied by 100. So for instance, an employer first reported as employer #1 in 1993 would have a unique employer id of 19930100. An employer first reported as employer #3 in 1994 would have the unique employer id of 19940300, and so on. This id number stays with the employer in subsequent survey rounds, whether or not more information is added to the employer record.

Users should note that previous versions of "unique ids" for employers were only available for respondents still eligible for interview in 1996 and subsequent survey years.

Subscribe to NLSY79