NLSY79 Appendix 18: Work History Data

NLSY79 Appendix 18: Work History Data

DESCRIPTION OF THE 1979-2018 NLSY79 WORK HISTORY PROGRAM

Description of the 1979-94 PL/1 Program

CHANGES TO THE WORK HISTORY DATA

DESCRIPTION AND CODES FOR VARIABLES IN 1979-2018 NLSY79 WORK HISTORY DATA

VARIABLES USED IN CREATION OF 1996 AND SUBSEQUENT WORK HISTORY DATA FILE

NLSY79 WEEK NUMBERS AND CORRESPONDING DATES (Separate Excel File). The Continuous Week Crosswalk contains the start date for each week (Sunday) from January 1, 1978, through December 31, 2021, and the week numbers assigned to that week in the construction of the work history data file. These week numbers do not match the week numbers printed on the employment calendar included with the survey instrument materials for earlier survey years. Week numbers in the work history programs are assigned based upon actual dates collected during the course of the interview. The variable names for the week-by-week arrays (status, hours, dual jobs) incorporate the specific year and number of the week within the specific year. For example, the 10th week in 1989 in the status array is called STAT8910. These names do not correspond to the strictly consecutive week numbers from 1-2184 listed in the Excel spreadsheet. The spreadsheet also contains the week numbers for each calendar year so that users will have a crosswalk for both calendar-year and continuous week numbers.

DESCRIPTION OF THE 1979-2018 NLSY79 WORK HISTORY PROGRAM

This document provides a general explanation of the procedures and logic of the work history programming and variables. The original PL/I programming was used to establish and maintain the structure and data from survey years 1979 to 1994. Therefore, the following discussion heavily references these programs. The series of SQL programs currently in use were converted directly from the PL/I code for the 1996 release.

The original PL/I work history program was written to create key work variables like "Number of Weeks Worked since Date of Last Interview," "Number of Weeks Worked in Last Calendar Year," etc. These key variables use all recorded jobs for each respondent (up to 10 jobs). The WEEKLY LABOR STATUS, HOURS WORKED, and DUAL JOBS arrays also were created with data from up to 10 jobs for each respondent. However, only 1% of all respondents have more than 5 jobs in any given survey year, resulting in valid missing data for jobs 6 through 10 for 99% of the sample. In order to reduce the total number of variables, public data files contain the specific employer variables for only 5 jobs for each respondent.

The purpose of the WEEKLY LABOR STATUS, HOURS WORKED and DUAL JOBS arrays is to create a longitudinal work history record for each respondent through the 2018 (round 28) interview date. Because each year's survey collects information on jobs held and periods not working since the date of the last interview, it is possible to construct a continuous, week-by-week record for each respondent.

There are a few exceptions, however. In the 1979 and 1980 surveys, job information was collected only for respondents age 16 and older at the date of the interview. Additionally, the 1979 survey data contain the most cases with inconsistent or invalid employment-related data of any survey year, resulting in a greater proportion of missing gaps in the work history record. For example, in 1979 there are 86 cases that have job dates that exceed the interview date; in 1980, there are 11 cases that have job dates that exceed the interview date; in 1981 there are none.

Users should also note that 1,079 members of the military sample were dropped as of the 1985 survey. In 1991, all members of the economically disadvantaged non-black/non-Hispanic oversample were dropped as well. More information on these sample types is available in the Retention and Reasons for Noninterview section.

Description of the 1979-94 PL/1 Program

The following is an abbreviated step-by-step description of the original 1979-1994 PL/I programming. In 1996, the PL/I program was converted to SQL code in a series of programs that replicate the PL/I program and functions. 

    1. All of the variables used in the program are declared and most are included in the PL/I structure called VARIABLES.
    2. The variables common to all respondents, like ID, SAMPLE_ID, etc. are assigned values. The week-by-week arrays are initialized to zero and all of the variables included in the WORK_HISTORY part of the structure are initialized to -4.
    3. For each interview year, procedures (VARIABLES1979, VARIABLES1980, etc.) that assign the variables for each survey year are called if the respondent was interviewed. Start and stop dates for jobs and periods not working are sent to the WEEK procedure, where the valid month, day and year variables are converted to a week number, with week 1 being January 1, 1978. If the respondent was not interviewed, then all WORK_HISTORY variables for that survey year are set to -5.
    4. After all VARIABLES19XX are assigned, the procedure CALC is called to evaluate the various start and stop dates, to assign codes, and to create the job number for all of the jobs for each interview year. Within CALC, the procedure FILL is called to fill in the codes that are assigned to the WEEKLY LABOR STATUS and DUAL JOBS arrays and to calculate the hours worked during each week that are loaded into the HOURS WORKED array.
    5. Finally, the procedure SUMMER is called to calculate and sum the key work history variables.

    CALC Procedure (in original PL/1 program)

    This procedure processes all jobs for each survey year, beginning with the first job. CALC starts by calculating each year the number of jobs since the date of the last interview, assigning a job number, and calculating the hourly wage for each job. If the respondent had the job at the date of the last interview, the start date becomes the date of the last interview, which is then "ceiled" or rounded up using the "ceil" function. Next, if the respondent is currently working at the job, it assigns the interview date, which is "floored" or rounded down using the "floor" function, as the stop date. (All dates at this point have been converted to week numbers in the WEEK procedure.)

    If the start and stop dates of the job are valid and do not coincide with an interview date, the start and stop dates are "ceiled." The number of weeks tenure on the job is calculated by subtracting the start week from the stop week of the job. FILL is then called to fill in the week arrays for the particular job. The start and stop weeks of the job, the job number, and the number of hours usually worked per week (HOURSWEEK) at the job are sent to the FILL procedure.

    If the job had any periods not working associated with it, then up to four periods not working for the employer are processed. If the start and stop dates for the periods not working are valid, a code is assigned indicating whether the respondent was out of the labor force (OLF) or unemployed for the period. If the respondent is OLF the whole period, a code of 4 is assigned. If the period not working is divided between OLF and unemployed, a temporary code of 9 is assigned and the number of weeks unemployed is determined. If the start and stop dates of the period are valid, but the labor force status cannot be determined, a code of 2 is assigned.

    The period start and stop dates, CODE, and HOURSWEEK are sent to FILL. If the period dates are invalid, a code of 3 is assigned and start and stop dates of the job are passed to FILL, along with HOURSWEEK. This is only done for the first period not working for the first employer this week.

    Next, tenure at the job is again calculated, this time in terms of total weeks on the job instead of just since the date of the last interview. First, a determination is made to see if the employer is the same employer a respondent reported at the time of the previous interview. If there is a previous employer number and the tenure for that previous employer is valid, then the tenure for the job from the previous interview is added to the tenure for the job being processed. Only tenure with an employer that is reported during contiguous survey years can be calculated over the total time spent with an employer. For example, consider a respondent who was interviewed in 1981, 1982 and 1983 surveys. Now suppose the respondent reported having worked for the Department of Labor at the time of the 1981 survey and left and then began working for that same employer again by the time of the 1983 survey. Because the employer numbers are only followed between contiguous interviews, there is no way to calculate total tenure with the Labor Department since the respondent did not report that employer during the 1982 survey. Only employers from the previous year's survey are compared with employers reported in the current year's survey.

    Finally, CALC evaluates up to six periods not working or in the military between jobs. For each of the periods not working, the same logic used for the periods not working on a job is used for the periods between jobs.

    User Notes

    1. If the start and stop dates for a job are invalid, then that job has no dates that can be sent to FILL. As a result, there is no record of that job in the WEEKLY LABOR STATUS array and no indication that the job is missing. In 1979, there were 1190 cases with any invalid start or stop dates (i.e., at least one week is unaccounted for - WEEKLY LABOR STATUS=0); in 1980, there were 942 cases; in 1981, there were 254; and in each of the following survey years, there were fewer than 200 cases.

    2. A job held in any day of a week is counted as a job for the whole week. This is achieved by "flooring" start dates and "ceiling" stop job dates to integer week values. There is one exception previously mentioned--stop dates for jobs held at the interview date are floored. This is done to avoid double counting across interview years.

    3. Start and stop dates for periods not working either within tenure with a job or between jobs are "ceiled" in FILL.

    4. The HOURS WORKED array is set to -3 if any job in the week has an invalid value for HOURSWEEK. Between 1979 and 1992, the maximum number of hours for any given week is 96. Beginning in 1993, the maximum number of hours for a given week can be reported up to 168 hours (the total number of hours possible in a single week).

    FILL Procedure

    The FILL procedure takes the start and stop dates that have been converted to week number values and fills in values for the WEEKLY LABOR STATUS, HOURS WORKED and DUAL JOBS arrays for each week between stopping and starting dates that are passed to it.

    In FILL, the WEELY LABOR STATUS array is loaded with either a survey year job number or a code signifying that there was not one civilian job that week (a code of 0, 2, 3, 4, 5, or 7). The DUAL JOBS array is loaded with a survey year job number(s) if more than a civilian job is held that week; otherwise it is assigned a value of 0. The HOURS WORKED array is loaded with the number of hours worked on all jobs held that week, up to a maximum of 96 through 1992, and a maximum of 168 in subsequent years.

    FILL is called from the CALC procedure for all start and stop dates except for military start and stop dates. Military start and stop dates are determined in the VARIABLES procedures for each year, and FILL is called from those procedures to fill in a code of 7 in the WEEKLY LABOR STATUS array for active military service.

    Initially, FILL checks for valid start and stop dates. If the dates are valid, then FILL takes one of three paths. The first path is to evaluate the WEEKLY LABOR STATUS array for that week to see (1) if it contains a job number, (2) if the code passed from CALC is a job number, and (3) if the previous employer number for the job is different from the job number in the WEEKLY LABOR STATUS array. If all of these statements are true, then FILL determines that the job is not a duplication of the job that exists in the WEEKLY LABOR STATUS array for that week.

    Next, FILL looks at the DUAL JOBS array to see if there is a job number in DUAL JOBS. If DUAL JOBS already has a job number(s), then the current job number is compared to the job number(s) in DUAL JOBS. If the job number does not exist in DUAL JOBS, then the HOURSWEEK for that job is added to the number of hours for that week for the HOURS WORKED array and the job number is added to DUAL JOBS. If the job is a duplicate job, then nothing is done to the arrays.

    The second path is taken if there is no dual job and if the week dates are associated with a job or if there is not job number in the WEEKLY LABOR STATUS array. If this is the case, FILL tests for two conditions. The first condition is met if COD is 9. (A code of 9 means that the respondent had a period not working that was part OLF and part unemployed.) If COD equals 9, then the HOURSWEEK are subtracted from the hours in the HOURS WORKED array, because the respondent is not working at the job. The number of weeks unemployed (code of 4) is arbitrarily assigned to the middle portion of the weeks not working, and the rest of the period is determined to be OLF (code of 5).

    The second condition in the second path tests to see if the value in the WEEKLY LABOR STATUS array is not a code of 4; if COD is a job number then the job number is placed into WEEKLY LABOR STATUS. If there are hours for the week and if the respondent was not working for the employer during this week, then the hours for the week are set to zero if the HOURS WORKED array is greater than zero. Otherwise, HOURS WORKED receives whatever value is in HOURSWEEK.

    The third path FILL evaluates is if the week falls in a period not working and if there is a dual job. Then, the job number is deleted from the DUAL JOBS array and HOURSWEEK for the job are subtracted from the HOURS WORKED array.

    Finally, if there are more than four dual jobs in the DUAL JOBS array then no other job numbers are added to DUAL JOBS because the array for each week is limited to four dual job variables.

     

    User Notes

    A few last notes about FILL:

    1. Civilian work takes precedence over any other activity. If the respondent has a civilian job while in the military, then the civilian job code replaces the military code in the WEEKLY LABOR STATUS array.

    2. The order of precedence in the construction of the WEEKLY LABOR STATUS array after a civilian job is as follows:

    1. a code of 3, associated with an employer but periods not working with employer are missing; if any period not working is missing, then the entire period of the job is assigned a 3. In 1979, there are 274 cases with invalid period dates, and in each of the following survey years, there are fewer than 60 cases
    2. a code of 4, unemployed
    3. a code of 5, OLF
    4. a code of 2, period not working with employer, but OLF vs unemployed status is unknown
    5. a code of 7, active military service
    6. a code of 0, no information is reported to account for the week

    3. About 32 cases have a week in which JOB # 1 from a survey week first appears in the DUAL JOBS array rather than the WEEKLY LABOR STATUS array. This occurs when (1) there is a discrepancy between the date of the previous interview date as it appears on the info sheet that the interviewer uses at the time of the interview and the interview date recorded at the previous interview or (2) the starting date and ending date for a job across interview years are the same due primarily to the way the dates are floored and ceiled. In all these cases, an erroneous entry appears in the DUAL JOBS array for that given week.

    Changes to the Work History Data

    There have been a number of changes and updates to programs and input variables that are used to create the Work History data over the NLSY79 tenure. The most significant programming change was the conversion of the programming from PL/1 to equivalent SQL code. A list of the main NLSY79 variables used in the creation of the 1979-96 work history data set is accessible at the end of this appendix.

    Other changes and updates are detailed in Changes in the NLSY79 Work History Data.

    DESCRIPTION AND CODES FOR VARIABLES IN 1979-2018 NLSY79 WORK HISTORY DATA

    Below are discussions of three types of variables:

        • the weekly arrays created by the Work History programs 
        • other items produced by the Work History programs 
        • variables that are either used in the Work History programs, or are basic, commonly used job-specific survey items that were duplicated on the separate Work History data sets prior to the 1979-2000 release.  When the Work History data became part of the general public 1979-2000 release, these variables were assigned to areas of interest titled WORK HISTORY --MAIN -- JOB INFORMATION [YEAR], to make it easier for historical data users to recreate the separate Work History data files they may have been working with up to that point.

    Variable coding information, as well as formulas for combining job-specific characteristics from several sources, are included where relevant.

    Work history weekly array variables

    The foundation of the work history data is the set of week-by-week arrays depicting labor force status, total number of hours, and dual job holdings if any, for each week since January 1, 1978. These array variables are found in three areas of interest in the NLSY79 public release. The construction and coding for each of the three arrays are described below, listed by their area of interest.

    Area of interest: WORK HISTORY-WEEKLY LABOR STATUS

    The WEEKLY LABOR STATUS array is the work history week array. Each variable corresponds to a week relative to 1/1/78.[1] There are 2085 variables in the 1979-2018 WEEKLY LABOR STATUS array--one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[1] [4] There are no missing data codes, and the codes that are in the array are as follows:

      0= no information reported to account for week.
      2= not working (unemployment vs. out of the labor force cannot be determined.)
      3= associated with an employer but the periods not working for the employer are missing. If all of the time with the employer cannot be accounted for, a 3 is loaded into the STATUS array instead of a job code.
      4= unemployed. If a respondent is not working and part of the time is spent looking for work or on layoff, the exact weeks spent looking for work is unknown. As a result, the number of weeks spent looking is assigned to the middle part of the period not working.
      5= out of the labor force.
      7= active military service. If a respondent has a civilian job while in active military service, the civilian job code is loaded into the array instead of a code of 7.
      >100= worked. The code represents the appropriate work history year multiplied by 100 plus the job number for that employer in that year. For example, 102=year 1, job 2; 305=year 3, job 5. This allows one to associate any characteristic for a job with that week. If a respondent has more than one job at the same time, the job number that is loaded into the array is determined by the starting date of the job with the lowest job number, not by any particular characteristics of the job such as the number of hours worked at the job. The year in the job code is the year in which the job is reported. Jobs held in year 2, but reported in year 10 would be assigned job numbers beginning with 1001 instead of 201.

     

    User Notes

    In some cases, a respondent reports a period not working that is part OLF and part unemployed. In these cases, a week-specific distinction between OLF and unemployed cannot be made. Users should refer to the Work History Program Description in this appendix for a discussion of how OLF and unemployed codes are assigned to the WEEKLY LABOR STATUS array in the event that such a period occurs.

    Area of interest: WORK HISTORY-HOURS WORKED

    The HOURS WORKED array contains the usual hours worked per week at all jobs. There are 2185 variables in the 1979-2018 HOURS WORKED array--one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[2] [5] The codes are as follows:

      0 no hours worked or interview does not cover array week
      1-95 = usual hours worked per week
      96 = 96 or more hours per week
      -5 = noninterview
      -4 = valid skip
      -3 = invalid skip
      -2 = don't know
      -1 = refusal

     

    User Notes

    Beginning in 1993, the first all-CAPI survey year, the maximum hours allowed per week is 168.

    Area of interest: WORK HISTORY-DUAL JOB 1-4

    The DUAL JOB arrays contain job numbers for any weeks when the respondent worked at more than one job simultaneously. There can be up to 2185 variables in each DUAL JOB [#] array -- one for week #0 and one for each of the 2184 weeks from 1/1/78 to 11/10/2019.[2] [5] DUAL JOB array variables are present if a dual job was reported.[3]

    The codes are as follows:

      0 = no dual job
      >100 = dual job year and job number

    For example, if a respondent worked at three jobs at the same time, the code for the lowest job number would be in the WEEKLY LABOR STATUS array, and the codes for the other two jobs would be in the DUAL JOB arrays (see item 3 in the user notes below). If the three jobs that the respondent held during week 190 from the 1981 (round 3) survey were jobs 1, 5, and 6, then WEEKLY LABOR STATUS would contain the value '301' for that week, the DUAL JOB 1 array for week 190 would contain the value '305' and the DUAL JOB array for week 190 would contain '306'.

    User Notes

    A few additional notes are in order:

    1. The maximum number of dual jobs accounted for is 4. The variable descriptions for variables in the WORK HISTORY - DUAL JOB [#] areas of interest indicate the relevant job number and week.

    2. The DUAL JOB [#] arrays do not provide labor force status in the detailed manner of the WEEKLY LABOR STATUS array. They contain only second, third, fourth, and fifth job numbers for weeks in which the respondent reports more than one employer.

    3. Users should be aware that it is possible in survey years 1979-92 for the CPS job number to appear in one of the DUAL JOB [#] arrays instead of the WEEKLY LABOR STATUS array, as would be expected. In most cases, the CPS job will be the lowest number job for a given year. However, this is not always the case. Each year contains a relatively small number of cases for which JOB #1 is not the CPS job. For these cases, the job number assigned by the work history program will not necessarily be the lowest one for that year. In cases for which the CPS job is not held simultaneously to any other job, the job number for the CPS job will appear in the WEEKLY LABOR STATUS array as expected. However, in cases for which the CPS job is held simultaneously with another job with a lower job number, the possibility exists that the job number for the CPS job will appear in one of the DUAL JOB [#] arrays instead of the WEEKLY LABOR STATUS array. Mechanical changes implemented in the 1993 CAPI instrument to ensure that the CPS job is always the first job have virtually eliminated this possibility from 1993 forward.

    Work History non-weekly array created

    Non-weekly array variables produced by the work history programs are listed below by Work History area of interest. Variables marked with an asterisk (*) contain an actual consecutive week number, ranging from week #0-2184, with the week of January 1, 1978, being week #1. Week #0 represents information for time prior to that date.

    Area of interest: WORK HISTORY-MAIN-CREATED

    TENURE[#]

    Total weeks tenure at each job as of interview date

    MILWK-SLI

    Weeks of active military service since date of last interview

    WKSWK-SLI

    Number of weeks worked since date of last interview

    HRSWK-SLI

    Number of hours worked since date of last interview

    WKSUEMP-SLI

    Number of weeks unemployed since date of last interview

    WKSOLF-SLI

    Number of weeks out of the labor force since date of last interview

    WKSUNACCT-SLI

    Percentage of weeks unaccounted for in calculating weeks worked since date of last interview

    MILWK-PCY

    Weeks of active military service in past calendar year

    WKSWK-PCY

    Number of weeks worked in past calendar year

    HRSWK-PCY

    Number of hours worked in past calendar year

    WKSUEMP-PCY

    Number of weeks unemployed in past calendar year

    WKSOLF-PCY

    Number of weeks out of the labor force in past calendar year

    WKSUNACCT-PCY

    Percentage of weeks unaccounted for in calculating weeks worked in past calendar year

    WKSSINCELI

    Number of weeks since date of last interview

    JOBSNUM

    Number of jobs ever reported as of interview date

    Area of interest: WORK HISTORY-MAIN-JOB INFORMATION-[YEAR]

    HRP[#]

    Usual wage earned at each job converted to an hourly rate

    Area of interest: WORK HISTORY-HISTORY

      LASTINT_WK#_[YEAR]* Week of last interview
      CURRINT_WK#_[YEAR]* Week of current interview

    Area of interest: WORK HISTORY-CALENDAR YEAR

      CAL_YEAR_JOB[#]_[YEAR] Job number that is loaded into the WEEKLY LABOR STATUS array for each job. The 1st two digits of the number are the year (01 thru 28) and the 2nd two digits are the job for that year (job 01 thru 10)
      CAL_YEAR_JOBS_[YEAR] Number of jobs in past calendar year
      WKS_NWMISSC_[YEAR] Percentage of weeks not employed in past calendar year that cannot be split between unemployed and out of the labor force

    Area of interest: WORK HISTORY-JOBS

      START_WK#_[YEAR]_JOB#[##] Starting week of each job
      STOP_WK#_[YEAR]_JOB#[##] Stopping week of each job
      PER[#]_START_[YEAR]_JOB#[##] Starting week of each period not working for each job
      PER[#]_STOP_[YEAR]_JOB#[##] Stopping week of each period not working for each job

    Area of interest: WORK HISTORY-GAPS BETWEEN JOBS

      BSTART_[YEAR]_PERIOD[#] Week started each period not working between jobs
      BSTOP_[YEAR]_PERIOD[#] Week stopped each period not working between jobs.

    Area of interest: WORK HISTORY-SINCE LAST INTERVIEW

      LASTINT_#JOBS_[YEAR] Number of jobs since the date of the last interview
      WKS_NWMISSL_[YEAR] Percentage of weeks not employed since the date of the last interview that cannot be split between unemployed and out of the labor force

    Area of interest: WORK HISTORY-MILITARY

      MIL_START1_[YEAR] Starting week of first period of active military service.
      MIL_START2_[YEAR] Starting week of second period of active military service.
      MIL_STOP1_[YEAR] Stopping week of first period of active military service.
      MIL_STOP2_[YEAR] Stopping week of second period of active military service.

     

    NLSY79 Main Data Work History variables

    A third set of variables are either used in the Work History programs, or are basic, commonly used job-specific and gap-related survey items that were at one time duplicated on the separate Work History data sets prior to the 1979-2000 release. When the Work History data became part of the general public 1979-2000 release, these variables were assigned to areas of interest titled WORK HISTORY -- MAIN -- JOB INFORMATION [YEAR], to make it easier for historical data users to recreate the separate Work History data files they may have been working with up to that point. These areas of interest continue to be maintained. Go to Work History Job- and Gap-specific Survey Items table 2018 to see these variables, with example reference numbers from the most recent round.

    VARIABLES USED IN CREATION OF 1996 AND SUBSEQUENT WORK HISTORY DATA FILES

    Beginning in 1996, the work history programming was converted to SQL programming. The SQL programs, which mirror the older PL/1 program, are not available to users. However, the Work History Input Variables 1996-Present Table lists the variables used as inputs to the SQL programs. Users who need more information should contact NLS User Services.

    Users should be aware that not all of variables listed in the table appear in the NLSY79 public release data file. Variables with no valid data for any respondent, jobs 6-10, within-job gap 4 and between-job gaps 5-6 are not currently included in the main file.

    Endnotes

    [1] All week number references in this program are relative to 1/1/78 and end with the most recent interview date. A week #0 is included at the beginning of the week-by-week array structures to indicate time prior to 1/1/78. Users are discouraged from incorporating data contained in this week in analysis. Researchers should instead use information from the 1979 interview concerning labor force activity prior to 1/1/78 in order to construct event histories of a more thorough nature. (Some information concerning labor force activity for respondents prior to the time frame of the initial 1979 interview is asked on an age restricted basis for respondents still in their teens at the time of interview.)

    [2] See footnote 1.

    [3] All variables have standard missing value codes unless otherwise noted.

    [4] The final 2018 (round 28) interviews were conducted in November 2019. Therefore, valid data are only present through variables for week #2184 in the current data set. The maximum week number variable in the Dual Job [#] arrays is week #2181, as no one reported multiple jobs in weeks #2182-2184.

    [5] See footnote 4.