Appendix 6: Event History Creation and Documentation

National Longitudinal Survey of Youth - 1997 Cohort

Appendix 6: Event History Creation and Documentation

The NLSY97 survey records significant life-course transitions experienced by young people, such as education, employment, program participation, and marital history, in a longitudinal format. The event history arrays document these events in a chronological format that records the significant transitions in a meaningful manner while maintaining data quality. Using these arrays, researchers can extract the status of a respondent at a point in time or over time. Event history arrays are generated for four distinct areas: employment, marital/cohabitation status, program participation, and schooling. This section presents information on each type of event history array; for details on the chronological format of the arrays and the naming conventions used to identify the variables, users should refer to Appendix 7.

Employment Event History Arrays

Three employment arrays provide information on the respondent's civilian employment on a weekly basis. These arrays include information about employee jobs and self-employment; jobs reported in the freelance section are not included in the arrays. Please see the NLSY97 User's Guide for a more complete description of these job types. All employment arrays provide information starting in the month when the respondent turned 14 and ending in the week that he or she was last interviewed.

1. EMP_STATUS
This main array presents the civilian employment status of a respondent in a particular week. The codes and their explanations follow:

Code Definition
Status=0: No information reported to account for week Week cannot be assigned due to missing job start and stop dates.
Status=1: Not associated with an employer, not actively searching for an employee job Refers to weeks during a between-jobs gap in which the respondent is not actively searching and reports working at a freelance job. Since the actual weeks working at a freelance job cannot be determined, all weeks in which the respondent is not actively searching are coded in this manner. This status code is only used when respondents reported working at a freelance job in addition to a gap in a regular job. As a result, this code only exists through round 5, after which all respondents aged out of the freelance section.
Status=2: Not working (unemployment vs. out of labor force cannot be determined) Assigned when the respondent is not asked follow-up questions about his or her search activity during a within-job gap or a between-jobs gap.
Status=3: Associated with an employer, periods not working for employer are missing Used when a respondent reports an indeterminate start or stop date for a within-job gap.
Status=4: Unemployed Indicates that the respondent reports actively searching for work during a within-job gap or a between-jobs gap. When the number of weeks unemployed do not account for the entire gap period, weeks unemployed are assumed to occur in the middle of that period.
Status=5: Out of the labor force Assigned during a between-jobs gap or a within-job gap when the respondent is either not actively searching for work or on layoff from a job.
Status=6: Active military service Indicates that the respondent is tied to the military.
Status=9701 to 201010: Employer on roster Refers to the employer number on the employer roster (YEMP_UID.xx). Presence of an employer number indicates that the respondent was working during a given week. Civilian work takes precedence over other activities, such as job search. Respondents who report working at an employer job for one day in a given week are listed as having worked at that job for the entire week, regardless of other activities.

2. EMP_DUAL_JOB#
If a respondent holds more than one civilian employee job during a week, the second employee job is presented in a dual job array. These arrays contain only the job number of the overlapping job; labor force status information is only included in the main array. For example, if a respondent held two civilian employee jobs (e.g., the first and third jobs listed on the employer roster) in one week, the employer number for the first job would be recorded in the EMP_STATUS array and the employer number for the third job would be recorded in the EMP_DUAL_2 array. If a respondent held three jobs (e.g., jobs #01, #04, and #05 on the roster) in one week, the first job would be recorded in the EMP_STATUS array, the employer ID for job #04 would be recorded in the EMP_DUAL_2 array, and the employer ID for job #05 would be recorded in the EMP_DUAL_3 array. Unlike the NLSY79 work history arrays, jobs are recorded in the status and dual jobs arrays based upon the order presented in the employer rosters, which is sorted by the ending date with the current or most recent job listed first.

3. EMP_HOURS
This final array calculates the total number of hours worked by a respondent at any civilian employee job during each week. Hours per week worked at each job are assumed constant except during a reported gap, when the hours for that job are assumed to be zero. Each week is assigned a code of '-3 (invalid skip)' when any of the jobs has an indeterminate gap date.

Other Information
Continuous week crosswalk.
A secondary set of variables translates the reported beginning and ending dates (day, month, and year) of employee jobs and the gaps within those jobs to the week and year naming scheme (e.g., EMP_GAP_START_YEAR.01.01 and EMP_GAP_END_YEAR.01.01 provide the start and end dates of the respondent's first gap at the first job in the continuous week and year format). More information about the week and year naming scheme is provided in Appendix 7 in this document.

Linking to survey data using unique ID codes. The created event history variables can be used in conjunction with the main file information about the respondent's employment. In the main data, unique employer ID numbers are listed under the question name YEMP_UID.xx (e.g., R24761.); these codes are used in the weekly employment status variables. Using these unique ID codes, researchers can identify the comparable job information (e.g., complete start and stop dates, fringe benefits, job satisfaction, industry and occupation, etc.) from the main file. The unique ID codes are assigned based on the survey round in which the employer was first reported.

User Note

The collection of freelance and self-employment information changed in the round 4 interview, as described in the introduction to Appendix 2. A small number of round 4 self-employed jobs may have a unique ID of 199999. The assignment of unique ID codes is described in detail in Appendix 8.

Denial of previously reported employers. Respondents sometimes deny that they ever worked for an employer reported in a previous round. If this situation occurs, the data for that employer remain in the event history arrays, but a flag variable (EMP_DENY) indicates that the employer was denied in a subsequent round. For example, assume that a respondent reported working for employer number 9802 from January 1, 1998 through the round 2 interview date. In round 3, however, the respondent stated that he or she never worked for that employer. The weekly STATUS variables for January 1, 1998, through the round 2 interview date will continue to report the respondent's status as working for employer 9802, but the EMP_DENY variable will also have a value of 9802, indicating that the respondent denied working for that employer during the round 3 interview.

Missing and imputed values. Occasionally, respondents cannot provide information about the start and end dates of employment periods or gaps in employment. Because dates of employment are often used in subsequent questions in the jobs section, default values are substituted for these missing values so that the interview program can continue. The missing values are then reinserted in the public use data file so that researchers will know the true value. However, to follow the flow of an interview, users may need to understand what values were substituted so that the correct question path can be followed. Similarly, in the creation of the event history arrays, some missing values are imputed. Imputation rules and the effect of each on the event history arrays are as follows:

Type of missing data Imputed value in interview and event history data Effect on event history variables
Missing job start or stop day Start day = 1
Stop day = 28
If there is a valid month and year, the imputed days are used in the creation of status variables as if they were valid data. For example, a respondent with an imputed start day of "1" for employer 9701 will be listed as working for employer 9701 for the first week (and each subsequent week) of the reported month in the STATUS array.
Missing job start or stop month Start month = 1
Stop month = 12
In the STATUS array, weeks in imputed months are assigned a status of "0"--no information. Each month from the beginning of the job to the next known date or from the last known date to the end of the job is assigned a 0. For example, assume a respondent reports starting a job in an unknown month of 1997 but then reports a within-job gap starting on 6/1/97. All weeks in the months of January-May will be assigned a value of 0.
Missing job start or stop year Start year = year of last interview
Stop year = year of current interview
In the status array, weeks in imputed years are assigned a status of 0. For example, in round 2 a respondent who reported a new job with an unknown start year would be assigned an imputed value equal to the year of the round 1 interview (usually 1997). In the STATUS variables, each week from the round 1 interview date to the first known employment date (or the current interview date) would have a value of 0.
Missing gap start or stop day Start day = 1
Stop day = 28
If there is a valid month and year, the imputed days are used in the creation of status variables as if they were valid data. For example, a respondent with an imputed start day of "1" for a within-job gap will be assigned a value of 2, 4, or 5--depending on information about layoff and job search--for the first week (and each subsequent week) of the reported month in the STATUS array.
Missing gap start or stop month; missing gap start or stop year Start month = job start date (or the date of a previous known gap)
Stop month = job stop month (or the date of a later known gap)
Each week in the imputed period is assigned a value of 3 in the STATUS array, meaning associated with an employer but with missing gap information. For example, a respondent with a job start date of 4/12/97 and a gap with an unknown start month and year would have an imputed gap start date of 4/12/97. Each week from that date to the next known employment date would have a status of 3.

Respondents may have more than one job in a given week due to the imputation of dates as described above. If the month or year was imputed for one job, resulting in the assignment of zeros to a given set of weeks, but another job with known dates falls in some of those weeks, the zeros will be dropped and replaced by information about the known job. However, the respondent will not be listed as having a dual job in those weeks. The imputed employer will not be listed in any array if the zeros are dropped because another job provides valid information.

Backreporter variables. Some respondents report during the current interview a new job with a start date prior to the date of the last interview that was not reported during that interview. If these jobs had been reported at the previous interview, the weeks and hours worked would have been represented in the arrays at that time. When they are instead reported in the current interview, the event history arrays created at the previous interview date are not changed to include information about these new jobs. Three "backreporter" variables alert users to changes that would have resulted if the jobs had been correctly reported during the previous interview.

The first variable, EMP_BK_WKS, tells how many weeks before the previous interview date the job started. The second and third variables show how the status and hours arrays would have been affected had the job beginning before the date of last interview been reported at the prior interview and included in the original array construction. One variable, EMP_BK_STATUS, indicates the number of weeks from the job's start date to the date of last interview for which a nonworking status would have changed to an employer ID had the job been reported during the previous interview round. The other variable, EMP_BK_HOURS, informs users about the additional number of hours per week worked on this job for the weeks from the job's start date to the date of the previous interview.

For example, assume a respondent named Mary was interviewed on January 15, 1999 (round 3), and January 15, 2000 (round 4). In round 3, Mary reported no employers. In round 4, she reported working 30 hours a week on a job that began on January 1, 1999. Since the job began 2 weeks before the round 3 interview, EMP_BK_WKS would have a value of 2. EMP_BK_STATUS would also have a value of 2, indicating that 2 weeks in the round 3 arrays would have changed from nonworking to working status. EMP_BK_HOURS would have a value of 30, indicating that 30 additional hours would have been worked in each of those weeks.

Similarly, assume a respondent named John was interviewed on the same dates as Mary in rounds 3 and 4. In round 3, John reported a job that he had worked at for 10 hours per week since the round 2 interview. In round 4, he reported a second, 20 hours-per-week job that began on January 1, 1999, 2 weeks before his round 3 interview. Like Mary, John would have a value of 2 for the EMP_BK_WKS variable. However, the weeks between January 1 and January 15, 1999, would already indicate that John was working (at the original employer). Therefore, EMP_BK_STATUS would have a value of 0, because no weeks would have changed from nonworking to working status if John had reported the new job in round 3. EMP_BK_HOURS would have a value of 20, indicating the number of hours per week that John worked at the new job. In John's case, the hours worked array variables created in round 3 would have a value of 10, reflecting the job he reported in round 3. Researchers can add the value of EMP_BK_HOURS to the value in the original round 3 arrays for the 2 weeks before January 15, 1999, to determine that John worked 30 hours per week in those weeks.

Pages