Appendix 2: Employment Variable Creation

Appendix 2: Employment Variable Creation

This appendix presents SAS programs for the creation of employment variables. Unless otherwise noted, the programs presented are for round 20. Except for minor hand edits to account for inconsistencies in the raw data, variable creation code remains generally constant from round to round. However, users who need code for a specific round not included here may contact NLS User Services.

Notes about NLSY97 employment data

Collection of Employment Data. The employment sections of the NLSY97 questionnaire are somewhat complex. Before beginning analysis, researchers must understand the structure of each round's questionnaire, particularly the way in which jobs are classified as employee, freelance, or self-employment. It is important to note that this classification depends in part on the survey round and the respondent's age. In rounds 1 and 2, employee jobs were recorded in the first part of the YEMP section, administered only to respondents age 14 or older as of the interview date. The second part of the YEMP section collected information about freelance jobs of respondents age 14 and older and all jobs of respondents age 12 or 13 (the implicit assumption being that respondents younger than 14 are not likely to hold employee jobs). If the respondent was at least 16 years old and made at least $200/week in a freelance job, the job was classified as self-employment, and an extra series of questions was asked during the freelance section.

In round 3, all respondents were at least age 14 by the interview date, so the age restriction for employee jobs was no longer necessary. The structure of the section remained largely the same, with a division between employee-type and freelance jobs. Self-employment was classified in the same way as in the earlier rounds.

In round 4, the section was redesigned. Respondents born in 1980-82 (who were mostly age 18 and older when the round 4 field period began) were asked about employee-type jobs and self-employment at the same time. In addition, the minimum income requirement from the freelance section no longer applied; jobs could be classified as self-employment regardless of earnings. However, respondents born in 1983-84 (who were mostly age 16 or 17 when the round 4 field period began) continued to describe employee and freelance jobs separately. Data on self-employment jobs were still collected in the freelance section, and freelance jobs still had to meet the income criteria to qualify as self-employment. The same pattern was used in round 5.

The redesign of the employment section has important implications for created employment variables. In rounds 1-3, all of the created employment variables were based only on employee-type jobs. So, for example, the variable "Weeks Worked during Calendar Year 1999" counted only the weeks worked by a respondent at a regular employee-type job. If the respondent also reported self-employment in a lawn care business, the weeks spent working at that job were not counted in the created variable.

In round 4, when older respondents reported both employee-type and self-employed jobs in the same series of variables, this approach was reconsidered. For rounds 4 and 5, older respondents had three versions of most created variables. The first version, identified by the suffix "ET" in the question name, includes only employee-type jobs. The second version, the "SE" variables, includes only self-employed jobs reported by respondents born in 1980-83 in the regular employment section during round 5 (and similarly for respondents born in 1980-82 in round 4). These variables do not include freelance jobs or self-employment reported by younger respondents in the freelance jobs section in rounds 4 and 5, and they do not include freelance jobs or self-employment reported in rounds 1-3 by any respondent, regardless of age. Finally, the variables for all jobs include both employee-type jobs and self-employment reported during round 5 for respondents born in 1980-83 but only include employee-type jobs for respondents born in 1984. (Similarly, in round 4, these variables reported all jobs for respondents born in 1980-82 and only employee-type jobs for respondents born in 1983-84). These last variables are identified with the suffix "ALL" in the question name.

Respondents' ages varied widely in round 4, when self-employed jobs were first recorded as part of the regular employment section. Also, some respondents reported employment over several years if they missed a round of interviewing. To simplify the creation of the employment variables, survey staff included only the self-employment job information starting on January 1 of the year the respondent turned 18. For example, consider a respondent who was 20 years old on his round 4 interview date in April 2001 and had not been interviewed since round 1. He reports self-employment in a computer repair business beginning on his 17th birthday in March 1998 and continuous employment at a fast-food restaurant since his round 1 interview in 1997. The round 4 created employment variables would include information about the employee-type fast food job dating all the way back to 1997. However, the computer repair business would not be considered until January 1, 1999 (the first day of the year he turned 18). In other words, the variable "Weeks Worked in Calendar Year 1998" would count only the fast food job, and the variable "Weeks Worked Any Job in Calendar Year 1999" would count both the fast food job and the repair business. Similarly, the new variable "Weeks R Was Self-Employed Year 1998" would have a value of -4, or valid skip (because the respondent was not yet 18), but the variable "Weeks R Was Self-Employed Year 1999" would report the weeks the respondent worked at the computer repair business in 1999. This approach, continued in round 5, permitted users to compare the employment variables across respondents in different rounds, with confidence that the job types included were the same for all respondents of a given age.

Researchers using the employment data may want to review the information about the employer roster structure and data collection in Appendix 8 for a deeper understanding of how roster loops work and how employers are linked across survey rounds.

"Backreporters." Occasionally respondents report a job in the current interview that started before the date of their last interview and should have been reported at that time. Appendix 6 contains a more complete description of the implications of these reports for the created employment event history variables. There are also a number of created employment variables detailed in this appendix. Nearly all of these variables use the information provided about employment previous to the date of last interview. The only exception is the set of CV_WKSWK_DLI variables, which reports the weeks worked since the previous interview date. The current round's variables would not include the backreported information in any case; the previous round's variables are not re-created to incorporate this new information.

For example, assume that Jane was interviewed in round 3 on April 15, 2000, and in round 4 on April 15, 2001. In the round 4 interview, she reports for the first time a job that started on April 1, 2000. The 2 weeks worked at that job before April 15, 2000, would not be reflected in any round's CV_WKSWK_DLI variables. However, those weeks would be counted in other variables. For example, CV_WKSWK_YR.00 would count all the weeks worked at any job in 2000, regardless of whether those weeks were reported in the round 3 or round 4 interview.