National Longitudinal Survey of Youth 1997 (NLSY97)

Errata for NLSY97 Round 19 Release

Newest errata

POSTED 01/16/2024

Updates to college codes (IPEDS)

Updates have been made to the IPEDS college codes (GEO69 & GEO70) for 117 cases between round 1 and round 19; these variables are available in the Geocode release. In the majority of cases (100), a valid missing code was updated to reflect the reported state. The remaining cases reflect a small number of updates that were made to school or state codes. Users with an active agreement to use the NLSY97 Geocode data should contact User Services ( for details. Corrections for these codes will be reflected on the next restricted Geocode data release.

Other errata

  1. Duplicate job IDs in the NLSY97

    POSTED, updated 11/29/2023

    In response to various user questions concerning this topic, NLS archivists have reviewed all occurrences in which the same job uid appears on the YEMP roster more than one time in the same interview round. These updates have been included in the most recent version of the public data. These cases fall into three distinct categories: 1) The jobs are the same but the respondent reported working at them multiple times during the same interview period; 2) The jobs are in fact distinct and should be assigned separate unique ids; 3) The jobs are the same, and the information collected about these jobs was duplicated. Updates have been included in the most recent version of the public data.

    Following the review of duplicate jobs, updates were made to these cases using the methods listed below:

    Category 1: These jobs include both duplicate job IDs with spells that don’t overlap as well as those that overlap in some weeks. For these cases, the YEMP_UID remains on the YEMP_ roster, and all interview and event history data remain intact. In the survey round in which the duplicated jobs appear, variables relating to tenure (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) were updated to reflect total tenure (including both spells) rather than the number of weeks by spell. In subsequent rounds, the job-specific tenure variables reflect the total amount of weeks worked. Additional checks were made and, when necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.

    Category 2: Six jobs were found to have an incorrect job ID and were updated to the correct job ID in the employer roster (YEMP_UID), and all variables associated with both jobs were updated (PUBID= 246, 3156, 4528, 4955, 5868, and 8371). This potentially included information in the EMP_STATUS and EMP_DUAL arrays along with the associated job-specific tenure measures (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) in that and subsequent rounds. When necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.

    Category 3: Sixteen jobs that overlapped completely were deleted as the respondent appeared to re-report the exact job twice (PUBID = 1374, 1586, 2742, 2947, 3132, 3449, 5641, 5644, 5649, 5888, 6325, 6348, 6851, 6859, 7217, and 9011). For these jobs, the extra job was deleted from interview data (YEMP-) and the employer roster (YEMP_). Information about the repeated job in the EMP_DUAL job arrays and the EMP_HOUR array was updated to remove the job. All information for the extra job was deleted in the affected round (CV_HRLY_COMPENSATION, CV_HRLY_PAY, CV_HRS_PER_WEEK, CV_JOB_13_WKS, CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR) and updated in the combined variables (CVC_HOURS_WK_YR) along with successive rounds where if the job occurred (CV_WKSWK_JOB_DLI, CV_WKSWK_JOB_YR). Additional checks were made and, when necessary, updates were made to the CVC_TTL_JOB_YR_ALL and CVC_TTL_JOB_ADULT2/TEEN2 variables.

    As part of this process, staff also conducted a more comprehensive tracing of job ID values and their link to the actual job name as reported over time. This review identified a total of 75 cases in which some form of data correction was necessary. These updates are included in the most recent version of the public data.

    The types of data correction are as follows. First, in a total 30 cases, a technical error caused jobs that were continuing from previous rounds to be inadvertently assigned a new job ID. Second, in 25 cases, the respondent failed to correctly report a job as a previously held job. In both of these cases, the correct job ID was assigned, and the corresponding data and created variables were updated in the same manner as described for Category 2 above.

    Third, in two cases, a false job was reported that needed to be deleted. Fourth, in 18 cases, a duplication of job data was detected within the same survey year. The duplicate job roster data and employment section data for these cases were removed. The process of updating the data for deleted cases is the same as described above for Category 3.

    These updates have been included in the most recent version of the public data.

  2. Corrections to NLSY97 schooling roster and event history data

    POSTED 11/27/2023

    NLSY97 archival staff have identified a problem that affects colleges that appear on the NEWSCHOOL roster. The NEWSCHOOL_INTERVIEW.xx and NEWSCHOOL_PUBID.xx variables are intended, respectively, to identify the round in which a particular school is first reported and then to assign a permanent public ID to the school within each respondent's enrollment history. This allows public data users to determine whether a respondent is enrolled at a school that was first reported in an earlier round, or whether a respondent has enrolled in a new school. For 320 respondents, we discovered that some colleges which appear to be newly reported colleges are in fact schools that were originally reported in an earlier round. These updates have been included in the most recent version of the public data.

    In order to correct this problem, the original interview round and public id have been reassigned to these schools wherever they appear in later rounds. These corrections also affect the created event history array based on the college attended SCH_COLLEGE_ID_year.xx and the associated term SCH_COLLEGE_TERM_year.xx and degree SCH_COLLEGE_DEGREE_year.xx. Corrections have been made in all rounds of data.

    This review also identified a small number of cases in which the SCH_COLLEGE_TERM array was repeated and needed to be corrected (PUBID= 584, 1182, 1200, 2552, 3365, 3497, 5677, 6179, 6341, 6589, 6773, 7281, 7537, 8096, and 8917). Additionally, updated information was provided for cases in which the respondent reported an ‘other specify’ answer for the degree pursued (PUBID=61, 1081, 2779, 5710, 6239, 6674, 7105, 7460, and 8684); these answers were later coded into actual degrees. For these cases, the SCH_COLLEGE_STATUS array was updated and the SCH_COLLEGE_TERM, SCH_COLLEGE_ID, SCH_COLLEGE_DEGREE arrays were added to the data.

    These updates have been included in the most recent version of the public data.


    POSTED 10/23/2023

    The COVID_ITEM_MODE variable is added to the most recent version of the public data. This variable determines the mode of interview based on the individual check items administered mechanically during the survey. 

  4. Corrections to the created incarceration variables and event histories

    POSTED 10/23/2023

    Updates have been made for 15 respondents in the incarceration arrays. These updates apply to respondents with 4 or more arrests that complete an incarceration prior to the interview date and have a subsequent incarceration in the same round. The later incarceration was not included in the program. Updates were made to the following variables: INCARC_AGE_FIRST, INCARC_FIRST, INCARC_LENGTH_FIRST, INCARC_LENGTH_LONGEST, INCARC_TOTMONTHS, INCARC_TOTNUM and various INCARC_STATUS variables between 2003.05 and 2019.09. The affected respondents are the following: 3127, 4044, 4937, 5066, 5092, 5286, 5404, 5760, 5893, 7375, 7483, 8571, 8607, 8761, and 8843. These variables are available in the most recent version of the public data.

  5. NLSY97 CV_MSA coding error in round 18 and round 19

    POSTED 5/22/2023

    Respondents who were not in the country were miscoded in the CV_MSA variable. These respondents were assigned a valid skip (-4) rather than the established category for not in the country (code=5). All of the respondents with a valid skip for these rounds should be recoded to the value of 5. These updates have been included in the most recent version of the public data.

  6. Income and assets review

    POSTED 11/4/2022

    The NLS program has conducted a review of income and assets topcoding and standardized these values. Adjustments have been made to a very limited number of values to reflect the current standards, including in the most recent version of the public data.

  7. Missing 1998 machine check variables

    POSTED 10/25/2022

    A set of machine check freelance job variables from 1998 was inadvertently omitted from a previous public release. The variables are included in the most recent version of the public data.

  8. Errata for custom weighting program

    POSTED 3/25/2022

    An error in the NLSY97 custom weighting program was discovered and corrected on March 10, 2022. The program application allowed an option to reclassify all members in the sample as cross-sectional. A program error caused this option to be exercised which reassigned all oversample user-entered IDs to be redefined as cross-sectional. As a result, the program generated weights without accounting for cross-sectional versus oversample status. The created variables contained on the public release are unaffected as are yearly weights generated from the Weight Years page in the Custom Weighting program.

  9. Corrections in some employer ID, created, and event history variables

    POSTED 2/25/2022

    A review of employer unique id variables indicated that for three respondents, there were duplicated ids representing unique jobs in various rounds of data. This problem affected the various YEMP_UID and related XWALKID variables across a number of rounds. Additional updates were made to the associated created and event history variables: EMP_STATUS_year.xx, EMP_DUAL2/3_year.xx, EMP_BK_WKS_year, EMP_BY_STATUS_year, EMP_BK_HOURS_year, EMP_START_YEAR_year, and CV_WKSWK_JOB_DLI. The corrected values for the affected cases are included in the most recent version of the public data.

  10. Missing Employment Variable

    POSTED 1/26/2022

    As part of the 2021 data release, a large set of variables that were used to determine job type was released to the public for the first time. Unfortunately, one of these variables, the 2005 version of YEMP-9899WDZ, which collects information about on-call status, was inadvertently left off that release. This variable is included in the most recent version of the public data.

  11. Updates to college event history variables

    POSTED 12/29/2021

    Due to a programming error, the imputation for cases where a respondent reported a valid start date but invalid stop date for an enrollment period was not implemented. This only affected data from Round 18 and Round 19 reports. A total of 54 cases were affected. The corrections are included in the most recent version of the public data.

  12. Missing Assets35 variables

    POSTED 12/10/2021

    Seven variables that were part of the combined NLSY97 Assets 35 section were inadvertently omitted from a previous public release. The complete Investigator tagset for these variables are included in the most recent version of the public data.

  13. Updates to marital status and associated event history variables

    POSTED 11/30/2021

    The round-specific created variables (CV_MARSTAT and CV_MARSTAT_COLLAPSED) and associated event history variables were updated in the Round 19 release to reflect respondent corrections to marital and cohabitation status histories. The variables also are updated in the most recent version of the public data.

    In addition, an examination of partners accumulated across multiple rounds indicated that a small number of partners who at first appeared to be unique individuals were in fact the same person. Updates made to the Round 19 release include those to the PARTNERS roster along with the event history and created variables to account for these mis-reports.

  14. Mismatched industry and occupation codes in single 2008 case

    POSTED 11/23/2021

    We discovered that for a single case in the 2008 data, the industry and occupation codes were not matched to the correct job number for some of this respondent's reported jobs. The corrected industry and occupation codes for these jobs are updated in the most recent version of the public data.