Errata for 1979-2014 Data Release

Errata for 1979-2014 Data Release

Newest Errata [posted 11/27/2018]

Erroneous MSA Codes (1984-1987)

A small number of Metropolitan Statistical Area (MSA) codes have been corrected for 16 cases between survey years 1984-1987. Users with an active agreement to use the NLSY79 Geocode data should contact User Services (usersvc@chrr.osu.edu or 614-442-7366) for details. Corrections for these codes will be reflected on the next restricted Geocode data release.

Previously posted Errata

Documentation Errors in 2014 "Source of Health Plan" Spouse Variables [posted 7/12/2018]

Minor title documentation errors were found in a small group of 2014 NLSY79 variables. The corrected "source of a health/hospital plan" Q11-84B variable titles can be seen in the table below. These corrections will be reflected in the next public data release.

Ref# QName Corrected Variable Title
T4898700 Q11-84B~000001 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - PRIVATE INSURANCE R'S CURRENT EMPLOYER
T4898701 Q11-84B~000002

SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - PRIVATE INSURANCE R'S PREVIOUS EMPLOYER

T4898702 Q11-84B~000003 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - PRIVATE INSURANCE SPOUSE'S OR PARTNER'S CURRENT EMPLOYER
T4898703 Q11-84B~000004 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - PRIVATE INSURANCE SPOUSE'S OR PARTNER'S PREVIOUS EMPLOYER
T4898704 Q11-84B~000005 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - PRIVATE INSURANCE BOUGHT DIRECTLY FROM MEDICAL INSURANCE COMPANY
T4898705 Q11-84B~000006 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - MEDICAID/WELFARE
T4898706 Q11-84B~000007

SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - OTHER

T4898707 Q11-84B~000008 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - MEDICARE
T4898708 Q11-84B~000009 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - VA/MILITARY HEALTHCARE
T4898709 Q11-84B~000010 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - AFFORDABLE CARE ACT
T4898710 Q11-84B~000011 SOURCE OF HEALTH/HOSPITALIZATION (SPOUSE) - UNION

 

Erroneous Government Training Question Names, 1986 [posted 6/19/2018]

Two question names for government training questions in 1986 are in error. The correct question names are listed below. These corrections will be reflected in the next public data release.

Reference # Current (incorrect) Question Name Correct Question Name
Title
R19602.00 GOVT-19_2 GOVT-22_1 PLACEMENT IN A JOB AS PART OF 1ST GOVT PROGRAM TRAINING
R19603.00 GOVT-19A_2 GOVT-19_2 SERVICES PROVIDED, 2ND GOVT PROGRAM TRAINING SINCE LAST INT – CLASS TRAINING?

Child Support Variable Left Out of 2004 Data [posted 6/13/2018]

The variable Q13-33U, The Total Amount of Child Support Paid by R/SPAR in 2003, has been inadvertently omitted from the public release data.

Job ID Variable Missing for 1983 Survey Year [posted 5/29/2018; updated 6/13/2018]

It has come to our attention that the JOB_UID_EMPROSTER4 variable (NLSY79 Title = "JOB UNIQUE ID FROM EMPLOYER HISTORY ROSTER JOB #04") is missing for the 1983 survey year (Round 5). The missing data will be available in the next public data release.

Assets Variables Missing from 2004 and 2008 Data [Originally posted 2/12/2018; updated with additional variables 3/1/2018]

Some assets variables were inadvertently omitted from the NLSY79 2004 and 2008 data. They will be included in the next full release of the NLSY79 data.

Erroneous Between Job Gap Dates 2000-2012 [posted 1/9/2018]

A set of erroneous between job gap dates, computed between NLSY79 survey years 2000 and 2012, should be eliminated. These gap dates were computed for respondents who reported one of the following scenarios:

  • One job ending and another starting on the same day, or;
  • One job ending one day and another job starting the next day.

Under these circumstances, no between job gap dates should exist, as there was no actual gap in employment. The number of cases affected in each survey year is contained in the table below. Most of these gaps had no actual substantive information collected. However, dates were recorded for these gaps, giving the possible appearance of an actual gap being present.

NLSY79 Survey Year

Total

# cases for whom only gap should be eliminated

2012

134

89

2010

112

66

2008

198

133

2006

241

156

2004

187

124

2002

223

140

2000

365

224

Over half the cases affected for each survey year involved respondents with only one gap which should be eliminated (see table above). The remainder of cases involved eliminating one or more gaps and shifting subsequent gaps up where necessary. For example, a respondent might report three between job gaps, with gaps 1 and 3 being legitimate and gap 2 being erroneous. Gap 2 would be eliminated, and data for gap 3 would be shifted up to gap 2.

The erroneous gap dates did not affect the construction of the work history arrays (WEEKLY LABOR STATUS, HOURS WORKED and DUAL JOB 1-4), nor did it affect the EMPLOYER HISTORY roster, as the roster does not contain information on between job gaps.

Corrections for erroneous gaps with the scenarios outlined above will be present in the next public release.

Cross-sectional Sampling Weights Variable Issue (affects all rounds) [posted 11/21/2017]

Due to a mistake in the program, 401 NLSY79 respondents were excluded from the cross-sectional weights variable (C_SAMPWEIGHT), which contains weights for only the interviewed respondents from the cross-sectional sample. [Note: this issue does NOT affect the SAMPWEIGHT variable, which is for the full sample]. The mistakenly excluded respondents are from the 'poor white cross-sectional sample' in the sampling type variable (SAMPLE_ID=2 and SAMPLE_ID=6). This is separate from the 'poor white oversample,' which was dropped after the 1990 interview. Unfortunately, this mistake affects the cross-sectional weight in all rounds. In the short term, the values for the cross-sectional weights variable have been set to '0' for all rounds in the public use data. Once the variables are created using the full cross-sectional sample, we will post them to the errata and include them in the data.

Researchers may also use the custom weighting program to create weights for the cross-sectional sample if desired. To do this, researchers should submit the cross-sectional sample IDs for the round of interest. Here are the steps to do this:

  1. To pull the appropriate CASEIDs in a particular round, researchers start by limiting the data using RNI=-4 for that round. For example, the 2010 RNI (RNUM=T3107600) indicates that 7565 respondents were interviewed in that round.
  2. Researchers can then isolate the NLSY79 cross-sectional sample by limiting the data using SAMPLE_ID=1 through SAMPLE_ID=8 (deleting SAMPLE_ID>8). Using the example above, that leaves 4,602 cross-sectional sample respondents who were interviewed in 2010.
  3. Once the IDs are selected, submit them using the custom weighting program for the NLSY79.

Erroneous Occupation Codes (2002 and 2004) [posted 9/1/2017]

A small number of incorrect occupation codes from survey years 2002 and 2004 have been revised. Several cases for two variables are affected. The cases and relevant variables with corrected values are listed below. These corrections will be reflected in the next public data release.

RNUM                                   
Question Name       
Variable Title                                                     
R00001.00 CASEID IDENTIFICATION CODE
R72096.00 OCCALL-EMP.01  OCCUPATION (CENSUS 3 DIGIT 00 CODES (ALL), JOB #01 2002
R78981.00 OCCALL-EMP.02  OCCUPATION (CENSUS 3 DIGIT 00 CODES (ALL), JOB #02 2004

 

R00001.00 SURVEY YEAR REFERENCE # INCORRECT VALUE CORRECTED VALUE
1306  2002  R72096.00  26  56
4564  2002  R72096.00  26  56
5585  2002  R72096.00  26  385
6170  2002  R72096.00  426  625
6785  2002  R72096.00  426  625
7576  2002  R72096.00  26  56
9112  2002  R72096.00  26  874
1306  2004  R78981.00  260  560
2027  2004  R78981.00  30  3920
2469  2004  R78981.00  30  3800
3933  2004  R78981.00  260  3800

Missing 'Class of Worker' Data (2008 and 2010) [posted 5/16/2017]

Class of Worker data have been found to be missing in the NLSY79 2008 and 2010 for a small number of cases. With very few exceptions, the missing data are comprised of other reserved missing codes (-1/refused, -2/don’t know, -3/invalid missing). The missing codes will be included in the next public data release.

2014 Spouse/Partner History Variables [posted 5/11/2017]

The 2014 Spouse/Partner History variables, NUMSPPTR14 (R99115.00) and RELSPPTR14 (R99116.00), contain values for all NLSY79 respondents rather than only those respondents interviewed in 2014. For those respondents interviewed in 2014, the values on NUMSPPTR14 and RELSPPTR14 incorporate the information provided from their interviews. For those respondents not interviewed in 2014, the values on NUMSPPTR14 and RELSPPTR14 reflect the information from the last time they were interviewed and not 2014 up-to-date information. In future data releases, values on NUMSPPTR14 and RELSPPTR14 will be restricted to only those respondents interviewed in 2014. NLSY79 respondents who have a value of zero on T50221.00 (SAMPWEIGHT), the sampling weight for the 2014 interview, should be treated as having a missing value on these variables.

Uncorrectable Data Errors

Legal Form of Business Not Collected for 31 Cases in 2012 (posted 1/23/2015)

Due to an error in the questionnaire, the legal form of a business (SES-BUSOWN-12.#) for 31 NLSY79 respondents was not collected in 2012. This error affects the respondents who reported a business in 2012 that matches a job reported during the last interview and who were last interviewed in 2008 and prior. The legal form of a business for these 31 cases should be coded as -3.

The IDs for these 31 respondents can be found in the following file: legalform12_invalidmissing.xlsx.

Since in 2012 we did not re-ask the legal form of a business that matched a job reported in the last interview, users wishing to determine the legal form of that business should use the employer number from the previous survey year variable in 2012 (EMPLOYER_EMPPREVID.#). For example, if EMPLOYER_EMPPREVID.# is 1, which means that the business is the same as job #1 in 2010, then legal form of that business in 2012 is the value of SES-BUSOWN-12.01 or BUSOWN-12.01 in 2010. If the business in 2012 is job #2 in 2010, then the legal form of that business is the value of SES-BUSOWN-12.02 or BUSOWN-12.02 in 2010.

Missing Occupation, Industry and Class of Worker in 1994 data items

The occupation, industry and class of worker information for 353 CPS employers were not collected during the 1994 interview. These CPS employers were either less than 9 weeks in duration since the last interview, or were employers for whom the respondent worked less than 10 hours per week. They were erroneously treated as other non-CPS employers with those characteristics, for which occupation, industry and class of worker information is not collected. For those employers that were also reported in the previous survey year, and for which the respondent confirmed that his/her occupation did not change since the previous survey year, the occupation, industry and class of worker codes from the previous survey year should also apply. Users may also data subsequent survey years in a similar manner to attempt to fill in more of this information.

This error is present on all current NLSY79 data releases.

Missing information on Union Affiliation/Collective Bargaining in 1994 data items

Due to an error in the questionnaire, information on union affiliation and collective bargaining on a number of employers was not collected. Respondents reporting a non-self-employed job should have answered these questions. This error affects employer #1 (generally the CPS employer) for 3,210 respondents of the 7141 respondents who should have been asked, employer #2 for 531 of the 2215 respondents who should have been asked, employer #3 for 128 of the 606 who should have been asked, employer #4 for 34 of 168 who should have been asked and employer #5 for 6 of 48 who should have been asked. This is 45% missing for employer #1, 24% missing for employer #2, 21% missing for employer #3, 20% missing for employer #4 and 13% missing for employer #5.

Conversely, information on union affiliation and collective bargaining was collected on a number of self-employed respondents, for whom these questions should not have been asked. This error affects employer #1 for 166 cases, employer #2 for 45 cases, and employers #3, #4 and #5 for 1 case each. This information for self-employed respondents (those with a code of "4" for class of worker) should be disregarded.

This error is present on all current NLSY79 data releases.

2 missing cases in 1994 data items

Due to probable machine glitches, the data from two (2) apparently completed interviews was rendered inaccessible. 1994 variables for cases #5078 and #10524 are missing. Any 1994 data items remaining for these cases is meaningless and should be discarded for purposes of analysis. The 1996 interview period for these cases spanned from the 1993 to the 1996 interview. Information that would have been collected at the 1994 interview is thus now included in the data for the 1996 survey year.

This data error is present on all current NLSY79 data releases.

NORC 1978 Memo

The 1978 NORC memo regarding race and ethnicity coding can be found here: NORC 1978 memo