Errata for 1979-2018 Data Release

Errata for 1979-2018 Data Release

NEWEST ERRATA

NLSY79 2018 Bonus and Income Data [posted 8/3/2022]

A small number of bonus and income variables were inadvertently omitted from the 2018 release. These variables can be downloaded using the missingbonusandincvars.zip file until they are made available on the next public release.

OTHER ERRATA

NLSY79 Total Net Family Income 2018 Value of Estates/Trusts/Inheritances [posted 7/25/2022]

The Total Net Family (TNFI_TRUNC) for 2018 mistakenly incorporated values from Q13-73A (value of estates/trusts/inheritances in the past calendar year), which is not consistent with the created income (TNFI_TRUNC) and poverty status (POVSTATUS) variables in other survey years. To maintain historical consistency, we will update the R28 TNFI_TRUNC variable by removing this component and recomputing TNFI_TRUNC and POVSTATUS values in the next release.

NLSY79 2018 Smoking and Alcohol Use Data [Posted 7/7/2022]

In 2016, the Smoking and Alcohol Use section was not fielded. This section was fielded in 2018. Unfortunately, a documentation setting error prevented these questions from being included in the 2018 release. These data will be included in the next full release of NLSY79 data. In the meantime, the data from the 2018 Smoking and Alcohol Use section can now be downloaded using the link smokealcohol.zip.

NLSY79 Total Net Family Income 2006 Estimates Included [posted 6/22/2022]

The Total Net Family Income (TNFI) for 2006 incorporated estimates of income values where available (self-reported ranges and unfolding brackets). While the data are not incorrect, the inclusion of estimates is not consistent with TNFI in other survey years. In future data releases, estimates will be eliminated from the 2006 TNFI_TRUNC calculation. Values for these cases will be reset to -3 (invalid missing). Family Poverty Status for affected cases will also be reset to -3. As in other survey years, estimated values will still available in the survey data for 2006 for researchers to incorporate at their discretion. To eliminate those cases from the 2006 TNFI_TRUNC variable before the next release, users can reset TNFI_TRUNC and POVSTATUS values to -3 for respondents reporting estimates for any TNFI_TRUNC components.

The following number of cases will be affected by the elimination of estimates of income values for the 2006 TNFI and POVSTATUS variables.

Reference #

Variable Name

Variable Title

Year

Total # of cases affected

T09878.00

TNFI_TRUNC

TOTAL NET FAMILY INCOME IN PAST CALENDAR YEAR *KEY* (TRUNCATED)

2006

725

T09879.00

POVSTATUS

FAMILY POVERTY STATUS IN 2005

2006

725

As mentioned above, in future data releases, these cases will be reset to -3 values (invalid missing), comparable to other survey years.

NLSY79 Total Net Family Income 2002 and 2004 Components in Error [posted 6/22/2022]

As components of Total Net Family Income (TNFI), monthly amounts for veterans', disability and Social Security benefits in 2002 and 2004 were all multiplied by 12, assuming receipt for every month in the past calendar year. Amounts for respondents reporting less than 12 months receipt should have been multiplied by the actual number of months the benefits were received. The Family Poverty Status (POVSTATUS) for a very small set of respondents in these survey years were affected as well.

The following 2002 and 2004 variables are affected:

Reference #

Variable Name

# cases for each type benefit affected

Year

Total # cases affected

R77037.00

TNFI_TRUNC

6 (veterans' benefits)

60 (disability)

5 (Social Security)

2002

71

 R77039.00

POVSTATUS

 

2002

8

R84961.00

TNFI_TRUNC

5 (veterans' benefits)

78 (disability)

18 (Social Security)

2004

98

R84963.00

POVSTATUS

 

2004

5

The survey level data for all of the TNFI_TRUNC components is correct and can be found in the current data release. A list of corrected and adjusted TNFI_TRUNC and POVSTATUS values for each survey year can be found in tnfi_vetdissocsec_0204_corrections.xlsx.

Corrected data for these variables will be included in the next public release.

Sex of First Biological Child [posted 5/5/2022]

The sex of the first child in the year-specific 2018 Fertility and Relationship History data was inadvertently not included in the public release. 

Users can download this variable in nlsy79_2018_sex_of_first_chid.zip.

Sex of Household Members [posted 5/5/2022]

In the NLSY79 Household Record data for 2004 (R21), 2006 (R22), and 2008 (R23), some household members incorrectly received codes of -4 (missing) for sex, rather than codes of 1 for male and 2 for female. In 2006, no variable was released for the 12th household member, and in 2008, variables for the 10th and 11th household members were not released. 

Users can download these variables in nlsy79_sex_of_hh_members_04_0_08.zip

The affected variables are:

Ref Number

Question Name

Year

Title

R85120.00

HHI_FINAL_GENCODE.01

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #01

R85121.00

HHI_FINAL_GENCODE.02

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #02

R85122.00

HHI_FINAL_GENCODE.03

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #03

R85123.00

HHI_FINAL_GENCODE.04

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #04

R85124.00

HHI_FINAL_GENCODE.05

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #05

R85125.00

HHI_FINAL_GENCODE.06

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #06

R85126.00

HHI_FINAL_GENCODE.07

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #07

R85127.00

HHI_FINAL_GENCODE.08

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #08

R85128.00

HHI_FINAL_GENCODE.09

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #09

R85129.00

HHI_FINAL_GENCODE.10

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #10

R85130.00

HHI_FINAL_GENCODE.11

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #11

R85131.00

HHI_FINAL_GENCODE.12

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #12

R85132.00

HHI_FINAL_GENCODE.13

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #13

R85133.00

HHI_FINAL_GENCODE.14

2004

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #14

T10068.00

HHI_FINAL_GENCODE.01

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #01

T10069.00

HHI_FINAL_GENCODE.02

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #02

T10070.00

HHI_FINAL_GENCODE.03

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #03

T10071.00

HHI_FINAL_GENCODE.04

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #04

T10072.00

HHI_FINAL_GENCODE.05

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #05

T10073.00

HHI_FINAL_GENCODE.06

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #06

T10074.00

HHI_FINAL_GENCODE.07

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #07

T10075.00

HHI_FINAL_GENCODE.08

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #08

T10076.00

HHI_FINAL_GENCODE.09

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #09

T10077.00

HHI_FINAL_GENCODE.10

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #10

T10077.10

HHI_FINAL_GENCODE.11

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #11

T10077.20

HHI_FINAL_GENCODE.12

2006

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #12

T22272.00

HHI_FINAL_GENCODE.01

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #01

T22273.00

HHI_FINAL_GENCODE.02

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #02

T22274.00

HHI_FINAL_GENCODE.03

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #03

T22275.00

HHI_FINAL_GENCODE.04

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #04

T22276.00

HHI_FINAL_GENCODE.05

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #05

T22277.00

HHI_FINAL_GENCODE.06

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #06

T22278.00

HHI_FINAL_GENCODE.07

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #07

T22279.00

HHI_FINAL_GENCODE.08

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #08

T22280.00

HHI_FINAL_GENCODE.09

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #09

T22280.10

HHI_FINAL_GENCODE.10

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #10

T22280.20

HHI_FINAL_GENCODE.11

2008

FINAL HOUSEHOLD RECORD - SEX OF MEMBER #11

 

Distance Measures for 2 NLSY79 Respondents Previously Dropped from the Sample [posted 3/25/2022]

Distance measures for 2 respondents dropped in prior rounds were erroneously maintained in the created distance variables (DISTANCE_) on the geocode release. These variables for respondent id 4646 relating to survey year 2006 should be recoded to -4. These variables for respondent id 7645 for survey years 2012 through 2016 should be recoded to -4. Prior errata noted that these cases were dropped from the data and all case data (other than the stray distance values) relating to the years in question were set to 'non-interview' for the appropriate survey rounds. This includes sample weights of '0' and RNI of 64 (unable to locate youth). These measures will be updated in the next geocode release.

CPSJOB Flags in NLSY79 Data [posted 3/25/2022]

CPS Job Flag Corrections – 1979, 1982-1992

The CPS job flag variables for some cases distributed between 1982 and 1992 are erroneously set to "0" instead of "1". The variables affected are assorted QES-52.## in various survey years from 1982-1992, and from the created Employer History Roster, EMPLOYERS_ALL_CPSJOB_[YEAR].## in 1979 and 1982-1992.

The Employer History Roster CPS job flags for 1979 (EMPLOYERS_ALL_CPSJOB_1979.##) were created based on the current or most recent job definition and released with the rest of the roster variables. There are no main questionnaire variables such as QES-52.## to correct in 1979. However, a small number of cases were still missing a CPS job identification and required correction (see table below).

The number of cases affected for 1979 and 1982-1992:

Survey Year

Single Job Reported

Multiple Jobs Reported

1979

8

10

1982

83

4

1983

32

0

1984

41

1

1986

0

1

1988

76

26

1989

2

0

1990

8             

4

1991

3

0

1992

9

5

A list of corrections and revisions can be found in nocpsjobsided_fixes_all.xlsx. Corrected and adjusted data for the variables listed above will be included in the next public release.

Errors in 2004 and 2006 NLSY79 Payrate and Related Variables [posted 3/25/2022]

Payrate variables are incorrect for a small subset of respondents who reported teacher-specific earnings in 2004 and 2006. The payrates for those respondents were not multiplied by 100 as they should have been. Because the payrates are a component in the calculation of the Hourly Rate of Pay variables and these variables are duplicated into the Employer History Roster, a number of other items are also affected. 

The following variables are affected for survey years 2004 and 2006:

PAYRATE-EMP-ALL.##
HRP#
HRP#_WHRLY2

In addition, the Employer History Roster variables listed below are also affected:

EMPLOYERS_ALL_PAYRATE_2004.##
EMPLOYERS_ALL_PAYRATE_2006.##
EMPLOYERS_ALL_HRLY_WAGE_2004.##
EMPLOYERS_ALL_HRLY_WAGE_2006.##

The following number of cases are affected for each job reported in 2004 and 2006:

2004 189 job 1
34 job 2
12 job 3
3 job 4
   
2006 198 job 1
20 job 2
9 job 3
2 job 4
1 job 5

A list of corrected values for each survey year and job can be found in payrate_hrp_2004_2006_corrections.xlsx.

Corrected data for these variables will be included in the next public release.

Erroneous Employer Data for NLSY79 Case in 1988 [posted 3/25/2022]

Respondent #6373 has been found to have a small number of variables that contain erroneous data for an employer in survey year 1988. The respondent was not employed during the survey period, but appears to have erroneously answered a small set of questions for a past employer. Values for the variables listed below should be set to -4 (valid missing) for 1988.

R27707.00                           QES-51.01  HOURS PER DAY WORKED JOB #01
R27708.00                            QES-52.01 INT CHECK – IS JOB #01 SAME AS CURRENT JOB?
R27709.00                          QES-52A.01 HOURS PER WEEK WORKED JOB #01
R27710.00 QES-52B.01 HOURS PER WEEK USUALLY WORKED AT HOME JOB #01
R27713.00 QES-53B.01 INT CHECK 88 – R WORK MORE/LESS THAN 10 HOURS PER WEEK? JOB #01
R27714.00 QES-54B.01 INT CHECK 88 – DID R WORK MORE/LESS THAN 9 WEEKS?  JOB #01
R27722.00 QES-71A.01 TIME UNIT OF RATE OF PAY JOB #01
R27722.20 HRP1_WHRLY2 HOURLY RATE OF PAY – INCLUDIN HOURLY RATE FOR RS FIRST REPORTING NON-HOURLY TIME UNIT JOB #01
R27723.00 QES-78.01 PAID BY THE HOUR (TIME UNIT OTHER THAN HOURLY PREVIOUSLY REPORTED)?  JOB #01
R27724.00 QES-79.01 HOURLY RATE OF PAY (RATE OTHER THAN HOURLY PREVIOUSLY REPORTED)  JOB #01
R27726.00 QES-80.01 INT CHECK – JOB #01 SAME EMPLOYER AS ANY AT LAST INT?
R27728.00 QES-92.01 INT CHECK 88 – ARE THERE ANY MORE EMPLOYER SUPPLEMENTS?  JOB #01
R27731.00 QES-87A.01 INT CHECK – R WORK LESS THAN 10 HOURS PER WEEK OR LESS THAN 9 WEEKS?  JOB #01
R27732.00 QES-88C.01 WAGES SET BY COLLECTIVE BARGAINING?  JOB #01

Corrected data for the variables listed above will be included in the next public release.

Correction to the 2006 PAST_CALENDAR_YEAR Variable [posted 2/4/22]

The R22 (2006) PAST_CALENDAR_YEAR incorrectly incremented the year variable by 1. The current answer category of 2006 should be 2005 and the current answer category of 2007 should be 2006. This variable will be included as part of the next public release. Until then, it can be accessed from the following link: PAST_CALENDAR_YEAR.xlsx.

Two Values for NLSY79 Variable QES-PARSE.## Miscoded [posted 6/16/2021]

Two values for QES-PARSE.## (one for each affected variable) are miscoded. The affected variables and the correct values are listed in the table below:

Survey Year

Reference Number

Qname

Title

Public ID Number

Correct Value

2004

R78637.00

QES-PARSE.02

JOB TYPE OF EMPLOYER #02

5074

1

2012

T41637.00

QES-PARSE.04

JOB TYPE OF EMPLOYER #04

2030

2

Corrected data for these variables will be included in the next public release.

Undocumented CPSOCC80 Codes in NLSY79 Data [posted 3/9/2021]

A relatively small number of undocumented codes have been brought to CHRR's attention in CPSOCC80 occupation code variables named CPSOCC80, between survey years 1992-2000. Codes have been revised where possible. CHRR was unable to verify most of the undocumented codes for survey years 1992-1994. The majority of undocumented codes for those survey years were set to an invalid missing code of "-3”. Variables affected are:

RNUMBER QNAME VARIABLE TITLE YEAR
R3728100 CPSOCC80 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 1992
R4193700 CPSOCC80 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 1993
R4587901 CPSOCC80 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 1994
R5270900 CPSOCC80 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 1996
R6473700 CPSOCC80 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 1998
R6592900 CPSOCC80.01 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (80 CENSUS 3 DIGIT) CPS ITEM 2000

Revised values for the variables listed above are contained in cpsocc80_undoc_codes_revisions.xlsx

Users can also access the 1970 occupation codes. Those variables for survey years 1992-2000 are:

RNUMBER QNAME VARIABLE TITLE YEAR
R3727800 CPSOCC70 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (70 CENSUS 3 DIGIT) CPS ITEM 1992
R4182100 CPSOCC70 OCCUPATION AT CURRENT JOB/MOST RECENT JOB (70 CENSUS 3 DIGIT) CPS ITEM 1993
R4587904 OCCALL-EMP.01 OCCUPATION (CENSUS 3 DIGIT, 70 CODES) (ALL) JOB #01 1994
R5270600 OCCALL-EMP.01 OCCUPATION (CENSUS 3 DIGIT, 70 CODES) (ALL) JOB #01 1996
R6472600 OCCALL-EMP.01 OCCUPATION (CENSUS 3 DIGIT, 70 CODES) (ALL) JOB #01  1998
R6591800 OCCALL-EMP.01 OCCUPATION (CENSUS 3 DIGIT, 70 CODES) (ALL) JOB #01    2000

Corrected data for the CPSOCC80 variables will be included in the next public release.

CPS HOURLY RATE OF PAY and CPS Job Flag Corrections for NLSY79 1980-1993 [posted 1/6/2021]

A review of the CPS hourly rate of pay variables yielded corrections to a small number of variables:

  1. A small number of CPSHRP variables (143) from 1980-93 were found to be missing valid values that should have been present.
  2. In addition, 40 cases were found to have a discrepancy between QES-52.## and QES-84.## (two variables that identify CPS job between 1988-1993). In the vast majority of those, QES-52.## appears to be correct.
  3. 20 cases in 1993 were rounded in the HRP and HRP_WHRLY_2 variables but not in the CPSHRP variable. The HRP will be replaced with the more precise CPSHRP calculation.
  4. 4 respondents were identified to have an incorrect CPSHRP value with one of those requiring an update to the HRP and HRP_WHRLY_2 values.

Corrected data for these variables have been included in the public release. 

AGEATINT updates and corrections (1979 to 2016) [posted 1/6/2021]

Due to reauthorization of CIPSEA in 2019, in response to the Foundations for Evidence-Based Policymaking Act of 2017, a review of all age variables was undertaken. The AGEATINT variables were smoothed in accordance with this review. During this process, two issues were corrected for the AGEATINT variable:

  1. In 1994, the original 1979 date of birth was used in the calculation rather than the hybrid date of birth that combines the 1979 date of birth with the 1981 date of birth. All other AGEATINT variables use the hybrid date to calculate the age at interview. This led to an update to 87 respondents.
  2. In the 2004 program, the components of the date of interview variable were mislabeled. When the program calculated the age, this mismatch caused the age to be miscalculated for 3,314 cases.

Corrected data for these variables have been included on the public release.

Corrections to INDALL-EMP and OCCALL-EMP variables in 2014 and 2016 [posted 1/6/2021]

In updating the R28 industry and occupation codes, staff found a small number of codes were pulled forward erroneously for long-term non-interviews (last interview in 2002 or prior) when a different coding frame was used. The codes for non-interviewed respondents from 2002 were converted by multiplying the existing code by 10; this affected 4 respondents (4 values) in 2014 and for 6 respondents (12 values) in 2016. The values for an additional 5 respondents in 2014 and 10 respondents in 2016 were removed as these were coded to the 1970 frame.

Corrected data for these variables have been included on the public release.

Corrections to HOURS_WORKED_WEEK_ALL.## for 2002 [posted 1/6/2021]

Values for the created variables HOURS_WORKED_WEEK_ALL.## in 2002 are incorrectly set to value missing (-4) for self-employed workers. This affected 903 self-employed jobs.

Corrected data for these variables have been included on the public release.