Newest errata
Coding errors in cause of parents' death variables
POSTED 10/1/2021
Some discrepancies have been uncovered in the NLSY97 combined cause of biological parents' death variables (YHEA29-180_COMB and YHEA29-220_COMB). First, there was a design change made in round 14 (2010) which broke up the initial cause of death category from a combined heart attack/stroke category into two separate categories. This was not indicated in the combined list of causes. Second, the values for the coding of stroke as a separate category and heart disease as an added category were inadvertently switched in the later rounds. Here is the updated list of disease categories that now appear in the most recent public data release:
Code |
Category |
---|---|
1 |
Heart Attack/Stroke (combined category in r13 only) |
2 |
Heart Attack (separate category in r14 and later) |
24 |
Stroke (separate category in r14 and later) |
2 |
Accident |
3 |
Cancer |
4 |
Old Age |
5 |
Emphysema |
6 |
OTHER (SPECIFY) |
7 |
Added in - Stroke (in r13 only) |
8 |
Added in - Heart Disease |
9 |
Added in - AIDS/HIV |
10 |
Added in - homicide |
11 |
Added in - liver disease |
12 |
Added in - diabetes |
13 |
Added in - septicemia |
14 |
Added in - viral hepatitis |
15 |
Added in - nephritis (kidney disease) |
16 |
Added in - Alzheimer's disease |
17 |
Added in - Influenza or pneumonia |
18 |
Added in - Suicide |
19 |
Added in - Unspecified Drug/Alcohol related |
20 |
Added in - Specific cause |
21 |
Added in - Unspecified cause |
22 |
Added in - Not deceased |
995 |
Supervisor Review |
999 |
Uncodeable |
Other errata
Duplicate job information for 1 respondent
POSTED 8/23/2021
Respondent 2685 reported information for 2 jobs that duplicate the main job reported in Round 5 (survey year=2001). This information was erroneously included in the created summary and event history variables posted in the current NLSY97 data. This info has now been updated in the most recent public release.
Some spouse/partner components omitted from NLSY97 Assets30 and Assets35 values
POSTED 8/2/2021
Some components related to spouse/partners were not incorporated into the ASSETS30 and ASSETS35 values reported in survey years 2010-2017 (rounds 14-18). Respondents reporting that the house in which they lived was (1) owned all or in part by the spouse/partner (the respondent did not have an ownership share) and (2) was not the house in which the respondent was living at the date of the prior interview round are missing relevant house-related values from ASSETS30 and/or ASSETS35 created variables. In addition, educational debt reported for a spouse/partner was not included in the calculations of the relevant ASSETS30 and/or ASSETS35 values. The affected variables are listed below, with the number of cases affected for ASSETS30/35 respectively in parentheses:
- CVC_HOUSE_VALUE_30/35 (123/40)
- CVC_HOUSE_DEBT_30/35 (88/25)
- CVC_HOUSE_TYPE_30/35 (126/42)
- CVC_ASSETS_DEBTS_30/35 (1012/501)
- CVC_HH_NET_WORTH_30/35 (950/443)
These have been corrected in the most recent public release.
Missing Assets 30 data
POSTED 5/26/2021
The data collected in 2015 for variables YAST30-5074_NEW_COMB, YAST30-5082_NEW_COMB, and YAST30-FA_8A_COMB were inadvertently omitted from the most recent public release. The complete data for these variables are now available in the most recent public release.
Invalid case data
POSTED 5/26/2021
The following XRND variables for case PUBID=1061 have been discovered to be invalid. These data will be removed from all future public releases.
- OTHERPARENTS_AGE.01
- OTHERPARENTS_CARES.01
- OTHERPARENTS_CHILDID1.01
- OTHERPARENTS_CHILDRANK.01
- OTHERPARENTS_CLOSE.01
- OTHERPARENTS_CONFLICT.01
- OTHERPARENTS_DEGREE.01
- OTHERPARENTS_EMPLOYED.01
- OTHERPARENTS_ENROLLSTAT.01
- OTHERPARENTS_ETHNICITY.01
- OTHERPARENTS_GOVTAID.01
- OTHERPARENTS_HGC.01
- OTHERPARENTS_INCOME.01
- OTHERPARENTS_RACE.01~000001
- OTHERPARENTS_RACE.01~000002
- OTHERPARENTS_RACE.01~000003
- OTHERPARENTS_RACE.01~000004
- OTHERPARENTS_RACE.01~000005
- OTHERPARENTS_RACE.01~000006
- OTHERPARENTS_RACE.01~000007
- OTHERPARENTS_RELBIRTH.01
- OTHERPARENTS_RELIGION.01
- OTHERPARENTS_RELPREG.01
- OTHERPARENTS_ROUND.01
- OTHERPARENTS_SOURCE.01
- OTHERPARENTS_UID.01
- YAST30-4000S1_COMB
- YAST30-4000S2_COMB
- YAST30-4000S3A_COMB
- YAST30-4000S3_COMB
- YAST30-4000S4_COMB
- YAST30-4000S5_COMB
- YAST30-4000S6_COMB
- YAST30-4000S8_COMB
- YAST30-4000_COMB
- YAST30-4870_COMB
- YAST30-4880_COMB
- YAST30-5010_COMB
- YAST30-5011_COMB
- YAST30-5015A_COMB
- YAST30-5015_COMB
- YAST30-5016_COMB
- YAST30-5019_COMB
- YAST30-5130_COMB
- YAST30-5210A_COMB
- YAST30-5210B1_COMB
- YAST30-5210C1_COMB
- YAST30-5210C2_COMB
- YAST30-5210D1_COMB
- YAST30-5250_COMB
- YAST30-5400_COMB
- YAST30-5410_COMB
- YAST30-5411_COMB
- YAST30-5420_COMB
- YAST30-AGE30ELIG2_COMB
Work history variables corrected for four cases
POSTED 10/13/2020
The NLSY97 employment history arrays for four respondents (PUBID=569, 1926, 4357, 8217) were calculated using an incorrect interview date. This caused a gap in the employment event history arrays and an error in select related variables (EMP_STATUS, EMP_HOURS, CV_WKSWK_JOB_YR_ALL/ET, CV_HOURS_WK_YR_ALL/ET, CV_WKSWK_JOB_DLI. CV_WKSWK_JOB_YR) .
These updates have been included in the release.
Skip pattern error in Assets section regarding tax-advantaged accounts
POSTED 10/2/2020
In the Assets at 30 and the Assets at 35 modules on the NLSY97, a questionnaire error caused a set of respondents to bypass questions on tax advantaged accounts. The questionnaire path should have asked respondents first whether they had a pension in YAST30-4270, then whether they had tax-advantaged accounts in YAST30-FA_8, and then whether they were owed money from loans in YAST30-4300. The skip pattern error occurred when respondents who reported not having a pension were skipped past the tax-advantaged question to the loans owed question; as a result, only respondents reporting a pension were specifically asked about tax-advantaged accounts. An invalid skip (-3) has been assigned to YAST30-FA_8 and YAST35-FA_8 for all respondents who were erroneously skipped past this question. It should be noted that all respondents were asked about 'other savings or substantial assets that we haven’t already discussed' in YAST30-4880. Respondents not asked about tax-advantaged accounts would have been able to include these assets in this total, despite not being prompted with the specific tax-advantaged questions. As a result, the created variables for total assets [CVC_HH_NET_WORTH_30/35] and for financial assets [CVC_FINANCIAL_ASSETS_30/35] have been created for all respondents. Researchers should note the inconsistency in how respondents are prompted for this class of assets.
Corrections from a review of CV_HRLY_COMPENSATION and CV_HRLY_PAY variables
POSTED 6/25/2020
A comprehensive review of the created wage variables [CV_HRLY_COMPENSATION and CV_HRLY_PAY] led t o updates to these created variables and select underlying raw data through all rounds. These updates have been included in the most recent version of the public-use data available on the NLS Investigator. A summary of the issues for the created wage variables follows:
- The text fill in YEMP-97300 and YEMP-97400, which should have been worded ‘per job’ or ‘per item,’ was incorrectly worded ‘per week’. The values for the created variables were recalculated to reflect the respondent’s answer for per week.
- Respondents reporting a time unit of "semi-monthly" were asked about "monthly" amount for the YEMP-19200 and YEMP-83100 branches; the created wage variables were recalculated with the new time unit.
- The created wage program incorrectly calculated the variables for respondents who reported an ‘other specify’ time unit option in their current/most recent wages (YEMP-38014, YEMP-38107), overtime (YEMP-24502, YEMP-88502, YEMP-34404, YEMP-38202, YEMP-38003, YEMP-98404A), or compensation (YEMP-100205, YEMP-21600, YEMP-38329B, YEMP-38407) as weekly. These cases were reviewed individually to ensure that the created values match the respondent’s answer in the ‘other-specify’ verbatim. Cases in which there wasn’t a verbatim or in which the time unit rate of pay didn’t match the initial reported time unit rate of pay were assigned a -3.
- An error in the created wage program caused the compensation amount to be double-counted when respondents reported overtime and any other type of compensation in YEMP-21200 and YEMP-38329D. This occurred in select cases when the respondent reported a start wage with overtime and additional compensation and then updated the compensation for the current or most recent wage.
- In rounds 8 and 9, a coding error in the created wage program caused the number of hours worked per week for compensation by hour or by per job/per item to be missing and the created compensation variables were then coded as an invalid skip. The created variables were recalculated using the actual hours.
- Longitudinal inconsistencies were corrected in the created wage program when the reported number of hours worked or the number of weeks reported were zero or an invalid answer. These instances were standardized across all rounds. In addition, in Round 11 a programing error caused all annual salaries to be divided by 52 weeks regardless of the number of weeks reported; these values were corrected using the number of weeks reported when available.
Changes to raw data as well as created variables:
- In round 8, raw data wage items (YEMP-33600, YEMP-97500, YEMP-100250, YEMP-22629, YEMP-22630, and YEMP-22632) were not adjusted to include 2 implied decimals. This impacted the created wage variables. Both the raw data and the created variables were adjusted.
- A relatively small number of cases in rounds 3, 5, 6 and 7 had a duplication of jobs on their employer (YEMP) rosters. In the initial cleanup work done prior to each data release, these jobs were deleted from the YEMP roster, but the raw data in the employer supplement loops was left unedited; these have now been removed.
Corrections to the created incarceration variables and event histories
POSTED 6/5/2020
Updates have been made to the incarceration monthly arrays and incarceration created variables after a review of cases where the created variable indicating the number of months incarcerated (INCARC_TOTMONTHS) did not match the number of months incarcerated in the incarceration monthly history arrays (variables beginning with INCARC_STATUS). In almost all cases, INCARC_TOTMONTHS was longer than the number of months incarcerated in the arrays. The most common reasons for this difference were reports of overlapping incarcerations and duplicate reports of the same incarceration. Updates were made to the incarceration created variables and, in some cases, incarceration event history arrays for 75 cases. The variables most affected were INCARC_TOTMONTHS (75 updates), INCARC_TOTNUM (52 updates), INCARC_LENGTH_LONGEST (32 updates), and INCARC_LENGTH_FIRST (27 updates). Updates were also made to INCARC_FIRST (11 updates) and INCARC_AGE_FIRST (9 updates). Updates to the incarceration status array were made for 28 cases. These updates have been included in the most recent version of the public data.
Invalid dates for freelance job start dates
POSTED 2/10/2020
29 cases in 1998 have invalid freelance job start dates of 01/79 for the variable R2478300/R2478301 (FREELANCE_STARTDATEC.01). These dates should be set to missing as invalid skips (-3/-3).