Item Nonresponse by Section
Item Nonresponse by Section
This section examines and quantifies the extent of missing data, formally called item nonresponse, in each section of the NLSY79. The six tables below show which areas of the NLYS79 respondents are least likely to answer by tracking the total number and percentage of questions that have missing data for each group of respondents. To provide readers with a detailed view of this problem, six surveys are analyzed. Nonresponse rates are examined first in the 1979 survey and then in the surveys that occur at roughly five-year intervals (1984, 1989, 1994, 1998, and 2004). These years were chosen to capture the major changes in the NLSY79. Examining the 1979 survey shows the initial levels of nonresponse. Examining the 1984 survey shows the amount of nonresponse in the survey just before one part of the respondent pool was dropped. The 1989 data show nonresponse after the first set of NLSY79 respondents was dropped. The 1994 data show what occurred after users and interviewers were switched from paper-and-pencil interviewing (PAPI) to computer-assisted personal interviewing (CAPI). While no major survey changes occurred during the 1998 and 2004 surveys, these surveys show nonresponse rates after many respondents had participated around 20 times.
The first column of the tables contains the section names within the survey. The second column shows the total number of questions that all respondents and all interviewers should have answered in that section. This number is determined by first calculating within each section the number of questions each respondent should answer. A question is considered answerable if it does not have a valid skip (-4) or noninterview (-5) as its answer. A total for the section is obtained by summing up the answers for all NLSY79 respondents.
The third (don't know), fourth (refusal), and fifth (invalid skip) columns show the total number of nonresponses found in each section. Columns six, seven, and eight show the same information except in percentage form. The ninth column shows the total percentage of questions missed and is the sum of the previous three percentages. The last column, labeled rank, shows which sections have the most (closer to 1) and least (further from 1) amount of nonresponse.
The bottom row of each table combines the information and shows totals. For example, the bottom of the "Number Questions Asked" column in the 1979 survey shows that almost four million questions (3,975,146) were expected to be filled in by respondents or interviewers. While the 1979 survey contains many questions, other years are not far behind. In 1984, there were 3 million questions, 1989 had 1.8 million, 1994 had 3.7 million questions, 1998 had had 4.1 million questions and 2004 had 3.7 million. Readers are cautioned that each year of NLSY79 data contains far more data points since the tables exclude questions obviously labeled as machine checks, date and time stamps, and questions with valid skip or noninterview data flags.
The six tables show that the overall rate of missing data for many years dropped steadily over time. In 1979, 2.7 percent of the questions in the survey were not answered. This number drops to 1.9 percent in 1984 and then falls to 0.9 percent in 1989 and reaches a low point of 0.7 percent in 1994. After 1994 the number rises again with 0.92 percent in 1998 and 1.42 percent in 2004. Hence, nonresponse problems are of slightly less concern after the initial round of surveying.
Combining the data from all sections in all the tables shows the majority of nonresponse is caused by don't knows and invalid skips. The surveys examined asked a total of 20 million questions. Of these questions more than 140,000 or 0.7 percent were don't knows and slightly more than 127,000, or 0.6 percent were invalid skips. The last category, refusal, contains about 26,000 questions which is roughly 0.1 percent of all questions asked.
Examining the tables over time shows a steady decrease in the amount of data missing due to invalid skips. In 1979, invalid skips accounted for 2.1 percent of the questions asked. This number dropped sharply to 1.2 percent by 1984 and then down to 0.25 percent by 1989. Analysis indicated that CAPI dramatically lowered the problem of invalid skips with only 57 questions out of almost 3.7 million incorrectly skipped in 1994 and 75 questions out of 4 million in 1998.
While invalid skips fall over time, the percentage of refusals has increased slightly. Refusals accounted for 0.01 percent in 1979, 0.07 percent in 1984, 0.10 percent in 1989, 0.16 percent in 1994, 0.19 percent in 1998, and 0.20 percent in 2004. Nevertheless, while refusals steadily increase over time in absolute terms the numbers are still quite small.
While invalid skips fall and refusals are rising over time, the trend in don't knows is more complex. Don't knows accounted for 0.6 percent in 1979, 0.6 percent in 1984, 0.5 percent in 1989, 0.5 percent in 1994, 0.7 percent in 1998, and 1.1 percent in 2004. These figures suggest that don't knows are making a U-shaped pattern over time.
The last column, labeled rank, shows that missing data are not confined to a single section or area of the survey. Table 1 shows that in 1979 the work experience section, with 14.5 percent of the questions missing valid data, had the most problems. Fourteen percent of all questions asked in this section are labeled as invalid skips and only 0.5 percent of the questions were either refusals or don't knows. Military experience, the second most problematic section had almost half the rate of missing data (7.8 percent) as work experience. The table shows the problem of invalid skips is not related to subject matter since the section (rank 21 out of 21) with the least problems, titled "On Jobs," also focuses on labor market issues, like work experience.
While the "On Jobs" section of the survey consistently has the least problems in these surveys, the section with the most problems changes. Table 2, which examines the 1984 survey, shows the most problems in the "Fertility" section. Of the almost half-million questions asked in the fertility section, 5.6 percent contain missing data. While the majority of problems (3.4 percent) were due to invalid skips, a surprisingly large 2 percent of the missing responses are don't knows. The second most problematic section in the 1984 survey was "Drug Use", where 2.7 percent of the questions have missing data. Like "Fertility," the major portion of the problem is invalid skips (1.8 percent), but don't knows (0.8 percent) also account for a significant share. Interestingly, refusals account for only 0.1 percent, a relatively small proportion for a sensitive topic, suggesting that some of the don't knows were hidden refusals.
Table 1. Extent of Refusals, Don't Knows and Invalid Skips, 1979
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Family Background | 660803 | 6196 | 90 | 12292 | 0.94% | 0.01% | 1.86% | 2.81% | 7 |
Marital Status | 32995 | 131 | 25 | 467 | 0.40% | 0.08% | 1.42% | 1.89% | 14 |
Fertility | 82141 | 679 | 23 | 624 | 0.83% | 0.03% | 0.76% | 1.61% | 17 |
Schooling | 402134 | 994 | 14 | 5592 | 0.25% | 0.00% | 1.39% | 1.64% | 16 |
Pay | 211504 | 22 | 0 | 3482 | 0.01% | 0.00% | 1.65% | 1.66% | 15 |
World of Work | 220185 | 2220 | 31 | 2883 | 1.01% | 0.01% | 1.31% | 2.33% | 10 |
Military | 145619 | 491 | 24 | 10885 | 0.34% | 0.02% | 7.47% | 7.83% | 2 |
CPS | 396697 | 862 | 8 | 10969 | 0.22% | 0.00% | 2.77% | 2.98% | 5 |
On Jobs | 230982 | 135 | 2 | 903 | 0.06% | 0.00% | 0.39% | 0.45% | 21 |
Employer Supp. | 291836 | 2009 | 69 | 3575 | 0.69% | 0.02% | 1.23% | 1.94% | 13 |
Last Job | 44504 | 31 | 0 | 261 | 0.07% | 0.00% | 0.59% | 0.66% | 20 |
Work Experience | 67695 | 288 | 15 | 9476 | 0.43% | 0.02% | 14.00% | 14.45% | 1 |
Gov. Training | 36728 | 62 | 28 | 2124 | 0.17% | 0.08% | 5.78% | 6.03% | 3 |
Other Training | 103662 | 52 | 0 | 2936 | 0.05% | 0.00% | 2.83% | 2.88% | 6 |
Not at Work | 90768 | 79 | 7 | 5019 | 0.09% | 0.01% | 5.53% | 5.62% | 4 |
Health | 67869 | 358 | 2 | 545 | 0.53% | 0.00% | 0.80% | 1.33% | 18 |
Significant Others | 58816 | 669 | 0 | 585 | 1.14% | 0.00% | 0.99% | 2.13% | 12 |
Residences | 52845 | 94 | 7 | 1029 | 0.18% | 0.01% | 1.95% | 2.14% | 11 |
Rotter Scale | 202976 | 1277 | 15 | 521 | 0.63% | 0.01% | 0.26% | 0.89% | 19 |
Income & Assets | 321685 | 1667 | 216 | 6813 | 0.52% | 0.07% | 2.12% | 2.70% | 8 |
Expectations | 252702 | 3824 | 20 | 2092 | 1.51% | 0.01% | 0.83% | 2.35% | 9 |
Total | 3975146 | 22140 | 596 | 83073 | 0.56% | 0.01% | 2.09% | 2.66% | - |
Table 2. Extent of Refusals, Don't Knows and Invalid Skips in 1984
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Calendar | 88462 | 8 | 0 | 4 | 0.01% | 0.00% | 0.00% | 0.01% | 15 |
Marital Status | 50206 | 273 | 18 | 561 | 0.54% | 0.04% | 1.12% | 1.70% | 4 |
Schooling | 324139 | 1031 | 469 | 2164 | 0.32% | 0.14% | 0.67% | 1.13% | 9 |
Military | 123126 | 337 | 41 | 1352 | 0.27% | 0.03% | 1.10% | 1.41% | 7 |
CPS | 333267 | 467 | 5 | 4270 | 0.14% | 0.00% | 1.28% | 1.42% | 6 |
On Jobs | 140382 | 0 | 0 | 17 | 0.00% | 0.00% | 0.01% | 0.01% | 16 |
Gaps in Jobs | 120601 | 15 | 0 | 175 | 0.01% | 0.00% | 0.15% | 0.16% | 13 |
Gov. Training | 31226 | 38 | 0 | 59 | 0.12% | 0.00% | 0.19% | 0.31% | 12 |
Other Training | 45002 | 7 | 0 | 736 | 0.02% | 0.00% | 1.64% | 1.65% | 5 |
Fertility | 462288 | 9141 | 891 | 15739 | 1.98% | 0.19% | 3.40% | 5.57% | 1 |
Child Care | 114317 | 201 | 13 | 1157 | 0.18% | 0.01% | 1.01% | 1.20% | 8 |
Health | 52866 | 35 | 3 | 29 | 0.07% | 0.01% | 0.05% | 0.13% | 14 |
Alcohol | 314511 | 33 | 47 | 2234 | 0.01% | 0.01% | 0.71% | 0.74% | 11 |
Drug Use | 414007 | 3464 | 300 | 7454 | 0.84% | 0.07% | 1.80% | 2.71% | 2 |
Income & Assets | 439646 | 2945 | 241 | 938 | 0.67% | 0.05% | 0.21% | 0.94% | 10 |
Attitudes | 13427 | 214 | 2 | 29 | 1.59% | 0.01% | 0.22% | 1.82% | 3 |
Total | 3067473 | 18209 | 2030 | 36918 | 0.59% | 0.07% | 1.20% | 1.86% | - |
Table 3 shows the amount of nonresponse in the 1989 survey. The most problematic section is "Income", missing data in 1.3 percent of its questions, with the CPS section a close second with 1.2 percent. Unlike earlier years, the major missing data problem in both the "Income" (1 percent) and CPS (0.8 percent) sections are don't knows, not invalid skips (0.1 percent income and 0.4 percent CPS).
Table 3. Extent of Refusals, Don't Knows and Invalid Skips in 1989
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Intro. | 14647 | 20 | 1 | 41 | 0.14% | 0.01% | 0.28% | 0.42% | 7 |
Marital | 86563 | 372 | 121 | 450 | 0.43% | 0.14% | 0.52% | 1.09% | 3 |
Schooling | 76999 | 179 | 39 | 217 | 0.23% | 0.05% | 0.28% | 0.56% | 6 |
Military | 33579 | 1 | 1 | 40 | 0.00% | 0.00% | 0.12% | 0.13% | 10 |
CPS | 406265 | 3320 | 52 | 1650 | 0.82% | 0.01% | 0.41% | 1.24% | 2 |
On Jobs | 39749 | 0 | 0 | 1 | 0.00% | 0.00% | 0.00% | 0.00% | 12 |
Gaps | 91565 | 91 | 1 | 894 | 0.10% | 0.00% | 0.98% | 1.08% | 4 |
Gov. Training | 49657 | 118 | 35 | 233 | 0.24% | 0.07% | 0.47% | 0.78% | 5 |
Fertility | 152546 | 6 | 35 | 92 | 0.00% | 0.02% | 0.06% | 0.09% | 11 |
Health | 154024 | 120 | 74 | 168 | 0.08% | 0.05% | 0.11% | 0.24% | 9 |
Alcohol | 217441 | 74 | 400 | 201 | 0.03% | 0.18% | 0.09% | 0.31% | 8 |
Income | 470686 | 4761 | 1124 | 439 | 1.01% | 0.24% | 0.09% | 1.34% | 1 |
Total | 1793721 | 9062 | 1883 | 4426 | 0.51% | 0.10% | 0.25% | 0.86% | - |
Table 4 shows that the most problematic area in the 1994 survey includes the asset questions, which are missing 2.5 percent of their answers (75 percent of those missing being don't knows). The second most problematic area includes income questions, which are missing 1.3 percent of their answers. While in the three previous surveys refusal rates were not an issue, the 1994 survey shows refusals are becoming significant. Slightly more than half a percent (0.6 percent) of the "Asset" section questions and more than one fifth of a percent (0.2 percent) of the "Income" section questions were refused.
Table 4. Extent of Refusals, Don't Knows and Invalid Skips in 1994 NLSY79 Survey
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Intro. | 36251 | 62 | 14 | 0 | 0.17% | 0.04% | 0.00% | 0.21% | 12 |
Marital Status | 137540 | 1522 | 193 | 0 | 1.11% | 0.14% | 0.00% | 1.25% | 3 |
School | 60166 | 302 | 2 | 0 | 0.50% | 0.00% | 0.00% | 0.51% | 7 |
Military | 27372 | 6 | 1 | 0 | 0.02% | 0.00% | 0.00% | 0.03% | 15 |
CPS | 269452 | 28 | 9 | 0 | 0.01% | 0.00% | 0.00% | 0.01% | 17 |
On Jobs | 79567 | 6 | 7 | 0 | 0.01% | 0.01% | 0.00% | 0.02% | 16 |
Employ. Suppl. | 1060679 | 7092 | 1342 | 8 | 0.67% | 0.13% | 0.00% | 0.80% | 5 |
Training | 194147 | 246 | 29 | 47 | 0.13% | 0.01% | 0.02% | 0.17% | 13 |
Fertility | 450871 | 1859 | 763 | 0 | 0.41% | 0.17% | 0.00% | 0.58% | 6 |
Child Care | 26453 | 109 | 12 | 0 | 0.41% | 0.05% | 0.00% | 0.46% | 9 |
Relationship | 81477 | 285 | 113 | 0 | 0.35% | 0.14% | 0.00% | 0.49% | 8 |
Health | 282702 | 623 | 199 | 0 | 0.22% | 0.07% | 0.00% | 0.29% | 11 |
Alcohol | 164663 | 46 | 61 | 0 | 0.03% | 0.04% | 0.00% | 0.06% | 14 |
Income | 305693 | 3176 | 672 | 1 | 1.04% | 0.22% | 0.00% | 1.26% | 2 |
Prog. Participation | 118305 | 297 | 63 | 0 | 0.25% | 0.05% | 0.00% | 0.30% | 10 |
Assets | 169301 | 3239 | 930 | 1 | 1.91% | 0.55% | 0.00% | 2.46% | 1 |
Drugs | 204621 | 772 | 1626 | 0 | 0.38% | 0.79% | 0.00% | 1.17% | 4 |
Total | 3669260 | 19670 | 6036 | 57 | 0.54% | 0.16% | 0.00% | 0.70% | - |
Table 5 examines the 1998 survey. Since the survey is fielded every other year in the late 1990s there is no 1999 interview, which would exactly continue the every five-year pattern. The 1998 survey is used as the closest substitute. This table, like the one for 1994, shows that the most problematic area is again the asset questions, which are missing 3.6 percent of their answers (75 percent of those missing being don't knows). The second most problematic area is the marital history questions, which added a new section that asked detailed questions about the work history and past life of the respondent's spouse. This expanded section is missing 1.8 percent of its answers. In the 1998 survey only two sections have relatively high refusal rates; assets (almost 0.6 percent) and drug use (0.79 percent).
Table 5. Extent of Refusals, Don't Knows and Invalid Skips in 1998
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Intro. | 10060 | 6 | 4 | 0 | 0.06% | 0.04% | 0.00% | 0.10% | 12 |
Marital Status | 207805 | 3296 | 520 | 1 | 1.59% | 0.25% | 0.00% | 1.84% | 2 |
School | 53928 | 197 | 45 | 0 | 0.37% | 0.08% | 0.00% | 0.56% | 10 |
Military | 25691 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.00% | 15 |
CPS | 301160 | 44 | 12 | 0 | 0.01% | 0.00% | 0.00% | 0.02% | 13 |
On Jobs | 117144 | 2 | 0 | 1 | 0.00% | 0.00% | 0.00% | 0.00% | 14 |
Employ. Suppl. | 1081493 | 10265 | 1441 | 1 | 0.95% | 0.13% | 0.00% | 1.08% | 3 |
Training | 241013 | 1559 | 143 | 1 | 0.65% | 0.06% | 0.00% | 0.71% | 7 |
Fertility | 578831 | 3180 | 1097 | 50 | 0.55% | 0.19% | 0.01% | 0.75% | 6 |
Child Care | 23241 | 57 | 11 | 1 | 0.25% | 0.05% | 0.00% | 0.30% | 11 |
Relationship | 86632 | 371 | 154 | 0 | 0.43% | 0.18% | 0.00% | 0.61% | 9 |
Health | 350533 | 2460 | 223 | 0 | 0.70% | 0.06% | 0.00% | 0.77% | 5 |
Income | 608849 | 3410 | 847 | 10 | 0.56% | 0.14% | 0.00% | 0.70% | 8 |
Assets | 174570 | 4702 | 1566 | 10 | 2.69% | 0.90% | 0.01% | 3.60% | 1 |
Drugs | 217175 | 419 | 1485 | 0 | 0.19% | 0.68% | 0.00% | 0.88% | 4 |
Total | 4078125 | 29968 | 7548 | 75 | 0.73% | 0.19% | 0.00% | 0.92% | - |
Table 6 examines the 2004 survey. This survey has two new sections that are not seen in the previous tables. The first section is found in the employer supplement and asks the respondent detailed questions about the pensions available from their employer and the respondent's participation in these pensions. This new section is ranked first in problems and has missing responses to 2.5% of all questions. The second new section is the over 40 health module. The goal of this section is to provide researchers with a baseline health measure that will be updated at ten year intervals. The health section is ranked 8th out of 13 sections and has a nonresponse rate slightly more than three-quarters of one percent.
Table 6. Extent of Refusals, Don't Knows and Invalid Skips in 2004
Section Name | # Questions Asked | # Don't Knows | # Refused | # Invalid Skipped | % Don't Knows | % Refused | % Invalid Skipped | Total % Missed | Rank |
Intro. | 91277 | 39 | 16 | 4 | 0.04% | 0.02% | 0.00% | 0.06% | 12 |
Marital Status | 77954 | 371 | 66 | 106 | 0.48% | 0.08% | 0.14% | 0.70% | 9 |
School | 56716 | 554 | 39 | 4 | 0.98% | 0.07% | 0.01% | 1.05% | 7 |
Military | 39772 | 20 | 5 | 0 | 0.05% | 0.01% | 0.00% | 0.06% | 13 |
Employ. Suppl. | 734366 | 7729 | 1001 | 275 | 1.05% | 0.15% | 0.04% | 1.23% | 6 |
Pensions | 189861 | 3753 | 508 | 485 | 1.98% | 0.27% | 0.26% | 2.50% | 1 |
Training | 307708 | 2943 | 887 | 322 | 0.96% | 0.29% | 0.10% | 1.35% | 5 |
Fertility | 521658 | 5801 | 733 | 1216 | 1.11% | 0.14% | 0.23% | 1.49% | 3 |
Child Care | 34561 | 12 | 4 | 7 | 0.03% | 0.01% | 0.02% | 0.07% | 11 |
Relationship | 1004 | 2 | 0 | 0 | 0.20% | 0.00% | 0.00% | 0.20% | 10 |
Over 40 Health | 622644 | 4386 | 402 | 14 | 0.70% | 0.06% | 0.00% | 0.77% | 8 |
Income | 412656 | 4382 | 1199 | 39 | 1.06% | 0.29% | 0.01% | 1.36% | 4 |
Assets | 626393 | 12726 | 2634 | 233 | 2.03% | 0.42% | 0.04% | 2.49% | 2 |
Total | 3716570 | 42718 | 7494 | 2705 | 1.15% | 0.20% | 0.07% | 1.42% | - |