Item Nonresponse by Section

Item Nonresponse by Section

This section examines and quantifies the extent of missing data, formally called item nonresponse, in each section of the NLSY79. The six tables below show which areas of the NLYS79 respondents are least likely to answer by tracking the total number and percentage of questions that have missing data for each group of respondents. To provide readers with a detailed view of this problem, six surveys are analyzed. Nonresponse rates are examined first in the 1979 survey and then in the surveys that occur at roughly five-year intervals (1984, 1989, 1994, 1998, and 2004). These years were chosen to capture the major changes in the NLSY79. Examining the 1979 survey shows the initial levels of nonresponse. Examining the 1984 survey shows the amount of nonresponse in the survey just before one part of the respondent pool was dropped. The 1989 data show nonresponse after the first set of NLSY79 respondents was dropped. The 1994 data show what occurred after users and interviewers were switched from paper-and-pencil interviewing (PAPI) to computer-assisted personal interviewing (CAPI). While no major survey changes occurred during the 1998 and 2004 surveys, these surveys show nonresponse rates after many respondents had participated around 20 times.

The first column of the tables contains the section names within the survey. The second column shows the total number of questions that all respondents and all interviewers should have answered in that section. This number is determined by first calculating within each section the number of questions each respondent should answer. A question is considered answerable if it does not have a valid skip (-4) or noninterview (-5) as its answer. A total for the section is obtained by summing up the answers for all NLSY79 respondents.

The third (don't know), fourth (refusal), and fifth (invalid skip) columns show the total number of nonresponses found in each section. Columns six, seven, and eight show the same information except in percentage form. The ninth column shows the total percentage of questions missed and is the sum of the previous three percentages. The last column, labeled rank, shows which sections have the most (closer to 1) and least (further from 1) amount of nonresponse.

The bottom row of each table combines the information and shows totals. For example, the bottom of the "Number Questions Asked" column in the 1979 survey shows that almost four million questions (3,975,146) were expected to be filled in by respondents or interviewers. While the 1979 survey contains many questions, other years are not far behind.  In 1984, there were 3 million questions, 1989 had 1.8 million, 1994 had 3.7 million questions, 1998 had had 4.1 million questions and 2004 had 3.7 million. Readers are cautioned that each year of NLSY79 data contains far more data points since the tables exclude questions obviously labeled as machine checks, date and time stamps, and questions with valid skip or noninterview data flags.

The six tables show that the overall rate of missing data for many years dropped steadily over time. In 1979, 2.7 percent of the questions in the survey were not answered. This number drops to 1.9 percent in 1984 and then falls to 0.9 percent in 1989 and reaches a low point of 0.7 percent in 1994. After 1994 the number rises again with 0.92 percent in 1998 and 1.42 percent in 2004. Hence, nonresponse problems are of slightly less concern after the initial round of surveying.

Combining the data from all sections in all the tables shows the majority of nonresponse is caused by don't knows and invalid skips. The surveys examined asked a total of 20 million questions. Of these questions more than 140,000 or 0.7 percent were don't knows and slightly more than 127,000, or 0.6 percent were invalid skips. The last category, refusal, contains about 26,000 questions which is roughly 0.1 percent of all questions asked.

Examining the tables over time shows a steady decrease in the amount of data missing due to invalid skips. In 1979, invalid skips accounted for 2.1 percent of the questions asked. This number dropped sharply to 1.2 percent by 1984 and then down to 0.25 percent by 1989. Analysis indicated that CAPI dramatically lowered the problem of invalid skips with only 57 questions out of almost 3.7 million incorrectly skipped in 1994 and 75 questions out of 4 million in 1998.

While invalid skips fall over time, the percentage of refusals has increased slightly. Refusals accounted for 0.01 percent in 1979, 0.07 percent in 1984, 0.10 percent in 1989, 0.16 percent in 1994, 0.19 percent in 1998, and 0.20 percent in 2004. Nevertheless, while refusals steadily increase over time in absolute terms the numbers are still quite small.

While invalid skips fall and refusals are rising over time, the trend in don't knows is more complex. Don't knows accounted for 0.6 percent in 1979, 0.6 percent in 1984, 0.5 percent in 1989, 0.5 percent in 1994, 0.7 percent in 1998, and 1.1 percent in 2004. These figures suggest that don't knows are making a U-shaped pattern over time.

The last column, labeled rank, shows that missing data are not confined to a single section or area of the survey. Table 1 shows that in 1979 the work experience section, with 14.5 percent of the questions missing valid data, had the most problems. Fourteen percent of all questions asked in this section are labeled as invalid skips and only 0.5 percent of the questions were either refusals or don't knows. Military experience, the second most problematic section had almost half the rate of missing data (7.8 percent) as work experience. The table shows the problem of invalid skips is not related to subject matter since the section (rank 21 out of 21) with the least problems, titled "On Jobs," also focuses on labor market issues, like work experience.

While the "On Jobs" section of the survey consistently has the least problems in these surveys, the section with the most problems changes. Table 2, which examines the 1984 survey, shows the most problems in the "Fertility" section. Of the almost half-million questions asked in the fertility section, 5.6 percent contain missing data. While the majority of problems (3.4 percent) were due to invalid skips, a surprisingly large 2 percent of the missing responses are don't knows.  The second most problematic section in the 1984 survey was "Drug Use", where 2.7 percent of the questions have missing data. Like "Fertility," the major portion of the problem is invalid skips (1.8 percent), but don't knows (0.8 percent) also account for a significant share. Interestingly, refusals account for only 0.1 percent, a relatively small proportion for a sensitive topic, suggesting that some of the don't knows were hidden refusals.

Table 1. Extent of Refusals, Don't Knows and Invalid Skips, 1979

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Family Background 660803 6196 90 12292 0.94% 0.01% 1.86% 2.81% 7
Marital Status 32995 131 25 467 0.40% 0.08% 1.42% 1.89% 14
Fertility 82141 679 23 624 0.83% 0.03% 0.76% 1.61% 17
Schooling 402134 994 14 5592 0.25% 0.00% 1.39% 1.64% 16
Pay 211504 22 0 3482 0.01% 0.00% 1.65% 1.66% 15
World of Work 220185 2220 31 2883 1.01% 0.01% 1.31% 2.33% 10
Military 145619 491 24 10885 0.34% 0.02% 7.47% 7.83% 2
CPS 396697 862 8 10969 0.22% 0.00% 2.77% 2.98% 5
On Jobs 230982 135 2 903 0.06% 0.00% 0.39% 0.45% 21
Employer Supp. 291836 2009 69 3575 0.69% 0.02% 1.23% 1.94% 13
Last Job 44504 31 0 261 0.07% 0.00% 0.59% 0.66% 20
Work Experience 67695 288 15 9476 0.43% 0.02% 14.00% 14.45% 1
Gov. Training 36728 62 28 2124 0.17% 0.08% 5.78% 6.03% 3
Other Training 103662 52 0 2936 0.05% 0.00% 2.83% 2.88% 6
Not at Work 90768 79 7 5019 0.09% 0.01% 5.53% 5.62% 4
Health 67869 358 2 545 0.53% 0.00% 0.80% 1.33% 18
Significant Others 58816 669 0 585 1.14% 0.00% 0.99% 2.13% 12
Residences 52845 94 7 1029 0.18% 0.01% 1.95% 2.14% 11
Rotter Scale 202976 1277 15 521 0.63% 0.01% 0.26% 0.89% 19
Income & Assets 321685 1667 216 6813 0.52% 0.07% 2.12% 2.70% 8
Expectations 252702 3824 20 2092 1.51% 0.01% 0.83% 2.35% 9
Total 3975146 22140 596 83073 0.56% 0.01% 2.09% 2.66% -

Table 2. Extent of Refusals, Don't Knows and Invalid Skips in 1984

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Calendar 88462 8 0 4 0.01% 0.00% 0.00% 0.01% 15
Marital Status 50206 273 18 561 0.54% 0.04% 1.12% 1.70% 4
Schooling 324139 1031 469 2164 0.32% 0.14% 0.67% 1.13% 9
Military 123126 337 41 1352 0.27% 0.03% 1.10% 1.41% 7
CPS 333267 467 5 4270 0.14% 0.00% 1.28% 1.42% 6
On Jobs 140382 0 0 17 0.00% 0.00% 0.01% 0.01% 16
Gaps in Jobs 120601 15 0 175 0.01% 0.00% 0.15% 0.16% 13
Gov. Training 31226 38 0 59 0.12% 0.00% 0.19% 0.31% 12
Other Training 45002 7 0 736 0.02% 0.00% 1.64% 1.65% 5
Fertility 462288 9141 891 15739 1.98% 0.19% 3.40% 5.57% 1
Child Care 114317 201 13 1157 0.18% 0.01% 1.01% 1.20% 8
Health 52866 35 3 29 0.07% 0.01% 0.05% 0.13% 14
Alcohol 314511 33 47 2234 0.01% 0.01% 0.71% 0.74% 11
Drug Use 414007 3464 300 7454 0.84% 0.07% 1.80% 2.71% 2
Income & Assets 439646 2945 241 938 0.67% 0.05% 0.21% 0.94% 10
Attitudes 13427 214 2 29 1.59% 0.01% 0.22% 1.82% 3
Total 3067473 18209 2030 36918 0.59% 0.07% 1.20% 1.86% -

Table 3 shows the amount of nonresponse in the 1989 survey. The most problematic section is "Income", missing data in 1.3 percent of its questions, with the CPS section a close second with 1.2 percent. Unlike earlier years, the major missing data problem in both the "Income" (1 percent) and CPS (0.8 percent) sections are don't knows, not invalid skips (0.1 percent income and 0.4 percent CPS).

Table 3. Extent of Refusals, Don't Knows and Invalid Skips in 1989

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Intro. 14647 20 1 41 0.14% 0.01% 0.28% 0.42% 7
Marital 86563 372 121 450 0.43% 0.14% 0.52% 1.09% 3
Schooling 76999 179 39 217 0.23% 0.05% 0.28% 0.56% 6
Military 33579 1 1 40 0.00% 0.00% 0.12% 0.13% 10
CPS 406265 3320 52 1650 0.82% 0.01% 0.41% 1.24% 2
On Jobs 39749 0 0 1 0.00% 0.00% 0.00% 0.00% 12
Gaps 91565 91 1 894 0.10% 0.00% 0.98% 1.08% 4
Gov. Training 49657 118 35 233 0.24% 0.07% 0.47% 0.78% 5
Fertility 152546 6 35 92 0.00% 0.02% 0.06% 0.09% 11
Health 154024 120 74 168 0.08% 0.05% 0.11% 0.24% 9
Alcohol 217441 74 400 201 0.03% 0.18% 0.09% 0.31% 8
Income 470686 4761 1124 439 1.01% 0.24% 0.09% 1.34% 1
Total 1793721 9062 1883 4426 0.51% 0.10% 0.25% 0.86% -

Table 4 shows that the most problematic area in the 1994 survey includes the asset questions, which are missing 2.5 percent of their answers (75 percent of those missing being don't knows). The second most problematic area includes income questions, which are missing 1.3 percent of their answers. While in the three previous surveys refusal rates were not an issue, the 1994 survey shows refusals are becoming significant. Slightly more than half a percent (0.6 percent) of the "Asset" section questions and more than one fifth of a percent (0.2 percent) of the "Income" section questions were refused.

Table 4. Extent of Refusals, Don't Knows and Invalid Skips in 1994 NLSY79 Survey

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Intro. 36251 62 14 0 0.17% 0.04% 0.00% 0.21% 12
Marital Status 137540 1522 193 0 1.11% 0.14% 0.00% 1.25% 3
School 60166 302 2 0 0.50% 0.00% 0.00% 0.51% 7
Military 27372 6 1 0 0.02% 0.00% 0.00% 0.03% 15
CPS 269452 28 9 0 0.01% 0.00% 0.00% 0.01% 17
On Jobs 79567 6 7 0 0.01% 0.01% 0.00% 0.02% 16
Employ. Suppl. 1060679 7092 1342 8 0.67% 0.13% 0.00% 0.80% 5
Training 194147 246 29 47 0.13% 0.01% 0.02% 0.17% 13
Fertility 450871 1859 763 0 0.41% 0.17% 0.00% 0.58% 6
Child Care 26453 109 12 0 0.41% 0.05% 0.00% 0.46% 9
Relationship 81477 285 113 0 0.35% 0.14% 0.00% 0.49% 8
Health 282702 623 199 0 0.22% 0.07% 0.00% 0.29% 11
Alcohol 164663 46 61 0 0.03% 0.04% 0.00% 0.06% 14
Income 305693 3176 672 1 1.04% 0.22% 0.00% 1.26% 2
Prog. Participation 118305 297 63 0 0.25% 0.05% 0.00% 0.30% 10
Assets 169301 3239 930 1 1.91% 0.55% 0.00% 2.46% 1
Drugs 204621 772 1626 0 0.38% 0.79% 0.00% 1.17% 4
Total 3669260 19670 6036 57 0.54% 0.16% 0.00% 0.70% -

Table 5 examines the 1998 survey. Since the survey is fielded every other year in the late 1990s there is no 1999 interview, which would exactly continue the every five-year pattern. The 1998 survey is used as the closest substitute. This table, like the one for 1994, shows that the most problematic area is again the asset questions, which are missing 3.6 percent of their answers (75 percent of those missing being don't knows). The second most problematic area is the marital history questions, which added a new section that asked detailed questions about the work history and past life of the respondent's spouse. This expanded section is missing 1.8 percent of its answers. In the 1998 survey only two sections have relatively high refusal rates; assets (almost 0.6 percent) and drug use (0.79 percent).

Table 5. Extent of Refusals, Don't Knows and Invalid Skips in 1998

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Intro. 10060 6 4 0 0.06% 0.04% 0.00% 0.10% 12
Marital Status 207805 3296 520 1 1.59% 0.25% 0.00% 1.84% 2
School 53928 197 45 0 0.37% 0.08% 0.00% 0.56% 10
Military 25691 0 0 0 0.00% 0.00% 0.00% 0.00% 15
CPS 301160 44 12 0 0.01% 0.00% 0.00% 0.02% 13
On Jobs 117144 2 0 1 0.00% 0.00% 0.00% 0.00% 14
Employ. Suppl. 1081493 10265 1441 1 0.95% 0.13% 0.00% 1.08% 3
Training 241013 1559 143 1 0.65% 0.06% 0.00% 0.71% 7
Fertility 578831 3180 1097 50 0.55% 0.19% 0.01% 0.75% 6
Child Care 23241 57 11 1 0.25% 0.05% 0.00% 0.30% 11
Relationship 86632 371 154 0 0.43% 0.18% 0.00% 0.61% 9
Health 350533 2460 223 0 0.70% 0.06% 0.00% 0.77% 5
Income 608849 3410 847 10 0.56% 0.14% 0.00% 0.70% 8
Assets 174570 4702 1566 10 2.69% 0.90% 0.01% 3.60% 1
Drugs 217175 419 1485 0 0.19% 0.68% 0.00% 0.88% 4
Total 4078125 29968 7548 75 0.73% 0.19% 0.00% 0.92% -

Table 6 examines the 2004 survey. This survey has two new sections that are not seen in the previous tables. The first section is found in the employer supplement and asks the respondent detailed questions about the pensions available from their employer and the respondent's participation in these pensions.  This new section is ranked first in problems and has missing responses to 2.5% of all questions. The second new section is the over 40 health module. The goal of this section is to provide researchers with a baseline health measure that will be updated at ten year intervals. The health section is ranked 8th out of 13 sections and has a nonresponse rate slightly more than three-quarters of one percent.

Table 6. Extent of Refusals, Don't Knows and Invalid Skips in 2004

Section Name # Questions Asked # Don't Knows # Refused # Invalid Skipped % Don't Knows % Refused % Invalid Skipped Total % Missed Rank
Intro. 91277 39 16 4 0.04% 0.02% 0.00% 0.06% 12
Marital Status 77954 371 66 106 0.48% 0.08% 0.14% 0.70% 9
School 56716 554 39 4 0.98% 0.07% 0.01% 1.05% 7
Military 39772 20 5 0 0.05% 0.01% 0.00% 0.06% 13
Employ. Suppl. 734366 7729 1001 275 1.05% 0.15% 0.04% 1.23% 6
Pensions 189861 3753 508 485 1.98% 0.27% 0.26% 2.50% 1
Training 307708 2943 887 322 0.96% 0.29% 0.10% 1.35% 5
Fertility 521658 5801 733 1216 1.11% 0.14% 0.23% 1.49% 3
Child Care 34561 12 4 7 0.03% 0.01% 0.02% 0.07% 11
Relationship 1004 2 0 0 0.20% 0.00% 0.00% 0.20% 10
Over 40 Health 622644 4386 402 14 0.70% 0.06% 0.00% 0.77% 8
Income 412656 4382 1199 39 1.06% 0.29% 0.01% 1.36% 4
Assets 626393 12726 2634 233 2.03% 0.42% 0.04% 2.49% 2
Total 3716570 42718 7494 2705 1.15% 0.20% 0.07% 1.42% -