NLSY79 Appendix 17: Interviewer Characteristics Data

National Longitudinal Survey of Youth - 1979 Cohort

NLSY79 Appendix 17: Interviewer Characteristics Data

Interviewer Characteristics Data and Data Review

Many researchers are interested in knowing if or how much interviewers affect respondents' answers. To enable researchers to investigate these questions, NLSY79 data releases since 1988 have contained information on interviewers' characteristics. From 1979-2000, information on the characteristics of NLSY79 interviewers primarily comes from NORC's interviewer personnel files. Data from the 2002-present surveys come from forms filled in by interviewers during their NLSY79 training program.

An extensive review of longitudinal interviewer ids (INTCHARS_INT_ID) was made prior to the 2014 data release. This review has resulted in a number of improvements in the identification of previously unidentifiable interviewers and in the linkages that can be established between specific interviewers' cases across survey years. Several types of issues were addressed, including:

  • A number of interviewers who had been assigned multiple longitudinal interviewer ids in different survey years were identified and assigned a consistent longitudinal interviewer id.
  • Interviewer ids assigned in 2002 had been based only on the project id assigned to interviewers in that survey year, which disrupted the longitudinal INTCHARS_INT_ID particularly severely in that year. The longitudinal id links have been reestablished for many interviewers in 2002 accounting for several thousand respondents.
  • A number of interviewers in isolated years had been assigned inordinately large project ids which were used as their longitudinal INTCHARS_INT_ID.  These ids have been shortened to be less problematic when extracting the data.
  • Identification of a number of previously unidentifiable interviewers has been made wherever possible through re-examination of available records. These interviewers had been unlinked to their interviews in other years generally owing to erroneously recorded or undocumented ids.

As a result of this review, an improved and more consistently linked set of INTERVIEWER CHARACTERISTICS variables for each survey year is included in the 2014 data release. Because IDs can vary considerably between survey years, matching interviewers through survey years can rely significantly on documentation containing interviewer names for verification. Future updates for mismatched and unmatched interviewers will be made, depending on availability of further documents or data.

Each NLSY79 survey year has the following variables available: interviewer ID number, number of times this interviewer has interviewed the respondent, and the interviewer's race, sex, age, and level of education. The variable names and response categories are listed at the end of this appendix. Researchers should note that CHRR built the 1979-2012 variables from data sources that represent the interviewer characteristics at specific points in time. Hence, changes in items like an interviewer's educational attainment are not always reflected in the data. The preliminary source reflects existing interviewer data a few months prior to the fielding of the NLSY79 1994 survey. After 1994, data sources come from short demographic questionnaires filled out during interviewer training.

As of 2014, the interviewer characteristics information is being compiled at NORC and sent to CHRR to code with longitudinal IDs attached. The procedures for matching interviewers remain the same. Interviewers are identified from a master list of interviewer names. New IDs are assigned to new interviewers who do not have a longitudinal ID. Interviewer characteristics are then assigned based on a short demographic survey filled out during interviewer training.

Constructing the Original Interviewer Characteristics ID

The key variable, which links the NLSY data set with the interviewer characteristics data set, is INTCHARS_INT_ID. This ID variable is often similar but not necessarily identical to the Interviewer ID variable, which is entered in the questionnaire and can be found on the NLSY79 public use data set for many years. INTCHARS_INT_ID is a constructed longitudinal variable that allow identification of cases interviewed by the same person over time. The previous version of the longitudinal interviewer ID, upon which the improved INTCHARS_INT_ID is based, was constructed using the following steps:

  • First, the NLSY interviewer ID for all years prior to 1996 is divided by ten to truncate the last digit. This last digit was used to cluster interviewers together and the digit was not used in the NORC interviewer characteristics database. IDs for 1996 and 1998 do not have this last digit.
  • Second, each ID was then run through the list of all known interviewers who changed their ID. Interviewers changed their ID if they moved to different states, were promoted or demoted. Only a partial list of interviewers who changed their ID is available, so there is no year when the characteristics of all interviewers are known. However, even if all the characteristics of an interviewer are not known, efforts were made to create a consistent ID number since a researcher might be interested in knowing who interviewed whom from year to year even if other information like sex is not available.
  • Third, for all surveys after 2002, all NLSY79 interviewers were given new round-specific project IDs, whether or not they had previously participated in the project. These project IDs were primarily used as the 2002 longitudinal id, creating a significant number of breaks in the continuous record of contact for some interviewers who already had an ID assigned.

The resulting ID was then used to search for each identified interviewer's characteristics from one of the two sources (1979-2000 or 2002-present) noted earlier. While most IDs match, some do not. That number is relatively small in most years. Readers should note that most interviewers interviewed multiple respondents, so not finding even a single interviewer's characteristics in the data sources can affect the number of cases missing interviewer characteristics dramatically.

Table 1. Interviewers Identified in NORC database by Survey Year

Year # of Respondents Interviewed # of Interviewers Matched
Percentage Not Matched
1979 12686 9838 22.4%
1980 12141 11200 7.8%
1981 12195 11850 2.8%
1982 12123 11736 3.2%
1983 12221 11980 2.0%
1984 12069 11585 4.0%
1985 10894 10850 0.4%
1986 10655 10560 1.0%
1987 10485 10485 0.0%
1988 10465 10386 10.2%
1989 10605 9906 6.6%
1990 10436 9321 10.7%
1991 9018 8933 1.0%
1992 9016 8947 0.3%
1993 9011 8933 1.0%
1994 8891 8701 2.1%
1996 8636 8050 6.8%
1998 8399 8302 1.2%
2000 8033 7814 2.7%
2002 7726 7723 0.0%
2004 7661 6851 10.6%
2006 7654 7328 4.3%
2008 7757 7742 0.1%
2010 7565 7559 0.0%
2012 7301 7293 0.1%
2014 7071 6977 1.3%

Other Interviewer Characteristics Variables

This section describes other variables available beyond the Interviewer's ID (INTCHARS_INT_ID).

Variables Available for Survey Years 1979-2014

Interviewer Count (INTCHARS_YRSINTR)

This counts the number of years the interviewer has interviewed the respondent, including the current survey year. Note that as telephone interviews become more prevalent, the number of first-time interviews expands considerably, as most interviewers are not assigned to specific cases year after year.

Interviewer Race (INTCHARS_RACE)

-3 = missing

Interviewer Sex (INTCHARS_SEX)

1 = MALE
-3 = missing

Interviewer Age (INTCHARS_AGE)

Age of the interviewer in the interview year, calculated as ([survey year]-[interviewer’s year of birth])
-3 = missing

Interviewer Education (INTCHARS_EDUCATION)

1 = Grade 0-8
2 = Grade 9-11
3 = High School Graduate
4 = Vocational degree
5 = Some College
6 = College Graduate
7 = Graduate School
8 = Masters Degree
9 = Professional Degree
0 = Other