Geographic Residence & Neighborhood Composition

Geocode file variables

  • Information on the State, county, and metropolitan statistical area of residence for each respondent (the current residence variables) are merged with information from several other data files, namely the City Reference File (Census 1973, 1982, 1983, 1987, 1992) and the County & City Data Book (Census 1972, 1977, 1983, 1988, 1994), to provide detailed information on the environmental characteristics of the State, county, and metropolitan statistical areas in which each NLSY79 respondent resides. NOTE: Users may attach additional county and metropolitan statistical area-level data from a variety of sources by simply merging information from the desired source with the Geocode data based upon the State, county, and metropolitan statistical area of residence codes in the Geocode file
  • For select survey years Geocode information is available on the location of respondents' jobs, the location of colleges attended, and the point of discharge from military service
  • Unemployment rate of each respondent's labor market of current residence:
    • The source of the 'Unemployment Rate' variables is the May issue of the Bureau of Labor Statistics' Employment and Earnings for the year following the survey year. Figures from March of each survey year are used. This table supplies unemployment rates for each State and for selected metropolitan statistical areas. Respondents who reside within one of these metropolitan statistical areas are assigned the appropriate unemployment rate. For those residing outside of these areas, a "balance of State" unemployment figure is computed using State total figures for the size of the civilian labor force and the number employed and subtracting the population living in metropolitan statistical areas.
    • Additional information on these variables can be found in Appendix 7 in the NLSY79 Geocode Codebook Supplement.

Types of County or Metropolitan Statistical Area Environmental Characteristics on the NLSY79 Geocode CD:

  • Population sizes
  • Percent of population that is:
    • urban
    • black
    • female
    • under 5 years old
    • 65+ years old
  • Birth/death/marriage/divorce rates
  • Physician and hospital bed rates
  • Crime rates
  • Poverty level data
  • Educational attainment levels 
  • Median family and per capita income
  • Recipients of and payments from:
    • AFDC
    • SSI
    • Social Security
  • Labor force statistics:
    • total labor force
    • civilian labor force
    • number of females in the civilian labor force
    • civilians unemployed versus employed
    • percent employed in various industries
  • Unemployment rate for labor market of residence

Geographic Residence: Detailed geographic mobility information was collected during the 1979-80, 1982, and from 2000 forward; data were gathered on the country/county/State and timing of up to five residential moves since January 1978 or since the last interview. Beginning in 2000 only significant geographical moves were recorded.

Neighborhood Quality: The neighborhood quality series (1992, and 1994-2000), is taken from the National Commission on Children Parent & Child Study, 1990 Parent Questionnaire. In this series of questions respondents rate how much of a neighborhood problem issues such as crime, lack of police protection, unsupervised children and joblessness are.

Other Geographic Variables: Users may obtain special permission to use zip code and Census tract data available at the BLS offices in Washington, DC.

Edited versus Unedited Versions of State/County of Residence: For some years (1979-82, 1988-89, 1991-92), two versions of the State and county of residence variables have been included in the "Geocode xxxx" files. The set occurring at the beginning of each file is the edited version, while the variables found near the end of the files for these years are unedited. If the variable has an actual source question number/name, it is the original from NORC. If the source question name says created, it is the edited/created version. Note that the unedited variables are sometimes combined into a single variable, with the State and county code appended to each other. These raw variables are preceded by the word "GEOCODE" in the variable title. The edited residence variables contain the corrections made for erroneous address information and are the ones from which the Geocode files themselves are constructed. Users should be aware that the edited version of these variables does not contain data for those respondents who are in the active military forces or who are living abroad or in a U.S. territory.  Codes of "-4" appearing in the unedited versions of the State or county variables (because foreign country and U.S. territory codes are placed in one field or the other) should not appear in the edited versions of these residence variables.

New Geocode Procedures for Assigning Residence Codes and Hand-Editing Discrepant Cases: During the 1988 hand-editing process, it became evident that the telephone numbers were very accurate, even in cases for which the address information contained discrepancies. Beginning in 1989, the area code and phone exchange were used to assign State and county of residence codes. The State assigned by the area code was then compared to the State assigned on the basis of zip code alone and the State contained in the original NORC respondent file. A "quality of match" variable was computed on the basis of how well these States match. For a more detailed discussion of these new assignment and matching procedures, refer to Appendix 10: Geocode Documentation in the Geocode Codebook Supplement. This process was used through the 1994 release.

The hand-editing procedure has also been streamlined. In 1989, the first year in which the phone assignment procedure was used, the residence codes assigned on the basis of the area code and exchange were compared to the raw residence variables received from NORC. Those with information that did not match were identified for individual examination. Ideally, the discrepancies requiring individual examination would be reduced to those cases which are "genuine movers" or which have zip codes covering multiple counties and would require some verification that the correct county was assigned based upon the phone information. The current process for identifying discrepancies and hand-editing is aimed more directly at achieving this objective. 

Beginning in 1990, the residence codes assigned based on phone information were compared to the 1989 CHRR-edited residence information to identify cases for individual examination. Because the previous year's edited variables incorporate the corrections that were made in the hand-editing process from earlier years, repeated editing of the same cases across years decreased. Through this process, the discrepancies in residential Geocode information were reduced. The number of cases requiring individual examination also decreased and was restricted more closely to the population of "genuine movers" and people with multiple-county zip codes and phone numbers that require verification of county of residence. 

The hand-editing process in previous years included not only these genuine movers and multi-county zip code dwellers, but also other cases for which elements of the address are simply in error or incompatible with each other. Some of these cases could potentially require editing for the same errors in more than one year, even if the respondent stayed in one location. Hand-editing procedures were further streamlined, and in some cases automated, to produce the 1992 data.

Beginning in 1996, a new procedure for verifying and assigning correct final Geocode information was instituted. This procedure is now performed using specialized address tracking Geocode software. The processes are described in Appendix 10. It is the belief of CHRR staff members not only that the current procedures are more efficient in identifying true discrepancies and streamlining the hand-editing process, but also that they result in more accurate and consistent assignment of State and county codes in general. 

Missing Values, New England Cases, and Mobility: Missing values in location of residence variables and metropolitan statistical area codes are associated with respondents who are in the active military forces or who are living abroad or in a U.S. territory. Users should be aware that, because the New England County Metropolitan Area (NECMA) codes are not comparable to metropolitan statistical areas from the remainder of the country, New England cases are eliminated from some of the procedures used to construct the Geocode files.

The review and hand-editing process has been periodically revised to improve the accuracy of the data and the efficiency of data production. The potential implications for effects on mobility rates between some years due to these changes have been noted in Appendix 10: Geocode Documentation. Users should read Appendix 10 carefully to gain a better understanding of the issues outlined above and their implications for specific research endeavors.

Comparison to Other NLS Cohorts: Data on the respondent's area of residence are available for all cohorts. Geographic residence information for those NLSY79 children who resided with their mother can be inferred from the residence data of their mothers. The NLSY97 main created variables indicate whether the respondent lives in an urban or rural area, whether the respondent lives in a Metropolitan Statistical Area, and in which Census region the respondent resides.  More detailed information is available on the restricted-use Geocode CD. Region of residence and geographic mobility of Original Cohort respondents are provided for most survey years. Geographic data for NLSY79 respondents fall into two categories: information on the main public file and more detailed information released on a restricted-access geocode CD.  

Survey Instruments and Documentation Data on residence at birth and at age 14, as well as the 1979-82 present/most recent residence series, were collected using questions found within Section 1 ("Family Background" and "On Family") of the 1979, 1980, and 1982 questionnaires. All other variables are created from or determined by the geographic information provided by each NLSY79 respondent within the locator section of the questionnaire or from the interviewing Face Sheet or internal NORC locating files. Several attachments and appendices in the NLSY79 Codebook Supplement and/or the NLSY79 Geocode Codebook Supplement offer creation procedure information and coding systems for the geographic residence variables. The following are relevant to the Geocode:

Areas of Interest Residence variables can be found within the "Family Background," "Key Variables," "Geocode xxxx," or "Misc. xxxx" areas of interest; the table above specifies the particular areas of interest for each variable. All environmental variables, including the 'Unemployment Rate for the Labor Market of Current Residence,' are present on the restricted release "Geocode xxxx" areas of interest on the Geocode CD.