Skip to main content

NLSY79

Race, Ethnicity & Immigration

Important Information About Using Race, Ethnicity and Immigration Data

  • Race and ethnicity variables for household members are based on information collected on the Household Screener; in which race and one ethnic background for each household member were recorded. 
  • The interviewer's identification of the respondent's race can be subjective. Each interview from 1979-1986 and 1988-1998 collected information on the interviewer's direct observation of the race of the respondent ("black," "nonblack/non-Hispanic," or "other"). 
  • No special instructions are provided within the Question by Question Specifications as to how the interviewer is to code race.

Additional instructions for coding race, ethnic origin, and the racial/ethnic identifier variable can be found in the Household Screener and Interviewer's Reference Manual (1978) and in a NORC memo dated 10/4/78 available from NLS User Services.

The following race and ethnicity variables are available for NLSY79 respondents: 

  1. a racial/ethnic variable based on the sample identification code assigned by NORC
  2. a series of self-reported ethnic origin variables collected during the 1979 and 2002 surveys
  3. a set of interviewer identifications of the race of the respondent at the time of the interview
  4. racial/ethnic identification for current and past spouse/partners
  5. variables representing the respondent's immigration history and status collected during the 1990 survey
  6. a 1979 variable indicating whether a foreign language was spoken in the house during the respondent's childhood
  7. a series of variables recorded by the interviewer indicating whether the survey was administered in English or another language
  8. race/ethnicity variables for each family member listed during the 1978 screener
  9. race of the interviewer where available at each interview
  10. country of origin of the respondent's parents and the respondent's country of birth, available on the restricted Geocode release

Race and ethnic origin information is also available for each household member identified during the 1978 household screening. In 2002 respondents were asked to identify their race/ethnicity using questions that conformed to Federal government definitions. Of related interest is a series of immigration questions, fielded in 1990, that included the collection of information on country of citizenship at the time that foreign-born respondents entered the U.S.

Race/Ethnicity

The variable 'Racial/Ethnic Cohort from Screener' (R02147.) designates the respondent as "Hispanic," "black," or "nonblack/non-Hispanic" and provides the basis for weighting NLSY79 data. This variable is collapsed from R01736., 'Sample Identification Code,' which includes such values as "supplemental male black" or "cross-sectional female Hispanic." This code was assigned by NORC to each respondent based on information gathered during the 1978 household screening. In the creation of the 'Sample Identification Code' and thus the 'Racial/Ethnic Cohort' variable, both race and ethnic origin information collected at the time of the 1978 household screening were used. Interviewers conducting the screening were instructed to:

  1. code race by observation into three categories, "nonblack/non-Hispanic," "black," or "other"
  2. inquire about the ethnicity of all household members age 14 or above
  3. but assign ethnicity, without asking, to those members who were under age 14

Coding procedures used by NORC to assign the "Hispanic," "black," and "nonblack/non-Hispanic" identifications to respondents included the following classification guidelines:

"Hispanics" were those who self-identified as Hispanic, whose ethnicity screener code was 1-4
  1. Mexican American, Chicano, Mexican, Mexicano
  2. Cuban, Cubano
  3. Puerto Rican, Puertorriqueno, Boriccua
  4. Latino, Other Latin American, Hispano, or Spanish descent. Persons who did not self-identify as Hispanic but who met the following conditions were also classified as "Hispanic": 
  • those who identified themselves in the ethnic origin categories that included Filipino (code 6) or Portuguese (code 13)
  • those whose householder or householder's spouse reported speaking Spanish at home as a child
  • those whose family surname is listed on the Census list of Spanish surnames
"Blacks"
  • included those for whom race was coded "black" and ethnic origin was "non-Hispanic" or those whose ethnic origin was coded black, Negro, or Afro-American (code 5) regardless of race coding
"Nonblack/non-Hispanics"
  • included those whose race was coded "white" or "other" and who did not identify themselves as either black or Hispanic in answer to the ethnicity question. Instructions to interviewers for coding race included coding in the "other" category those persons who were Japanese, Chinese, Vietnamese, Asian Indian, Native American, Korean, Eskimo, Pacific Islander, or of another race besides black or white.
  • Father's race was to be used to assign race for those of mixed descent except for some cases of those under age 14 of Spanish descent.  Users should note that this decision rule is different from that applied to the NLSY79 children, for whom the mother's race is used.  Spanish origins were to be given preference; if at least one ethnicity mentioned was of Spanish origin, the Spanish origin was to be coded (or, for those under 14, if at least one parent was Hispanic, the Hispanic parent's ethnicity was assigned).

A series of ethnic identification variables, '1st-6th Racial/Ethnic Origin' and 'Racial/Ethnic Origin with Which R Identifies Most Closely' (R00096.-R00102.), provide extensive ethnicity information. Respondents were asked during the 1979 interviews to name the racial/ethnic origins with which they identified.  A listing of more than 20 categories, including "Black," "English," "French," "German," "American Indian," "Irish," "Mexican," "Mexican-American," and "Puerto Rican," were presented on a Show Card. If a respondent offered more than one origin, he or she was also asked for the ethnic group with which he or she most closely identified.  Users should be aware that frequency counts for the coding category "Indian American, or Native American are unusually high.  About 5 percent of respondents reported this racial/ethnic origin, compared to Census estimates of approximately 0.5 percent of the population. This may have resulted from some respondents' misinterpretation of the term "Native American." Table 1 compares frequencies of the 1979 first (or most closely held) ethnic identification with the NORC assigned racial/ethnic identification.

Table 1. Ethnicity by Racial or Ethnic Cohort from Screener (Unweighted Data)
Respondent's Self-Identification   NORC-Assigned Race/Ethnicity
Racial/Ethnic Group1 Total   NonBlack/
Non-Hispanic
Non-Hispanic
Black
Hispanic 
or Latino
 
Total 12686   7510 3174 2002
 
Black 3049   19 3017 13
 
Total Hispanic or Latino 1834   46 5 1783
  Cuban 116   1 0 115
  Chicano 59   0 0 59
  Mexican 383   5 0 378
  Mexican-American 734   15 1 718
  Puerto Rican 328   7 1 320
  Other Hispanic or Latino 118   7 0 111
  Other Spanish 96   11 3 82
 
Total European 5281   5100 82 99
  French 311   290 10 11
  German 1395   1376 5 14
  Greek 31   29 0 2
  English 1561   1476 51 34
  Irish 949   933 3 13
  Italian 497   474 7 16
  Polish 238   234 3 1
  Portuguese 97   88 3 6
  Russian 45   45 0 0
  Scottish 122   120 0 2
  Welsh 35   35 0 0
 
Total Asian 117   93 11 13
  Asian Indian 22   20 2 0
  Chinese 26   22 4 0
  Filipino 43   33 4 6
  Japanese 19   14 0 5
  Korean 6   3 1 2
  Vietnamese 1   1 0 0
 
Hawaiian/Pacific Islander 20   17 0 3
 
American Indian 622   585 17 20
 
Other 779   736 21 22
 
American 743   692 10 41
             
None 2 241   222 11 8
 
1 R00102., 'Racial/Ethnic Origin with Which R Identifies Most Closely,' is used unless it was not answered; otherwise R00096., '1st or Only Ethnic Origin' is used. Those listing only one ethnic background did not answer R00102.
2 Includes totals of 98 "don't know," 132 "none," 10 "invalid skips," and 1 "refusal."

Immigration

In 1990, NLSY79 respondents born outside the United States, its territories, or Puerto Rico were asked a series of questions on their immigration history and visa status. Dates of first and most recent entrance into the United States to live for six or more months and information on whether the respondent was the principal entrant/immigrant were collected. For respondents' or principal entrant/immigrants' first and most recent entry or change in visa/immigration status, details were gathered on:

  1. visa or immigration status at entry date
  2. form of temporary entry visa
  3. citizenship status (that is, citizen or permanent resident alien) and relationship of the sponsoring relative
  4. country of citizenship at entry date or date of change of status

Also recorded for the respondent was information on:

  1. current citizenship/residence/visa status in the United States
  2. residence inside/outside the United States
  3. expectations to return to the United States to live permanently or to return to his or her country of birth to live permanently
  4. the total number of years spent outside the United States since initial entry

Of related interest is the variable, 'Is R a Citizen of the U.S.' available from the 1984 and 1990 interviews.

Foreign Language Used or Spoken

For each household member, information is available from the screener on presence of a Spanish surname and whether Spanish was the language spoken in the home when that individual was a child. The 1979 interview asked whether a foreign language (Spanish, French, German, other) was spoken at home during the respondent's childhood. In addition, interviews record for each survey whether English, Spanish, or another foreign language was used to administer the Household Interview Forms ('English or Foreign Language Used for Household Record') and questionnaire ('Int Remarks - Was Interview Conducted in English or Foreign Language').

Comparison to Other NLS Cohorts: Race is available for all cohorts; ethnicity is available for all cohorts except the Older Men and Young Men. Users should be aware that coding categories for race and ethnicity have varied among cohorts and over time. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Reference

NORC. 1978 Household Screener and Interviewer's Reference Manual. Chicago, IL: National Opinion Research Center - University of Chicago, 1978.

Survey Instruments & Documentation
  • Race and ethnicity variables originating from the screener are located on the second page of the Household Screener. Questions concerning the ethnicity of the respondent are included in the "Family Background" section (Section 1) of the 1979 questionnaire. Interviewer remarks regarding race are located in the final section ("Interviewer's Remarks") of each questionnaire. Immigration questions are located in Section 13, "Immigration," of the 1990 questionnaire.
  • For further information on the coding of race and ethnicity in the Household Screener, see the 1978 Household Screener and Interviewer's Reference Manual (NORC 1978). Those needing additional information on coding procedures should request a copy of a NORC memo dated 10/4/78 available from NLS User Services.
Areas of Interest
  • 'Birthplace (Country and State) of R's Mother/Father' and 'Birthplace (Country) of Father's Father' are available in "Geocode 1979" (on the Geocode CD) areas of interest. 
  • Race and ethnicity variables are included in the following areas of interest: 'Racial/Ethnic Cohort from Screener' is a "Common" variable. Ethnicity variables originating from the 1979 interview as well as all immigration variables have been placed in the "Family Background" area of interest. The interviewer's remarks variables are located in "Interviewer Remarks." Race variables for household members originating from the 1978 household screening are located in "Misc. 1979."  'Current Residence In "Misc. xxxx."

Household Composition

Created variables

TYPE OF RESIDENCE. These variables reflect the type of residence in which the R was living at each survey point (e.g. own dwelling unit, in parental household, jail, etc.). Although these variables exist for each year, they are only actually created or compiled from multiple versions of the Household Interview from 1979-1986. A single version of the Household Interview was used beginning in 1987.

Important information: Using household composition data

  • Some familiarity with the following survey instruments (see Survey Instruments section for descriptions of each of these instruments) which gather information on households is necessary:
    • the NLSY79 Household Interview Forms
    • the NLSY79 "Household Enumeration"
    • the NLSY79 Face Sheet
    • and the household screeners that were used to select respondents for the NLSY79 cohort
  • This section does not contain information pertaining to variables about the characteristics or experiences of household members, the presence of partners within the household, or geographical areas of residence. Any information collected specifically on household members will be in specific topics of interest, such as age, sex, educational status, and so forth. The availability of information on partners is discussed in the Marital Status, Marital Transitions & Attitudes section.
  • Income of partners is omitted from 'Total Net Family Income', family size, and family income variables. Inferring a monetary relationship between household members who do not have a legal relationship by their own design is more tenuous than inferring a monetary relationship between designated family members. Therefore, partners are excluded. You can easily add or subtract from the family size by designating your own qualifying relationships.
  • Spousal pairs are inconsistent for three respondents. In the created relationship codes for household members (R00001.51, R00001.53), respondents 9707, 8522, and 1414 are considered spouses of 9706, 8521, and 1413, respectively. However, 9706 is considered 9707's partner, 8521 is considered 8522's "other non-relative," and 1413 is considered 1414's husband or brother-in-law. These assigned relationships are reflective of respondents' own explanations of the relationships. Relationship codes linking respondents may be weak outside of immediate family relationships. 
  • Do not use the 1982 'Version of Household Record from Last Interview' as a substitute for the missing 1981 version because it may contain inaccuracies and because not all 1981 interviewees were interviewed in 1982.
  • This section describes variables related to household and family composition, household identification, linkages between members of multiple respondent households, and household residence.

Household members

The term "household" refers to all individuals sharing the respondent's primary residence at the time of the interview. For respondents living in temporary quarters (except temporary military quarters), the usual residence is defined as that person's permanent residence. For those living in their own dwelling unit or in military family housing, the usual residence is the person's dwelling unit. For example, if a male college student is living in a temporary residence, such as a fraternity, those who share his permanent residence, such as his parents' address, would be considered his household members. However, if that same college student were living in his own apartment, all those living in his apartment would be considered his household members.

Persons analyzing military households should note that household screener information was not collected for persons in the military sample. Thus, while military units are included in the total 8,770 unique households, military units cannot be multiple respondent households. Household specification for those respondents enlisted in the military is as follows: (1) for those in the military who are married but living in military quarters other than military family housing, the household is the household of the respondent's spouse, and (2) for those in the military who are unmarried, no household information is recorded.

During PAPI surveys, information about a respondent's household was gathered during a separately administered household interview. Three different Household Interview Forms "Household Interview Forms" were used prior to 1987: 

  • Version A was completed by a parent of those respondents living in a parental household
  • Version B was administered to youth not living at a permanent address
  • and Version C was answered by those respondents living in their own dwelling unit or independent living quarters

Table 1 details, by survey year, the relevant universes and residence types specific to each version; notes on variations in administration of the forms are included. A series of variables entitled 'Version of Household Record Used' is available for the 1979-80 and 1982-86 survey years. To determine the version of the household interview used in 1981, it is necessary to match information from the variable, 'Type of Residence R is Living In,' to residence information that was included on the three different forms. Beginning in 1987, only one version of the household interview was used, as all respondents were 22 or older and living predominantly on their own. Since the introduction of CAPI interviews in 1993, household information has been collected in the first section of the main questionnaire rather than in a separate instrument.

All members of the respondent's household are enumerated each survey year on the household record; in 1978, household members were listed on the household screener. The relationship generally listed for each household member on the household record is relative to the youth respondent, such as 'Household Record - Relationship to Youth Member # 1.' For variables from the screener and for one series of 1979 household record variables, the relationship of household members (only family members in the screener) is relative to the householder. Anyone who lives in the residence but is temporarily away is listed; anyone who is there only temporarily is removed from the listing. For the screener and for interviews in which the respondent lives in a new household, that is, living with new people rather than living at a new address, the householder generally is listed first, followed by a spouse; any children; any other relatives; and any roomers, boarders, hired help, or other usual unrelated residents.

Table 1. Guide to the household forms: NLSY79 1979-2018
Year Household Version Conducted with R's Residence
1979 Version A
Version B1
Version B2
Version C
Parent of R only 
Youth Respondent 
Youth Respondent 
Youth Respondent
Parental home 
Dorm, jail, hospital, temporary living quarters 
Military sample member 
Own dwelling unit, orphanage, religious institution, other living quarters
1980 Version A 
Version B 
Version C
Parent of R only 
Youth Respondent 
Youth Respondent
Parental home 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 2
1981 Version A 
Version B 
Version C
Parent of R only 
Youth Respondent 
Youth Respondent
Parental home 1 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 2
1982 Version A 
Version B 
Version C
Parent of R only 
Youth Respondent 
Youth Respondent
Parental home 1 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 2
1983 Version A 
Version B 
Version C
Parent of R only 
Youth Respondent 
Youth Respondent
Parental home 3 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 4
1984 Version A 
Version B 
Version C
Parent of R only 
Youth Respondent 
Youth Respondent
Parental home 3 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 5
1985 Version A 
Version B 
Version C
Youth R or Parent 
Youth Respondent 
Youth Respondent
Parental home 6 
Dorm, jail, hospital, mil/temp living quarters 
Own dwelling unit, orph, relig, mil/other living quarters 4
1986 Version A 
Version B 
Version C
Youth R or Parent 
Youth Respondent 
Youth Respondent
Parental home 6 
Dorm, jail, hospital, mil/temp living quarters
Own dwelling unit, orph, relig, mil/other living quarters 4
 1987-2020 One HH version only Youth respondent only Any residence
1   Includes youth respondents under 18, living in other parent's or spouse's parents' home.
2  Includes youth respondents over 18, living in other parent's or spouse's parents' home.
3  Preferred version of household interview for youth respondents living in other parent's or spouse's parents' home.
4  Permissible (though not preferred) version of household interview for youth respondents living in other parent's or spouse's parents' home.
5  Included some youth respondents still in parental household (with explanation as to circumstances--code "17" added).
6  Included youth respondents in other parent's or spouse's parents' home (codes "18" and "19" added to reflect whether household interview conducted with the youth respondent or the parent).

Family members

Within the listing of household members, family units are identified through family unit numbers and relationship codes. A family unit includes all those related by blood, marriage, or adoption. For each member of the household in every survey year, including the 1978 screener, the family unit number is listed on the "Household Enumeration" or the screener, such as 'Household Record - Family Unit # 1 Member # 1.' All family members in an interrelated group will share a family unit number, with number 1 assigned to the respondent's family. Each additional interrelated group or individual adult sharing the household but not related to another group or individual in the household constitutes an additional family unit. For example, if Mr. and Mrs. Brown are boarders in the same house with Mr. and Mrs. Smith, the Smiths are the first family unit and the Browns are a second family unit. The reliability of 1979-92 family unit numbers beyond those assigned to the respondent's family is questionable. Beginning in 1993, family units were assigned electronically; the definition of a family unit remains the same. Codes were added for partner's family/relations. All others are assigned a code of "9."

An enumeration of a respondent's children is also available. Several variables have been created as part of the Supplemental Fertility File (area of interest "Fertility and Relationship History/Created"), including variables such as '# of Own Children in Household,' 'Age of Youngest Child in Household,' and a variety of variables for each biological child listed, with some exceptions, in order of age. Unedited variables from the Children's Record Form (areas of interest "Child Record Form/Biological" and "Child Record Form/Nonbiological") are also available for both biological and nonbiological children. If there is a discrepancy between household rosters versus marriage event histories use the marriage event histories rather than the household roster data. See the "Fertility" section of this guide for more information about the collection of information on the respondent's children.

Additionally, information about household members includes, sex of the member, their relationship to the respondent, age, highest grade completed, and whether the member receives pay for work (age restrictions apply.)

Finally, information on whether the mother and father of each child (in 1991, new children only) live in the household is available for the 1987-2014 survey years. In all other years, information on whether the father of the child is present is available for children of female respondents.

Family size

Beginning with the 1990 release, a family size variable, comparable to the family size variable created for the computation of the 'Total Net Family Income' and 'Poverty Status' variables, was created for each year. The variable is constructed by simply cycling through the household record "relationship codes" and increasing the family size by one each time a qualifying relationship relative to the respondent is encountered. Qualifying relationships include all relations by blood, marriage, and adoption. Foster relationships, partners, boarders, guardians, and other individuals are not considered family members in the creation of this variable.

Program derivation

The SPSS program statements for a sample survey year (1979-92) FAMILY SIZE variable are as follows:

COUNT FAMSZXX=RELR1 TO RELR15 (0 THRU 32,37 THRU 44,47 THRU 49)

IF (WEIGHTXX EQ 0) FAMSZXX=-5

After 1993, the roster was expanded to accommodate up to 20 individuals. The SPSS program is the same but the number of relationships to check is five larger. Additionally, the respondent is not on the household roster after 1992, so FAMSZXX is initialized to "1."

Household identification and linkages

The NLSY79 screening procedure allowed more than one member of a household to be selected for interviewing. The original 12,686 respondents were members of 8,770 households; 6,742 respondents or 53 percent of the sample were members of households from which more than one respondent originated, while 5,944 respondents or 47 percent were members of single respondent households (Table 2). To establish linkage of respondents originating from the same household, variables identify other interviewed household members and their relationships as of the 1979 interview. The 1979 variable providing the unique household identification number of each household is R00001.49, 'Household Identification Number' (HHID). The same HHID is assigned to all respondents who originated from the same household in 1979. In multiple-respondent households, the HHID corresponds to the lowest respondent 'Identification Code' of all respondents interviewed in that household; in single-respondent households, the HHID corresponds to the respondent 'Identification Code.' The HHID variable was constructed using other created variables from the NLSY79 main data set and exists only for 1979.  Multiple respondent households can also be identified through variables that identify other respondents in the same household and their relationship to the first respondent. Reference numbers include R00001.50-R00001.61, for example, 'Identification Code of 1st Other Interviewed Youth in R's Household.'

Table 2. Distribution of respondents living within single and multiple respondent households: NLSY79

Household Type 1

Households Respondents % of Sample2
Single Respondent Households 5944 5944 46.9
Multiple Respondent Households 2826 6742 53.1

2 Respondent Households

1985 3970 31.3

3 Respondent Households

634 1902 15.0

4 Respondent Households

170 680 5.4

5 Respondent Households

32 160 1.3

6 Respondent Households

5 30 0.2
Total 8770 12686 100.0
1 Household types are based on information gathered during the 1978 household screening.
2 Numbers have been rounded up to the nearest tenth.

Although these matches represent unique samples for a number of research topics, be aware that matches may be not be demographically representative due to the age restrictions applied to all members chosen from a household. The primary types of relationships that existed among respondents within multiple respondent households at the time the surveys began included brothers, sisters, husbands, and wives (Table 3). Other relationships included cousins, brothers- and sisters-in-law, step-brothers or -sisters, and other types of household members.

Table 3. Number of NLSY79 Civilian Respondent Pairs Interviewed in 1979 and 1992

Type of Pair

Respondent Members Households
1979 1992 1979 1992
Siblings 5863 4806 2448 2149

Two Siblings

3386 2744 1693 1572

Three Siblings

1725 1427 575 446

Four Siblings

604 519 151 116

Five Siblings

130 99 26 13

Six Siblings

18 17 3 2
Spouses 1 334 216 167 120
1 Excludes three cases in which the relationship assigned to the respondent pair is "spouse" for only one member of the pair.

Household residence

Household residence refers to the type of dwelling or living situation of the respondent. Household residence information is available for the respondent at each survey point, for the respondent during his or her childhood, and for the respondent's children during recent surveys. The variable 'Type of Residence R is Living In' classifies the respondent's actual place of residence at the time of each survey. From 1979-86, it was created based upon responses to several questions asking about different types of dwelling units. In these years, several versions of the Household Interview Forms (the instrument completed before the main questionnaire and used to construct the household enumeration) existed. The universes for these different versions were dependent upon the type of dwelling unit in which the respondent lived (parental home, own dwelling unit, individual or group quarters), the sample type of the respondent (military or civilian), and who answered the household interview section questions (respondent or parent). The responses to questions designating type of residence from each of these versions were combined into one variable reflecting type of residence for the entire sample.

Beginning in 1987, the several versions of the Household Interview Forms were combined and all types of residences were coded in one question. Therefore, after 1986, this question is no longer considered a "created" variable. The 'Type of Residence R is Living In' variables include categories such as dorm, fraternity or sorority, hospital, jail or juvenile detention center, orphanage, religious institution, own dwelling unit, parents' household, and specific types of military quarters. The codes assigned to response categories for type of residence in 1979 differ significantly from those in other survey years. Also, in earlier years, respondents living in parental homes were treated as valid skips; in later years, these respondents were assigned a separate code that differs by year.

Retrospective information describing the respondent's childhood living arrangements was collected during 1988 in a three-part series of questions on the Childhood Residence Calendar. In Part 1, the respondent's identification of any type of parent with whom he or she lived for four or more months was recorded. Coding categories included biological, adoptive, or stepmother or father for each age from birth through 18 years, for example, 'Lived with Biological Mother at Birth,' 'Lived with Adoptive Father at Age 16.' Ages at which the respondent stopped living with a parent, the reason for ending shared living arrangements, and the frequency of visitation with the absent parent during the first year after coresidence ended were also collected. For those ages when the respondent reported not living with a parent, information was collected in part 2 of the Childhood Residence Calendar on:

  • coresidence with grandparent(s), other relative(s), foster parent(s), or friend(s)
  • residence in a children's home or orphanage, a group care home, a detention center/jail/prison, or another institution
  • use of another type of arrangement
  • for those ages ten and over, whether the respondent was left to be on his or her own

Variable titles for this series include 'Lived with Foster Parent(s) (Not Living With a Parent) at Age-7,' 'Lived in Children's Home/Orphanage (Not Living With a Parent) At Age-2,' 'Left to be on Own (Not Living With a Parent) at Age-15.' The number of foster or group care arrangements experienced by the individual is also recorded. Finally, in part 3 of this supplement for each age during which the respondent experienced more than one living arrangement when not living with a parent, the place at which s/he stayed the longest is identified. Data quality issues are discussed by Haurin (1991).

Information on the residence of respondents' children is available, for the most part, since the 1982 survey year. Note that edited variables based on the 1989 and 1991 raw data are not available until the subsequent year's release. These edited variables, cleaned and checked for consistency, include residences of each biological child in birth order (with some anomalies), such as 'Usual Residence of 7th Child,' and combine information collected for residence of children of male respondents with that of children of female respondents. Coding categories include in the respondent's household, with other parent, with other relatives, in foster care, with adoptive parents, in a long-term care institution, away at school, deceased, lives part-time with both parents, lives part-time with the respondent and another person, and other. The unedited variables upon which the edited variables are based can be found in the "Fertility" section of the main youth questionnaire and include residence of all biological children; residence of all children born by the time of the previous interview, collected annually since 1984; and residence of all children born since the last interview, collected since 1983. Unedited residence information for nonbiological children is available for 1985, 1986, 1988, 1990, 1992, and from 1994 forward. Coding categories for all unedited variables are the same as for edited variables.

References

Haurin, R. Jean. "Childhood Residence Patterns: Evidence from the National Longitudinal Surveys of Work Experience of Youth." Columbus, OH: CHRR, The Ohio State University, 1991.

Morgan, William R. "Sibling Influences on the Career Plans of Male and Female Youth." Columbus, OH: CHRR, The Ohio State University, 1983.

Comparison to Other NLS Surveys

All biological children of NLSY79 mothers are included in the NLSY79 Child data set. NLSY79 young adults, regardless of whether they are living with their mothers, complete a household interview almost identical to that in the main youth.

Information on the respondent's household is available for all other cohorts for most survey years. Data generally include the age, sex, relationship to the respondent, and educational attainment of all occupants; the enrollment status of those of school age; and the occupation and weeks worked of residents age 14 and older. In the pre-1980 surveys of the Original Cohorts, data were generally collected only for family members living in the respondent's household and not for unrelated household members. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Survey Instruments & Documentation The 1988 childhood residence data were collected using questions in Section 16 "Childhood Residence" and the supplemental Childhood Residence Calendar. Information on residence of respondent's children is collected in the "Fertility" section of the questionnaire. Questions on distance of a respondent's child to the child's mother, father, or to the respondent also are located in the "Fertility" section. The questions on distance from the respondent's residence to that of his or her father or mother can be found in the "Family Background" section of the 1979 questionnaire. General information on the Supplemental Fertility File variables, such as the edited residence of children variables, can be found in Appendix 5: Supplemental Fertility and Relationship Variables of the NLSY79 Codebook Supplement. A technical appendix in Morgan (1983) presents details on respondent sibling matching procedures.
Areas of Interest
  • The family size and type of residence variables are included in the "Key Variables" area of interest. 
  • Edited residence of children variables have been placed in "Fertility and Relationship History/Created," while unedited residence of children variables have been placed in the "Birth Record" and "Birth Record xxxx" areas of interest.
  • Information from the household interview, which is transcribed onto the household enumeration, is included in "Household Record."
  • The distance from the respondent's residence to that of each child not living in the household, as well as the distance each child lives from his or her mother (for children of male respondents) or father (for children of female respondents), is available in the "Birth Record xxxx" area of interest for 1984-86, 1988, 1990, 1992, and 1994-2016. The distance from the respondent's residence to the residence(s) of the respondent's mother and father was collected during the 1979 interview.

Geographic Residence & Neighborhood

Created variables

PUBLIC USE VARIABLES

  • Region of residence at each survey date (Northeast, North Central, South, or West)
  • Information on whether the current residence is in an urban or rural county
  • Through 1996, this series was based on the respondent's State and county of residence and the "% urban population" data from the County & City Data Book. From 1998-2002 this item was based on whether the respondent was living in an urbanized area or in area with a population greater than 2,500. Beginning in 2004, this item indicates whether the respondent resides within an urban cluster or urbanized area. For further information see the Geocode Codebook Supplement.
  • Information on whether the current residence is in a Metropolitan Statistical Area (MSA), the central city of an MSA, or outside of an MSA
  • Based upon zip code, State, and county matches with metropolitan statistical designations for place of residence, the location of the respondent is determined to be within or outside of a metropolitan statistical area.
  • Beginning in 1988, whether the current residence is in the United States

GEOCODE FILE VARIABLES

  • The specific county and State (both edited) of residence at the time of interview, coded with Federal Information Processing Standards (FIPS) codes
  • Similar information is provided for the respondent's residence at birth and at age 14
  • The specific metropolitan area of residence at the time of interview. As applicable, information may be included for the following types of metropolitan areas:
    • SMSA-Standard Metropolitan Statistical Area
    • MSA-Metropolitan Statistical Area
    • CMSA-Consolidated Metropolitan Statistical Area
    • PMSA-Primary Metropolitan Statistical Area
    • NECMA-New England County Metropolitan Area
    • CBSA-Core Based Statistical Area
  • Distance between respondent addresses at each interview round (see Appendix 22: Migration Distance Variables for Respondent Locations).
  • This supplements the data on state and county of residence and is available only on the geocode release
  • The distance between the respondent's addresses at each date of interview was created for all unique pairs of survey years
  • The data described here do not actually provide a location for the respondent's residence; these variables only provide distances between the various places the respondent lives
  • This pairwise matrix of variables enables various types of migration research by enabling users to consider the distance between residences and to identify return migration to an area where the respondent has lived in the past
  • Indicators of the quality of the geographic data:
    • May not have an address for the respondent
    • In such cases the respondent's address is geocoded to the centroid of the zipcode when we can determine the zipcode
    • To identify these cases, an indicator for the quality of this distance measure was created based on the quality of the matches in both years
  • Indicator for whether the respondent was located in the same zip code, was created for all pairs of years

Important information about using restricted-use Geocode data

  • The level of detail available determines whether a variable is placed within the restricted release "Geocode xxxx" files. For example, general country level information, such as whether the respondent resided at various points in time within or outside of the United States, is available to all users with no restriction, while the specific county or SMSA in which he or she resided at a specific interview point is present only within the restricted-use Geocode data files.
  • Researchers interested in using restricted-use Geocode data must submit an application to BLS. These confidential files are available for use only at the BLS National Office in Washington, DC, and at Federal Statistical Research Data Centers (FSRDCs) on statistical research projects approved by BLS. Access to data is subject to the availability of space and resources. Information about applying to use the zip code and Census tract data is available on the BLS Restricted Data Access page.
  • The "Misc. xxxx" areas of interest contain a set of variables titled 'Does R Live on a Farm or in a Rural Area?' The interviewer answers this question based on observation when at the respondent's permanent residence; if the interview takes place elsewhere, the interviewer asks the respondent about the place of residence. There are no consistent criteria for the definition of nonfarm property as rural. These variables should not be considered a replacement for the created KEY VARIABLE, 'Current Residence Urban/Rural?'
  • The coding of respondents' geographic location before 1993 required extensive hand-editing and may not be completely accurate. The most common error is the potential assignment of a respondent to an adjacent county of residence. Data on addresses, zip codes, and phone numbers are used to clean the geographic codes. The post-1988 use of telephone number information improved data quality. A brief discussion below provides more information on both the hand-edits performed each year and the created variable that indicates the extent of hand-editing required for each case; see Appendix 10 in the Geocode Codebook Supplement for more details.

Geographic data for NLSY79 respondents fall into two categories: information on the main public file and more detailed information released as restricted-use Geocode data. Table 1 lists NLSY79 geographic variables along with their areas of interest. Variables with a "Geocode xxxx" area of interest are restricted-use data; all others are public use.

Table 1. Select Residence Variables by Survey Year and Area of Interest: NLSY79 Main and Geocode Files
Variables Survey Year(s) Area of Interest Documentation
Residence at Birth      
  Country - U.S. or Other Country 1979, 1983 Geocode 1979 --
  Country - Actual Other Country 1979 Geocode 1979 Attachment 101
  County 1979 Geocode 1979 Attachment 102
  State 1979 Geocode 1979 Attachment 102
  South/Non-South 1979 Family Background Attachment 100
Residence at Age 14      
  Country - U.S. or Other Country 1979 Geocode 1979 --
  Country - Actual Other Country 1979 Geocode 1979 Attachment 101
  County 1979 Geocode 1979 Attachment 102
  State 1979 Geocode 1979 Attachment 102
  South/Non-South 1979 Family Background Attachment 100
  Area of Residence - Urban/Rural 1979 Family Background User's Guide and Appendix 6
Present Residence      
  Lived in Since Birth 1979 Family Background --
  Year of Move to 1979 Family Background --
Most Recent Residence      
  5th-1st Country/County/State Since Jan. 1978 1979 Geocode 1979 Attachment 101
Attachment 102
  Month/Year of Move(s) 1979 Family Background --
  5th-1st Country/County/State Since Last Int. 1980 Geocode 1980 Attachment 101
Attachment 102
  Month/Year of Move(s) 1980 Family Background Attachment 102
  9th-1st Country/County/State Since 1980 Int. 1982 Geocode 1982 Attachment 101
Attachment 102
  Month/Year of Move(s) 1982 Family Background --
Current Residence      
  Region 1979-2020 Key Variables Attachment 100
  Urban/Rural 1979-2020 Key Variables User's Guide and Appendix 6
  SMSA/Central City 1979-2020 Key Variables User's Guide and Appendix 6
  In U.S. 1979-2020 Misc. xxxx NLSY79 User's Guide
  County 1979-2020 Geocode xxxx Attachment 102
  State 1979-2020 Geocode xxxx Attachment 102
  SMSA 1979-2020 Geocode xxxx Attachment 104
  PMSA 1979-2020 Geocode xxxx Attachment 104
  MSA 1979-2020 Geocode xxxx Attachment 104
  CMSA 1979-2020 Geocode xxxx Attachment 104
  MSA/CMSA/NECMA 1979-2020 Geocode xxxx Appendix 10
  CBSA 1979-2020 Geocode xxxx Appendix 10
  Main Reason for Moving Since Date of Last Interview 2018, 2020 Family Background NLSY79 User's Guide
         

Related Variables: Related NLSY79 main file variables discussed in the  Household Composition and Family Background sections of this guide include:

  • Type of residence or dwelling unit at the time of interview (such as dorm, hospital, jail, orphanage, own home, and so forth)
  • Childhood living arrangements of NLSY79 respondents from birth to age 18, including not only information on persons with whom the respondent lived (such as biological versus adoptive and step-parents) but also on institutions such as children's homes, group care homes, or detention centers/jails/prisons in which he or she may have resided.

Geocode file variables

  • Information on the State, county, and metropolitan statistical area of residence for each respondent (the current residence variables) are merged with information from several other data files, namely the City Reference File (Census 1973, 1982, 1983, 1987, 1992) and the County & City Data Book (Census 1972, 1977, 1983, 1988, 1994), to provide detailed information on the environmental characteristics of the State, county, and metropolitan statistical areas in which each NLSY79 respondent resides. NOTE: Users may attach additional county and metropolitan statistical area-level data from a variety of sources by simply merging information from the desired source with the Geocode data based upon the State, county, and metropolitan statistical area of residence codes in the Geocode file
  • For select survey years Geocode information is available on the location of respondents' jobs, the location of colleges attended, and the point of discharge from military service
  • Unemployment rate of each respondent's labor market of current residence:
    • The source of the 'Unemployment Rate' variables is the May issue of the Bureau of Labor Statistics' Employment and Earnings for the year following the survey year. Figures from March of each survey year are used. This table supplies unemployment rates for each State and for selected metropolitan statistical areas. Respondents who reside within one of these metropolitan statistical areas are assigned the appropriate unemployment rate. For those residing outside of these areas, a "balance of State" unemployment figure is computed using State total figures for the size of the civilian labor force and the number employed and subtracting the population living in metropolitan statistical areas.
    • Additional information on these variables can be found in Appendix 7 in the NLSY79 Geocode Codebook Supplement.

Types of County or Metropolitan Statistical Area Environmental Characteristics in the NLSY79 restricted-use Geocode data:

  • Population sizes
  • Percent of population that is:
    • urban
    • black
    • female
    • under 5 years old
    • 65+ years old
  • Birth/death/marriage/divorce rates
  • Physician and hospital bed rates
  • Crime rates
  • Poverty level data
  • Educational attainment levels 
  • Median family and per capita income
  • Recipients of and payments from:
    • AFDC
    • SSI
    • Social Security
  • Labor force statistics:
    • total labor force
    • civilian labor force
    • number of females in the civilian labor force
    • civilians unemployed versus employed
    • percent employed in various industries
  • Unemployment rate for labor market of residence

Geographic Residence: Detailed geographic mobility information was collected during the 1979-80, 1982, and from 2000 forward; data were gathered on the country/county/State and timing of up to five residential moves since January 1978 or since the last interview. Beginning in 2000 only significant geographical moves were recorded.

Neighborhood Quality: The neighborhood quality series (1992, and 1994-2000), is taken from the National Commission on Children Parent & Child Study, 1990 Parent Questionnaire. In this series of questions respondents rate how much of a neighborhood problem issues such as crime, lack of police protection, unsupervised children and joblessness are.

Other Geographic Variables: Users may obtain special permission to use zip code and Census tract data available at the BLS offices in Washington, DC.

Edited versus Unedited Versions of State/County of Residence: For some years (1979-82, 1988-89, 1991-92), two versions of the State and county of residence variables have been included in the "Geocode xxxx" files. The set occurring at the beginning of each file is the edited version, while the variables found near the end of the files for these years are unedited. If the variable has an actual source question number/name, it is the original from NORC. If the source question name says created, it is the edited/created version. Note that the unedited variables are sometimes combined into a single variable, with the State and county code appended to each other. These raw variables are preceded by the word "GEOCODE" in the variable title. The edited residence variables contain the corrections made for erroneous address information and are the ones from which the Geocode files themselves are constructed. Users should be aware that the edited version of these variables does not contain data for those respondents who are in the active military forces or who are living abroad or in a U.S. territory.  Codes of "-4" appearing in the unedited versions of the State or county variables (because foreign country and U.S. territory codes are placed in one field or the other) should not appear in the edited versions of these residence variables.

New Geocode Procedures for Assigning Residence Codes and Hand-Editing Discrepant Cases: During the 1988 hand-editing process, it became evident that the telephone numbers were very accurate, even in cases for which the address information contained discrepancies. Beginning in 1989, the area code and phone exchange were used to assign State and county of residence codes. The State assigned by the area code was then compared to the State assigned on the basis of zip code alone and the State contained in the original NORC respondent file. A "quality of match" variable was computed on the basis of how well these States match. For a more detailed discussion of these new assignment and matching procedures, refer to Appendix 10: Geocode Documentation in the Geocode Codebook Supplement. This process was used through the 1994 release.

The hand-editing procedure has also been streamlined. In 1989, the first year in which the phone assignment procedure was used, the residence codes assigned on the basis of the area code and exchange were compared to the raw residence variables received from NORC. Those with information that did not match were identified for individual examination. Ideally, the discrepancies requiring individual examination would be reduced to those cases which are "genuine movers" or which have zip codes covering multiple counties and would require some verification that the correct county was assigned based upon the phone information. The current process for identifying discrepancies and hand-editing is aimed more directly at achieving this objective. 

Beginning in 1990, the residence codes assigned based on phone information were compared to the 1989 CHRR-edited residence information to identify cases for individual examination. Because the previous year's edited variables incorporate the corrections that were made in the hand-editing process from earlier years, repeated editing of the same cases across years decreased. Through this process, the discrepancies in residential Geocode information were reduced. The number of cases requiring individual examination also decreased and was restricted more closely to the population of "genuine movers" and people with multiple-county zip codes and phone numbers that require verification of county of residence. 

The hand-editing process in previous years included not only these genuine movers and multi-county zip code dwellers, but also other cases for which elements of the address are simply in error or incompatible with each other. Some of these cases could potentially require editing for the same errors in more than one year, even if the respondent stayed in one location. Hand-editing procedures were further streamlined, and in some cases automated, to produce the 1992 data.

Beginning in 1996, a new procedure for verifying and assigning correct final Geocode information was instituted. This procedure is now performed using specialized address tracking Geocode software. The processes are described in Appendix 10. It is the belief of CHRR staff members not only that the current procedures are more efficient in identifying true discrepancies and streamlining the hand-editing process, but also that they result in more accurate and consistent assignment of State and county codes in general. 

Missing Values, New England Cases, and Mobility: Missing values in location of residence variables and metropolitan statistical area codes are associated with respondents who are in the active military forces or who are living abroad or in a U.S. territory. Users should be aware that, because the New England County Metropolitan Area (NECMA) codes are not comparable to metropolitan statistical areas from the remainder of the country, New England cases are eliminated from some of the procedures used to construct the Geocode files.

The review and hand-editing process has been periodically revised to improve the accuracy of the data and the efficiency of data production. The potential implications for effects on mobility rates between some years due to these changes have been noted in Appendix 10: Geocode Documentation. Users should read Appendix 10 carefully to gain a better understanding of the issues outlined above and their implications for specific research endeavors.

Comparison to Other NLS Cohorts: Data on the respondent's area of residence are available for all cohorts. Geographic residence information for those NLSY79 children who resided with their mother can be inferred from the residence data of their mothers. The NLSY97 main created variables indicate whether the respondent lives in an urban or rural area, whether the respondent lives in a Metropolitan Statistical Area, and in which Census region the respondent resides. More detailed information is available on the restricted-use Geocode data. Region of residence and geographic mobility of Original Cohort respondents are provided for most survey years. Geographic data for NLSY79 respondents fall into two categories: information on the main public file and more detailed information released as restricted-use Geocode data. These confidential files are available for use only at the BLS National Office in Washington, DC, and at Federal Statistical Research Data Centers (FSRDCs) on statistical research projects approved by BLS. Access to data is subject to the availability of space and resources. Information about applying to use the zip code and Census tract data is available on the BLS Restricted Data Access page.

Survey Instruments & Documentation

Data on residence at birth and at age 14, as well as the 1979-82 present/most recent residence series, were collected using questions found within Section 1 ("Family Background" and "On Family") of the 1979, 1980, and 1982 questionnaires. All other variables are created from or determined by the geographic information provided by each NLSY79 respondent within the locator section of the questionnaire or from the interviewing Face Sheet or internal NORC locating files. Several attachments and appendices in the NLSY79 Codebook Supplement and/or the NLSY79 Geocode Codebook Supplement offer creation procedure information and coding systems for the geographic residence variables. The following are relevant to the Geocode:

Areas of Interest Residence variables can be found within the "Family Background," "Key Variables," "Geocode xxxx," or "Misc. xxxx" areas of interest; the table above specifies the particular areas of interest for each variable. All environmental variables, including the 'Unemployment Rate for the Labor Market of Current Residence,' are present in the "Geocode xxxx" areas of interest in the restricted-use Geocode data.

Gender

Important Information About Using Gender Data

During screening, sex was determined by observation and asked directly of respondents only if it was "not obvious" to the interviewer. On March 1, 1986, 'Sex of R' was changed for 45 cases as a result of inconsistencies generated from interviewer checks for respondent's sex in the fertility section of the 1982 survey instrument; three additional cases were changed shortly thereafter. Each of these cases was verified by NORC for accuracy. 'Sex of R' (R02148.) for the following identification codes (R00001.) was changed:

  • From male to female: 712, 1306, 1933, 2212, 2286, 2287, 2433, 3960, 4157, 6102, 7571, 7645, 7890, 8542, 8690, 8826, 9150, 9713, 10511, and 12676
  • From female to male: 1663, 3388, 3582, 3583, 3865, 4524, 4579, 4917, 5929, 6198, 6360, 6466, 6840, 7620, 7624, 8321, 8543, 8596, 9166, 9555, 10347, 11110, 11114, 12257, and 12387

The variable series 'Int Remarks - Sex of R,' provides interviewers' observations of the sex of the respondent for the 1982-1986 and 1988-1998 survey years. These observations are subject to a small degree of error from erroneous interviewer observation or recoding and data entry error. When using this series of variables, a small number of respondents may appear to "change" sex across surveys.

Variables available within the NLSY79 provide information on the sex of each respondent, their children, and members of their household. Information on the sex of the respondent can be found in:

  • a single 1979 variable, 'Sex of R' (R02148)
  • a set of yearly interviewer remarks variables, 'Int Remarks - Sex of R' 

The 1979 'Sex of R' variable (R02148.) is derived from R01736., 'Sample Identification Code,' a variable which defines each respondent's membership in one of the subsamples that compose the NLSY79 (such as the "cross-sectional male, nonblack/non-Hispanic poor," "supplemental female black," and so forth). Subsample identification was based on information gathered during the 1978 household screening.

Comparison to Other NLS Cohorts: Sex for all biological children born to female members of the NLSY79 is available. Information on sex is also available for the NLSY97. Sex is implicit by membership in the Original Cohorts. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Survey Instruments & Documentation A copy of the 1978 Household Screener used to collect information on sex of the respondent and other household members can be found in the Household Screener and Interviewer's Reference Manual (NORC 1978). Interviewer observations are recorded in the final section of each questionnaire, entitled "Interviewer's Remarks." Household members' sex is collected during the administration of the Household Interview Forms. A copy of the Information Sheet, containing sex of respondents' children, can be found near the beginning of the yearly Question by Question Specifications. The CRF is a separate child "inventory" referenced in the "Fertility" section of the questionnaire; sample copies can be found in the Question by Question Specifications. Finally, a general description of the derivation of the Supplemental Fertility File variables, such as sex of children, appears in Appendix 5: Supplemental Fertility and Relationship Variables in the NLSY79 Codebook Supplement.
Areas of Interest All sex variables discussed above are located on the main NLSY79 data set. 'Sex of R' (R02148.) and the 'Sample Identification Code' (R01736.) can be found in the "Common" area of interest, while the interviewer remarks variables are located in "Interviewer Remarks." The Supplemental Fertility File variables have been placed in "Fertility and Relationship History/Created." Children's sex, listed separately for biological and nonbiological children on the CRF, are in area of interest "Child Record Form/Biological" and "Child Record Form/Nonbiological," respectively. Variables collected during the household interview can be found in "Household Record," and variables from the Household Screener are housed in "Misc."

Age

Created variables

AGEATINT: These variables provide the respondent's age at each interview date.

Important information: Using age data

The eligibility for inclusion in this cohort was based on the 1979 age reports, as are weights. Birth date questions were asked again in the 1981 survey because:

  • a number of discrepancies between birth dates found on the military file and the NLSY79 files were discovered
  • a number of inconsistencies between age as recorded on the "Household Enumeration" and the main questionnaire were apparent

The 1981 birth dates should be used to determine age with the 1979 dates used only as a backup. Differences between 1979 and 1981 birth dates remained for approximately 200-250 respondents after the 1981 fielding; editing on a case-by-case basis was performed by CHRR staff on only the 1981 variable. Inconsistencies in age or birth date information may appear for a number of reasons: age and birth date information has been collected at multiple survey points, giving rise to respondent-reported inconsistencies; respondents' ages for sample selection were based on date of birth information reported at the time of the 1978 household screening by individuals who may not have been the respondent; and responses to interviewer check items, that is, the age reported to the interviewer that determines when age-specific questions should be asked, may not be the same age as that calculated from previously reported age or birth date information. For example, a respondent whose age was 16 as calculated from the birth date reported in 1981 may have answered questions which were specific to a 17 year old.

Date of birth information was collected from each NLSY79 respondent during the 1979 and 1981 interviews:

  • The variable 'Age of R,' gathered during the 1979-83 surveys, is the self-reported age of the respondent as of the interview date.
  • The NLSY79 main data files also contain a yearly created variable, 'Age of R at Interview Date.' These created variables are constructed using the 1981 date of birth information coupled with the 1979 birth date for the 491 respondents not interviewed in 1981. 

Table 1 depicts age distribution of the NLSY79 for the 1979 survey year. Table 2 depicts age distribution of the NLSY79 for the 2018 survey year.

Table 1. Age of NLSY79 respondents as of date of interview for the 1979 survey year
Age Male Female Total
14 533 471 1004
15 804 766 1570
16 784 780 1564
17 756 756 1512
18 838 798 1636
19 812 871 1683
20 829 827 1656
21 832 850 1682
22 215 164 379
Total 6403 6283 12686

This table uses the created variable 'Age of R at Interview Date.'

Table 2. Age of NLSY79 respondents as of date of interview for the most recent survey year (2020)
Age Male Female Total
55 14 16 30
56 375 376 751
57 432 476 908
58 446 489 935
59 425 517 942
60 430 437 867
61 345 375 720
62 306 384 690
63 280 363 643
64 23 26 49
Total 3076 3459 6535

This table uses the created variable 'Age of R at Interview Date.' In the 2020 survey year, 3327 males and 2824 females were not interviewed.

Comparison to Other NLS Surveys Age data are available for all NLS cohorts. These variables include both the age of the respondents as of a fixed date during the initial survey year and as of the interview date in various years. Date of birth is also available for all cohorts. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Retirement

As the 1979 Youth cohort approached their fifties, questions were added about retirement preparation and expectations, starting with the 2006 survey year. 

Table 1 shows the retirement preparation/expectation variables and the survey years they have been asked. Variable question names begin with the "RETIRE_EXP" prefix.

In 2018, the retirement section was revamped and expanded so that respondents (many who were starting to retire) could provide more complete information about their retirement funds. Information on retirement income can be found in the Pension Benefits & Pension Plans section of this Users Guide.

Also in 2018, respondents were asked if their employer had ever offered an early retirement window, one with a special financial incentive like a cash bonus or improved pension benefits. 

Table 1. Retirement Preparation & Expectation Questions asked of NLSY79 Respondents: 2006-2018
Question Survey Year
  20061    2008  2010    2012 2014    2016 2018
What is the probability that you will not be working for pay at age 67? age 65? age 62? * *  *  *      
How would you define retirement for yourself? ("Select All" categories include stop work, receive pension, work for fun, etc.) * *  *  * *    
Approximately at what age do you think you'll retire? * *  *  * *    
Do you expect to work for pay..or at a family business for pay/profit in the future?   *  * * *    
...Ever calculated how much retirement income you would need at retirement? * * *  * *    
...Ever consulted a financial planner about how to plan your finances after retirement? * * *  * *    
...Ever read any magazines or books on retirement spending? * * *  * *    
...Ever used a computer program to help you plan your retirement? * * *  * *    
...Ever attended any meetings on retirement or retirement planning? * * *  * *    
Were any of the meetings organized by your (or your spouse's/partner's) employer? * * *  * *    
Did you (or your spouse's/partner's) attend the meetings on a voluntary basis or was attendance required? * *  *   * *    
What do you think the chances are... that you will be working full-time after you reach age 62? (age 65?)          * *  *
...your health will limit your work activity during the next 10 years?          * *  *
...you (and spouse/partner) will leave inheritances totaling $10,000 or more? $100,00 or more?          * *  
...you will live to be 75 or more? (85 or more?)          * *  *
...you will ever have to move to a nursing home?          * *   * 
How much planning have you done for your retirement?              *
Right now, would you like to leave work altogether, but plan to keep working because you need health insurance?              *
Right now, would you like to leave work altogether, but plan to keep working because you need the money?              *

1 Only a random sample of 991 respondents were asked retirement expectations questions in 2006.

Survey Instruments & Documentation Questions about retirement expectations can be found in the following sections: Retirement Expectations Part I (2006), Retirement Expectations Part II (2006), and Retirement Expectations (2008-2018).
Areas of Interest Retirement Expectations variables are found in the "Retirement" area of interest.

Business Ownership

Created variables

BUSOWN_SOURCEYR. Survey year when business data was collected. Cross-round variable.

BUSOWN-UID-MATCH.01, 02, etc. Unique job ID in Employment History Roster of Business 01, 02, etc. originally. Cross-round variable.

BUSOWN-MATCH-QUALITY.01, 02, etc. Provides quality of the match between Business 01, 02, etc. ownership reported and past employers. Cross-round variable.

BUSOWN-11_TRUNC.01, 02, etc. Money respondent used to establish/acquire Business 01, 02, etc. (amount truncated). Cross-round variable.

BUSOWN-16_TRUNC.01, 02, etc.  Respondent's estimate of sales/revenue of Business 01, 02, etc., generated in a typical year (amount truncated). Cross-round variable.

Note: Users are encouraged to use these XRND (cross-round) variables, as they combine the information gathered in 2010, 2012, 2014, and 2016.

Prior to 2010, only limited information was collected on business ownership in the NLSY79. In 2010 (round 24), NLSY79 respondents who were current or former business owners were asked a lengthy series of questions about each business owned since age 18. The business ownership questions were also asked in subsequent survey years for those not interviewed in 2010. 

Respondents who reported having owned a business gave the year each business was established and how ownership was acquired: whether the respondent established the business themselves or with partners, received ownership as a gift, purchased ownership, inherited ownership, or received an ownership stake through marriage. Respondents answered questions about working for a related type of business prior to starting their business, the source of the money used to establish or acquire the business, the number of employees and the number of physical locations of the business, the legal form of the business (sole proprietorship, corporation, etc.) the sales or revenue in a typical year, and whether the business was family owned.

In addition, all NLSY79 respondents interviewed in round 24 (regardless of "own business" status) were asked several questions on family ownership of businesses, their own patent-seeking activities, and whether they consider themselves to be entrepreneurs. These questions were repeated in subsequent rounds for those who missed the round 24 interview.

To find these business ownership questions in NLS Investigator, use "BUSOWN" as the Question Name search criterion. These variables are listed as cross-round (XRND) variables. To determine whcih survey year the data were collected, use the created variable "BUSOWN_SOURCEYR."

Employer History Roster

The Employer History roster includes information on virtually all employers reported by NLSY79 respondents, with many of the employer characteristics reported for each employer included in a single record. These variables are classified as "XRND" variables, rather than being assigned to a single survey year. Data from each survey year are currently available. The variables will be updated as necessary as revisions are made and with each progressive round.

Structure

In earlier years, the linking process required to build a record of various commonly used employer-specific characteristics that have been reported in different survey years has involved a rather complex process. Researchers would need to use variables indicating the employer number in the previous survey year and establish the link through each preceding year individually. Appendix 9: Linking Employers Through Survey Years details this process.

By contrast, the Employer History roster is designed to help alleviate that more involved process of employer linking for many of the most commonly used employer characteristics. The Employer History data is constructed in a roster structure, consisting of one record for each of the up to 65 jobs (so up to 65 records). Many employer characteristics have been compiled into each employer's record. The record for each employer will include each of the variables listed in the table below. For instance, the record for employer #1 will contain EMPLOYERS_ALL_CPSJOB_[YEAR].01, EMPLOYERS_ALL_UNION_[YEAR].01, EMPLOYERS_ALL_IND_[YEAR].01, EMPLOYERS_ALL_OCC_[YEAR].01, etc. for each survey year that employer #1 was reported.

Data Notes

Employer information for additional employers (jobs 6-10) in some of the older survey years may not have been included in the roster. These jobs were often included on separate data tapes, the original data of which has been difficult to recover. Should any of these data on additional employers be recovered in the future, that data will be added to the Employer History roster accordingly and users notified of these additions. These employers comprise a very small proportion of those ever reported for a very small number of respondents.

Locating Employer History variables

The Employer History items can be easily located within the NLSY79 dataset on NLS Investigator by using one of the following search criteria:

  • Question Name (enter search term) index, "starts with", enter "EMPLOYERS_ALL" search term
  • Area of Interest (pick from list) index, "equals", choose areas of interest beginning with "EMPLOYERS_ALL"
  • Reference Number (enter search term), "starts with", enter "E"

Table 1 lists the contents of the Employer History roster. The variables are listed under the Areas of Interest currently assigned. The "variable names" column contains the qnames for each of the variables. Currently there are up to 65 jobs for each respondent included on the roster. This means that at least one respondent has reported 65 jobs over the course of the NLSY79's 27 rounds. Most respondents of course will have a much smaller number of jobs reported since 1979.

Table 1: Employer History Roster Contents by Area of Interest

EMPLOYER HISTORY--JOB CHARACTERISTICS
Variable names Description Notes
EMPLOYERS_ALL_GOVJOB_[YEAR].[JOB#] Was this job a government job? Each job, each survey year through 1987
EMPLOYERS_ALL_CPSJOB_[YEAR].[JOB#] Was this job the CPS (current/most recent) job? Each job, each survey year
EMPLOYERS_ALL_UNION_[YEAR].[JOB#] R covered by union or employee contract on the job? Each job, each survey year
EMPLOYERS_ALL_CURWK_[YEAR].[JOB#] R currently working for employer at date of interview? Each job, each survey year
EMPLOYER HISTORY--JOB EMPLOYER IDs
EMPLOYERS_ALL_NUM_ARRAY_[YEAR].[JOB#] Number loaded into Work History Labor Force Status array Each job, each survey year; from this we hope to eventually create a parallel Labor Force Status array with a single job number for each job all the way through
EMPLOYERS_ALL_PREVID_[YEAR].[JOB#] ID number of job in survey year Each job, each survey year
EMPLOYERS_ALL_ID_[YEAR].[JOB#] ID number of job in survey year Each job, each survey year
EMPLOYERS_ALL_UID.[JOB#] A "unique" employer id assigned to employers, consisting of the survey year and an employer number for that survey year Each job, each survey year
EMPLOYER HISTORY--JOB HOURS WORKED
EMPLOYERS_ALL_HOURSDAY_[YEAR].[JOB#] Hours per day usually worked at job Each job, each survey year
EMPLOYERS_ALL_HOURSWEEK_[YEAR].[JOB#] Hours per week usually worked at job Each job, each survey year
EMPLOYER HISTORY--JOB INDUSTRY, OCCUPATION AND CLASS OF WORKER
EMPLOYERS_ALL_IND_[YEAR].[JOB#] Type of business or industry for employer Each job, each survey year
EMPLOYERS_ALL_OCC_[YEAR].[JOB#] Occupation for employer Each job, each survey year
EMPLOYERS_ALL_COW_[YEAR].[JOB#] Class of worker for employer Each job, each survey year
EMPLOYER HISTORY--JOB ORIGINAL STARTDATES AND MOST RECENT STOPDATES AND REASON LEFT JOB
EMPLOYERS_ALL_STOPDATE_MOST_RECENT.[JOB#]~[D/M/Y] Most recent stopdates for employer Each job
EMPLOYERS_ALL_STARTDATE_ORIGINAL.[JOB#]~[D/M/Y] Original startdate for employer Each job
EMPLOYERS_ALL_WHYLEFT_MOST_RECENT.[JOB#] Most recent reason given for leaving employer Each job
EMPLOYER HISTORY -- JOB PAYRATES AND TIME UNITS
EMPLOYERS_ALL_TIMERATE_[YEAR].[JOB#] Time unit for rate of pay Each job, each survey year
EMPLOYERS_ALL_PAYRATE_[YEAR].[JOB#] Payrate for employer Each job, each survey year
EMPLOYERS_ALL_HRLY_WAGE_[YEAR].[JOB#] Hourly rate of pay for employer Each job, each survey year
EMPLOYER HISTORY -- JOB START DATES
EMPLOYERS_ALL_STADATE_[YEAR].[JOB#]~[D/M/Y] Startdate  for employer Each job, each survey year
EMPLOYER HISTORY -- JOB START WEEKS
EMPLOYERS_ALL_STARTWEEK_[YEAR].[JOB#] Week number of start date for job Each job, each survey year, calculated by Work History programs
EMPLOYER HISTORY -- JOB STOP DATES
EMPLOYERS_ALL_STOPDATE_[YEAR].[JOB#]~[D/M/Y] Stop date for employer Each job, each survey year
EMPLOYER HISTORY -- JOB TENURE AND PRETENURE
EMPLOYERS_ALL_PRETEN_[YEAR].[JOB#] Months worked for employer before date of last interview Each job, each survey year
EMPLOYERS_ALL_TENURE_[YEAR].[JOB#] Total weeks tenure with employer Each job, each survey year
EMPLOYERS_ALL_PAST_[YEAR].[JOB#] R work for employer before date of last interview? Each job, each survey year
EMPLOYERS HISTORY -- JOB WHY LEFT
EMPLOYERS_ALL_WHYLEFT_[YEAR].[JOB#] Reason left job Each job, each survey year
EMPLOYER HISTORY -- WITHIN JOB GAPS REASON NOT WORKING
EMPLOYERS_ALL_WHYNOWK_[YEAR].[JOB#].[GAP#] Reason not working for within job gap Each job, each gap, each survey year
EMPLOYER HISTORY -- WITHIN JOB GAPS START DATES
EMPLOYERS_ALL_PERSTAR_[YEAR].[JOB#].[GAP#] Week number of start dates for within job gap Each job, each gap, each survey year, calculated by Work History programs
EMPLOYER HISTORY -- WITHIN JOB GAPS STOP DATES
EMPLOYERS_ALL_PERSTOP_[YEAR].[JOB#].[GAP#] Week number of stop dates for within job gap Each job, each gap, each survey year, calculated by Work History programs
EMPLOYER HISTORY -- WITHIN JOB GAPS WEEKS LOOKING
EMPLOYERS_ALL_LOOK_[YEAR].[JOB#].[GAP#] Any weeks looking for work during within job gap Each job, each gap, each survey year
EMPLOYER HISTORY -- WITHIN JOB GAPS WEEKS NOT WORKING
EMPLOYERS_ALL_NOTLOOK_[YEAR].[JOB#].[GAP#] Number of weeks not looking for work during within job gap Each job, each gap, each survey year
EMPLOYERS_ALL_WKSNOTWK_[YEAR].[JOB#] Any weeks not working for employer Each job, each survey year

Work History Data

Created Variables

STATUS ARRAY: These XRND variables constitute a week-by-week array spanning from January 1, 1978 through the current interview date, which contain either the job number of the current/most recent principal job, or the alternate labor force status (active military enlistment, unemployed, OLF, etc.) for each week.
HOURS ARRAY: These XRND variables constitute a week-by-week array spanning from January 1, 1978 through the current interview date, which contain the total number of hours worked at all jobs for each week.
DUAL JOBS ARRAY: These XRND variables constitute a set of week-by-week arrays spanning from January 1, 1978 through the current interview date, which contain up to 4 additional job numbers held concurrently with whatever principle job is reflected in the STATUS array. 

Note: The "XRND" assignment indicates that the data are not necessarily tied to a single round. Instead, they contain data reflecting the most current round in which a respondent was interviewed. These are generally variables that do not contain the typical "-5" non-interview code for specific survey years. A respondent's interview status in any given year can be determined by using the REASON FOR NON-INTERVIEW variables which are present from 1980 forward. 

A series of summary variables, listed below, are created based upon the week-by-week labor force status arrays produced by the NLSY79 Work History program. These summary variables are present on the NLSY79 main data files and provide a count of the number of weeks that a respondent held a given labor force status, that is, working, unemployed, out of labor force, or in the active Armed Forces. Each summary variable is available for the period since the last interview and in the past calendar year. Variables which indicate the percentage (if any) of weeks not accounted for due to missing data or indeterminate status in the Work History arrays are also calculated.

 

NUMBER OF WEEKS SERVICE IN ACTIVE ARMED FORCES IN PAST CALENDAR YEAR
NUMBER OF WEEKS SERVICE IN ACTIVE ARMED FORCES, LAST INT TO PRESENT

NUMBER OF WEEKS OUT OF LABOR FORCE IN PAST CALENDAR YEAR
NUMBER OF WEEKS OUT OF LABOR FORCE SINCE LAST INT

NUMBER OF WEEKS UNEMPLOYED IN PAST CALENDAR YEAR
NUMBER OF WEEKS UNEMPLOYED SINCE LAST INT

NUMBER OF WEEKS WORKED IN PAST CALENDAR YEAR
NUMBER OF WEEKS WORKED SINCE LAST INT

NUMBER OF HOURS WORKED IN PAST CALENDAR YEAR/SINCE LAST INT
PERCENT OF WEEKS UNACCTD FOR IN PAST CALENDAR YEAR/SINCE LAST INT

The first set of variables uses "Past Calendar Year," that is, the full calendar year previous to the current survey year, for its summations. The second set, which uses "Last Interview Date" as the reference period, allows researchers to piece together a cumulative set of figures for each respondent (up to the most current point of interview) depicting total number of weeks with a given labor force status. The variables containing the percentage of weeks unaccounted for serve to alert users to the completeness of a respondent's record over time. Because respondents can skip interview years, users should be careful in employing these variables to compose cumulative histories. These variables provide cumulative labor force status for the same period of time for each respondent interviewed in a given year. Comparative analyses can be conducted for a comparable time period across all respondents interviewed in a given year.

 

Important Information About Using Work History Data

The work history program constructs and consolidates in one place a great deal of employment-related information, sparing the time and effort involved in distilling these variables from the NLSY79 main data files.

Beginning with the release of the 2000 data, the Work History file is incorporated into the main NLSY79 data set. Variables previously located on the separate Work History file can be identified by searching for areas of interest beginning with "Work History." In addition, the reference numbers for work history variables begin with "W." Key work history variables are described below.

Weekly Arrays: Week-by-week records of the respondent's labor force status and associated job(s), if employed, and the total number of hours worked each week at any job, if employed, are available. This information is contained in the three arrays described above.

Although data on only up to five jobs are released, data are collected on all jobs. Data for the extra jobs are used to construct summary KEY variables by the work history programs. The number of jobs has exceeded ten for one case in 1991 and 1992, two cases in 1998, and one case in 2000.

Many researchers focus on data for the CPS job. Because the CPS was Job #1 only in select years, researchers should see the "Important Information" box in the Labor Force Status section for an elaboration of this concept.

Employment Gaps: Gaps within tenure with a specific employer are reported in association with that employer. They occur between the start and stop dates given for an employer. The respondent does not consider himself/herself completely disassociated from the relevant employer during these periods, although he or she was not actively working for that employer. Specific variables for each gap include start and stop dates; the reason that the respondent was not working; the number of weeks that a respondent was unemployed (looking for work or on layoff) or out of the labor force (OLF or not looking for work); and, for those who were OLF at some time during a gap, the reason they were not looking for work. See the Work Experience section for a discussion of gaps with respect to job tenure.

Gaps between employers are gaps in a respondent's employment during which he or she was not associated with any employer. The specific variables collected with respect to "within job gaps" (see the discussion in the Work Experience section on tenure with a specific employer) are also collected with respect to gaps between employers, with the exception of the reason that the respondent was not working during the gap.

The information collected on reasons for employment gaps allows specific dates to be fixed for unemployed or OLF status only if a respondent was unemployed or OLF for the entire period of the gap. If the respondent was unemployed for part of the gap and OLF for the other part, the number of weeks unemployed and OLF is recorded, but the specific dates of periods for which the respondent was actively looking for work/on layoff and not looking for work are not collected. This prevents the Work History program from assigning specific week numbers to these statuses in the event of such a "split gap." Instead, the number of weeks reported as unemployed is assigned to the middle of the total gap period, with the remainder of weeks at the beginning and end of the gap period being assigned an OLF status. Users examining the week-by-week status array containing labor force statuses should be aware that "split gaps" will appear as a series of "5" codes, followed by a series of "4" codes, followed by another series of "5" codes (5 5 5 5 5 .... 4 4 4 4 4 .... 5 5 5 5 5). Although the start and stop dates for the whole gap will be those actually reported by the respondent, the assignment of the unemployed and OLF statuses will not represent actual dates reported by the respondent. They represent only the number of weeks that a respondent reported having held each status, with the unemployed status being arbitrarily assigned to the middle portion of the gap.

Summary Labor Force Related Variables: Variables are constructed summarizing different aspects of a respondent's labor force activity, including total number of hours worked, weeks worked, weeks unemployed, weeks out of the labor force, and weeks in active military service. There are two sets of these variables, referring to each of two time periods--the period since the last interview and the past calendar year (see the Labor Force Status section). Variables are also created indicating the number of weeks since the previous interview and the percent of weeks for which a definite status cannot be determined in constructing the summary variables discussed above. See the Work Experience section for further notes on these variables.

Tracing Employers Back Through Contiguous Survey Years: Of particular interest to many researchers have been the PREV_EMP# and TENURE variables associated with each employer. The PREV_EMP# allows a respondent's association with a given employer to be traced back through contiguous survey years. Using PREV_EMP# and the appropriate start and stop dates, a TENURE variable is constructed for each job reported, which depicts total weeks of tenure with each employer across contiguous survey years. Examine the work history documentation in Appendix 18 to determine if any such time-saving variable constructions exist. 

Creation of the Work History Data

The work history is a complete retrospective up to and including the respondent's most recent date of interview. The questions in these survey sections are constructed to collect a complete history for each respondent, regardless of period of noninterview. For example, a respondent previously interviewed in 1984 and not interviewed again until 1989 will have a complete labor force history as of the 1989 interview, as information for the intervening period will be recovered in the 1989 interview. The Work Experience section contains a discussion of possible discrepancies or inconsistencies in these data. Researchers should be aware that, although such possibilities exist, they have not appeared to be a major factor in the quality or completeness of the work history record.

Be aware that for respondents with simultaneous active military status and civilian employment status, civilian labor force activity will take precedence over military status. For the purposes of constructing the week-by-week status array, the civilian job number will replace the military status code for weeks in which both statuses occur. The order of precedence for various labor force status codes is detailed in the work history documentation (see the discussion of the work history PL/I program in Appendix 18 of the NLSY79 Codebook Supplement); see also the Work Experience section.

For purposes of constructing the status array and computing the summary labor force activity variables, the work history programs require that specific week numbers be assigned on the basis of the job-specific start and stop dates. In the event that missing data occur in the job-specific start and stop dates, the programs take one of two actions. 

  1. If only the day in a given date is missing, the program assigns the number "15," placing these dates in the middle of the month. This allows an approximate week number to be assigned. The possibility still exists, however, that a negative job/gap duration will result because the day is arbitrarily fixed. For example, a start date of 10/-2/90, which indicates a missing day, and a stop date of 10/6/90 would be read by the work history program as 10/15/90 and 10/6/90 respectively. Therefore, when the week numbers are assigned, the arbitrary assignment of "15" as the start day would give an erroneous impression that a job started after it stopped. The status array and computed summary variables will reflect the invalid data in the week numbers.
  2. Dates missing a month or year cannot be estimated by the work history program and therefore have invalid missing codes for the week numbers. The status array and other computed variables cannot be calculated for activity within periods for which either or both of the dates have such missing information. These will also register invalidly missing information for any period in which specific dates and week numbers cannot be determined.

Comparison to Other NLS Cohorts: The NLSY97 Event History file contains created variables summarizing the month and year in which major life events occurred for each respondent, along with all main file data. Variables cover topics such as marital status, enrollment, employment status, and program participation. The NLSY97 Event History file presents employment status information in a format similar to the NLSY79 employment information, using a continuous week timeline. Although the NLS has collected information on labor force behavior since its inception, only partial work histories for respondents in the Original Cohorts can be constructed for certain survey years. The degree of completeness of the work history data varies by cohort and survey year. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Survey Instruments and Documentation The work history data are constructed from information gathered in the "Military History," "Current Labor Force Status or CPS," Employer Supplement, and "Periods not Working" sections of the NLSY79 instruments. The work history program converts dates reported in these sections (start and stop dates, employment gap dates, enlistment and discharge dates) to week numbers, using January 1, 1978, as week #1. Week-by-week histories of a respondent's labor force activity are constructed by filling in the weeks between the reported beginning and ending dates for different activities (or inactivity) with the appropriate code. In turn, this weekly accounting makes possible the construction of the summary variables.
Work History-Specific Documentation Prior to the release of the 2000 data, work history variables were documented in a series of text files on the separate work history data set. In 2000, this information was moved to the Codebook Supplement. Appendix 18: Work History Data provides information about the logic and procedures used to create the work history arrays, as well as additional coding information for selected variables.
Areas of Interest The majority of the work history variables are constructed from variables found in the "Military," "Job Information," "Periods Not Working within Job Tenure," "Jobs," "CPS," and "Between Job Gaps" areas of interest on the main data set. The resulting arrays are located in the "Work History" area of interest. The summary variables are included in the "Key Variables" area of interest.

Wages

Created variables

  • HRP#: These variables contain a computed hourly rate of pay for each job for which wage information was collected.
  • PAYRATE-EMPALL-##: These variables represent the pay rate value for each for which a pay rate was collected. In 1979-1993, pay rates and time units were collected in a single set of questions. Beginning in 1994, pay rates were compiled from a series of questions depending on the time unit reported.
  • PAYRATE-SP-ALL: These variables represent the pay rate value for the spouse/partner's main job. In 1979-1993, pay rates and time units were collected in a single set of questions. Beginning in 1994, pay rates were compiled from a series of questions depending on the time unit reported.
  • CPSHRP: is the hourly rate of pay for CPS job computed from 1979 to 1994.

Important information: Using wages data

The creation of the rate of pay variables listed in the Created Variables box above utilize the HRS_WORKED_WK cross-round (XRND) variables from the NLSY79 Work History data (those XRND variables constitute a week-by-week array spanning from January 1, 1978 through the current interview date and contain the total number of hours worked at all jobs for each week. See Work History Data section and Appendix 18 for more information). For those who report that they performed one or more hours of work at home (1988 to present) and that the number of hours worked at home was not included in the usual hours worked per week, the total number of hours usually worked including work at home is used. This inclusion of home hours has produced, for a small number of respondents, extreme hourly rates of pay due to the fact that both the hours worked at home and hours worked at a place of business are counted. Low numbers in total hours worked--for respondents who did not include home work in their first reported usual hours worked--produce, when combined with rate of pay, erroneous hourly rates of pay. For the most part, accurate total hours worked can be constructed from these raw data. Note that:

  • the calculation procedure, which factors in each respondent's usual wage, time unit of pay, and usual hours worked per day/per week produces, at times, extremely low and extremely high pay rate values;
  • no editing of values reported by a respondent occurs even if the value is extreme, such as $25,000 per hour;
  • no 'Hourly Rate of Pay Job #1-5' data are available for those respondents reporting a time unit of "other"; and
  • any hourly wage rate information reported in the 1988-1993 follow-up question is not included in the creation statements.

Year(s)

Universe
1979-1980 Current job from which R was not laid off in Employer Supplements; other jobs that are government-sponsored part-time or summer jobs, government sponsored jobs for those not in regular school, part of a tax credit program or any other government sponsored program in Employer Supplements; other jobs R is > 15 years of age & >= 20 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1981 Current job from which R was not laid off in Employer Supplements; other jobs that are government-sponsored part-time or summer jobs, government sponsored jobs for those not in regular school, part of a tax credit program or any other government sponsored program in employer supplement; other jobs >= 20 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1982-1984 Current/most recent job in Employer Supplements; other jobs that are government-sponsored part-time or summer jobs, government sponsored jobs for those not in regular school, part of a tax credit program or any other government sponsored program in employer supplement; other jobs >= 20 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1985 Current/most recent job in Employer Supplements; other jobs that are part of a tax credit program or any government sponsored program in employer supplement; other jobs >= 20 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1986 Current/most recent job in Employer Supplements; other jobs that are part of a tax credit program or any government sponsored program in employer supplement; other jobs >= 10 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1987 Current/most recent job in Employer Supplements; other jobs that are part of any government sponsored program in employer supplement; other jobs >= 10 hours/week & >= 9 weeks worked since date of last interview in Employer Supplements
1988-1992 Current/most recent job in Employer Supplements; all other jobs except those for which class of worker = working without pay in family business or farm in Employer Supplements
1993-current survey year All jobs in Employer Supplements

Data on respondents' usual earnings (inclusive of tips, overtime, and bonuses but before deductions) have been collected during every survey year for each employer for whom the respondent worked since the last interview date. The amount of earnings, reported in dollars and cents, is coupled with information on the applicable unit of time, such as per day, per hour, per week, or per year. Since 1988, those respondents reporting any unit of time other than "per hour" have been asked a follow-up question on whether they were paid by the hour on that job; if so, an hourly wage rate was collected.

The raw earnings data, collected in the Employer Supplements during each round of the survey and in Section 10 of the 1979 questionnaire, can be found in the variable series 'Rate of Pay Job #1-5' and 'Time Unit of Rate of Pay Job #1-5.' Two sets of variables provide information based on the combined earnings and time unit data. The first set, 'Hourly Rate of Pay Job #1-5,' provides the hourly wage rate for each job as reported. The actual responses of those respondents who report wages with an hourly time unit in the initial earnings question appear in this variable. For those reporting a time unit other than "per hour" or "other" in the initial earnings question, an hourly rate of pay has been calculated. 

A second set of variables based on responses to the initial set of wage/time unit questions, entitled 'Hourly Rate of Pay Current/Most Recent Job,' identifies the hourly earnings for the job identified as the CPS job, that is, the job that the respondent held most recently. Hourly wage rates for those respondents who reported a time unit other than "per hour" can be found in the 1988-93 variables series, 'Paid by the Hour (Time Unit Other than Hourly Previously Reported) Job #1-5' and 'Hourly Rate of Pay (Rate Other than Hourly Previously Reported) Job #1-5.' Table 1 depicts the core set of rate of pay variables present on the NLSY79 combined Main/Work History Data.

Table 1. Core rate of pay variables: NLSY79 combined main and work history files

Variable Title Years Area of interest
Rate of Pay Job #1-5 1979-current survey year Job Information
Time Unit of Rate of Pay Job #1-5 1979-current survey year Job Information
Hourly Rate of Pay Job #1-5 1979-current survey year Job Information
Hourly Rate of Pay Current/Most Recent Job 1979-93 CPS

Follow-up questions starting in 1986 asked those respondents whose earnings had changed for wage rate and time unit information at the time they first started working for a new employer. In 1986 and 1987, those who were not working for the employer at the interview date were also asked for wage information at the time they left that employer. These data can be found in the following variables: 'Wages Changed Since First Began Working Job #1-5,' 'Rate of Pay When 1st Began Working at Job #1-5,' 'Time Unit of Rate of Pay When 1st Began Working at Job #1-5,' 'Rate of Pay When Last Worked at Job #1-5,' and 'Time Unit of Rate of Pay When Last Worked at Job #1-5.'

Comparison to Other NLS Surveys

Starting in 1988, NLSY79 children age 10 and older have been asked about the number of hours usually worked and usual earnings in a week. In the NLSY97, several questions are used to determine the job's rate of pay as of the start date. The rate may be defined according to different scales (such as per month, per week, per day, or per hour). Additional information is collected on whether the respondent received any pay from overtime, tips, commissions, bonuses, incentive pay, and other sources when the job started. Questions about freelance employment gather information about the usual number of hours the respondent worked per week and the usual weekly earnings as of the job's start date. In rounds 1-3, respondents who were age 16 or older and reported earning $200 or more per week at a freelance job were considered self-employed.

For the Original Cohorts, rate of pay is available for the CPS job and for many dual or intervening jobs. For more precise details about the content of each survey, consult the appropriate cohort's User's Guide using the tabs above for more information.

Survey Instruments & Documentation Section 10, "Jobs," of the 1979 questionnaire and the Employer Supplements for 1980-current survey year collected these raw data.
Areas of Interest The 'Rate of Pay Job #1-5,' 'Time Unit of Rate of Pay Job #1-5,' and 'Hourly Rate of Pay Job #1-5' variables for each job can be found in the "Job Information" area of interest on the main NLSY79 data files. The 'Hourly Rate of Pay Current/Most Recent Job' (1979-1993) variables for each year are located in the "CPS" area of interest. All other main file variables discussed above have been placed in the yearly "Misc. xxxx" areas of interest.
Subscribe to NLSY79