Types of Variables

Types of Variables

The NLSY79 Child and Young Adult data release contains comprehensive information from the 1986 through the current survey round. The file also contains child-specific information from the mother's main Youth interviews. Certain variables are derived from the mother's longitudinal record while other data items represent the questions administered during the Child and Young Adult interviews and the responses from each child assessment. Finally, there is an extensive set of created variables on the file, based on the assessment and interview data.

Detailed information on the types of data available for the NLSY79 Children and Young Adults can be found by examining the field instruments and by searching the database indices. Instructions on how to search the database can be found in the Investigator User's Guide. Researchers who are interested in items based on data from the mother's record are encouraged to access copies of the main Youth questionnaires and to review the NLSY79 main Youth documentation. These items are available in the NLSY79 section of the website. Information on how to link child and mother data can be found primarily in the section on Linking Children, Young Adults, and Mothers.

The NLSY79 Child data include demographic and family background, pre- and postnatal health history, home environment reports, information on child care and school experiences, items and scores from the biennial child assessments, and reports from the child "10 and older" self-report questionnaire. The Young Adult contains questionnaire items from all Young Adult interview years, covering areas such as family background, schooling, training, work and military experiences, relationship history, fertility, health, and drug and alcohol use, as well as a set of created variables for each round. Geographic information for young adults is available on a separate geocode file.

The type of variable may affect (1) the physical placement of the variable within the codebook (its sequence in the reference number list) and (2) the assignment of a variable to a particular area (or areas) of interest. Types of variables that appear in the public releases of the Child and Young Adult files include:

  1. Direct (or raw) responses from a questionnaire or assessment or other survey instrument.
  2. Recoded or edited variables constructed from raw data according to consistent procedures, e.g., coding of verbatim responses about jobs done for pay or religion other than the precoded categories. Such variables are marked as recode versions of the original.
  3. Constructed variables based on responses to more than one data item or multiple reports to the same item, either from cross-sectional or longitudinal information. Some of these created variables are indices or scale summations, such as the assessment scores, and others are individual items edited for consistency where necessary, e.g., child background characteristics such as age, date of birth, and gender. (See additional information below.)
  4. Constructed variables from a non-NLS data source, e.g., the County & City Data Book information present on the NLSY79 Young Adult geocode file.
  5. Variables provided by NORC or another outside organization based on sources not directly available to the user, e.g., the transcript data and test scores from the child school survey.
  6. Data collected from or about one universe of respondents reconstructed with a second universe as the unit of observation, e.g., variables on the NLSY79 Child datafile that are based on inputs from the mother's main Youth record but linked to each child.

Constructed Child variables based on main Youth data. Constructed variables, drawn from the mothers' records, provide information on each mother's household composition, quarterly employment referenced to the birth of each child, and family background. While most information is cross-sectional, many variables link maternal events or behaviors to the child's life cycle-specific points after, or in some instances, before the child's birth. Any item from the complete record of the mother's main Youth record can be linked to the Child and Young Adult files.

Constructed Child- and Mother-Specific Variables. In addition to the questionnaire items and constructed assessment scores the NLSY79 Child data set contains a number of other constructed variables. Some constructed variables, such as pre- and postnatal care and child usual residence, are drawn from child-specific information collected in the mother's main Youth interview. Other constructed items, such as maternal household composition and family background, are created from mother-based information that does not vary across children. Constructed variables are generally found in the following Areas of Interest: 


These created items include sibling identifiers, maternal family background, maternal household composition at each interview, and family educational background. Details on these constructed variables can be found in the Topical Guide to the Data. Mother-specific information present on the NLSY79 main data file and on special data sources such as the work history and geocode main Youth files can be linked with the Child data by case ID.

Constructed Young Adult Variables. In addition to the questionnaire items from the Young Adult surveys, several constructed variables for Young Adults are available. Some of these created variables are available for all young adult respondents who were interviewed in any survey year (designated as XRND), while others are specific to a particular survey round. The Young Adult constructed variables are located in the Area of Interests called YA COMMON KEY VARIABLES and YA FERTILITY AND RELATIONSHIP DATA - CREATED. 

The following key variables are constructed for all young adults: young adult ID (Y00001.00, CASEID), date of birth, gender, race, the ID code of the mother, comprehensive biological child information, dates of first marriage and first cohabitation, as well as the following:


Additionally, XRND flags for completing specific degrees, as well as the month and year the degrees were received, are available for all young adults.

Details about key variables: Two key identification codes are provided: that of the Young Adult and that of the mother. Any child who has not yet aged into the Young Adult sample, or who is ineligible for fielding, or who has been fielded but not interviewed, will have a missing value (-7) on these two ID variables. Only children who have ever been interviewed as Young Adults have valid values. These variables are provided for users who want to quickly restrict their sample to ever-interviewed Young Adults. The ever-interviewed Young Adults also have an updated date of birth (month and year), gender, and race based on mother's racial/ethnic cohort from the 1978 screener.

Beginning with the 2000 release, three interview status variables are provided. First is the year of most recent Young Adult survey (Y12051.). This variable allows users to quickly identify when data for a non-year-specific variable would have been pulled. For example, if a respondent was last interviewed in 1994, only information from that year would have been available to use in constructing variables such as ever cohabited or ever reported a first marriage. 

The second interview status variable is the number of Young Adult interviews completed by a respondent (Y12052.). This variable allows users to assess how many respondents have multiple time points for repeated measures. Users are reminded, however, that there are a variety of factors that influence a respondent's value on this variable, such as when the respondent aged into the sample, during what years there were age or other restrictions applied to the fielded sample, and whether or not the respondent was actually interviewed in a given year. There are two flags per survey year, located in the CHILD BACKGROUND area of interest, allowing the user to identify whether a respondent was eligible to be interviewed as a Young Adult and whether or not a Young Adult interview occurred.

The last of these interview status variables is the number of Child survey years where the respondent has at least some interview or assessment data available (Y12053.). Users should be aware that the Child survey consists of two or three instruments, depending on the age of the child, and some respondents may have data for only one of these instruments in a given survey year. This variable, as with the number of Young Adult interviews, is provided to help users gain a quick portrait of data availability. 

New Variables Created by Researchers. Researchers sometimes use the NLS public datasets to generate a new variable to use in their research. In some cases, researchers like to make that new variable publicly available (through their own data repository) so that it can be easily accessed for follow-up studies. This is permissible as long as researchers are using public NLS data (rather than restricted) and that they make it clear they are the author of the variable rather than the NLS team.