# Sample Weights & Design Effects

## Sample Weights & Design Effects

### Methodology for Calculating Weights

The assignment of individual respondent weights involved a number of different adjustments. Complete details are found in the NLSY97 Technical Sampling Report, which has step-by-step descriptions of the entire adjustment process. Some of the major adjustments are:

Adjustment One. Computation of a base weight, reflecting the case's selection probability for the screening sample. This step also corrects for missed housing units and caps the base weights in the supplemental sample to prevent extremely high weights;

Adjustment Three. Development of a combination weight to allow the black and Hispanic cases from the cross-sectional sample to be merged with those from the supplemental sample (non-Hispanic, non-blacks in the supplemental sample were not eligible for the NLSY97 sample);

Calculation of Weights under New "Cumulating Cases" Strategy

Starting in round 4, a new "Cumulating Cases" strategy has been used to calculate weights. Instead of calculating separate CX and SU (cross-sectional and supplemental) base weights and then later combining the separate sample weights, a Horvitz-Thompson approach to weighting is used.

In the Horvitz-Thompson approach, the weights are determined across samples depending only on the overall selection probability (into either sample) of the individual element, giving a single unified set of weights for the cumulated cases. This approach is straightforward. Only Adjustments 1 and 3 above are modified. The probability for a case to be in either sample is simply the sum of the probabilities to be in each sample because the samples are independently drawn. Thus, the base weight for a case is the inverse of the sum of sample selection probabilities for a case:

Under the old strategy, the separate CX and SU step 1 weights were the reciprocal of the selection probabilities for just that sample:

The only other change is that Adjustment 3 is now unnecessary.

### Summary of NLSY97 Weights for Rounds 1 through 20

Table 1 summarizes the weights for each of the NLSY97 rounds. The sum of the weights for all of these weights is equal, but when there are fewer positive weights to share this sum, the weights tend to increase. The positive weights for Round 1 respondents who are nonrespondents for any of the other rounds are spread around the round's (or panel's) respondents.

#### Table 1: Summary Table for NLSY97 Rounds 1-20 Weights Under "Cumulating Cases" Strategy

 R1 R2 R3 R4 R5 R6 R7 R8 R9 N (> 0) 8,984 8,386 8,209 8,081 7,883 7,898 7,756 7,503 7,338 Sum 19,378,453 19,378,454 19,378,453 19,378,453 19,378,453 19,378,453 19,378,454 19,378,453 19,378,453 Mean 2,157.00 2,310.81 2,360.64 2,398.03 2,458.26 2,453.59 2,498.51 2,582.76 2,640.84 Standard Deviation 931.01 1,011.70 1,021.15 1,052.08 1,066.22 1,092.47 1,118.88 1,162.17 1,200.80 Min. (> 0) 760.71 846.23 858.76 889.08 866.55 864.75 900.60 868.23 916.66 5th %ile 887.2 938.12 969 983.16 997.49 992.86 1,006.25 1,016.25 1,037.19 25th %ile 1,072.03 1,135.96 1,168.73 1,185.60 1,222.83 1,204.79 1,230.15 1,273.38 1,263.82 Median 2,596.59 2,777.05 2,806.46 2,869.30 2,955.45 2,955.01 3,009.67 3,144.57 3,215.89 75th %ile 2,909.73 3,111.65 3,154.13 3,203.60 3,271.03 3,293.50 3,356.67 3,458.88 3,579.02 95th %ile 3,268.16 3,556.58 3,649.77 3,786.59 3,831.66 3,904.73 3,997.80 4,158.40 4,183.79 Max. 15,761.82 16,718.19 16,646.80 16,950.37 17,277.02 17,167.26 17,852.01 18,026.17 18,594.09 R10 R11 R12 R13 R14 R15 R16 R17 R18 N (> 0) 7,559 7,418 7,490 7,559 7,479 7,423 7,141 7,103 6,734 Sum 19,378,454 19,378,453 19,378,455 19,378,453 19,378,452 19,378,452 19,378,452 19,378,454 19,378,453 Mean 2,563.63 2,612.36 2,587.20 2,563.63 2,591.05 2,610.60 2,713.69 2,728.21 2,877.70 Standard Deviation 1,157.31 1,180.74 1,178.50 1,175.61 1,201.68 1,225.20 1,271.64 1,282.70 1,318.34 Min. (> 0) 918.64 897.41 894.00 866.01 876.70 882.87 891.34 890.61 926.76 5th %ile 1,022.26 1,046.29 1,024.90 997.75 1,011.30 1,020.56 1,050.96 1,064.28 1,093.74 25th %ile 1,230.84 1,267.50 1,254.40 1,235.94 1,231.02 1,242.24 1,278.73 1,282.44 1,378.47 Median 3,124.67 3,178.10 3,146.90 3,144.54 3,155.05 3,167.25 3,327.70 3,390.06 3,525.73 75th %ile 3,469.62 3,541.70 3,534.60 3,516.00 3,563.37 3,597.63 3,754.63 3,755.54 3,903.84 95th %ile 4,044.36 4,127.30 4,054.50 4,012.05 4,118.80 4,232.26 4,310.88 4,380.94 4,586.40 Max. 18,521.05 18,857.91 19,165.40 18,911.82 19,222.75 19,377.82 20,094.05 19,803.54 20,072.81 R19 R20 Future R Future R Future R Future R Future R Future R Future R N (> 0) 6,947 6,713 -- -- -- -- -- -- -- Sum 19,378,453 19,378,452 -- -- -- -- -- -- -- Mean 2,789.47 2,886.71 -- -- -- -- -- -- -- Standard Deviation 1,275.79 1,351.23 -- -- -- -- -- -- -- Min. (> 0) 918.04 937.37 -- -- -- -- -- -- -- 5th %ile 1,076.51 1,090.59 -- -- -- -- -- -- -- 25th %ile 1,346.11 1,359.18 -- -- -- -- -- -- -- Median 3,397.94 3,527.77 -- -- -- -- -- -- -- 75th %ile 3,786.88 3,978.75 -- -- -- -- -- -- -- 95th %ile 4,441.97 4,626.92 -- -- -- -- -- -- -- Max. 20,043.66 21,090.85 -- -- -- -- -- -- --

### Calculating Design-Corrected Standard Errors for NLSY97

Overview

The National Longitudinal Survey of Youth, 1997 (NLSY97) is a combination of two area-probability samples. The Cross-Sectional (CX) sample is an equal-probability multi-stage cluster sample of housing units for the entire United States (every housing unit in the United States in 1997 had an equal probability of being in the CX sample). The Supplemental (SU) Sample is a multi-stage cluster sample of housing units that oversamples Hispanic and non-Hispanic Black youths; it is designed so that every eligible Hispanic and non-Hispanic Black youth in the United States in a housing unit in 1997 had an equal probability of being in the SU sample.

Since these samples are cluster samples, standard errors are larger for the NLSY97 than simple random sample calculations (calculated without correction for the design) would indicate. To correctly calculate standard errors, design variables must be used in statistical software. Without these design variables, statistical software will assume a simple random sample and underestimate standard errors. For a handful of variables selected below in Table 1, we see multiplier effects on standard errors (labeled DEFT) ranging from 1.30 to 1.62.

In order to facilitate the calculation of design effects, we provide two design variables for every Round 1 NLSY97 interview: VSTRAT and VPSU.  VSTRAT is the Variance STRATum while VPSU is the Variance Primary Sampling Unit.  The combination of VSTRAT and VPSU reflect the first-stage and second-stage units selected as part of the NORC National Sampling Frame. There are two second-stage units (VPSU) for each first-stage unit (VSTRAT).

First stage units in the NLSY97 are called Primary Sampling Units (PSUs), each of which is composed of one or more counties. The largest urban areas are selected with certainty to guarantee their representation in NSLY97. Second-stage stage units in the NLSY97 are called segments, each of which is one or more Census-defined blocks. The first-stage and second-stage units are selected with probabilities proportional to size (housing units for the CX sample; minority youths for the SU sample), and the sample housing units (third-stage units) are then selected to be an equal-probability sample.

To create the variables VSTRAT and VPSU, we recode the PSUs and segments, depending on whether the PSU was selected with certainty. Certainty PSUs are considered strata, so all the segments in one certainty PSU are in one VSTRAT value, with segments divided so that half are assigned to VPSU = 1 while the other half are assigned to VPSU = 2.  Some certainty PSUs are large enough to be divided into multiple VSTRAT values with up to twenty segments in one VSTRAT value (ten in each VPSU). Non-certainty PSUs are paired into one VSTRAT value with one PSU assigned to VPSU = 1 while the other PSU is assigned to VPSU = 2. It is rare, but possible, for PSUs to be combined in one VPSU. This strategy was designed by Kirk Wolter.

Here is sample Stata code to analyze the variable ANALYSISVAR within an NLSY97DATAFILE with the appropriate weight variable for the analysis, WTVAR:

use NLSY97DATAFILE.dta,clear
svyset [pweight=WTVAR] , strata(vstrat) psu(vpsu) singleunit(scaled)
svy: mean ANALYSISVAR                 //mean for continuous variables
svy: proportion ANALYSISVAR        //proportion for categorical variables
estat effects                                  // design effects--this generates the DEFF and DEFT
svy, subpop (if SUBGROUP==1): mean ANALYSISVAR    // mean within a subpopulation
svy: tabulate ANALYSISVAR  //one way table

The results generated by running this code on select NLSY97 variables are shown in Table 1. They report design-corrected standard errors as well as standard errors assuming simple random sampling as would be estimated in the absence of these design variables.

 Table 1: Variance estimation for selected NLSY97 variables Variable Mean/            proportion Sample size Estimate Design-corrected std error SRS std error DEFF DEFT Gross family income 2009 from 2010 interview [T5206900] mean 6527 64858.5 923.831 773.64 1.728 1.314 ASVAB score for Math (percentile) [R9829600] mean 7093 50.410 0.638 0.367 3.447 1.857 ASVAB score for Math (percentile) for females [R9829600, R0536300] mean 8102 51.508 0.705 0.509 2.222 1.491 Weeks worked in 2008 [Z9061800] mean 8011 40.021 0.270 0.224 1.700 1.304 Ever received a bachelor's degree or higher as of 2011 interview [T6657200] proportion 7398 0.300 0.009 0.006 2.614 1.617 Never received a high school diploma as of 2011 interview [T6657200] proportion 7398 0.199 0.007 0.005 2.368 1.539 Lived with 2 parents (at least 1 bioparent) in Round 1 [R1205300] proportion 8953 0.675 0.007 0.005 2.216 1.489 Tabulations use the Round 1 Cumulating Cases Sampling Weight [R1236101]

One thing to note is that the sample size for math ASVAB scores for females is greater than the sample size for math ASVAB scores for all respondents. Though the estimate uses only the 3503 observations of females with non-missing data for ASVAB scores, the variance calculation uses information from all observations in the data whether or not they have valid values for the specific variables. (See here for more information.)

In SAS, one would use PROC SURVEYFREQ to calculate the design-corrected standard errors. SPSS is menu-driven, so no code is given here, but you can create design-corrected standard errors within SPSS using the Complex Samples add-on.

Potential Problems with Sparse Data

If you are calculating design-corrected standard errors using a subsample of the NLSY97 data set and/or using a variable that has a large number of observations with missing values, you may receive a message such as this:

STATA error handling: "missing standard error because of stratum with single sampling unit"

VSTRAT and VPSU were created so that there was a minimum of 7 NLSY97 respondents within a VSTRAT/VPSU cell.  However, if all respondents within a cell are missing on a variable, it will be impossible to calculate the standard error. If the dataset is subset (to males or females, for example), this error becomes more likely to happen.

The best workaround is to merge two VSTRATA together to eliminate this problem (the VSTRATA are ordered so that similar VSTRATA are numerically consecutive). To diagnose the problem, run a frequency of the data by VSTRAT/VPSU and look for VSTRAT values with only one VPSU with respondents. Here is an example:

 VSTRAT VPSU # of cases … … … x-1 1 3 x-1 2 5 X 1 4 x+1 1 6 x+1 2 4 … … …

The error occurs because VSTRAT = x has four cases with VPSU=1 but none with VPSU=2. This prevents Stata from calculating the variance for this strata (it has nothing to compare VPSU=1 with). The cases were sorted by VSTRAT and VPSU so that the most similar VSTRATA are numbered consecutively and the most similar cases are always within the two VPSU values of one VSTRAT value. The easiest solution is therefore to make room for VSTRAT x within VSTRAT x-1 by combining the two VPSUs within VSTRAT x-1. Then, the VSTRAT x cases are moved to the other VSPU value within VSTRAT x-1. Note that VSTRAT x-1 is chosen instead of VSTRAT x+1 in this example because VSTRAT x-1 has fewer total cases (8) than VSTRAT x+1 (10). In some, but not all cases, the VPSU 1 cases in VSTRAT x are "more similar" to the VSTRAT x-1 cases than the VSTRAT x+1 cases. Here are the two programming steps:

1. If VSTRAT = x-1 and VPSU = 2 then VPSU=1

2. If VSTRAT = x then VSTRAT=x-1 and VPSU=2.

Here is the revised frequency:

 VSTRAT VPSU # of cases … … … x-1 1 8 x-1 2 4 x+1 1 6 x+1 2 4 … … …

This eliminates the "stratum with a single sampling unit." In severe cases of data subsets, this step may be required more than once, although this may also indicate that the "clustering" has been removed (by using less than 10 percent of the total sample, for example).