NLSY79 Appendix 23: Revised Asset and Debt Variables and Computed TOTAL Net Wealth Variables
NLSY79 Appendix 23: Revised Asset and Debt Variables and Computed TOTAL Net Wealth Variables
The NLSY79 public release data contains a variety of asset and debt data, as well as some computed variables depicting total net family wealth at each survey point. Below are descriptions of revised asset and debt variables for survey years 1985-2000 (found in ASSETS area of interest with question names ending in “_REVISED”) and the constructed Total Net Family Wealth (TNFW_TRUNC) variables.
Revised Asset and Debt Variables
The public release contains a revised set of asset and debt variables for survey years 1985-2000. These revised asset and debt variables were created in 2008 and fixed a number of problems with the NLSY79 data by eliminating some implausible outliers and generating uniform topcodes for all rounds. These variables are then used in the computation of Total Net Family Wealth variables. Two versions of each asset and debt variable appear in the data. Using the market value of residential property owned by the respondent in 1987 as an example, the following variables appear in the data:
R23627.00 [Q13-118] MARKET VALUE OF RESIDENTIAL PROPERTY R/SPOUSE OWN (TRUNC)
R23627.01 [Q13-118_REVISED] MARKET VALUE OF RESIDENTIAL PROPERTY R/SPOUSE OWN (TRUNC) (REVISED)
In the example, Q13-118 is the original variable which remains in the dataset to allow researchers to reproduce previous results. Q13-118_REVISED, as indicated, is a revised version of Q13-118, which uses a modified topcoding algorithm. This algorithm provides researchers with some additional information with respect to the topcoded range of responses.
Table 1 gives an example using the 1987 property value question of how seven different types of cases were handled by the revision process.
Table 1. Hypothetical Examples of How NLSY79 Asset/Debt Data Were Modified
|
Original R23627.00 Q13-118 |
Revised R23627.01 Q13_118-REVISED |
Explanation |
Case #1 |
$150,001 |
$276,984 |
Originally above the topcode and the value is still above the topcode but the topcode is now higher, revealing more information |
Case #2 |
$150,001 |
$151,500 |
Originally above topcode and now below topcode. Value is no longer topcoded. |
Case #3 |
-1 |
-1 |
Originally “refused”. |
Case #4 |
-2 |
-2 |
Originally “don't know”. |
Case #5 |
-3 |
-3 |
Originally an “invalid skip”. |
Case #6 |
-4 |
-4 |
Originally a valid skip. Since valid skip means does not have the asset the item is changed to zero. |
Case #7 |
-2 |
-2 |
Originally “don't know”. |
Not every asset or debt variable has a new revised. 15 asset/debt categories were created in each year, matching the categories found in the 1990s NLSY79 wealth module. The categories are: Home Value, Mortgage Value, Property Debt Value, Cash Saving, Stocks/Bonds, Trusts, Business Assets, Business Debts, Vehicle Value, Vehicle Debt, Possession Value, Other Debt Value, IRA, 401K, Certificate of Deposit Value. In 2000 and subsequent survey years, some complexities were added to the asset and wealth module as the cohort aged. For these more recent survey rounds, each asset/debt category corresponds to multiple individual asset/debt variables.
For example, in 2004 respondents were asked to report the values of two homes. Their values are combined to form the "home value" category. Similarly, in 2004 the "stocks/bonds" category represents the individually-reported values of government bonds, mutual funds, life insurance surrender values, stocks, corporate bonds, and money owed to the respondent.
Details of the Revisions to Asset Values 1985-2000
The process for creating the revised variables described above can be broken down into the three steps described below.
Step 1 -- Cleaning Raw Data
The original raw data for many asset/wealth variables contains a number of “out-of-range” codes. Some of these out-of-range codes denoted values that exceeded the maximum value accommodated in the questionnaire and allowed for by the data entry software. These out-of-range codes were originally given the top code value when released to the public. Examination of cases with out-of-range suggests some were mistakes and not actually out of range. This issue arises in the pre-1993 Paper and Pencil interviewing (PAPI) years, but is most prevalent in survey years 1988 and 1989. For the REVISED/Q13-118_REVISED set of variables, these out-of-range codes were assigned an "invalid missing" (-3) code. These cases can be found by identifying values that were top coded in the original version of a variable (Q13-118) but were assigned an “invalid missing” code (-3) in the REVISED version (Q13-118_REVISED).
Step 2 -- Unfolding Brackets
Unfolding brackets are a means for respondents to estimate certain asset values of which they were unsure. Respondents who are unable to report a specific value for an asset or liability are asked a series of questions to establish a loose range for a value (see Appendix 20: Round 20 (2002) Early Bird and Income Recall Experiments for further information on the use of unfolding brackets in the NLSY79). A respondent who does not know or refuses to report the value of his certificate of deposit (CD) would first be asked if the CD is worth more than a randomly assigned entry amount ($10,000 for some respondents and $20,000 for others). If the value is not above the entry amount, the respondent is asked if the value of his CD is worth $5,000 or more. If the value is above the entry amount, he is asked if the value would amount to $30,000 or more. These three questions result in four potential reported ranges: 1) below $5,000; 2) between $5,000 and the entry amount; 3) between the entry amount and $30,000; and 4) above $30,000.
Beginning in survey year 2000, as part of an experiment for estimation of values, unfolding brackets were used for four asset/debt categories and the net wealth question. In 2004, unfolding brackets were expanded to all asset/debt categories for which respondents were not sure of a value.
Estimates given for asset values were incorporated into the calculation of TNFW_TRUNC variables by:
- Using the midpoint of the range for quartiles 1, 2 and 3 and;
- Using the median of values between the floor and highest value reported for quartile 4.
Step 3 -- Revision of Top Codes
In the final step, new and consistent top codes were calculated for the wealth data.
The NLSY79 has used three basic types of top coding algorithms for financial data. In the early years of the survey (up to 1988), every response above a specified ceiling value, such as $100,000 for some variables, is recoded to the ceiling value plus one dollar, such as $100,001. Unfortunately this algorithm results in a sharp downward bias in the sample mean because the right tail of the distribution is sharply truncated. From 1989 to 1994, a new algorithm was implemented, replacing all values above the hard ceiling with the average of all outlying values. Beginning in 1996, the hard ceiling value was eliminated and the average of the top 2% of values above the traditional hard ceiling was used as a topcode value.
In addition to the multiple topcoding methods used over time, some researchers commented on the lack of information above a hard ceiling. The data cleaning steps described above also dramatically changed a number of the highest asset values. For these reasons, home and vehicle values were re-topcoded. Homes and vehicles are clearly identifiable objects which pose the possibility of reidentifying respondents. Business-related values over $1,000,000 are also topcoded due to reidentification concerns. Other asset or debt categories are no longer topcoded because it is difficult to use them to identify a particular respondent. If the variable was previously topcoded it was re-topcoded using the top 2% of values as described above.
Computed Net Wealth Variables
The current data release includes an updated set of net wealth variables. Unlike previous version of these items, these updated variables (named “TNFW_TRUNC”) do not incorporate imputations.
Below is a brief history and description of the variables that were used as components, sometimes combined by category, to create the TNFW_TRUNC figures. Information is still available on individual assets and debts for researchers who want to probe a particular aspect of a respondent’s financial life, such as their debts or ownership of vehicles. For a link to the programs used to create the TNFW_TRUNC variables, go to Programs Creating Total Net Family Wealth Variables.
Asset and debt questions were first introduced in 1985. They were included in each survey through the 1990s, with the exception of 1991. Beginning in 2000, the assets module has been included in alternating rounds. These questions solicit information about specific categories of assets and debt. The 1985 survey included questions about ten categories. In subsequent years, these categories were expanded. Through the 1990s, the values for each category were collected with a single question per category. In 2000 and beyond, multiple values were collected with respect to some categories, providing some additional definition to those categories. For instance, questions about home value were expanded to include the value of a secondary residence as well. A single question on stocks/bonds/mutual funds/etc. was split to ask individually about stock values, bond values, mutual fund values, etc. Questions on business values and debts were expanded to establish sole or joint ownership of the business and percentage of the business that respondents owned if applicable.
In the 2000 and subsequent surveys, respondents were also given the opportunity to provide estimates for some values of which they were unsure or for which they refused to report a precise figure. Respondents had the opportunity to report values within a bounded range (reporting a low and high value) and/or to report values simply over or under certain amounts (see description of unfolding brackets earlier in the appendix), enabling a range to be established. These estimates were incorporated into the TNFW_TRUNC values where available. Questions containing the string “_SR000001” and “_SR000002” contain values for low- and high-bounded ranges respondents may have reported. Questions containing the string “UAB_A”, “UAB_B” and “UAB_C” contain responses to unfolding brackets if the respondent reported estimates in that manner. The exception is 2000, the introductory year, when only a few questions (Q13-122, Q13-123C and Q13-125) were followed by unfolding brackets if necessary.
The categories of assets and debt are listed below, along with the survey year in which they were introduced.
1) Home value (1985)
2) Mortgages (1985)
3) Other residential debt (1985)
4) Value of farm/business/real estate (1985)
5) Debts of farm/business/real estate (1985)
6) Market value of vehicles (1985)
7) Debt of vehicles (1985)
8) Value of stocks/bonds/mutual funds (1988)
9) Value of CDs (1994)
10) Value of trusts (1988–2000 only)*
11) Value of IRAs (1994)
12) Value of 401ks and 403bs (1994)
13) Value of cash savings (1985)
14) Value of other assets like jewelry/collections (1985)
15) Value of all other debts like credit cards/student loans (1985)
*Trusts were dropped in 2000 as few respondents reported being trust fund recipients.
Table 2 lists the reference numbers, question names and titles from the 1996 survey for each of the categories, and the imputed variable created for each category:
Table 2. Round 17 (1996) Asset and Debt Variable Categories
Asset/Debt Category |
R Num | Root Qname (rev/trunc version used wherever exists) | Description | |
Q13-118_REVISED | R57282.01 | Q13-118 | Mkt Val Res Property R-Sp Own 96 | |
Q13-119 | R57283.01 | Q13-119 | Amount R-Sp Owe On Res Property 96 | |
Q13-120_REVISED | R57284.01 | Q13-120 | Amt Oth Debt R-Sp Owes On Res Prop 96 | |
Q13-122_REVISED | R57286.01 | Q13-122 | Amount In Savings Accounts 96 | |
Q13-123A_REVISED | R57288.01 | Q13-123A | Amount In IRAs-Keough 96 | |
Q13-123C_REVISED | R57290.01 | Q13-123C |
|
|
Q13-123E_REVISED | R57292.01 | Q13-123E | Amount In CDs, Loans, Mortg 96 | |
Q13-125_REVISED | R57294.01 | Q13-125 | Mkt Val Of Stocks, Bonds R-Sp Have 96 | |
Q13-127_REVISED | R57296.01 | Q13-127 | Total Val Of Estate, Invest Trust 96 | |
Q13-131_REVISED | R57300.01 | Q13-131 | Ttl Mkt Val Farm, Bsns, Oth Prop? 96 | |
Q13-132_REVISED | R57301.01 | Q13-132 | Ttl Amt Debts, Liablty Farm, Bsns 96 | |
Q13-135_REVISED | R57304.01 | Q13-135 | Amt R-Sp Owe On Vehicles 96 | |
Q13-136_REVISED | R57305.01 | Q13-136 | Mkt Val Of Vehicles R-Sp Own 96 | |
Q13-138_REVISED | R57307.01 | Q13-138 | Ttl Mkt Val Items Over $500 96 | |
Q13-140_REVISED | R57309.01 | Q13-140 | Total Amt R-Sp Owe To Creditors 96 |
As an example, Table 3 lists the twelve variables used in 2004 to create the home value variable, as a component of "TNFW_TRUNC.”
Table 3: 2004 Home Value Variables
R Number |
Qname |
Title |
R83792.00 |
NFA_1A_TRUNC |
Mkt Val Res Property R/Sp Own 2004 |
R83793.00 |
NFA_1A_SR000001 |
Est Market Value Of Residential Property R/Spar Own |
R83794.00 |
NFA_1A_SR000002 |
Est Market Value Of Residential Property R/Spar Own |
R83795.00 |
NFA_1A_UAB_A |
Market Value Of Residence In 2003 More Than Entry Amount |
R83795.10 |
NFA_1A_UAB_B |
Market Value Of Residence In 2003 More Than $5k |
R83796.00 |
NFA_1A_UAB_C |
Market Value Of Residence In 2003 More Than $30k |
R83814.00 |
NFA_2A_TRUNC |
Market Value of (2nd) Residential Property R/Spouse Own |
R83814.00 |
NFA_2A_SR000001 |
Est Market Value Of (2nd) Residential Property R/Spar Own |
R83815.00 |
NFA_2A_SR000002 |
Est Market Value Of (2nd) Residential Property R/Spar Own |
R83816.00 |
NFA_2A_UAB_A |
Market Value Of (2nd) Residence In 2003 More Than Entry Amount |
R83816.10 |
NFA_2A_UAB_B |
Market Value Of (2nd) Residence In 2003 More Than $5k |
R83817.00 |
NFA_2A_UAB_C |
Market Value Of (2nd) Residence In 2003 More Than $30k |
Table 4 lists the root variables included in the asset and debt categories starting with Round 21 (2004). While only the root question names are shown in Table 4, the additional bounded range and unfolding bracket estimate questions, shown in Table 3, are incorporated as well whenever applicable.
Table 4: Data Underlying Mid-Level Asset and Debt Categories Used Beginning in Round 21 (2004)
Asset/Debt Category |
Rnum (the first Rnum is listed for sets of variables that fall in loops and have more than one occurrence) | Root Qname (truncated version used wherever exists) | Description |
HOME/RESIDENTIAL VALUE | |||
R83792.00 | NFA_1A | Value of 1st Home | |
R83814.00 | NFA_2A | Value of 2nd Home/Time Share | |
MORTGAGE | |||
R83799.00 | NFA_1B | Mortgage on 1st Home | |
R83820.00 | NFA_2B | Mortgage on 2nd Home/Time Share | |
PROPERTY DEBT | |||
R83808.00 | NFA_1C | Other Property Debt on 1st Home | |
R83827.00 | NFA_2C | Other Property Debt on 2nd Home/Time Share | |
CASH SAVINGS | |||
R83630.00 | FA_1A | Total Amount In Checking, Savings, And Money Market Funds | |
STOCKS/BONDS | |||
R83648.00 | FA_3A | Total Money If R-Spar Cashed In US Government Savings bonds | |
R83657.00 | FA_4A | Total Money If R-Spar Sold Mutual Funds | |
R83666.00 | FA_5A | Total Money If Insurance Policies Cashed | |
R83764.00 | FA_9A | Money R-Spar Have If Sold/Paid Amt Owe On Stock | |
R83773.00 | FA_10A | Amt Of $ If Cash/Pay Off Securities/ Bonds | |
R83782.00 | FA_11A | R-Spouse/Partner Owed Money From Personal Or Mortgage Loans | |
TRUSTS | |||
These variables contain only -4s and -5s after survey year 2000, since the trust asset value questions were not asked. Because there was a chance the underlying questions would be brought back, the variables (all missing) are still present in several years after 2000 to preserve a consistent sequence. |
|||
BUSINESS ASSETS | |||
R83868.00 | Q13-FJT-11.## | Market Value Of Farm in 2003, Excluding Crops Held Under Commodity Credit Loans |
|
R83873.00 | Q13-FJT-12B.## | Percentage Of Farm Owned By R Or Spouse | |
R83896.00 | Q13-BPPJT-11.## | Market Value Of Business Professional Practice | |
R83904.00 | Q13-BPPJT-12B.## | Percentage Of Professional Practice That R Owns | |
R83907.00 | Q13-BPPJT-12E.## | Market Value Of R Share Of Professional Practice | |
R83927.00 | Q13-REJT-11.## | Market Value Of Additional Real Estate | |
R83932.00 | Q13-REJT-12B.## | Percentage Of Real Estate R-Spouse Own | |
R83936.00 | Q13-REJT-12E.## | Market Value Of R Share Of Real Estate | |
R83943.00 | Q13-131 | Total Market Value Of Farm/Business/Other Property R/Spouse Own |
|
BUSINESS DEBT | |||
R83869.00 | Q13-FJT-12.## | Total Amount Of Debts Owed On Farm | |
R83900.00 | Q13-BPPJT-12.## | Total Amount Of Debts Owed On Professional Practice |
|
R83950.00 | Q13-132 | Total Amount Of Debts On Farm/Business/Other Property R/Spouse Owe |
|
CAR VALUE | |||
R83979.00 | NFA_4C.## | Market Value Of Vehicle | |
R83611.00 | SC_12A.01.## | Current Value Of Vehicle | |
R84105.00 | NFA_5C | Market Value Of Other Personal Use Vehicles | |
CDs | |||
R83639.00 | FA_2A | Total Money If R-Spouse Cashed In Certificate of Deposits/CDs |
|
CAR DEBT | |||
R84080.00 | NFA_4F.## | Total Amount Owed By R-Spouse On Vehicle After Last Car Payment |
|
R84098.00 | NFA_5B | Total Amount Owed By R-Spar On All Other Personal Use Vehicles |
|
R83621.00 | SC_12B.## | Balance Owed On Vehicle After Last Payment |
|
POSSESSIONS | |||
R84116.00 | NFA_6E | Market Value Of Collections Worth $1000 Or More |
|
R84125.00 | NFA_7D | Market Value Of Individual R-Spouse Items Worth $1000 Or More |
|
OTHER DEBT | |||
R84132.00 | DEBT_1A | Total Balance Owed On All Credit Card Accounts Together |
|
R84139.00 | DEBT_2A | Total Amount R-Spouse Owes On Student Loans |
|
R84147.00 | DEBT_2D | Total Amount Owed On Student Loans For Children |
|
R84154.00 | DEBT_3A | Total Amount R-Spouse Owes To Other Businesses After Most Recent Payment |
|
R84161.00 | DEBT_4A | Total Amount R-Spouse Owes Other Persons, Institutions, Companies to nearest $1000 | |
IRAs | |||
R83733.00 | FA_8D_TRUNC.## | Total Money If Tax Advantage Account Cashed | |
401Ks | |||
R83683.00 | FA_6E | Total Value Of Emp-Sponsored Retiremt Plans | |
R83700.00 | FA_7C | Tot Balance Of Spar-Emp Sponsored Retiremt Plans |
Researchers can find all the original respondent data (truncated where necessary) in the NLSY79 dataset in the NLS Investigator by searching the root qnames listed in the tables above.