Sample Construction

Most of the statistical formulas associated with sampling theories are based upon the assumption of simple random sampling. Specifically, the statistical formulas for specifying the sampling precision (estimates of sampling variance), given particular sample sizes, are premised on simple random sampling. Unfortunately, random sampling requires that all of the elements in the population have an equal chance of being selected. Since no enumeration of the total population of the United States (or its subdivisions) is available, all surveys of the general public are based upon an approximation of the actual population and survey samples are generated by a process closely resembling true random sampling.

The survey sample was based on a modified stratified random digit dialing method, using an area probability/RDD sample rather than a single-stage/RDD sample. There are several important advantages to using an area probability base: (1) it draws the sample proportionate to the geographic distribution of the target population rather than the geographic distribution of telephone households, which is vital to constructing unbiased population estimates from telephone surveys; (2) it allows greater geographic stratification of the sample to control for known geographic differences in non-response rates; and (3) it facilitates the use of Census estimates of population characteristics to weight the completed sample to correct for other forms of sampling bias. Moreover, the precision of sample estimates is generally improved by stratification.

Hence, as specified for the study design for the survey, the adult household population of the United States was stratified by the ten NHTSA regions. The estimated distribution of the population by stratum was calculated on the basis of the Bureau of the Census, Resident Population of the United States, Regions and States by Selected Age Groups and Sex: April 1, 1990 Census and July 1, 1995 to July 1, 2050 Estimates (release date, February 1998). At the time of the survey, these were the most recent projections of the distribution of adult population by state. Based on these Census data on the geographic distribution of the target population, the total sample was proportionately allocated by stratum. The geographic allocation of the cross-sectional sample for the survey is presented in Table 1.

TABLE 1
Population Aged 16 and Older by NHTSA Region:
February, 1998
Region State Population
207,594,256
Cross-Section
Proportion
100.00%
Cross-Section
Sample
(3,000)
Region I CT, ME, MA, NH, RI, VT 10,565,351 5.09 153
Region II NJ, NY 20,321,261 9.79 294
Region III DE, DC, MD, PA, VA, WV 21,443,106 10.33 310
Region IV AL, FL, GA, KY, MS, NC, SC, TN 39,268,081 18.92 567
Region V IL, IN, MI, MN, OH, WS 37,650,887 18.14 544
Region VI AR, LA, NM, OK, TX 23,792,984 11.46 344
Region VII IA, KS, MO, NE 9,780,660 4.71 141
Region VIII CO, MT, ND, SD, UT, WY 6,771,875 3.26 98
Region IX AZ, CA, HI, NV 29,589,196 14.25 428
Region X AK, ID, OR, WA 8,410,855 4.05 122
 Source: Resident Population of the United States, Regions and States by Selected Age Groups and Sex: April 1, 1990 Census and July 1, 1995 to July 1, 2050 Estimates, U.S. Bureau of the Census, (release date, February 1998).

Once the sample had been geographically stratified with sample allocation proportionate to population distribution, a sample of assigned telephone banks were randomly selected from an enumeration of the Working Residential Hundreds Blocks of the active telephone exchanges within the region. The Working Hundreds Blocks were defined as each block of 100 potential telephone numbers within an exchange that included 3 or more residential listings. (Exchanges with one or two listings were excluded because in most cases such listings represent errors in the published listings).

The use of residential listings to identify working residential exchanges is generally described as "listed assisted" or "truncated" RDD sampling. In a series of empirical studies, Brick, et. al. demonstrated that only about four percent of all telephone households are excluded in national samples using this method. In addition, these studies indicate that the differences between covered and uncovered samples are trivial in most instances. The principal advantage of "list assisted" sampling is that an equal probability systematic sample of telephone numbers can be selected under this procedure and the variances of estimates from the list-assisted sample are usually lower than those from a clustered design like the Mitovsky-Waksberg RDD method.

In the third stage sample, a two digit number was randomly generated by computer for each Working Residential Hundreds Block selected in the second stage sample. This third stage sampling process is the random digit dialing (RDD) component. Every telephone number within the Hundreds Block has an equal probability of selection, regardless of whether it is listed or unlisted.

The third stage RDD sample of telephone numbers was then dialed by SRBI interviewers to determine which were currently working residential household phone numbers. Non-working numbers and non-residential numbers were immediately replaced by other RDD numbers selected within the same stratum in the same fashion as the initial number. Ineligible households (e.g., no adult in the household, language barriers) were also immediately replaced. Non-answering numbers were not replaced until the research protocol (in this study, a five call protocol) was exceeded.