Content
Sample Design
Sample Construction
Screening to Determine Household Eligibility
Selection of Respondent Within Household

Young Adult Oversample
Initial Contact
Spanish Language Interviews
Refusal Conversion
Field Outcomes
Sample Weighting
Precision of Sample Estimates
Estimating Statistical Significance
Statistical Comparisons Between Samples
References
Technical Report Documentation
APPENDIX A1: English Language Questionnaire Version 1
APPENDIX A2:
English Language Questionnaire Version 2
APPENDIX B1: Spanish
Language Questionnaire Version 1
APPENDIX B2: SPANISH Language Questionnaire Version 2

Estimating Statistical Significance

The estimates of sampling precision presented in the previous section yield confidence bands around the sample estimates, within which the true population value should lie. This type of sampling estimate is appropriate when the goal of the research is to estimate a population distribution parameter. However, the purpose of some surveys is to provide a comparison of population parameters estimated from independent samples (e.g. annual tracking surveys) or between subsets of the same sample. In such instances, the question is not simply whether or not there is any difference in the sample statistics which estimate the population parameter, but rather is the difference between the sample estimates statistically significant (i.e., beyond the expected limits of sampling error for both sample estimates).

To test whether or not a difference between two sample proportions is statistically significant, a rather simple calculation can be made. Call the total sampling error (i.e., var (x) in the previous formula) of the first sample s1 and the total sampling error of the second sample s2. Then, the sampling error of the difference between these estimates is sd which is calculated as:

Estimating Statistical Significance

Any difference between observed proportions that exceeds sd is a statistically significant difference at the specified confidence interval. Note that this technique is mathematically equivalent to generating standardized tests of the difference between proportions.

An illustration of the pooled sampling error between subsamples for various sizes is presented in Table 7. This table can be used to indicate the size of difference in proportions between drivers and non-drivers or other subsamples that would be statistically significant.

TABLE 7. Pooled Sampling Error Expressed as Percentages For Given Sample Sizes (Assuming P=Q)

Sample

Size

4000 14.1 10.0 7.1 5.9 5.1 4.7 4.3 4.0 3.8 3.6 3.5 3.0 2.7 2.5 2.4 2.3 2.2
3500 14.1 10.0 7.1 5.9 5.2 4.7 4.3 4.1 3.8 3.7 3.5 3.0 2.7 2.6 2.4 2.3
3000 14.1 10.0 7.2 5.9 5.2 4.7 4.4 4.1 3.9 3.7 3.6 3.1 2,8 2.7 2.5
2500 14.1 10.0 7.2 6.0 5.3 4.8 4.5 4.2 4.0 3.8 3.7 3.2 2.9 2.8
2000 14.2 10.1 7.3 6.1 5.4 4.9 4.6 4.3 4.1 3.9 3.8 3.3 3.1
1500 14.2 10.2 7.4 6.2 5.5 5.1 4.7 4.5 4.3 4.1 4.0 3.6
1000 14.3 10.3 7.6 6.5 5.8 5.4 5.1 4.8 4.7 4.5 4.4
900 14.4 10.4 7.7 6.5 5.9 5.5 5.2 4.9 4.8 4.6
800 14.4 10.4 7.8 6.6 6.0 5.6 5.3 5.1 4.9
700 14.5 10.5 7.9 6.8 6.1 5.7 5.5 5.2
600 14.6 10.6 8.0 6.9 6.3 5.9 5.7
500 14.7 10.8 8.2 7.2 6.6 6.2
400 14.8 11.0 8.5 7.5 6.9
300 15.1 11.4 9.0 8.0
200 15.6 12.1 9.8
100 17.1 13.9
50 19.8
50 100 200 300 400 500 600 700 800 900 1000 1500 2000 2500 3000 3500 4000

Sample Size