Inference for non-probability surveys
Introduction
In week 7 we discussed model-assisted survey inference. This week we go one step further and discuss model-based approaches, that also underlie imputation models. As long as we can come up with a statistical model that describes the ‘thing’ we are interested in a good way, we can use that model for making predictions about data that are missing (imputation), or about the population as a whole. A statistical model is used as the basis for performing inference; inclusion probabilities no longer play a role.
Literature
- Meng, X. L. (2018). Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election. The Annals of Applied Statistics, 12(2), 685-726.
- Mercer, A. W., Kreuter, F., Keeter, S., & Stuart, E. A. (2017). Theory and practice in nonprobability surveys: parallels between causal inference and survey inference. Public Opinion Quarterly, 81(S1), 250-271.
- Valliant, R. (2020) Comparing alternatives for estimation from nonprobability samples. Journal of Survey Statistics and Methodology, 8(20), 231-263
Lecture
We review the U.S. presidential dataset again. How does one design a good inference model? The Total Survey Error Model is reviewed, with a focus on non-probability based datasets. We discuss several approaches to doing non-probability inference using purely model-based inference, pseudo-likelihood methods, and mass imputation. Slides
Class and Take home exercises
Can you model? We introduce a short competition. Who can design the best model?
Class Exercise
Survey data
Population data
Mass imputation population