why access panels cannot weight elections polls accurately

There are a lot of reasons why would not want to use acces panels for predicting electoral outcomes . These are well discussed in many places on- and offline. I’ll shortly summarize them, before adding some thoughts to why access panels do so badly predicting election outcomes.

1. Access panels don’t draw random samples, but rely on self-selected samples. A slightly better way to get panel respondents is a quota sample, but even these have problems, well discussed here, here and here for example. The bottom line is that access panel respondents are not ' normal’ people, and so voting preferences of not-normal people are likely to be biased.

2. Because of these problems, survey managers use weighting. They correct their sample for known biases in the sample. If they know elderly people with low educations are underrepresented in an access panel, they weigh them up. I think this is bad practice. And it has been shown that weighting does not solve the problem,. and can sometimes make biases worse for general surveys. Here are some additional and specific problems, often neglected. In short, weighting only works if the weighting variables can predict the dependent variable to a great extent.

Weighting is usually done with socio-demographic variables. From political science research, we know that sociodemographics do a bad job of explaining voting behavior. Explained variances for regression models normally don’t exceed 10%.
So, let me get down to the main point I would like to make in this post. A point which I have not seen discussed anywhere.

Panel survey managers have ‘resolved’ the weakness of their weighting models by including a variable that does predict voring behavior fairly well: past voting behavior. If one knows that past Social Democrat voters are underrepresented, one can weight on that variable. This is all very well, if one has good data of past voting behavior for all panel members. The panels currently do not. Their information is wrong in two ways:

1. Access panels will never have information for people who did not vote previously. These are mainly young people, or people who normally do not vote in elections. If these new voters vote like everyone else there is no problem, but new voters have very specific voting preferences.

2.  Reversely, access panels can not predict well who is not going to vote in current elections. If non-voters disproportionally voted for one party in the previous elections, this will lead to an overestimation of voters for that party.

I believe these two problems are larger than most people think. The first problem can predict why the PVV-vote was underestimated in 2006 and 2010. The PVV attracted many new voters in those elections. The second problem explains why the PVV-vote was overestimated in 2012.  Many people who voted PVV in the previous elections, stayed home this time.

So, panel survey managers who want a bit of free advice how to improve your polls. Try to get a clear view on the new voters, and the people unlikely to vote. That may be hard, especially because non-voters are not so interested in politics, and will therefore not sign up for online access panels voluntarily. But it is certainly not impossible.

Avatar
Peter Lugtig
Associate Professor data quality

I am an associate professor of data quality at Utrecht University, department of Methodology and Statistics.

Related