Mark the Ballot: Four Bayesian model comparison

I now have four different Bayesian models for estimating the two-party preferred (TPP) vote share for the Coalition from multiple polling house sources.

A dynamic linear model with input from TPP polling data
A dynamic linear model with input from the primary voting intention polling data
A Beta-walk model with input from TPP polling data; and
A Dirichlet-walk model with input from the primary voting intention polling data

While each is a hidden Markov model, they use different input data (the TPP estimate or the primary vote estimates from the polling houses), and they use different distributions to propagate the Markov process in the hidden Markov model (the Normal, Dirichlet, or Beta distributions). The models based on the Normal distribution are dynamic linear models (DLM).

Because of size constraints, only the models which use the pollsters' TPP input data have a daily hidden Markov model. In these models, we estimate the hidden Coalition TPP vote share for each day in the period under analysis.

The models that use the primary vote data from the pollsters track a weekly hidden Markov model. In these models, we estimate the hidden Coalition TPP vote share for each week in the period under analysis. The weeks in these models begin on the first day of the earliest poll in the input.

All of the models are processed using the free JAGS software. The fastest model is the TPP dynamic linear model. The slowest model (by a country mile) is the primary vote share dynamic linear model. In the context of model speed, I am thinking of testing the Biips software, which claims to be faster than JAGS. But any move from JAGS will take some time.

Differences can arise between the models for a host of reasons. Comparing the models can be useful for better understanding the underlying trends in voting preferences (and whether there are issues that need to be factored into future analysis).

At the moment there is a substantial divergence between the TPP models and the primary vote share models. My suspicion is that this has come about because of the collapse of the Palmer United Party, and the way in which pollsters allocate preferences based on preference-flows at the last election and the stated voting intention of poll respondents, but I will need to do some work to test this hypothesis.

Mark the Ballot

Pages

Monday, June 8, 2015

Four Bayesian model comparison

No comments:

Post a Comment