Pages

Bayesian Aggregation

This page provides some background on the Bayesian poll aggregation model I use for the 2028 Australian Federal election.


The Gaussian Random Walk Model

This is a linear model with three important assumptions. The model assumes:

  • poll results are noisy observations of an unobserved underlying whole-of-population voting intention; this voting intention on any particular day only changes by a small random amount from the day before (this is the 'random walk' assumption)
  • the methodologies deployed by each pollster have a built-in house-effect when compared with the other pollsters, and
  • that collectively, the house effects from those pollsters with more than five polls sum to zero - that collectively, the house effect biases cancel out

Each of these assumptions need some explanation: The first assumption is hard-coded as drawn from a normal sample with a standard deviation of 0.1 percentage points each day. This is a modelling choice. It can be interrogated, but it appears reasonable given the polling movements to date. If there were a sharp sudden movement in polling - for example, associated with a change of political leader - then we might need to introduce a discontinuity into the time-series.

Implicit in the second assumption is that pollsters do not change their methodology. In practice, and given enough time, this is very unlikely. Where it becomes clear that a pollster has changed their methodology, either by public statement, or because it is clear that the polling dynamics have changed, then a separate polling series for that pollster will be commenced from the date of the methodological change.

This last assumption is somewhat ad hoc. Nonetheless, you can use the house effect plots to make your own adjustment to the time-series up or down. If you have reason to believe a particular pollster is likely to be more accurate than the others, you can mentally adjust the time-series by the amount of that pollster's median house effect. 

For example, if you believe that Newspoll is the gold standard in Australian polling, and the median house effect for Newspoll is (say) plus 1 percentage point for a particular series. You can add one percentage point to the time-series line to get an aggregation that accommodates your belief about the collective house effects. 

The code for this model can be found on my GitHub site.