Wednesday, April 10, 2013

Tweaking the Bayesian model

My suspicion is that Morgan has stopped producing its fortnightly face-to-face polls and its irregular phone polls.

Certainly the face-to-face poll was vexing for psephologists with its house bias of somewhere between 2.5 and 3.5 percentage points in Labor's favour. Some (like me) included it in their aggregation models and noted the issue of bias, some others included it with an adjustment for bias, while yet others excluded it. You can see the consistent nature of the Morgan face-to-face bias in the next two charts (the Morgan series is in light blue at the top of both charts).

As Morgan appears to have ceased publishing these two poll series, I am looking at dropping them from my Bayesian aggregation. The one clear benefit from dropping these polls is that the sum-to-zero constraint within the Bayesian model is more likely to yield a two-party preferred (TPP) vote share estimate that is closer to the population parameter. The current zero-sum estimate is skewed by the substantial bias in the Morgan face-to-face series.

My back-of-the-envelope calculation was that the Bayesian aggregation would move about 0.5 percentage points in the Coalition's favour. As it turns out, the envelope was pretty well on the mark. My personal assessment is that this new aggregation is less biased than the aggregation that included the Morgan face-to-face poll.

The relative house biases around the assumption that the cumulative biases from all of the polling houses sums to zero is as follows:

This adjustment will affect the national seat estimation models.

Because I find the other LOESS charts useful when thinking about the trend, they also follow (noting that the cessation of the Morgan face-to-face series also affects these charts). Even if I keep the Morgan F2F series in the LOESS charts, its cessation on the right-hand side will see a movement of the trend line around the point of the cessation. This is demonstrated in the next two charts. The most recent decline in Labor's TPP vote share the second chart is artificially inflated by the cessation of the Morgan F2F poll series.


Another option is to include the Morgan face-to-face data points in the aggregation, but exclude them from the sum to zero constraint. The JAGS model for this approach is as follows:

    model {
        ## Based on Simon Jackman's model, with an additional element for
        ## the various rounding effects of the different polling houses
        ## and a sum-to-zero constraint on house effects
        ## -- observational model
        for(i in 1:NUMPOLLS) { # for each poll result ...
            roundingEffect[i] ~ dunif(-houseRounding[i], houseRounding[i])
            yhat[i] <- houseEffect[house[i]] + walk[day[i]] + roundingEffect[i]
            y[i] ~ dnorm(yhat[i], samplePrecision[i]) # distribution
        ## -- temporal model
        for(i in 2:PERIOD) { # for each day under analysis ...
            walk[i] ~ dnorm(walk[i-1], walkPrecision) # AR(1)

        ## -- sum-to-zero constraint on house effects (ignoring Morgan F2F)
        houseEffect[1] <- -sum( houseEffect[2:HOUSECOUNT] ) + houseEffect[MORGANF2F]

        ## -- priors
        sigmaWalk ~ dunif(0, 0.01)          ## uniform prior on std. dev.  
        walkPrecision <- pow(sigmaWalk, -2) ##   for the day-to-day random walk
        walk[1] ~ dunif(0.4, 0.6)           ## initialisation of the daily walk

        for(i in 2:HOUSECOUNT) { ## vague normal priors for house effects
            houseEffect[i] ~ dnorm(0, pow(0.1, -2))

Compared with the aggregation above, this model affects the texture of the aggregation prior to the cessation of the Morgan F2F poll (but not the vertical position of the aggregation), the later section of the aggregation series (after the cessation of the F2F series) is largely unchanged.

In the next chart we can see that this model places the estimate of zero house bias in much the same place as the chart above (although it looks more leftwards on the chart with the inclusion of the F2F series).

Feedback sought: Please let me know if you can see any methodological or analytical reasons why I should not remove the Morgan phone and face-to-face series from the Bayesian analysis.

Also let me know if I should remove them completely, or as I have done in this update, simply remove them from sum-to-zero constraint.

Whatever I do, by the time the September 14 election comes around, the Morgan face-to-face and phone series will no longer be informing my six month Bayesian aggregation. 

1 comment:

  1. I'd certainly at least remove them from the sum-to-zero constraint. (I think just doing that is quite sufficient). F2F definitely seems to be gone. I'm not sure if Morgan phone is really gone, since it was becoming so irregular anyway that it is hard to tell if it has ceased. The test will come in election week - are Morgan willing to go with their new methods entirely for the one that their reputation rides on the most, or will they switch back to phone-polling for the crunch poll?