This is a question for a simple multiple regression against the formula:
TPP_estimate = coalition_pv +
α green_pv +
β other_pv
In English, the Coalition's two-party preferred vote-share estimate comprises the Coalition primary vote, plus a proportion of the Greens' primary vote ( α ), and a proportion of the other parties' primary vote ( β ). In this equation, α and β are both values between 0 and 1 (on the continuum of no flow of preferences through to 100% flow). I decided to solve this regression using a simple Bayesian model, as follows.
model { ## -- preference flows for(poll in 1:NUMPOLLS) { # for each poll result - rows yhat[poll] <- pv_coalition[poll] + (alpha * pv_greens[poll]) + (beta * pv_other[poll]) y[poll] ~ dnorm(yhat[poll], tau) } ## priors alpha ~ dunif(0.0, 1.0) beta ~ dunif(0.0, 1.0) sigma ~ dunif(0.001, 0.1) tau <- pow(sigma, -2) }
I undertook the analysis for each polling house, using their polling data since the last Federal election, with the following results.
On both charts, I have marked with a vertical gray line the preference flow I use in my models (0.1697 for the Greens and 0.533 for other parties). I used Antony Green's earlier work to set my preference flows within my models.
The gray line falls within the 95% credibility interval for each of the polling houses. Therefore, I cannot argue that any of the polling houses are using different preference flows from the one I am using. If the pollsters are using different preference flows, this test did not demonstrate that.
Very interesting exercise. I understand that some pollsters apply preference distributions from the last election to their results on a state by state basis rather than nationally. If, for instance, a higher share of the Greens' national vote comes from Victoria than in 2013, then the Green preference flow to the Coalition will be slightly lower by such methods. However the differences created by this are so small I wouldn't expect a regression to catch them on a pollster by pollster basis.
ReplyDeleteMy equation uses a small fourth term which is a constant term for each parliamentary term (currently 0.14%) to account for the influence of three-cornered contests in reducing the Coalition 2PP. It would be possible to get fancy with this concept using the state-by-state distributions of the Coalition votes and the size of the Nationals vote, but for very little gain in accuracy.
From last August on I've been using the 2PPs implied for the published primaries to modify the pollster's published 2PP (mainly to avoid throwing away information useful for estimating what the 2PP was before the pollster rounded it.) In this time I've ended up adjusting Morgan's published 2PP in the Coalition's favour 15 times, in Labor's favour 2 times and not adjusting it at all 3 times.