Mark the Ballot: 2021

Monday, December 13, 2021

Latest polling charts

It's time for a complete set of updated polling charts. We will start with the primary votes. What is important to note here is that both Labor and Coalition have lost primary votes to independents and other parties in past 6 months. While most pollsters have Labor as the clear favourite to win the next election, if these primary vote polls are accurate, it would be an historic win. Labor has not won government with less than 38 per cent of the primary vote in the past (well 37.99 per cent in 2010 to be precise).

When it comes to the two party preferred (2pp) votes estimates, the different pollsters use different approaches. Resolve Strategic does not calculate a 2pp estimate. Roy Morgan relies on respondent preferences. Other pollsters use preference flows from the most recent federal election. For consistency, I have calculated a 2pp estimate for each poll based on the primary vote poll results and preference flows in 2019.

The most recent published Roy Morgan polls point to a Labor landslide (although this reduces somewhat when I calculate the 2pp based on preference flows in 2019). The most recent Newspoll and Essential polls suggest a comfortable Labor win. The Resolve Strategic polls suggest a narrow Coalition win. Interestingly, while the pollsters had similar estimates to each other in the first half of 2021, the pollsters are more diverse now.

The following charts start with the pollsters estimates, and my calculations of 2pp based on the most recent election. We then look at the exponentially weighted polling average by pollster. Finally, I use a Bayesian implementation of a Gaussian state space model to aggregate my estimate of the Coalition's 2pp based on the 2019 preference flows and the assumption that the polling house effects sum to zero.

The attitudinal polling follows.

Note: the data for these plots was sourced from Wikipedia. The code for producing the plots is available on my GitHub site.

Friday, November 12, 2021

Model Building

I have resurrected the Bayesian aggregation that I used at the past few elections. I have re-coded it so that the model operates in the unit scale near zero. This is not a huge change to the model, but it should work better with Stan in these terms. The model is as follows.

// STAN: Two-Party Preferred (TPP) Vote Intention Model 

data {
    // data size
    int<lower=1> n_polls;
    int<lower=1> n_days;
    int<lower=1> n_houses;
    
    // assumed standard deviation for all polls
    real<lower=0> pseudoSampleSigma;
    
    // poll data
    vector<lower=-50,upper=50>[n_polls] y; // TPP vote share
    int<lower=1> house[n_polls];
    int<lower=1> day[n_polls];
}

transformed data {
    real sigma = 0.15; // day-to-day std deviation in percentage points 
}

parameters {
    vector[n_days] hidden_vote_share; 
    vector[n_houses] pHouseEffects;
}

transformed parameters {
    // -- sum to zero constraint on house effects
    vector[n_houses] houseEffect;
    houseEffect[1:n_houses] = pHouseEffects[1:n_houses] - mean(pHouseEffects[1:n_houses]);
}

model {
    // -- temporal model [this is the hidden state-space model]
    hidden_vote_share[1] ~ cauchy(0, 10); // PRIOR (reasonably uninformative)
    hidden_vote_share[2:n_days] ~ normal(hidden_vote_share[1:(n_days-1)], sigma);
    
    // -- house effects model
    pHouseEffects ~ normal(0, 8); // PRIOR (reasonably uninformative)

    // -- observed data / measurement model
    y ~ normal(houseEffect[house] + hidden_vote_share[day], pseudoSampleSigma);
}

The key outputs from this model is a collected view of the two-party-preferred (2pp) vote share for the Coalition. All other things being equal, you would expect a Coalition 2pp vote share of 46.4 to result in a Labor landslide if the election was held now.

I have also started work on a seats model for the House of Representatives. This model will take the raw poll results and predict the number of seats each party would win. When completed, the model would also predict the probability of a hung parliament or not. To win outright, a party needs to win 76 of the 151 seats in the chamber.

The model will look at three things:

How many seats will be won by the minor parties and independents. At the moment, I am doing this with a simple auto-regressive model. I anticipate that the number of independent and minor party seats will be the much the same as the last parliament, noting a slight tendency over time for the number of independents and minor party members to increase.
How the two-party-preferred (2pp) outcome maps to the number of remaining seats (after minor parties and independents) won by the major parties. Quite surprisingly, the relationship is not as tight as you might think. Three times over the past 20 elections, the party with fewer 2pp votes has won a seat majority in the house.
How the poll predictions for the 2pp at the election relate to the actual 2pp vote outcome. Last election demonstrated that sometimes the polls can have a big miss.

So far I have coded the first two parts of this model. My test outputs for a notional Coalition 2pp vote share of 47 per cent show Labor winning an outright majority around 96 per cent of the time. But, these are just test outputs from a model in development. I will back-test the model on past elections before it is finalized. The test outputs are as follows. Note: I expect the spreads to be wider on the Labor and Coalition charts once the polling uncertainty is factored into the model.

I mentioned above that the vote share only translates roughly into seats in Parliament. The next chart is the regression (and its associated confidence interval) for the relationship between 2pp vote share and seats won. The second chart is the prediction interval I use in the model. As you can see from these charts, in 1998 the then government lost by around 2 percentage points (Labor had about two percentage points more 2pp vote than the Coalition). However, the Coalition was ahead in seats by 10 percentage points. You will note that all of the past 20 elections are within the 95% prediction interval within the model.

Given the Coalition's TPP of 46.4 (which translates to a government-opposition margin of -7.2), we are in the bottom left hand corner of the previous two charts. If this does not change before the election, a change of government election is the most likely outcome.

Hopefully, I will have these models finished in the next couple of weeks.

Sunday, October 31, 2021

The 2022 Election

Hello World!

It has been a while since I last blogged.

My expectation is that the next Australian Federal Election will be in May 2022. With the current poor polling for the government, an election before the new year is highly unlikely.

So, as I have done in the past, I thought I would dust off the old programs and start a regular (probably fortnightly in the first instance) blog in anticipation of an election being called in April and had in May.

So far, all I have is one Jupyter Notebook that extracts the polling data from Wikipedia and produces a locally weighted explained sum of squares (LOWESS) regression for all of the data points. This is a quick method for aggregating the polls. Over the next few weeks I will get the Bayesian charts up and running again, and I will start my regular polling of the gambling sites for the odds on the election outcome and the individual seat outcomes.

As always, I will make my work publicly available. This election cycle it will be available on my github page.

The first lot of charts follow: