The Ipsos monthly poll has been released. It estimates Labor would receive 53 per cent of the two-party preferred (TPP) vote if an election was held now. Popping these latest numbers into the aggregation, we get an aggregate estimate of 54.2 to 45.8 per cent in Labor's favour.

Turning to the moving averages, which do not cope well with disruption events, we can see the short-run averages are coming in close to the Bayesian model. The longer-run averages will need more time to come into line.

The Ipsos primary vote numbers were a little unusual, with 35 per cent of the primary vote going outside of the Coalition and Labor parties. Also unusual was Labor's low primary vote share in this poll.

Extrapolating a TPP from the primary vote aggregations yields the following.

# Mark the Ballot

Psephology by the numbers

## Monday, September 17, 2018

## Wednesday, September 12, 2018

## Monday, September 10, 2018

### Fourth Morrison poll - Second Newspoll

Newspoll is out at the start of another parliamentary sitting fortnight. It's the same headline message as the previous Newspoll. Labor is on 56 per cent of the two-party preferred vote, well ahead of the Coalition on 44 per cent. With these numbers, there is not a lot of subtlety: Labor would win with a landslide election result if an election was held at the moment.

I would not read too much into the slight downwards slant of the Morrison period to the right of the chart after the discontinuity. The first day of this period has three polls informing its position, the last day has just one poll. The slant may disappear as more polls come in.

The moving average models are coming around. They will over-shoot the Bayesian model before coming into line. They are not designed for the discontinuity we have seen.

Turning to the primary votes aggregation, we see a similar picture.

I would not read too much into the slight downwards slant of the Morrison period to the right of the chart after the discontinuity. The first day of this period has three polls informing its position, the last day has just one poll. The slant may disappear as more polls come in.

The moving average models are coming around. They will over-shoot the Bayesian model before coming into line. They are not designed for the discontinuity we have seen.

Turning to the primary votes aggregation, we see a similar picture.

## Sunday, September 2, 2018

### Monte Carlo simulation of elections

Between elections, the Australian Election Commission (AEC) redraws the electoral boundaries to ensure each seat has a similar number of voters. Now that this redistribution process has been completed, I can use the new seats to model election outcomes.

The first thing I needed was the recalculated margins for each seat. For this data I used Wikipedia. Antony Green has also undertaken these calculations. For the seats that had not been redistributed, we have original polling outcome data from the AEC. This base, expressed as margins, looks something like this.

With this base, I have built a Monte Carlo simulation. In a Monte Carlo simulation we sample from probability distributions many thousands of times to identify the range of possible outcomes. These are then analysed to identify the probabilities for different events occurring. The model needs to consider those factors that can see the results vary.

The biggest source of uncertainty I need to manage is polling uncertainty. It is not unusual for an aggregated opinion poll to be plus or minus two percentage points from the final election outcome. In the Monte Carlo model I have assumed that the actual election outcome will be normally distributed around the poll estimate with a standard deviation of one percentage point.

Another source of uncertainty is the way in which the swing in the individual seats is distributed around the national swing and the way in which this swing varies state-by-state. Historically, individual seat swings have been close to normally distributed around the the national swing with a standard deviation of 3 percentage points. They have also been close to normally distributed around state swings with a standard deviation of 2.5 percentage points.

For this model, I have used state swings, based on the most recent state-by-state Newspoll (which pre-dates the Morrison ascendancy), and then adjusted for a change in the aggregate two-party preferred (TPP) since the Morrison ascendancy. I draw random numbers from a Dirichlet distribution to achieve this adjustment. The state swings since the last election the model used can be seen in the following kernel density estimate plot. The chart is of the state swings to the Coalition in percentage points since the 2016 election. The largest swings against the government at the moment appear to be in WA and Queensland.

There are two key factors that I have not modeled. The first is the sophomore effect - a bump that first term members of Parliament get when running for re-election. The second factor is the retirement effect - a decline in the party vote in a seat following the retirement of long standing member for that party. Labor has a large number of first-term parliamentarians, and is likely to benefit from the sophomore effect at the next election, not withstanding it also has a number of retirees.

A further (and perhaps more critical) factor I have not modeled is the outcome in seats currently held by other parties. For this analysis I have simply assumed those seats will continue to be held by other parties.

In the current climate, with an estimated aggregate TPP of 45 per cent for the Coalition. The model predicts a substantial victory for Labor were an election held now. Based on a simulation run of 100,000, the model predicts Labor is most likely to win 95 seats, and the Coalition 51 seats.

While this is the most likely outcome, there are a cluster of possible outcomes for both parties. But there is little doubt, if an election were held now, a significant Labor majority would be the outcome.

Turning to the individual seat outcomes, these are charted below. In this chart, the seats where we have the Coalition at zero or 100 per cent probability are not sorted.

And finally, my rough and ready code for this exercise. Usual caveats apply: this is a work in progress.

The first thing I needed was the recalculated margins for each seat. For this data I used Wikipedia. Antony Green has also undertaken these calculations. For the seats that had not been redistributed, we have original polling outcome data from the AEC. This base, expressed as margins, looks something like this.

With this base, I have built a Monte Carlo simulation. In a Monte Carlo simulation we sample from probability distributions many thousands of times to identify the range of possible outcomes. These are then analysed to identify the probabilities for different events occurring. The model needs to consider those factors that can see the results vary.

The biggest source of uncertainty I need to manage is polling uncertainty. It is not unusual for an aggregated opinion poll to be plus or minus two percentage points from the final election outcome. In the Monte Carlo model I have assumed that the actual election outcome will be normally distributed around the poll estimate with a standard deviation of one percentage point.

Another source of uncertainty is the way in which the swing in the individual seats is distributed around the national swing and the way in which this swing varies state-by-state. Historically, individual seat swings have been close to normally distributed around the the national swing with a standard deviation of 3 percentage points. They have also been close to normally distributed around state swings with a standard deviation of 2.5 percentage points.

For this model, I have used state swings, based on the most recent state-by-state Newspoll (which pre-dates the Morrison ascendancy), and then adjusted for a change in the aggregate two-party preferred (TPP) since the Morrison ascendancy. I draw random numbers from a Dirichlet distribution to achieve this adjustment. The state swings since the last election the model used can be seen in the following kernel density estimate plot. The chart is of the state swings to the Coalition in percentage points since the 2016 election. The largest swings against the government at the moment appear to be in WA and Queensland.

There are two key factors that I have not modeled. The first is the sophomore effect - a bump that first term members of Parliament get when running for re-election. The second factor is the retirement effect - a decline in the party vote in a seat following the retirement of long standing member for that party. Labor has a large number of first-term parliamentarians, and is likely to benefit from the sophomore effect at the next election, not withstanding it also has a number of retirees.

A further (and perhaps more critical) factor I have not modeled is the outcome in seats currently held by other parties. For this analysis I have simply assumed those seats will continue to be held by other parties.

In the current climate, with an estimated aggregate TPP of 45 per cent for the Coalition. The model predicts a substantial victory for Labor were an election held now. Based on a simulation run of 100,000, the model predicts Labor is most likely to win 95 seats, and the Coalition 51 seats.

While this is the most likely outcome, there are a cluster of possible outcomes for both parties. But there is little doubt, if an election were held now, a significant Labor majority would be the outcome.

Turning to the individual seat outcomes, these are charted below. In this chart, the seats where we have the Coalition at zero or 100 per cent probability are not sorted.

And finally, my rough and ready code for this exercise. Usual caveats apply: this is a work in progress.

# PYTHON: Monte-Carlo simulation of election outcomes # -- NOTE: a number of data sources need to be updated # in this code before it is run. import pandas as pd import numpy as np import matplotlib.pyplot as plt import sys sys.path.append( '../bin' ) plt.style.use('../bin/markgraph.mplstyle') # --- version information print('Python version: {}'.format(sys.version)) # --- Seat data # Seat data sourced from # https://en.wikipedia.org/wiki/Pre-election_pendulum_for_the_next_Australian_federal_election workbook = pd.ExcelFile('./Data/Seats.xlsx') df = workbook.parse('seats') df.index = df.Seat Coalition_TPP_2016 = 0.5036 # ===> UPDATE HERE <=== Coalition_TPP_now = 0.4500 # TO DO - source Coalition_TPP_now directly from TPP aggregation Swing_to_Coalition = Coalition_TPP_now - Coalition_TPP_2016 others = df[df['LNP TPP'].isnull()] df = df[df['LNP TPP'].notnull()] base = ((df['LNP TPP'] / 100.0) - 0.5) # NOTE: base < 0 is Labor; base > 0 is Coalition # --- State Data - note: TPP from the Coalition's perspective. states = ['NSW', 'Vic', 'Qld', 'WA', 'SA', 'Tas', 'ACT', 'NT'] # voters from https://www.aec.gov.au/Enrolling_to_vote/Enrolment_stats/national/index.htm # ===> UPDATE HERE <=== voters = [5211182, 4094212, 3203789, 1615900, 1200395, 381409, 290654, 138581] voters = pd.Series(voters, index=states) # State 2016 TPP source: https://results.aec.gov.au/20499/Website/HouseTppByState-20499.htm tpp_2016 = [0.5053, 0.4817, 0.5410, 0.5466, 0.4773, 0.4264, 0.3887, 0.4294] # latest TPP estimate draws on # https://www.theaustralian.com.au/national-affairs/turnbull-axed-as-coalition-closed-the-gap-on-labor/news-story/487dd05cd4dc95693bd6c55b44bfbe88 # ===> UPDATE HERE <=== tpp_est_now= [0.4963, 0.4597, 0.5000, 0.4996, 0.5103, 0.4164, 0.3787, 0.4194] # the multinomial vector for drawing the Dirichlet random numbers of state swings alpha_scale = 10000000 state_alpha = (tpp_est_now * voters * alpha_scale / voters.sum()).astype(int) # --- let's simulate ... Monte_Carlo_N = 100000 # next line - preallocate space to speed up calculations simulations = pd.DataFrame(np.zeros((len(base),Monte_Carlo_N))) simulations.index = df.index state_swings = pd.DataFrame(np.zeros((len(states),Monte_Carlo_N))) state_swings.index = states print('Commencing ', str(Monte_Carlo_N), ' simulation run ...') for i in range(Monte_Carlo_N) : # -- progress indication if i % (Monte_Carlo_N // 20) == 0 : print(i) # -- polling uncertainty - polls often out by up +/- two percentage points pollingUncertainty = np.random.standard_normal(1) * 0.01 # = standard deviation #pollingUncertainty = 0.0 # -- variable swing by state - use a dirichlet random to manage this element to ensure # total Coalition vote is the same as the Coalition TPP for all eligible voters # NOTE: Comment out this section to use national swings rather than state swings # NOTE: drawing random numbers from the Dirichlet distribution is slow state_dirichlet = np.random.dirichlet(state_alpha) # proportion of Coalition cote in each state state_Coalition_tpp = state_dirichlet * (voters.sum() * Coalition_TPP_now) / voters state_swing_to_coalition = state_Coalition_tpp - tpp_2016 state_swings[i] = state_swing_to_coalition # we will plot this Swing_to_Coalition = df.State.map(state_swing_to_coalition) # -- TO DO - retirement effect # -- TO DO - sophomore effect # -- variable swing seat-by-seat - normally distributed noise around 0 # -- use a standard deviation of 0.03 for national swings # -- use a standard deviation of 0.025 for state swings # -- https://marktheballot.blogspot.com/2016/11/how-are-seat-swings-distributed-around.html # -- https://marktheballot.blogspot.com/2012/11/state-swings.html seatDistributedAroundSwing = np.random.standard_normal(len(base)) * 0.025 # = standard deviation # -- bring it all together ... simulations[i] = base + Swing_to_Coalition + pollingUncertainty + seatDistributedAroundSwing print('Finished simulation ... analysing data ...') sumCoalition = simulations[simulations >= 0].count() sumLabor = len(base) - sumCoalition simSummary = pd.concat({'Coalition': sumCoalition.value_counts(), 'Labor': sumLabor.value_counts()}, axis=1) min_value = simSummary.index.min() max_value = simSummary.index.max()+1 simSummary = pd.DataFrame(simSummary[['Labor', 'Coalition']], index=range(min_value, max_value)) simSummary = simSummary / simSummary.sum() simSummary = simSummary.sort_index() # -- seat count distributional plot print('About to plot ...') ax = simSummary.plot.bar() ax.set_title('Election Outcome Probabilities for Coalition TPP: ' + str(Coalition_TPP_now * 100.0)) ax.set_xlabel('Seats Won') ax.set_ylabel('Probability') ticks = np.arange(min_value, max_value, 5) ax.set(xticks=[x - ticks[0] for x in ticks], xticklabels=ticks) fig = ax.figure fig.set_size_inches(8, 4) fig.tight_layout(pad=1) fig.text(0.99, 0.01, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/SeatCountProbabilities.png', dpi=125) plt.close() # -- most likely outcome plot ml_coalition = simSummary[simSummary['Coalition'] == simSummary['Coalition'].max()].index[0] ml_labor = len(base) - ml_coalition ml_other = len(others) ml_outcome = pd.Series(data=[ml_labor, ml_coalition, ml_other], index=['Labor', 'Coalition', 'Other']) ax = ml_outcome.plot.barh() ax.set_title('Most likely Election Outcome for Coalition TPP: ' + str(Coalition_TPP_now * 100.0)) ax.set_xlabel('Number of Seats Won by Party') ax.set_ylabel('') for i in ax.patches: ax.text(x=1, y=i.get_y()+.16, s=str(i.get_width()), color='white') fig = ax.figure fig.set_size_inches(8, 4) fig.tight_layout(pad=1) fig.text(0.99, 0.01, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/SeatLikelyOutcome.png', dpi=125) plt.close() # -- plot state swings to the Coalition - as a KDE state_swings = state_swings * 100 # covert to percentage points ax = state_swings.T.plot.kde() ax.set_title('State Swing Kernel Density Estimates for Coalition TPP: ' + str(Coalition_TPP_now * 100.0)) ax.set_xlabel('Swing to the Coalition in Percentage Points') ax.set_ylabel('Density') fig = ax.figure fig.set_size_inches(8, 4) fig.tight_layout(pad=1) fig.text(0.99, 0.01, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/Seat-StateSwingKDE.png', dpi=125) plt.close() # -- plot base base = base * 100 # covert to percentage points base.sort_values(inplace=True) ax = base.plot.barh(color='royalblue') ax.set_title('2016 Coalition Margins Starting Point') ax.set_xlabel('Percentage Points (Labor is <0; Coalition is >0)') ax.set_ylabel('Seat') fig = ax.figure fig.set_size_inches(8, 30) fig.tight_layout(pad=1) fig.text(0.99, 0.005, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/Seat-baseMargins.png', dpi=125) plt.close() # -- plot individual seat outcomes sumSeatCoalition = simulations[simulations >= 0].count(axis=1) sumSeatCoalition = sumSeatCoalition / Monte_Carlo_N sumSeatLabor = 1.0 - sumSeatCoalition seatSummary = pd.DataFrame(data={'Coalition': sumSeatCoalition, 'Labor': sumSeatLabor}) seatSummary = seatSummary[['Labor', 'Coalition']] # correct order for colours seatSummary.sort_index(inplace=True) ax = seatSummary.plot.barh(stacked=True, legend=False) ax.set_title('Seat Win Probabilities for Coalition TPP: ' + str(Coalition_TPP_now * 100.0)) ax.set_xlabel('Probability') ax.set_ylabel('') fig = ax.figure fig.set_size_inches(8, 30) fig.tight_layout(pad=1) fig.text(0.99, 0.005, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/SeatWinProbabilitiesNameOrder.png', dpi=125) plt.close() seatSummary.sort_values(by='Coalition', inplace=True) ax = seatSummary.plot.barh(stacked=True, legend=False) ax.set_title('Seat Win Probabilities for Coalition TPP: ' + str(Coalition_TPP_now * 100.0)) ax.set_xlabel('Probability') ax.set_ylabel('') fig = ax.figure fig.set_size_inches(8, 30) fig.tight_layout(pad=1) fig.text(0.99, 0.01, 'marktheballot.blogspot.com.au', ha='right', va='bottom', fontsize='x-small', fontstyle='italic', color='#999999') fig.savefig('./Graphs/SeatWinProbabilitiesOutcomeOrder.png', dpi=125) plt.close()

## Wednesday, August 29, 2018

### Third Morrison Poll

Roy Morgan has released the results of two recent Morgan polls. The first is dated 18-19 August. The second is dated 25-26 August. They are slightly more benign for the Coalition, with the first TPP estimate at 49.5 per cent and the second at 46 per cent. The resulting aggregation follows.

The primary vote aggregation follows (with One Nation included in Other).

A number of TPP estimates can be derived from the primary vote model.

Finally, we can look at One Nation's primary vote in its own right.

The primary vote aggregation follows (with One Nation included in Other).

A number of TPP estimates can be derived from the primary vote model.

Finally, we can look at One Nation's primary vote in its own right.

### Second Morrison poll

Essential Media has come in with their poll estimate of the two-party preferred vote share for the government at 45 per cent, compared with Labor on 55 per cent.

All but one of the poll-bias adjusted moving averages are starting to turn.

Moving to the primary vote model.

As always, the data for this analysis is sourced from Wikipedia.

All but one of the poll-bias adjusted moving averages are starting to turn.

Moving to the primary vote model.

As always, the data for this analysis is sourced from Wikipedia.

Subscribe to:
Posts (Atom)