Following his success at predicting the US Presidential election for 2012, I decided to read Nate Silver's book: The Signal and the Noise. I am enjoying it so far.
It inspired me to throw together a quick and dirty Monte Carlo simulation of the Australian 2013 Federal Election based on the current state of national opinion polling (assuming the Coalition would win 52.5 per cent of the two party preferred vote). In effect, the simulation runs the election 100,000 times collating the range of potential outcomes given the current polling.
The result is a probability distribution of likely seats won for each party. The most likely outcome is 63 seats for the ALP and 85 seats for the Coalition. It has a more than 50 per cent chance of being between 62 and 65 seats for the ALP, and between 83 and 86 seats for the Coalition.
I am assuming that Wilkie would retain Denison and that Katter would retain Kennedy. For the other seats currently held by the greens and independents, I am assuming they would return to the major parties. I am following Poliquant's logic on this. I used Antony Green's pendulum for some of the underlying arithmetic.
This is a very naive model! Over coming months I plan to refine it. Refinements would include better treatment of the non-major-party seats, adding state-by-state opinion polling results, accounting for systemic bias from the various pollsters (what Simon Jackman calls "house effects"), and a sprinkling of Bayesian intelligence. As we get closer to the election I will look at ballot position and whether the seat contest has a retiring member or not.
In addition to the above national totals, the model could also provide seat-by-seat probabilities. I have not written code for this as yet, but it should not take too long.
Well, I have now written some code to aggregate the individual seat results from each simulation run. I have made some minor modifications to the way in which the second model manages the seats held by others. I have also tidied the plots a little. Nonetheless, it is still a very naive model. I have a long way to go before I am comfortable that it is making robust predictions on the available data.
The updated charts follow for the new 100,000 elections simulation:
The R code for these models can be found here. As always, if you see any errors in the code, or ways I can improve the analysis, please drop me a line.