Monday, January 31, 2022

Newspoll 56-44 in Labor's favour

Today's Newspoll has the Coalition with 44 per cent and Labor with 56 per cent of the two-party preferred vote. This is about as bad as it gets for the Coalition. Today's poll suggests the Coalition faces a landslide loss at the 2022 election (most likely to be in May 2022).

This is not the first published poll since the 2019 election to have the government at or below 45 per cent. But it is the first poll with this result when the two-party preferred tally is calculated using preference flows at the last election.

The Bayesian aggregation tells much the same story.

The attitudinal polling suggests a slide in the Prime Minister's standing.

Of interest in the primary votes is the sizable vote for the minor, non-duopoly parties.

I will report the betting market response in tomorrow's post. But there has already been some movement. Right now Labor is on \$1.35 and the Coalition is on \$3.00.

Sunday, January 30, 2022

Simulating election outcomes from betting market data

Because betting market odds can be expressed as probabilities, they can be used in Monte Carlo simulations to reflect the election outcome that punters think most likely. However, there are a couple of technical issues that need to be considered. One is simple; the other, less so.

Bookmakers make their money by ensuring that whomever wins, their outlays will always be less than their takings. A bookmaker adjusts her odds as bets are laid to maintain a margin for herself. Fair enough, bookmakers need to cover their costs and make a profit. This is variously known as the bookmaker's overround, margin, vigorish, vig or juice. To get fair probabilities from the bookmaker's odds, we need to correct for the bookmaker's margin.

The more challenging technical issue is what is known as the Favourite-Longshot Bias (FLB). First reported by R M Griffith in the American Journal of Psychology in 1949, FLB notes that on average short odds under-estimate the probability of winning and that long odds over estimate this probability. As a consequence, bets placed on higher-valued odds (the longshots) offer worse rates of return than bets placed on lower-valued odds (favourites). This tendency in betting markets has been empirically validated many times.

It is evident in our data. In almost every seat, sportsbet offers odds of \$101 on the United Australia Party (UAP) winning the seat. This is an individual seat win probability of 0.0099009900990. In a 151 seat Parliament, the most likely outcome for these same odds across every seat is the UAP winning one or two seats in the next election. Sorry, but I would be surprised if the UAP won any of these seats, with those odds. Consistent with FLB, these odds of \$101 overstate the UAPs probability of winning.

If I don't adjust the bookmaker's odds for FLB, the Monte Carlo simulations ends up with an implausibly large number of Greens, independents and other minor parties winning seats in the House of Representatives, as evident in the following charts.

So what adjustments should I make? One option is to arbitrarily ignore longshot odds over a certain value (say \$20). This is the approach I took for the 2019 election. While this yields a more plausible simulation, the cut-off is arbitrary. Another option is to transform the odds so that the higher valued odds are made even higher to correct for FLB. I have considered two transformations. First multiplying the raw-odds by their square root (this is the same as raising the raw-odds to the power of 1.5). Second, squaring the raw-odds (which is the same as raising the raw-odds to the power of 2). The next set of charts come from the Monte Carlo simulation where the raw odds were multiplied by their square root. The results are similar to the charts above where longshot odds over \$20 were ignored.

The final treatment I considered was squaring the raw-odds, before converting them to probabilities and standardizing these probabilities so that they sum to one in each seat.

I am still deciding which transformation of the raw-odds is best. While the squared approach yields probabilities for the number of seats won by the Greens and Others more in line with the current parliament, the variance of the probability distributions for all four groups has been reduced. The bias-variance trade-off  suggests that we need to at least consider the possibility that we might be over-fitting the betting market data. Nonetheless, for the moment, I plan to use this squared-odds approach for managing the favourite-longshot bias (FLB) within the Monte Carlo simulations. Otherwise, the number of seats won by the Greens and others still seems too high. But I will review this approach from time to time.

For the individual seat outcome probabilities, I will take the less aggressive approach of multiplying the raw-odds by their square root before calculating the probabilities and adjusting for the bookmaker's margin.

If you have alternate/better treatments for FLB, please drop a note in the comments below and argue your case.

Caveat

A couple of people have noted that I am using independent draws in the above Monte Carlo simulation. They argued, because voting across electorates is correlated, I should have used dependent draws, or correlated draws. It is a fair points, but this requires substantial work to understand and model the dependency structure. I have added this to my to do list. One of my interlocutors provided a couple of useful links:

Friday, January 28, 2022

Individual seat betting markets

I have started collecting sportbet's daily odds for the 151 seats in the House of Representatives.

In terms of the following plots (one for each seat), I have ignored the odds in a seat in excess of \$25. These larger-valued odds can be problematic for two related reasons. First, typically the bookmaker's overround for each seat is sizeable. Second, the larger valued odds can over estimate the win probability because of a phenomenon known as the long-shot bias

While there is not much to see from my first two days of data collection (yesterday and today), these charts should provide a good indication of who the broader public thinks will win each seat, and how that opinion changes over time.

The seats that punters think are a close contest are: Braddon (TAS), Flinders (VIC), Casey (VIC), Bass (TAS), Deakin (VIC), Gilmore (NSW), Flynn (QLD), Boothby (SA), Longman (QLD), Reid (NSW), La Trobe (VIC), Lindsay (NSW), Robertson (NSW), Hunter (NSW), Dobell (NSW), Higgins (VIC), Goldstein (VIC), Hasluck (WA), and Wentworth (NSW)

The seats where the punters think the Greens have a chance are: Melbourne (VIC), Higgins (VIC), Griffith (QLD), Macnamara (VIC), Cooper (VIC), Brisbane (QLD), Wills (VIC), and Kooyong (VIC)

The seats where other parties and independents have a chance, according to the punters, are: Kennedy (QLD), Clark (TAS), Mayo (SA), Warringah (NSW), Indi (VIC), Flinders (VIC), Goldstein (VIC), Wentworth (NSW), North Sydney (NSW), Kooyong (VIC), Hughes (NSW), Hume (NSW), Nicholls (VIC), and Mackellar (NSW).