Thursday, April 25, 2013

Towards a more integrated primary vote share model

A few weeks ago I said that I had developed a Bayesian model for primary vote shares. At the time, I was seeking (but had not attained) an integrated solution that ensured the primary votes shares across the four party groups (Coalition, Labor, Green and Other) summed to one (or one hundred per cent). My original model typically summed in the range 99.5 to 100.5 per cent.

I have now have a more integrated model working. This new model typically sums in the range 99.95 to 100.05 per cent (an order of magnitude improvement on the unintegrated model). The model is as follows.

model {
    #### -- observational model 
    for(poll in 1:NUMPOLLS) { # for each poll result - rows
        for(party in 1:PARTIES) { # for each party - columns
            yhat[poll, party] <- houseEffect[house[poll], party] + 
                walk[pollDay[poll], party] 
            primaryVotes[poll, party] ~ dnorm(yhat[poll, party], precision[poll, party])
        }
    }
            
    #### -- temporal model (a daily walk where today is much like yesterday)
    for(day in 2:PERIOD) { # rows
        for (party in 1:PARTIES) { # columns
            tmp[day, party] ~ dnorm(walk[day-1, party], walkPrecision[party])
        }
    }

    ## -- impose a sum-to-one constraint ... total of all parties sums to one every day
    for(day in 1:PERIOD) { # rows
        walk[day, 1:PARTIES] <- tmp[day, 1:PARTIES] / sum(tmp[day, 1:PARTIES ])
    }

    ## -- constrained priors for the day-to-day variance of the temporal model
    for(party in 1:PARTIES) { # for each party
        sigmaWalk[party] ~ dunif(0, 0.005)  ## uniform prior on std. dev.  
        walkPrecision[party] <- pow(sigmaWalk[party], -2)   
    }

    ## -- uninformative priors for first day in the temporal model
    for (party in 1:PARTIES) { # for each party
        tmp[1, party] ~ dunif(0.0001, 0.9999) # fairly uninformative
    }

    #### -- sum-to-zero constraint on house effects (ignoring Morgan F2F)
    for (party in 1:PARTIES) { # for each party, house effects across houses sum to zero 
        houseEffect[1, party] <- 0 - sum( houseEffect[2:HOUSECOUNT, party] ) + 
            houseEffect[MORGANF2F, party]
    }
    for(house in 2:HOUSECOUNT) { # for each house, house effects across the parties sum to zero
        houseEffect[house, 1] <- 0 - sum( houseEffect[house, 2:PARTIES] ) 
    }
    # but note, we do not apply a double constraint to houseEffect[1, 1] [NEWSPOLL, LABOR]
        
    ## -- vague normal priors for house effects - centred on zero
    for (party in 2:PARTIES) { # for each party (cols)
        for(house in 2:HOUSECOUNT) { #  (rows)
            houseEffect[house, party] ~ dnorm(0, pow(0.1, -2))
        }
    }
}

The results from the model (with a 100,000 iteration run) are as follows ... noting this model took an hour of computing time.













While I have a working model, I remain concerned that it is inelegant. If you are feeling particularly wonkish, you can help me improve the model.

3 comments: