Monday, November 9, 2015

Update to the Dirichlet, sum-to-zero model

I have been wrestling with an aspect of the Dirichlet, sum-to-zero model of primary voting intention for some time, which I think I have finally solved.

The Dirichlet distribution is a n-tuple or list of proportions, where each proportion is greater than or equal to zero and less than or equal to one. The sum of the n-tuple is one. In our model we use a 4-tuple, where each proportion represents the primary vote share for each party grouping [Coalition, Labor, Greens, Other].

The Dirichlet, sum-to-zero model has three parts:
  1. A temporal model, where the estimated national vote share for each party is pretty much like it was on the previous day. 
  2. An house-effects model, where the house effects:
    • across all houses for a particular party sum to zero
    • across all parties for a particular house sum to zero
  3. An observational model where the poll result for a particular day, subject to houses effects, supports the national estimated vote-share for each of the parties for that day in the temporal model.
I had been struggling with was the double constraint in the house-effects element of the model. For a long time, I simply ignored the second constraint in the corner case, as JAGS would not allow the node to be defined twice. The code snippet for this follows.

    #### ----- House-effects model
    ## -- vague priors ...
    for (h in 2:HOUSECOUNT) { 
        for (p in 2:PARTIES) { 
            houseEffect[h, p] ~ dunif(-0.1, 0.1)

    ## -- sum to zero constraint - but only in one direction for houseEffect[1, 1]
    for (p in 2:PARTIES) { 
        houseEffect[1, p] <- 0 - sum( houseEffect[2:HOUSECOUNT, p] )
    for(h in 1:HOUSECOUNT) { 
        # includes a constraint for houseEffect[1, 1], but only in one direction
        houseEffect[h, 1] <- 0 - sum( houseEffect[h, 2:PARTIES] )

While the absence of the double constraint troubled me a little, the model appeared to work. So I largely left it untouched. With the move to JAGS 4.0.1, I revisited this model. It occurred to me that I could code the second constraint as a requirement that the sum of houseEffect[1, 2:PARTIES] equaled the sum of houseEffect[2:HOUSECOUNT, 1]. I did this with the following addition to the model (including the introduction of zero as a defined constant data element).

data {
    zero <- 0.0
model {


    ## -- the other direction constraint on houseEffect[1, 1]
    zero ~ dsum( houseEffect[1, 1], sum( houseEffect[2:HOUSECOUNT, 1] ) )

From the earlier code, houseEffect[1, 1] contains the negated sum from houseEffect[1, 2:PARTIES]. When added to sum of houseEffect[2:HOUSECOUNT, 1] the result should be zero.

Let's revisit the charts. We will start with the vote estimate charts, before moving to the house effects charts (which we will look at from the perspective of houses and parties).

Does this additional bit of code make the model work better? I don't think so, but it is a more accurate specification of the necessary constraints in the model.

The full model code can be seen here (it is the third of the four models on the linked page).

No comments:

Post a Comment