tag:blogger.com,1999:blog-6007032115980019186.post2490577181159497714..comments2023-11-02T12:21:34.000+11:00Comments on Mark the Ballot: Exploring a non-linear house effects modelMark Graphhttp://www.blogger.com/profile/10462713733051104779noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-6007032115980019186.post-22523807986462040302013-07-30T14:34:04.999+10:002013-07-30T14:34:04.999+10:00Julian - you ask a heap of questions, which I will...Julian - you ask a heap of questions, which I will seek to answer.<br /><br />Simon Jackman's Bayesian model needs a constraint. Without a constraint, it yields the same shaped line, but randomly (up or down) placed on the chart with each run. I am pretty sure that Simon mentioned this in his text (or one of his journal articles), but being on the road, I can't give you the page reference. The sum to zero constraint is easy to implement, but (you are right) it offers no guarantee of correctness. So yes, you need to think about where the line should be placed, and therefore how it is constrained. For my own personal rule-of-thumb, I sometimes use the mid point between Nielsen and Newspoll as an approximation, but it is problematic. <br /><br />I have had a lengthy look at calibration from the previous elections (without compelling success). At the last election, the sum to zero was a point in Labor's favour. But problems present when applying to subsequent elections: Too few elections. New polling houses. Changed methodologies. Non-linearity in polling response response to changes in population voting intentions. etc. <br /><br />On to my non-linear model. I chose constraints to yield something that looked reasonable (bad practice, I know; which is why I flagged it as experimental and problematic). It was a test of concept; with huge problem. If I released the constraints entirely (non-informative prior) the model goes to something close to a flat line. Thinking about this some more, I think the problem was in this statement: << beta-for-pollster * (poll-result - minimum-poll-result) >>. Rather than the minimum, it should be a central tendency of some type (mid-point, mean, median, etc.). I may also need a degree-2 polynomial to better model. Anyway, something to think about some more. <br /><br />I wrestled with the circularity problem you mentioned (given the previous paragraph, some would say unsuccessfully). It is a limitation with JAGS' directed acyclical graph (DAG) approach. <br /><br />My expertise is not quite up to a "compound Poisson-Gaussian process (periodic jumps, the amplitude of which is Gaussian distributed)". I look forward to seeing your work on this.Mark Graphhttps://www.blogger.com/profile/10462713733051104779noreply@blogger.comtag:blogger.com,1999:blog-6007032115980019186.post-42772791195618463612013-07-29T09:01:38.339+10:002013-07-29T09:01:38.339+10:00Incidentally, the effects being talked about here ...Incidentally, the effects being talked about here demonstrate why a "sum-to-zero" constraint is pretty dangerous. In somewhere like the US, with loads of polls and lots of organisations, I can find it plausible that there is no net bias. <br /><br />On the other hand, Australia has a small number of pollsters. I am not convinced that there is any good reason that the net bias should be zero. <br /><br />The better approach is to calibrate the biases off previous elections. The problems then are: 1) few data points, and 2) the pollsters may change their methods periodically. #1 is what it is - you can't get around that. #2 is potentially tractable analytically - you could extend the model to allow for the biases of the pollsters to follow a random walk with time, or alternatively perhaps have the jumps follow a Poisson model whereby they periodically jump in either direction with some amplitude. <br /><br />The most realistic model is a compound Poisson-Gaussian process (periodic jumps, the amplitude of which is Gaussian distributed). The techniques for dealing with this situation are well known for stochastic calculus etc (the jump-diffusion model is a very popular options pricing technique).Julian Kinghttps://www.blogger.com/profile/09999443797525672413noreply@blogger.comtag:blogger.com,1999:blog-6007032115980019186.post-56336403820452744102013-07-29T08:52:40.780+10:002013-07-29T08:52:40.780+10:00This is a particularly interesting idea; I have be...This is a particularly interesting idea; I have been experimenting with something myself.<br /><br />What is your reasoning for a prior of [-0.1, 0.1] on the beta parameter? This is not an uninformative prior given the intrinsic scale of the problem, and this is evident from your graphs where the right side of the confidence limits pile up against 0.1.<br /><br />It looks like that the Morgan F2F value wants to go much higher than this.<br /><br />Have you tried something truly uninformative, like [-1,1]? Beta = -1 corresponds to the case where the response is completely static i.e. changes in general sentiment have no effect on the panel. Perhaps a Gaussian with a standard deviation of 1.0 is reasonable.<br /><br />There is nothing wrong with having very uninformative priors. Remember that your priors should be based on your understanding of the problem before seeing the data - you should not be using the effects you are seeing in the data before you to select the prior on beta. <br /><br />It doesn't surprise me at all that the betas are poorly constrained. You can get an idea of the potential magnitude of the effect by plotting, for a particular poll series, the difference between the poll values and your model as a function of the model value (or the poll value, whichever you prefer). The 2PP vote has not really strayed outside the range ~44 to ~50, and the scatter of the poll values from the model is of the order of the statistical uncertainty, so there is not a good baseline from which to determine the betas. If 2PP was much more volatile then you'd be able to determine beta better, or if you had lots more poll data, or if the poll data had much higher sample sizes.<br /><br />An interesting alternative is to consider the case where beta relates not to the value from the poll under consideration, but the value of the model at that point in time. Unfortunately, this (I think) leads to a circularity problem, where the value of the model at any point depends on beta but the value of beta depends on the model.<br /><br />Julian Kinghttps://www.blogger.com/profile/09999443797525672413noreply@blogger.com