The 2014 elections may prove to be no different, and psephologists will list the usual suspects — highly fluid voter preferences, false responses, and of course, sampling error — by way of rationalising the discrepancies. But one reason for the sometimes embarrassingly way-off-the-mark estimates of seat shares will never be proffered: that most of these national surveys, comprising 5,000 to 15,000 randomly selected voters, are designed to estimate party-wise national vote shares, not national seat shares. Or to put it differently, if seats in the Lok Sabha were distributed across parties based on the party-wise vote shares, most of these surveys would be fairly accurate.
But they aren’t. And this is the biggest weakness in the design of pre-poll surveys. While the mathematics of random sampling to estimate population parameters, whether it be voting preferences or choice of toothpaste, rests on fairly strong foundations, dating back to seminal work in probability theory and statistics by greats such Pascal, Fermat and Gauss, the same cannot be said of techniques used by psephologists in disaggregating party vote shares into the corresponding seat shares. Such an exercise is, to put it charitably, more art than science. Indeed, there is very little published academic literature on the techniques used by pollsters in converting higher aggregate vote shares into seat shares to be able to hold them up to scientific scrutiny.
Essentially, psephologists use voting distributions from previous elections to distribute the estimated vote share across constituencies to get a robust seat share estimate. The technique is moot on three counts: First, voter preferences, demographics, and political conditions change significantly over a five-year period to inject a large degree of error. Second, the method is found wanting when a new party or new alliance debuts. This explains why most pollsters were way off the mark in estimating the Aam Aadmi Party’s seat share in the December 2013 Delhi Assembly elections.
Third, an Indian general election is an aggregate of 543 unique Presidential style elections. While overarching national issues, track record of the outgoing government, choice of PM candidate and party affinities matter, so do issues and circumstances specific to each of the 543 constituencies: the profile of candidates in the fray, the track record of the serving MP if he or she is seeking re-election, and the constituency’s own demographic profile, which — even if not too different from the state’s — will, nevertheless, disproportionately influence electoral outcomes in a multi-cornered, first-past-the-post fight.
While media houses are primarily interested in generating good copy from the surveys they commission and not so much in their credibility, there is a larger public interest in improving the accuracy of such pre-poll surveys. Publicly and privately commissioned surveys determine political realignments and it would be in everyone’s interest that political actors base their decisions on more accurate estimates of their party’s electoral prospects. There is also the spectre of manipulating surveys to shape public opinion. This falls under a broader discussion dealing with regulation of opinion polls — but it is germane to point out that a survey designed to capture constituency-specific vote shares largely drives out the subjectivity associated with deriving these estimates from higher larger aggregate vote shares.
There are still other issues that bedevil pre-poll surveys. One, a representative sample of the electorate is not the same as the representative sample of the population that will come out to vote on polling day. Two, certain social groups are notorious for not revealing their voting intentions. These are areas of research and an effort must be made to put the art of forecasting Indian elections, given their unique complexities, on a firmer scientific footing.
The author, a former public policy analyst, is the co-founder of satire website The Unreal Times and co-author of the novel, Unreal Elections