Pronab Sen was the chief statistician before moving to the Planning Commission as principal advisor. He tells Dilasha Seth there are internal in-built systems to detect data errors and the Central Statistics Office (CSO) should look into why these did not work. Edited excerpts:
How could there such a miscalculation in the IIP data for January?
Basically, what happened was very unusual. The sugar season starts from October. So, what they reported was the cumulative data for four months, October-January, instead of that for only January. This should have been detected. When I had seen the IIP number in January, everyone was excited about the 6.8 per cent increase in industrial production, but consumer non-durables rising 42 per cent was surprising. This meant there was some error. But at the end of the day, there is a human element attached to it.
Aren’t there any internal checks to avert these data errors?
We usually have built-in computerised checks to detect such errors. So, when data comes in, the computer flags it if the growth is greater than, say, x per cent or less than y per cent. What the CSO needs to check is why these computerised checks did not work. These are standard checks.
Products like machine tools and publishing are showing a high increase for February, too. Do you see an issue here?
One should not look at the data on a product basis, but on a category basis. For each product, there may be 1,000 companies. But the sample size is less than 10 for each product. So, the weight of each company at the product level is very high. Therefore, if one company is shut for a while for some reason, it would have a negative effect on the numbers. So, one should not look at the data at the product level.
Earlier, it was the GDP numbers, then export numbers and now, the IIP numbers. Is there a problem with the data collection system?
In all the three cases, the problem is not with data collection, but one at the compilation statistical level. We have built-in checks and these have worked well in the past. But why these did not work in these cases is the question. Are people entering data while these programmes are switched off? I feel this is not a computing problem, but an organisational problem.