Don’t miss the latest developments in business and finance.

How not to be fooled by randomness

We're the only species smart enough to think an octopus can be psychic

If enough people ask enough animals to predict enough sporting events, it is likely some will guess correctly a few times in a row
If enough people ask enough animals to predict enough sporting events, it is likely some will guess correctly a few times in a row
Faye Flam | Bloomberg
Last Updated : Dec 10 2017 | 11:29 PM IST
There’s a common thread to the dubious claims and irreproducible results that have plagued some fields of science. In most of these cases, scientists saw illusory patterns in the randomness of our world. It’s the same error that led people to be captivated by an octopus named Paul who, in 2010, correctly predicted the outcomes of eight German World Cup matches.
 
It might seem surprising that a cephalopod could make so many predictions from mere random guesses. But as statistician David Hand explains in his book “The Improbability Principle”, if enough people ask enough animals to predict the outcomes of enough sporting events, eventually it is likely some will guess correctly a few times in a row.
 
Something similar happens when thousands of scientists go looking for weird, headline-grabbing results — a common practice in psychology. For a while, it looked like you could make yourself happier by holding a pencil in your teeth, become more empathetic by spending three minutes reading classic literature, or gain power by assuming special “power poses”. All have gone the way of poor, discredited Paul the Octopus.
 
The good news is that tools to separate real results from random noise exist. The bad news is that many researchers in afflicted fields never learned to use them correctly — a conclusion echoed by reformers from within, and reflected in a statement issued last year by the American Statistical Association.
 
Statistical analysis falls into two major schools — frequentist and Bayesian statistics. The frequentist school of thought hinges on the idea that the probability of something happening corresponds to the number of times it would happen given many chances. Roll a die often enough and five will come up exactly one-sixth of the time.
 
Back in 1865, mathematicians Benjamin and Charles Peirce, a father-and-son team, used the concept to help settle a dispute involving Hetty Green, who would later become known as the richest woman in America. The story, told in detail by journalist Louis Menand in his book “The Metaphysical Club”, starts with the mathematicians getting called as expert witnesses to determine whether Green (then Hetty Robinson) had forged her aunt’s signature on an alternative will that would have bequeathed her a fortune of $2 million.
 
The signature in question was perfectly identical to a signature on the original will, suggesting that Robinson had traced it. Most authentic signatures vary a bit. What were the odds these two signatures would be identical by chance?
 
The Peirces noted that the signature had 30 down strokes. They found 44 other examples of the aunt’s signature, measured down strokes, and calculated that a given down stroke matched across two signatures 5 percent of the time. They calculated odds of one in 68 that three down strokes would match in two signatures, 1 in 144 that four would match, and odds of one in trillions that all 30 would match.
 
This calculation bears some resemblance to what scientists do to determine what they call statistical significance — which is expressed as a p-value. The technique was invented to help researchers separate real results from noise by giving them a sense of whether they should be surprised enough by their data to take a closer look.
 
But there, psychologists and medical researchers usually use an arbitrary cutoff point of .05 (1 in 20) to define what’s statistically significant — a standard far less stringent than the one-in-trillions calculated by the Peirces. This porous filter was originally intended to flag preliminary results that deserved a second look — not as a proxy for truth.
 
The problem with p-values goes beyond that. Gerd Gigerenzer, a psychologist and longtime science critic at the Max Planck Institute for Human Development, points to a survey published in 2002, which indicated most professors of psychology don’t know what p-values represent. They think they know, but they don’t.
 
And because they don’t understand it, they routinely calculate it incorrectly, he said, allowing the publication of lots of high-profile noise under a veneer of statistical rigour. In a 2004 paper titled Mindless Statistics, Gigerenzer illustrated the crux of the problem with an anecdote from the writings of physicist Richard Feynman.
 
If researchers comb through their data fishing for weird things, that’s fine, but to calculate their statistical significance requires a separate experiment. Otherwise they’ll end up with the same problem that afflicted one of Paul’s successors, a koala named Oobi-Ooobi, who was fired last year after his sports prediction powers suddenly — and not so mysteriously — disappeared.
© 2017 Bloomberg