In 2011 Daryl Bem proved that
precognition exists! People can know something before it happens. Specifically,
he ran an experiment which went like this: students were shown a computer
screen displaying two curtains. They were told that a picture would appear
behind one of the curtains and they had to predict which curtain. The computer then
used a random number generator to choose which side the picture would appear. If the subjects had no
ability to know the future, you would expect a 50-50 success rate. But Bem’s
experiment showed that subjects predicted successfully at a statistically significant
higher rate.
Creative Commons
Licence
It would have been easy to dismiss
this as an improperly run experiment, except that
- Professor Bem was a highly
regarded scientist, having initiated a major field of study within psychology
- He had taught
at Carnegie-Mellon, Stanford, Harvard, and Cornell Universities
- His study was peer-reviewed and no problems found with the methodology or analysis
and so “Feeling the Future” was
published in one of the most prestigious psychology journals.
Creative Commons
Licence
Consternation! Pretty much no-one
believed in precognition, but how could “good”, peer-reviewed, published
science, be totally wrong? Several scientists tried repeating Bem’s study and
found no precognition. And others started looking at whether some
well-accepted studies were replicable and found that they, too, didn’t hold up.
Some estimates suggested that as many as 50% of published results are not
repeatable. And so began the so-called Replication Crisis in psychology.
Sadly, some of the studies that
have not held up include some great ideas. You know about them if you’ve read
wonderful books like Nobel laureate Daniel Kahneman’s Thinking Fast and Slow or Dan Ariely’s Predictably
Irrational. Examples are:-
Creative Commons
Licence
Social Priming: A famous study
found that after exposing subjects to some random words which included ones
relating to old age, they walked more slowly when leaving the building.
Creative Commons
Licence
Ego depletion: This 1996 experiment
has been cited over 3,000 times. Student volunteers were placed in a room with
freshly baked, fragrant chocolate chip cookies. Some groups were permitted to
eat them, other groups not. After a while they were given an impossible puzzle to try and solve. Those whose will-power had been tested by refraining
from eating gave up after
eight minutes. Those who had been allowed to eat the cookies lasted an average of
19 minutes.
Quentin Gronau - Creative Commons
Licence
Facial Feedback Hypothesis: A
highly cited experiment in 1988 gave subjects a pen to hold in their mouths
while looking at cartoons. Those who were instructed to hold the pen in a way
to force them to smile, found the cartoons significantly funnier than the
control group.
There are many reasons that
invalid results can be published and subsequently not refuted. Some are
- Significance level: Psychology
accepts as significant a 95% confidence level. And if the conclusion is true 95% of the time, you would
expect it to be wrong once in every 20 studies. By contrast the benchmark for Physics is
one spurious conclusion in 3.5 million!
- Journals will only
publish positive results, so there’s a lot of pressure on researchers to find
significance.
- Journals are also very
reluctant to publish papers showing that studies were not successfully
replicated. And if the replication works, it's old news! So there’s little incentive to try to replicate a study. It won’t
get you a publishable paper no matter whether the replication is successful or
unsuccessful.
- “P-hacking”. You collect a lot of data and then search it for a significant result. If you
slice and dice your data enough, you are quite likely to come up with a
statistically significant result, purely by chance. So if you don’t get
significance with your entire group of subjects, try splitting them into subgroups of
males and females; or try young and old; or try gay and straight; or
first-born vs other birth sequence, etc. In Bem’s study, he found that
precognition worked with erotic images and male subjects. If you have 20 subgroups, you should have a good chance of finding a spuriously
significant result.
- “Data Peeking”. This is a way of selectively collecting the data you would like. You
run a study on, say, 20 subjects and look at the results. If they don’t look good,
decide that the procedure isn't quite right. Throw away that 'bad' data, make a small change and try again.
Keep doing this until you get a really good result on 20 subjects. Then declare those 20 to be the start of the real study
and keep going. It has been suggested that Bem may have done this. It’s
clearly dishonest if you know that the procedural changes you are making are not
significant. But there’s a widespread belief that in many cases minor changes
to the procedure are significant. (Perhaps just convenient wishful thinking).
Lots of psychologists are trying
hard to fix the many problems in the discipline.
- There are many groups
working to replicate and either support or debunk existing theories.
- There’s a lot of
pressure to “preregister” studies. This means you specify in advance what your
hypothesis is and what data you’ll test and analyze. It leaves less wiggle room
for data hacking after the fact.
- There’s pressure on journals
to agree to publish Registered Reports. These would be well-designed studies
and the journal agrees in advance to publish the results, regardless of the
outcome.
In the meantime, I plan to keep
smiling and eating chocolate chip cookies.