The replication crisis in science has just begun. It will be big.

Larry Kummer, Editor

8 years ago

Summary: After a decade of slow growth beneath public view, the replication crisis in science begins breaking into public view. First psychology and biomedical studies, now spreading to many other fields — overturning what we were told is settled science, the foundations of our personal behavior and public policy. Here is an introduction to the conflict (there is pushback), with the usual links to detailed information at the end, and some tentative conclusions about effects on public’s trust of science. It’s early days yet, with the real action yet to begin.

“Men only care for science so far as they get a living by it, and that they worship even error when it affords them a subsistence.”
— Goethe, from Conversations of Goethe with Eckermann and Soret.

Mickey Kaus referred to undernews as those “stories bubbling up from the blogs and the tabs that don’t meet MSM standards.” More broadly, it refers to information which mainstream journalists pretend not to see. By mysterious processes it sometimes becomes news. A sufficiently large story can mark the next stage in a social revolution. Game, the latest counter-revolution to feminism, has not yet reached that stage. The replicability crisis of science appears to be doing so, breaching like a whale from the depths of the sea in which it has silently grown.

The crisis is seen in these general media articles about failures in specific fields, often with large public policy consequences.

“A Study on Fats That Doesn’t Fit the Story Line” by the NYT, looking at the long-hidden research suggesting that animal fats are not worse than vegetable fats. See #12 below for links to these studies.
“The sugar conspiracy” by Ian Leslie in The Guardian — “In 1972, a British scientist sounded the alarm that sugar – and not fat – was the greatest danger to our health. His findings were ridiculed and his reputation ruined. How did the world’s top nutrition scientists get it so wrong for so long?”
“How scientists fell in and out of love with the hormone oxytocin” by Brian Resnick at VOX — “Scientists believed a whiff of the chemical could increase trust between humans. Then they went back and checked their work.”
“Cancer Research Is Broken” by Daniel Engber at Slate — “There’s a replication crisis in biomedicine — and no one even knows how deep it runs.”

Slowly articles appear about the systematic nature of the crisis, a result of science’s institutions not adapting to their massive growth in size and funding since WWII (more appear below).

Nature’s survey of 1500 scientists: 52% say there is a “significant crisis”.
“Scientific Regress” by William A. Wilson at First Things.
“How should we treat science’s growing pains?” by Jerome Revetz at The Guardian.

This crisis emerged a decade ago as problems in a few fields — especially health care and psychology. Slowly similar problems emerged in other fields, usually failures to replicate widely accepted research. Even economics, with its high standards for transparency — has been hit. The landmark 2010 paper “Growth in a Time of Debt” by Harvard professors Carmen Reinhart and Kenneth Rogoff — used to justify austerity policies in scores of nations — was found to have serious errors in their spreadsheets. Even physics has been affected, as William Wilson notes.

“Two of the most vaunted physics results of the past few years — the announced discovery of both cosmic inflation and gravitational waves at the BICEP2 experiment in Antarctica, and the supposed discovery of superluminal neutrinos at the Swiss-Italian border — have now been retracted, with far less fanfare than when they were first published.” {See this about the former and this about the latter.}

Now those people who are paying attention see that there is a structural problem in modern science, a deterioration of the always sloppy (as with most social processes) self-correcting dynamics of institutional research. Only small scale research has been conducted so far, so we do not know how broad and deep this dysfunctionality extends. The available evidence suggests that “large” is the most likely answer.

The stakes are almost beyond imagination. It’s not just a matter of time and money wasted when bad studies send research down blind allies. Science is one of our best ways to see the world, and effective public policy requires reliable research on scores of subjects, from health care to climate change. The trillions of dollars, the world’s rate of economic growth, and the health of billions can be affected.

Actions and resistance

Talk precedes action, and there have been several high-level conferences about this crisis. Such as the February 2014 workshop by the Subcommittee on Replicability in Science, part of the Advisory Committee to the NSF Directorate for Social, Behavioral, and Economic Sciences. They produced this typically thorough report: Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science“.

Journalists describe the replication crisis as a “Whig history” — another step in the inevitable evolution and perfection of science. They seldom mention the scientists — and science institutions — resisting reforms, making the outcome uncertain (here’s an example in social psychology). This hidden side of the crisis is described by David Funder (Prof of psychology, UC-Riverside) at his website.

It’s not just – or even especially – about psychology. I was heartened to see that the government representatives saw the bulk of problems with replication as lying in fields such as molecular biology, genetics, and medicine, not in psychology. Psychology has problems too, but is widely viewed as the best place to look for solutions since the basic issues all involve human behavior.

It makes me a bit crazy when psychologists say (or sometimes shout) that everything is fine, that critics of research practices are “witch hunting,” or that examining the degree to which our science is replicable is self-defeating. Quite the reverse: psychology is being looked to as the source of the expertise that can improve all of science. As a psychologist, I’m proud of this.

Backlash and resistance.

This issue came up only a couple of times and I wish it had gotten more attention. It seemed like nobody at the table (a) denied there was a replicability problem in much of the most prominent research in the major journals or (b) denied that something needed to be done. As one participant said, “we are all drinking the same bath water.” … {But} there will be resistance out there. And we need to watch out for it.

…One of Geoff Cumming’s graduate students, Fiona Fidler, recently wrote a thesis on the history of null hypothesis significance testing {NHST}. It’s a fascinating read and I hope will be turned into a book soon. One of its major themes is that NHST has been criticized thoroughly and compellingly many times over the years. Yet it persists, even though – and, ironically, perhaps because – it has never really been explicitly defended! Instead, the defense of NHST is largely passive. People just keep using it. Reviewers and editors just keep publishing it; granting agencies keep giving money to researchers who use it. Eventually the critiques die down. Nothing changes.

That could happen this time too. The defenders of the status quo rarely actively defend anything. They aren’t about to publish articles explaining why NHST tells you everything you need to know, or arguing that effect sizes of r = .80 in studies with an N of 20 represent important and reliable breakthroughs, or least of all reporting data to show that major counter-intuitive findings are robustly replicable. Instead they will just continue to publish each others’ work in all the “best” places, hire each other into excellent jobs and, of course, give each other awards. This is what has happened every time before.

Things just might be different this time. Doubts about statistical standard operating procedure and the replicability of major findings are rampant across multiple fields of study, not just psychology. And, these issues have the attention of major scientific studies and even the US Government. But the strength of the resistance should not be underestimated.

Other signs of pushback to the replication crisis

Institutions seldom reform without a fight, a surprise only to those who believe journalists “Whig history” of science.

“Errors riddled 2015 study showing replication crisis in psychology research, scientists say” by Amy Ellis Nut at the WaPo, 3 March 2016.
“Is Psychology’s Replication Crisis Really Overblown?” by Jesse Singal at New York Magazine, 8 March 2016.

Conclusions

“But what a weak barrier is truth when it stands in the way of an hypothesis!”
— By Mary Wollstonecraft in A Vindication of the Rights of Woman (1792).

This just touches on the many dimensions of the replication crisis. For example, there is the large and growing literature about the misuse of statistics — and the first steps to understanding the various causes of replication failure (almost certainly from structural issues, perhaps common to many or all sciences today).

We can only guess at how many of the sciences have serious problems with replication — and the methodological problems that produce it. This might be one of the greatest challenges to science since the backlash to Darwin’s theory of evolution. Depending on the extent of the problem and the resistance of institutions to reform, this might become the largest challenge since the Roman Catholic Church’s assault in the 15th and 16th centuries, putting the works of famous scientists on the Index Librorum Prohibitorum (e.g., Copernicus, Kepler, Galileo). But this time the problems are within, not external to science.

The likely (but not certain) eventual results are reforms which strengthen the institutions of science, but the crisis might have severe side-effects. America has long had a rocky relationship with science, from the 1925 Scopes “Monkey Trial” about evolution to the modern climate wars. With our confidence in our institutions so low and falling, news about replication failures in “settled science” might have affect the public’s confidence willingness to trust scientists. How many replication failures like the rise and fall of fears about non-celiac Gluton Sensitivity can occur without consequences? This might take long to heal.

Many sciences are vulnerable, but climate science might become the most affected. It combines high visibility, a central role in one of our time’s major public policy questions, and a frequent disregard for the methodological safeguards that other sciences rely upon.

Watch for the next developments in this important story.

To learn more about the crisis of science

Some early articles about the crisis

An early warning that something was amiss: “Problems With Null Hypothesis Significance Testing (NHST)” by Jeffrey A. Gliner et al in The Journal of Experimental Education, 2002 — “The results show that almost all of the textbooks fail to acknowledge that there is controversy surrounding NHST.”
“Why Most Published Research Findings Are False“, John P. A. Ioannidis, Public Library of Science Medicine, 30 August 2005.
“Who’s Afraid of Peer Review?“, John Bohannon, Science, 4 Oct 2013 — “A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals.”

A few good articles about the crisis

“The Truth Wears Off” by Jonah Lehrer in The New Yorker, 13 December 2010 — “Is there something wrong with the scientific method?” Gives some powerful examples.
“Replication studies: Bad copy” by Ed Yong in Nature, 16 May 2012 — “In the wake of high-profile controversies, psychologists are facing up to problems with replication.”
“How science goes wrong: Scientific research has changed the world. Now it needs to change itself“, The Economist, 19 October 2013.
An excellent intro to the subject: “The Replication Crisis in Psychology” by Edward Diener and Robert Biswas-Diener, NOBA, 2016.
“Big Science is broken” by Pascal-Emmanuel Gobry at The Week, 18 April 2016.
“What does research reproducibility mean?” by Steven N. Goodman et al in Science Tranlational Medicine, 1 June 2016.

Some of the many papers about the replication crisis

“Statistical errors in medical research – a review of common pitfalls” by Alexander M. Strasak et al, Swiss Medical Weekly, 27 January 2007 — “Standards in the use of statistics in
medical research are generally low. A growing body of literature points to persistent statistical errors, flaws and deficiencies in most medical journals.”
“What errors do peer reviewers detect, and does training improve their ability to detect them?” by Sara Schroter et al in the Journal of the Royal Society of Medicine, 1 October 2008 — Showed massive failure of peer-review on deliberated flawed paper submitted to the British Medical Journal.
“Reliability of ‘new drug target’ claims called into question“, Brian Owens, Nature, 5 September 2011 — Internal study at Bayer finds that in only 14 of 67 target-validation projects did results match the published finding. These projects covering the majority of Bayer’s work in oncology, women’s health and cardiovascular medicine over the past 4 years. See the paper: “Reliability of ‘new drug target’ claims called into question“, Asher Mullard, Nature Reviews Drug Discovery,
“Academic bias & biotech failures” at Life Sci VC, 28 March 2011 — “The unspoken rule is that at least 50% of the studies published even in top tier academic journals – Science, Nature, Cell, PNAS, etc… – can’t be repeated with the same conclusions by an industrial lab.”
“Believe it or not: how much can we rely on published data on potential drug targets?“, Florian Prinz et al, Nature Reviews – Drug Discoveries, Sept 2011.
“In cancer science, many “discoveries” don’t hold up“, Reuters, 28 March 2012 — About Amgen’s study, “Drug development: Raise standards for preclinical cancer research” by C. Glenn Begley and Lee M. Ellis in Nature, 28 March 2012. They tested 53 “landmark” papers about cancer; 47 could not be replicated.
“Weak statistical standards implicated in scientific irreproducibility” by Erika Check Hayden, Nature, 11 November 2013 — “One-quarter of studies that meet commonly used statistical cutoff may be false.” About “Revised standards for statistical evidence” by Valen E. Johnson in PNAS, 26 November 2013.
“Estimating the reproducibility of psychological science” by the Open Science Collaboration, Science, 28 August 2015. Part of The Reproducibility Project: Psychology of the Open Science Foundation.
“Questions About Questionable Research Practices in the Field of Management” by George C. Banks et al, Journal of Management, January 2016. See the HuffPo article by one of the authors: “How and Why Scientists Fudge Results, and What We Can Do About It“.
“Records found in dusty basement undermine decades of dietary advice” by Sharon Begley at STAT, 12 April 2016. — Powerful but unpublished studies decisively refuted the consensus belief about dangers of animal fats. They were probably unpublished because they contradicted the ruling paradigm. The NYT also covered this. See these two papers in the British Medical Journal: “Re-evaluation of the traditional diet-heart hypothesis: analysis of recovered data from Minnesota Coronary Experiment (1968-73)“, 12 April 2016 — and “Use of dietary linoleic acid for secondary prevention of coronary heart disease and death: evaluation of recovered data from the Sydney Diet Heart Study {1966-73} and updated meta-analysis“, 5 February 2013.
“1,500 scientists lift the lid on reproducibility” by Monya Baker, Nature, 25 May 2016 — “Survey sheds light on the ‘crisis’ rocking research.”
“Replication initiatives will not salvage the trustworthiness of psychology” by James C. Coyne at BioMed Central (peer-reviewed, open access), 31 May 2016.
“Is Most Published Research Really False?“, Jeffrey T. Leek and Leah R. Jager, Annual Reviews, March 2017.

Sources of on-going information

List of replication attempts in psychological research. Many failed.
Investigating Variation in Replicability: A “Many Labs” Replication Project by the Open Science Collaboration. See a summary at National Geographic.
The master website for anyone interested in this subject: Retraction Watch.

For More Information

Please like us on Facebook, follow us on Twitter. For more information see all posts about experts (our reliance on and trust of them), especially these…