The replication crisis in science has just begun. It will be big.

Summary: After a decade of slow growth beneath public view, the replication crisis in science begins breaking into public view. First psychology and biomedical studies, now spreading to many other fields — overturning what we were told is settled science, the foundations of our personal behavior and public policy. Here is an introduction to the conflict (there is pushback), with the usual links to detailed information at the end, and some tentative conclusions about effects on public’s trust of science. It’s early days yet, with the real action yet to begin.

“Men only care for science so far as they get a living by it, and that they worship even error when it affords them a subsistence.”
— Goethe, from Conversations of Goethe with Eckermann and Soret.

Replication Crisis in Science

Mickey Kaus referred to undernews as those “stories bubbling up from the blogs and the tabs that don’t meet MSM standards.” More broadly, it refers to information which mainstream journalists pretend not to see. By mysterious processes it sometimes becomes news. A  sufficiently large story can mark the next stage in a social revolution. Game, the latest counter-revolution to feminism, has not yet reached that stage. The replicability crisis of science appears to be doing so, breaching like a whale from the depths of the sea in which it has silently grown.

The crisis is seen in these general media articles about failures in specific fields, often with large public policy consequences.

  • A Study on Fats That Doesn’t Fit the Story Line” by the NYT, looking at the long-hidden research suggesting that animal fats are not worse than vegetable fats. See #12 below for links to these studies.
  • The sugar conspiracy” by Ian Leslie in The Guardian — “In 1972, a British scientist sounded the alarm that sugar – and not fat – was the greatest danger to our health. His findings were ridiculed and his reputation ruined. How did the world’s top nutrition scientists get it so wrong for so long?”
  • How scientists fell in and out of love with the hormone oxytocin” by Brian Resnick at VOX — “Scientists believed a whiff of the chemical could increase trust between humans. Then they went back and checked their work.”
  • Cancer Research Is Broken” by Daniel Engber at Slate — “There’s a replication crisis in biomedicine — and no one even knows how deep it runs.”

Slowly articles appear about the systematic nature of the crisis, a result of science’s institutions not adapting to their massive growth in size and funding since WWII (more appear below).

This crisis emerged a decade ago as problems in a few fields — especially health care and psychology. Slowly similar problems emerged in other fields, usually failures to replicate widely accepted research. Even economics, with its high standards for transparency — has been hit. The landmark 2010 paper “Growth in a Time of Debt” by Harvard professors Carmen Reinhart and Kenneth Rogoff — used to justify austerity policies in scores of nations — was found to have serious errors in their spreadsheets. Even physics has been affected, as William Wilson notes.

“Two of the most vaunted physics results of the past few years — the announced discovery of both cosmic inflation and gravitational waves at the BICEP2 experiment in Antarctica, and the supposed discovery of superluminal neutrinos at the Swiss-Italian border — have now been retracted, with far less fanfare than when they were first published.”  {See this about the former and this about the latter.}

Now those people who are paying attention see that there is a structural problem in modern science, a deterioration of the always sloppy (as with most social processes) self-correcting dynamics of institutional research. Only small scale research has been conducted so far, so we do not know how broad and deep this dysfunctionality extends. The available evidence suggests that “large” is the most likely answer.

The stakes are almost beyond imagination. It’s not just a matter of time and money wasted when bad studies send research down blind allies. Science is one of our best ways to see the world, and effective public policy requires reliable research on scores of subjects, from health care to climate change. The trillions of dollars, the world’s rate of economic growth, and the health of billions can be affected.

Actions and resistance

Talk precedes action, and there have been several high-level conferences about this crisis. Such as the February 2014 workshop by the Subcommittee on Replicability in Science, part of the Advisory Committee to the NSF Directorate for Social, Behavioral, and Economic Sciences. They produced this typically thorough report: Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science“.

Journalists describe the replication crisis as a “Whig history” — another step in the inevitable evolution and perfection of science. They seldom mention the scientists — and science institutions — resisting reforms, making the outcome uncertain (here’s an example in social psychology). This hidden side of the crisis is described by David Funder (Prof of psychology, UC-Riverside) at his website.

David Funder

It’s not just – or even especially – about psychology. I was heartened to see that the government representatives saw the bulk of problems with replication as lying in fields such as molecular biology, genetics, and medicine, not in psychology. Psychology has problems too, but is widely viewed as the best place to look for solutions since the basic issues all involve human behavior.

It makes me a bit crazy when psychologists say (or sometimes shout) that everything is fine, that critics of research practices are “witch hunting,” or that examining the degree to which our science is replicable is self-defeating. Quite the reverse: psychology is being looked to as the source of the expertise that can improve all of science. As a psychologist, I’m proud of this.

Backlash and resistance.

This issue came up only a couple of times and I wish it had gotten more attention. It seemed like nobody at the table (a) denied there was a replicability problem in much of the most prominent research in the major journals or (b) denied that something needed to be done. As one participant said, “we are all drinking the same bath water.” … {But} there will be resistance out there. And we need to watch out for it.

…One of Geoff Cumming’s graduate students, Fiona Fidler, recently wrote a thesis on the history of null hypothesis significance testing {NHST}. It’s a fascinating read and I hope will be turned into a book soon. One of its major themes is that NHST has been criticized thoroughly and compellingly many times over the years.  Yet it persists, even though – and, ironically, perhaps because – it has never really been explicitly defended!  Instead, the defense of NHST is largely passive.  People just keep using it.  Reviewers and editors just keep publishing it; granting agencies keep giving money to researchers who use it.  Eventually the critiques die down.  Nothing changes.

That could happen this time too.  The defenders of the status quo rarely actively defend anything. They aren’t about to publish articles explaining why NHST tells you everything you need to know, or arguing that effect sizes of r = .80 in studies with an N of 20 represent important and reliable breakthroughs, or least of all reporting data to show that major counter-intuitive findings are robustly replicable.   Instead they will just continue to publish each others’ work in all the “best” places, hire each other into excellent jobs and, of course, give each other awards.  This is what has happened every time before.

Things just might be different this time.  Doubts about statistical standard operating procedure and the replicability of major findings are rampant across multiple fields of study, not just psychology.  And, these issues have the attention of major scientific studies and even the US Government.  But the strength of the resistance should not be underestimated.

Other signs of pushback to the replication crisis

Institutions seldom reform without a fight, a surprise only to those who believe journalists “Whig history” of science.

Conclusions

“But what a weak barrier is truth when it stands in the way of an hypothesis!”
— By Mary Wollstonecraft in A Vindication of the Rights of Woman (1792).

This just touches on the many dimensions of the replication crisis. For example, there is the large and growing literature about the misuse of statistics — and the first steps to understanding the various causes of replication failure (almost certainly from structural issues, perhaps common to many or all sciences today).

We can only guess at how many of the sciences have serious problems with replication — and the methodological problems that produce it.  This might be one of the greatest challenges to science since the backlash to Darwin’s theory of evolution. Depending on the extent of the problem and the resistance of institutions to reform, this might become the largest challenge since the Roman Catholic Church’s assault in the 15th and 16th centuries, putting the works of famous scientists on the Index Librorum Prohibitorum (e.g., Copernicus, Kepler, Galileo). But this time the problems are within, not external to science.

The likely (but not certain) eventual results are reforms which strengthen the institutions of science, but the crisis might have severe side-effects. America has long had a rocky relationship with science, from the 1925 Scopes “Monkey Trial” about evolution to the modern climate wars. With our confidence in our institutions so low and falling, news about replication failures in “settled science” might have affect the public’s confidence willingness to trust scientists. How many replication failures like the rise and fall of fears about non-celiac Gluton Sensitivity can occur without consequences? This might take long to heal.

Many sciences are vulnerable, but climate science might become the most affected. It combines high visibility, a central role in one of our time’s major public policy questions, and a frequent disregard for the methodological safeguards that other sciences rely upon.

Watch for the next developments in this important story.

How science goes wrong

To learn more about the crisis of science

Some early articles about the crisis

  1. An early warning that something was amiss: “Problems With Null Hypothesis Significance Testing (NHST)” by Jeffrey A. Gliner et al in The Journal of Experimental Education, 2002 — “The results show that almost all of the textbooks fail to acknowledge that there is controversy surrounding NHST.”
  2. Why Most Published Research Findings Are False“, John P. A. Ioannidis, Public Library of Science Medicine, 30 August 2005.
  3. Who’s Afraid of Peer Review?“, John Bohannon, Science, 4 Oct 2013 — “A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals.”

A few good articles about the crisis

  1. The Truth Wears Off” by Jonah Lehrer in The New Yorker, 13 December 2010 — “Is there something wrong with the scientific method?” Gives some powerful examples.
  2. Replication studies: Bad copy” by Ed Yong in Nature, 16 May 2012 — “In the wake of high-profile controversies, psychologists are facing up to problems with replication.”
  3. How science goes wrong: Scientific research has changed the world. Now it needs to change itself“, The Economist, 19 October 2013.
  4. An excellent intro to the subject: “The Replication Crisis in Psychology” by Edward Diener and Robert Biswas-Diener, NOBA, 2016.
  5. Big Science is broken” by Pascal-Emmanuel Gobry at The Week, 18 April 2016.
  6. What does research reproducibility mean?” by Steven N. Goodman et al in Science Tranlational Medicine, 1 June 2016.

Some of the many papers about the replication crisis

  1. Statistical errors in medical research – a review of common pitfalls” by Alexander M. Strasak et al, Swiss Medical Weekly, 27 January 2007 — “Standards in the use of statistics in
    medical research are generally low. A growing body of literature points to persistent statistical errors, flaws and deficiencies in most medical journals.”
  2. What errors do peer reviewers detect, and does training improve their ability to detect them?” by Sara Schroter et al in the Journal of the Royal Society of Medicine, 1 October 2008 — Showed massive failure of peer-review on deliberated flawed paper submitted to the British Medical Journal.
  3. Reliability of ‘new drug target’ claims called into question“, Brian Owens, Nature, 5 September 2011 — Internal study at Bayer finds that in only 14 of 67 target-validation projects did results match the published finding. These projects covering the majority of Bayer’s work in oncology, women’s health and cardiovascular medicine over the past 4 years. See the paper: “Reliability of ‘new drug target’ claims called into question“, Asher Mullard, Nature Reviews Drug Discovery,
  4. Academic bias & biotech failures” at Life Sci VC, 28 March 2011 — “The unspoken rule is that at least 50% of the studies published even in top tier academic journals – Science, Nature, Cell, PNAS, etc… – can’t be repeated with the same conclusions by an industrial lab.”
  5. Believe it or not: how much can we rely on published data on potential drug targets?“, Florian Prinz et al, Nature Reviews – Drug Discoveries, Sept 2011.
  6. In cancer science, many “discoveries” don’t hold up“, Reuters, 28 March 2012 — About Amgen’s study, “Drug development: Raise standards for preclinical cancer research” by C. Glenn Begley and Lee M. Ellis in Nature, 28 March 2012. They tested 53 “landmark” papers about cancer; 47 could not be replicated.
  7. Weak statistical standards implicated in scientific irreproducibility” by Erika Check Hayden, Nature, 11 November 2013 — “One-quarter of studies that meet commonly used statistical cutoff may be false.” About “Revised standards for statistical evidence” by Valen E. Johnson in PNAS, 26 November 2013.
  8. Estimating the reproducibility of psychological science” by the Open Science Collaboration, Science, 28 August 2015. Part of The Reproducibility Project: Psychology of the Open Science Foundation.
  9. Questions About Questionable Research Practices in the Field of Management” by George C. Banks et al, Journal of Management, January 2016. See the HuffPo article by one of the authors: “How and Why Scientists Fudge Results, and What We Can Do About It“.
  10. Records found in dusty basement undermine decades of dietary advice” by Sharon Begley at STAT, 12 April 2016. — Powerful but unpublished studies decisively refuted the consensus belief about dangers of animal fats. They were probably unpublished because they contradicted the ruling paradigm. The NYT also covered this. See these two papers in the British Medical Journal: “Re-evaluation of the traditional diet-heart hypothesis: analysis of recovered data from Minnesota Coronary Experiment (1968-73)“, 12 April 2016 — and “Use of dietary linoleic acid for secondary prevention of coronary heart disease and death: evaluation of recovered data from the Sydney Diet Heart Study {1966-73} and updated meta-analysis“, 5 February 2013.
  11. 1,500 scientists lift the lid on reproducibility” by Monya Baker, Nature, 25 May 2016 — “Survey sheds light on the ‘crisis’ rocking research.”
  12. Replication initiatives will not salvage the trustworthiness of psychology” by James C. Coyne at BioMed Central (peer-reviewed, open access), 31 May 2016.
  13. Is Most Published Research Really False?“, Jeffrey T. Leek and Leah R. Jager, Annual Reviews, March 2017.

Sources of on-going information

  1. List of replication attempts in psychological research. Many failed.
  2. Investigating Variation in Replicability: A “Many Labs” Replication Project by the Open Science Collaboration. See a summary at National Geographic.
  3. The master website for anyone interested in this subject: Retraction Watch.

For More Information

Please like us on Facebook, follow us on Twitter. For more information see all posts about experts (our reliance on and trust of them), especially these…

Let's do it -- for Science!

45 thoughts on “The replication crisis in science has just begun. It will be big.

    1. It’s disappointing to see that Baumeister’s piece is pigeonholed as nothing more than “resistance to reforms.” I agree it seems reactionary at times, but it makes some excellent points.

      Knee-jerk mea-culpa, OMG-type approaches to reform do run the risk of discouraging important, excellent exploratory research; and our reform efforts might also make the “rich richer and poor poorer,” in scientific terms, by making it difficult for all but the best-funded (i.e., oldest, wealthiest, had the most initial undergrad and grad opportunities, etc.) researchers to be successful in the field.

      As others have noted, exploratory research, low-powered studies, etc. are not the problem, in themselves, so let’s not effectively demonize or disincentivize them. Rather, let’s push for clear distinctions between exploratory and confirmatory research, with pre-registration for the latter, and with norms for publishing data and code, and discussing research decisions more transparently, all around.

    2. Darrin,

      This post is 2300 words. My typical post is 1,000 words because readership falls off pretty quickly after that. My point was that journalists misreport this as a “Whig History” of inevitable progress. There is resistance to reform, and these institutional forces might prevent substantial changes. Length prohibited more comment about this.

      As for Baumeister’s “Charting the future of social psychology on stormy seas: Winners, losers, and recommendations“, it is accurately described as “resistance to reform.” Like all such, he believes things are fine today, imagines a wide range of ill effects to reforms — and believes that only small reforms should be made. That you agree does not mean his response is “resistance to reform”.

      “My position is that the field has actually done quite well in recent decades, and improvement should be undertaken as further refinement of a successful approach, in contrast to the Cassandrian view that the field’s body of knowledge is hopelessly flawed and radical, revolutionary change is needed.”

  1. Science is at the point of diminishing returns. A. Bell and Edison and Pasteur could make groundbreaking discoveries with a small lab and a few assistants. Now it takes millions and a lab full of people to make incremental advances, which have little practical application. The Greeks thought rational philosophy was the greatest human accomplishment, Romans thought law, Europe in the middle ages thought theology. We think science. The future will believe in something else.

    1. dashui,

      “Now it takes millions and a lab full of people to make incremental advances, which have little practical application.”

      That’s clearly false, on both counts. Massive breakthoughs can be made in the social sciences without “millions” (they’re now where the hard sciences were centuries ago). Second, your statement about “little practical application” is quuite astounding on the brink of another industrial revolution — with innovations in genetics, AI, power systems, and nanotechnology looming ahead.

  2. Thanks for this post and especially all the links to documentation. The question is will responsible climate scientists pay attention? We know the Skeptical Science pseudo-scientists will deny is.

    1. dpy6629,

      Most of these revelations came from inside the various fields of science, although often from junior or very junior members. My guess (emphasis on guess) is that is no longer likely in climate science. If so, changes will be like those in nutrition — occurring only decades later. Most likely when the weather has had the last word.

      In nutritional science, they discovered that they had had the correct answers all along — but preferred not to see them.

  3. “Massive breakthoughs can be made in the social sciences without “millions” (they’re now where the hard sciences were centuries ago).”

    And yet the feeling persist, something is amiss. To wrangle the last secrets of matter, we need a billion dollar accelerator; hundreds millions are poured in massively complex pharmaceutical reasearch; fundemental reaserch now requires machines of exquisite sophistication; climate science requires massive supercomputers and we don’t seem any closer to a clear picture and finally Moore’s law is slowly but surely coming to an end.

    Everywhere I look,be it in politics, law, industry, science etc, I see systems of ever-increasing complexity escaping our control. This crisis of science is a good case, the number of publication is so large that it is beyond the capacity of any single institution to check their validity, specialization makes it increasingly difficult to find someone competent to judge the quality of a paper and verification requires costly effort in time and money for no substantial reward. The result is not very suprising.

    1. zi,

      “To wrangle the last secrets of matter,”

      Again, all of your examples are in the physical sciences. Look to the social sciences, which are centuries behind in evolution — and in which a lone researcher can make massive theoretical advances, and small teams can produce landmark research.

      “Everywhere I look,be it in politics, law, industry, science etc, I see systems of ever-increasing complexity escaping our control.”

      That’s nostalgia. Those things were never under our controls. The very operation of our social, political, and economic systems was not understood (little reliable theory, almost no accurate data). Nature itself was a massive random element, sweeping large fractions of the population off the board without warning.

    2. Oh, researchers are chugging along just fine, the goals are getting more ambitious, that’s all. Engineered biological systems, better understanding of physiology, neurobiology, development of useful transgenic tools, AI, quantum computing – it’s pretty crazy stuff. Plenty of less exciting things down the chain like new materials, ongoing improvements in good old semiconductors, everyday electronics, photonics, carbon, power systems, information systems, at last we have semi-useful natural language interaction with machines (and machine translation), semi-intelligent tools (robots if you will) for more and more applications. And yes, plenty of dysfunction as well, that’s just human nature. There’s big money in all this after all.

    3. Pete,

      “researchers are chugging along just fine, the goals are getting more ambitious, that’s all.”

      The first clause is quite bizarre in the face of the evidence shown that research is not “chugging along just fine”. A large and growing body of scientists disagree with you.

      Can you provide any evidence for your second clause? It sounds unlikely. In 1687 Newton titled his magnum opus quite grandly as “Mathematical Principles of Natural Philosophy”. In 1936 Keynes titled his magnum opus the “The General Theory of Employment, Interest and Money”, again with almost megalomaniac confidence.

    4. Second one first.

      Newton and Keynes were theorists, or at least that’s what I know them for primarily. Newton especially is a great example.

      Your article, and the important points it brings up, are about the experimental side of science. I’ll try to explain why I think this distinction is relevant.

      In the time of Euclid or Archimedes, who I think are up there among Newton’s predecessors, the distance to the sun or moon was estimated using sticks, shadows, and human cleverness. (Archimedes did one better, he got to work naked in a bath :-) )

      Newton himself already had it a bit harder for data- some of his laws came from many years of observations by Kepler.

      Einstein and Bohr, who could be his successors, each came up with theory to understand hundreds or possibly thousands of person-years of experimental work.

      Today the more ambitious works of the physical sciences are looking at things like studying the genome, measuring the cocktail of countless proteins and chemicals in individual people and animals, the study of gravitational waves, atoms cooled to a tiny fraction of a degree above absolute zero to study quantum mechanics, new inventions in telescopes, microscopes, you name it.

      The massive and still unsuccessful efforts to make fusion work is an example of an engineering project, but that too relies on building blocks which if not fundamental is at least ambitious.

      Perhaps you are right about the grand-ness of the theory, but I think compared to the data that fed Newton, present day experiments are reaching for more. (Given the resources of our world and the apparent difficulty in finding useful work for people to do, why not?)

      Social sciences are different, they don’t have the advantages of being able to apply so much hardware and technology to control experimental conditions. In that sense, their work is harder. They also don’t have so much distance between the experimental work and the theory and the funding…

  4. Of all the signs pointing to a mounting replication crisis in science, I don’t see how the causal link between cholesterol and heart disease is one. The ‘sugar conspiracy’ article is an example of how PR tactics are used to obfuscate scientific knowledge and create controversy where there really is none. The good ole “saturated fat isn’t bad, it’s sugar that’s worse than cocaine!” line is nothing new, and has been used to sell fad diets for decades (and market products like eggs as health food). That’s not to say nutritional science doesn’t have serious problems, but an article that bad isn’t the best window into understanding them.

    1. “Of all the signs pointing to a mounting replication crisis in science, I don’t see how the causal link between cholesterol and heart disease is one.”

      Well here is a list of medical studies comparing cholesterol levels in the blood and longevity!

      http://vernerwheelock.com/179-cholesterol-and-all-cause-mortality/

      In country after country, those with high blood cholesterol lived longest!

      The idea that saturated fat was harmful was started by Ancel Keys (an easy name to GOOGLE!) who cherry picked data from 7 countries out of a much larger data set to ‘prove’ his contention! I stake my life on this because I now eat saturated fat and don’t take statins.

      The only reason that people aren’t up in arms about the science crisis – the waste, the damage to people’s health – the pollution of the scientific record – is that they all assume that the areas they haven’t looked into are just fine!

      Many years ago as a post doc, I was concerned about some very serious problems with the apparatus we used. I was told these would be fixed *after* two PhD students had finished collecting data for their theses! At that point I moved from chemistry into software development.

      I have watched the growing corruption of science ever since.

    2. gettingmoresceptical,

      There’s pretty convincing evidence that saturated fat causes atherosclerosis. Researchers as far back as 1908 were able to artificially induce the disease in rabbits and monkeys by feeding them on a diet of egg yolks (see Anichkov’s work, another easy name to GOOGLE!). Over the next 108 years, there’s been a lot more evidence demonstrating that relationship. The literature is vast, and I think the link between SF and atherosclerosis is at this point very well demonstrated.

      I wanted to say that I see the the SF – atherosclerosis link as analogous to the smoking – lung cancer connection (i.e. there is a solid body of evidence that threatens an industry and a concerted effort by that industry to obfuscate the evidence). The PR tactic of tobacco companies was to create doubt about the link between smoking and lung cancer. They didn’t have to convince anyone that smoking was healthy, they just had to create enough controversy so that people would throw up there hands and say “might as well smoke because no one knows anything”. There is a lot of nutritional research that is funded by food industry groups, usually with the aim of demonstrating the health benefits of eating their product (and countering research demonstrating otherwise).

  5. The articles you linked that I read were very good — the Guardian one excluded. It was like finding a basket full of gold nuggets and one turd. That (Guardian) article didn’t describe mechanisms of dysfunctional research any more than a tobacco industry PR job would accurately describe problems in lung cancer research.

    1. somegreatnotion,

      Regarding the relationship between heart disease and SF/cholesterol, you might find this analysis of the well known Framingham study interesting: “Framingham follies” by Michael Eades, MD. They didn’t find the expected correlations at all! This illustrates a big part of the problem – that myths get ingrained in science, and are then very hard to shift, because people bury difficult evidence for a whole lot of all-too-human reasons!

      I’m not a biochemist or physiologist, I am merely trying to point out that most of the people objecting to chunks of received medical wisdom, are not doing so without serious evidence!

      My only personal stake in this, is that while I watch my sugar intake, I eat as much saturated fat as I like, and I don’t take a statin (partly because I once did, and suffered very unpleasant side effects, so I don’t even have my cholesterol measured.

    2. David,

      I suggest caution about drawing critical conclusions from one guy’s blog post — someone not even an expert in the field he critiques. Dr Michael Eades is co-author of Protein Power, a diet book that looks like a knock-off of Dr Atkins faddish diet. Not a strong qualification to evaluation one of the largest and most respected health care studies.

      For more about the Framingham Heart Study see Wikipedia and their website.

  6. A few anecdotal thoughts from the inside (I’m a Ph.D. student in Microbiology).

    One key issue that is rarely discussed (albeit constantly on the minds of those who work in the field) is the effect of hyper competitiveness in Biomedical Sciences. The increases in the NIH budget in the 90s and early 2000s expanded the number of trainees (Grad Students and Post Docs) in the field greatly. Unfortunately, the availability of senior positions has not kept pace.

    The results of this is a constant drive to produce by junior members of the field. Junior faculty members pressure those who work under them to drop their n-values (i.e. replicate number). Controls that a reviewer aren’t likely to ask for aren’t run, even if the grad student actually driving the project forward believes them to be importance. If a reagent from one supplier gives a positive result, who cares if the same reagent from supplier B does not! A post doc may not even mention it his/her superior, it could mean literally months and tens-of-thousands of dollars down the drain. Heck, I’ve seen it cost some one her job. To some it up succinctly in a sickly ironic mantra, people in the building I work often say “don’t think, just do.” Such a motto does not breed reliable Science, and in my mind contributes to the replication crisis more than anyone wants to admit.

    The solution? I am at a loss for a simple one. Like the problems discussed in the articles above only systemic and painful changes are likely to work long . though it might take the pressure off and produce more reliable results, no one wants to lax graduation requirements. There are only so many professorships at top tier institutions (if everyone could work at Harvard, no one would want to work at Harvard). There is often a stigma with getting a job in “industry,” and while jobs are available the number of positions does not equal the number of applicants. Finally, one wants to admit fewer Grad Students. We are just too cheap and work too hard.

    1. Paul,

      Thanks for the link. It’s always useful to see how scientists see the problem. Color me skeptical about Briggs’ analysis. As a good 21st C American, he identifies the problems as coming from outsiders. Scientists are pure as snow, perhaps even victims. Got to love “papers that professors were forced to write” — by other professors, who control the tenur-granting system.

      Moaning about too much money not going according to his priorities, as if expanding the number of students in college has no justifications.

      Also, governments provide money for science and they are going to have a large voice in how its spent. You might as well complain about the sun rising. They’re not going to let Prof Briggs personally decide. But this takes the cake as pure delusion.

      “We have nitwit, avaricious, power-hungry politicians running around telling scientists that “The science is settled!” This isn’t only in global warming, where devilish politicos are scheming to prosecute troublesome truth tellers, but in any matter sexual.”

      Climate scientists created the “science is settled” meme (the exact origin of the specific phrase is debated). Gavin, James Hansen, Phil Jones and a score of other climate scientists pushed this story. Climate activists on the Left echoed them. Politicos then picked it up.

  7. Thank you for an very interesting blog. I will argue that the “replication crisis” is evidence of great strength in science, and that the seeming consensus in “climate science” is in fact evidence of weakness of intellectual content. And yes, scientific papers have always had errors

    What seems to be missing from the discussion is some idea of what separates “science” from technology or engineering. Science can be practiced for free and by anyone with a brain. Most of what is presented to the public as science is not at all scientific, though it may be highly technical, or merely complicated like many nutrition studies.

    The idea that science can only be practiced with massive expenditure demonstates a lack of understanding of science, as does the statement that the “science is settled” “Theory” plays a totally different role in science from that in technology like engineering or most
    industrial chemistry.

    In engineering, theory is used to design things like buildings and bridges. Theory is a guide to what we know about materials. Theory plays a similar role in chemistry. For technology in general, theory is used to understand how the universe works, or more accurately, theory defines what we understand about how the universe works.

    But in science, theory is used to define the limits of our understanding. Thus in science, theory presents a barrier that is to be demolished, and proven wrong, or at least in error.

    The arbitration of “truth” in science depends on the outcome of direct experiments in the context of conflicting theories. One would prefer such studies to be isolated in a laboratory, but controlled field studies can work just as well. Relativity predicts the orbit of the planet Mercury better than Newton’s theory. Some other theory may prove even better.

    So here is an example of the scientific method at work. Let’s say that your theory is that temperature at each point in space, in particular on your back porch at 9 in the morning, is uniquely defined and readily measurable. My theory is that any actual measurement of temperature will always be a multivariate function of the type of thermometer and the the coupling of the thermometer into the local air mass and radiation environment, and that any given measurement may vary from another nominally identical measurement by 10 degrees F.

    So I propose an experiment. We go to a hardware store and buy as many different thermometers as we can afford, and at least four of one kind. We also buy a block of polystyrene foam and enough mounting hardware so that we can observe the different thermometers all within a few inches of each other in the identical environment. We can calibrate the thermometers in the kitchen sink with ice water and on the stove in a pot of water.

    Now, what do you predict to be the outcome of this experiment for A) different thermometers mounted on the same surface of the support block, or B) and for different orientations of the thermometers (taking different samples of the local radiation environment? Recognize that if they do not all read the same, you must favor my theory.

    Anyone who participates in my experiment will discover that temperature measurement is a complicated business, and the notion of averaging multiple thermometer readings does not describe the complexity of atmospheric radiation exchange, or tell us what is happening in the lower troposphere. And if you get in your car and transport the apparatus into the countryside, you may gain understanding of the nature of urban heat island effect.

    1. You see much bigger problems in the area of pharmaceutical research:
      http://www.nature.com/nrd/journal/v10/n9/full/nrd3439-c1.html
      scientsts from Bayer found that 75% of the published results could not be reproduced.
      http://www.nature.com/nature/journal/v483/n7391/full/483531a.html
      6 out of 53 papers could be replicated

      Then there’s this one:
      http://science.sciencemag.org/content/342/6154/60.full
      A member of Science’s staff wrote a spoof paper, riddled with scientific and statistical errors, and sent 304 versions of it to a range of peer-reviewed journals. It was accepted for publication by more than half of them.

      That is not even considering the unfavourable results of new pharmaceuticals that are filed away to never again see the light of day, or are “recut” to make it look like the researcher was looking at something else.
      These effects are being addressed- but slowly.

    2. MindBody,

      I agree that the problems in biomedical research are among the most serious aspects of the replication crisis – and the most surprising, considering they have the strictest standards and review mechanisms.

      Thank you for the papers. This post listed the Bayer paper, but not the other two. I’ll add them to the list of cites. I had read the 2013 Science article about the spoof paper — it is one of their best ever!

  8. I’d like to add this link to a video by the Nobel Prize winning physicist, Ivar Giaever: “Global Warming Revisited” at the 2015 Lindau Nobel Laurate Meetings.

    I’d love to hear him debate against any of the ‘climate scientists’!

    Abstract

    I resigned from the society in 2011 because of the following statement from the American Physical Society:

    “The evidence is incontrovertible: Global warming is occurring. If no mitigating actions are taken, significant disruptions in the Earth’s physical and ecological systems, social systems, security and human health are likely to occur. We must reduce emissions of greenhouse gases beginning now.”

    First: nothing in science is incontrovertible. Second: the “measured” average temperature increase in 100 years or so, is 0.8 Kelvin. Third: since the Physical Society claim it has become warmer, why is everything better than before? Forth: the maximum average temperature ever measured was in 1998, 17 years ago. When will we stop wasting money on alternative energy?

    1. Getting More Skeptical,

      Why do you put “climate scientists” in scare quotes? Even if Giaever is right on all points, that doesn’t mean climate scientists are not valid scientists. The history of science has many instances of fields taking wrong turns. It is, after all, a social activity — and so quite fallible. The value of science is that it is eventually self-correcting, not that it is always-correct revealed truth.

  9. My interest in this topic is purely personal. I have no professional qualifications at all in statistical research, other than a few basics that I remember from my undergraduate university courses. My interest in this started when I was trying to decide how much confidence I could have in projecting the opinion percentages from popular opinion polls onto the target populations. Intuitively, I suspected “zero,” but I’ve been trying to clarify my reasons for that, and to get some ideas from some researchers in survey methodology.

    After a few days of poking around, I finally came up with some fruitful search terms, like “researcher degrees of freedom” for example, and in the process I learned about what people are calling a “replication crisis.”

    I agree with people who are saying that it might be unfair, and even backwards, to single out psychology in discussing the problem. It even seems to me to do credit to the field, that it was in the front lines of publicizing the problem. I also agree with people who are treating it at least partly as an ethical issue. I don’t think it’s all, or even mostly, in the ethics of researchers. I think it’s part of a widespread and growing moral and spiritual crisis in all of society, affecting all professions and all the institutions of society.

    I’ll say what I think is an indispensable part of the solution, that anyone who wants to can do to help:

    1. Practice and promote continual self-improvement for the benefit of all people everywhere.

    2. Help with the growth and spread of healthy community life at the grass roots level, in every corner of the world and every corner of society.

    1. Jim,

      “how much confidence I could have in projecting the opinion percentages from popular opinion polls onto the target populations. Intuitively, I suspected “zero,”

      That’s quite false. To take the best funded examples, national presidential election polls for the popular vote taken on the eve of the election are almost always within their margins of error. That was true in 2016. See the numbers here.

      “I agree with people who are saying that it might be unfair, and even backwards, to single out psychology in discussing the problem.”

      Who is doing so? The replication crisis started in biomedicine — and its implications there are far more important than in psychology. Dietary and nutritional science also has severe problems, with large effects.

    2. Larry, sorry. I should have specified, the opinion polls I had in mind are about religious views. I’m aware that there’s some reason for some confidence in election polls, and possibly some other kinds of polls, if there are any others that can be validated empirically. Comparing election polls to actual election results makes it possible and rewarding to refine the methodology to make it progressively more accurate. With opinion polls about religious views, or any other opinions, which have never been tested empirically, I don’t see any grounds at all for trusting their confidence intervals to include the actual population percentages. I’m not sure they would even claim, themselves, that their confidence intervals apply to the opinion percentages.

Leave a Reply