Karl Popper explains how to open the deadlocked climate policy debate

Summary: Many factors have frozen the public policy debate, but none more important than the disinterest of both sides in tests that might provide better evidence — and perhaps restart the discussion. Even worse, too little thought has been given to the criteria for validating climate science theories (aka their paradigm) and the models build upon them. This series looks at the answers to these questions given us by generations of philosophers and scientists, which we have ignored. This post shows how Popper’s insights can help us. The clock is running for actions that might break the deadlock. Eventually the weather will give us the answers, perhaps at ruinous cost.

“Confirmations should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.”
— Karl Popper in Conjectures and Refutations: The Growth of Scientific Knowledge (1963).

“I’m considering putting “Popper” on my list of proscribed words.”
— Steve McIntyre’s reaction at Climate Audit to mention that Popper’s work about falsification is the hallmark of science, an example of why the policy debate has gridlocked.

This graph creates a high bar for useful predictions by climate models

Global fossil carbon emissions
From the Department of Energy’s Carbon Dioxide Information Center.

What test of climate models suffices for public policy action?

Climate scientists publish little about about the nature of climate science theories. What exactly is a theory or a paradigm? Must theories be falsifiable, and if so, what does that mean? Scientists have their own protocols for such matters, and so usually leave these questions to philosophers and historians or symposiums over drinks. Yet in times of crisis — when the normal process of science fails to meet our needs — the answers to these questions provide tools that can help.

A related but distinct debate concerns the public policy response to climate change, which uses the findings produced by climate scientists and other experts. Here insights about the dynamics of the scientific process and the basis for proof can guide decision-making by putting evidence and expert opinion in a larger context.

A previous post in this series (links below) described how Thomas Kuhn’s theories explain the current state of climate science. This post looks to the work of Karl Popper (1902-1994) for advice about breaking the gridlocked public policy debate about climate change. At the end of this post is the best-known section of his work about this.

Popper said scientific theories must be falsifiable, and that prediction was the gold standard for their validation. Less well known is his description of what makes a compelling prediction: it should be “risky” — of an outcome contrary to what we would otherwise expect. A radical new theory that predicts that the sun will rise tomorrow is falsifiable by darkness at noon — yet watching the dawn provides little evidence for it. Contrast that with the famous 1919 test of general relativity, whose prediction was contrary to that of the then-standard theory.

How does this apply to climate science?

NOAA: Long-term global temperature graph
From NOAA’s interactive Climate At A Glance graphing page.

Predictions of warming

“The globally averaged combined land and ocean surface temperature data as calculated by a linear trend, show a warming of 0.85 [0.65 to 1.06] °C, over the period 1880 to 2012, when multiple independently produced datasets exist. …

“It is extremely likely that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together. The best estimate of the human-induced contribution to warming is similar to the observed warming over this period.”

— From the Summary of Policymakers to the IPCC’s Working Group I report of AR5.

Popper’s insight raises the bar for testing the predictions of climate models. The world has warmed since the late 19th century; anthropogenic forces became dominant only after WWII. The naive prediction is that warming will continue. This requires no knowledge of greenhouse gases or theory about anthropogenic global warming.

A risky test requires a prediction that differs from “more of the same”. Forecasts of accelerated warming late in the 21st century qualify as “risky” but provide no evidence today. Hindcasts — matching model projections vs. past observations — provide only weak evidence for the policy debate, as past data was available to the model’s developers.

As usual in climate science, these points have been made — and ignored. For example, “Should we assess climate model predictions in light of severe tests?” by Joel Katzav (Professor of Philosophy, Eindhoven University of Technology) in EOS (of the American Geophysical Union), 11 June 2011. He builds upon Popper’s call for “severe testing” in The Logic of Scientific Discovery (2005) It’s worth reading in full; here is an excerpt.

The scientific community has placed little emphasis on providing assessments of CMP {climate model prediction} quality in light of performance at severe tests. Consider, by way of illustration, the influential approach adopted by Randall et al. in chapter 8 of their contribution to the fourth IPCC report. This chapter explains why there is confidence in climate models thus: “Confidence in models comes from their physical basis, and their skill in representing observed climate and past climate changes”.

…CMP quality is thus supposed to depend on simulation accuracy. However, simulation accuracy is not a measure of test severity. If, for example, a simulation’s agreement with data results from accommodation of the data, the agreement will not be unlikely, and therefore the data will not severely test the suitability of the model that generated the simulation for making any predictions.

…It appears, then, that a severe testing approach to assessing CMP quality would be novel. Should we, however, develop such an approach? Arguably, yes …. First, as we have seen, a severe testing assessment of CMP quality does not count simulation successes that result from the accommodation of data in favor of CMPs. Thus, a severe testing assessment of CMP quality can help to address worries about relying on such successes, worries such as that these successes are not reliable guides to out-of-sample accuracy, and will provide important policy-relevant information as a result.

Buttons of thumbs up and down

Conclusions

The public policy debate about climate change has gridlocked in part because many consider the evidence given as insufficient to warrant massive expenditures and regulatory changes. The rebuttal has largely consisted of “trust us” and screaming “denier” at critics. Neither has produced progress; future historians will wonder why anyone expected them to do so.

This series seeks tests that both sides can accept — that might move the policy debate beyond today’s futile bickering.

The insights of Daniel Daves Kuhn and advice by Popper offer a possible solution: test models from the past 4 Assessment Reports using observations from our past but their future. Run them with  observations made after their creation, not scenarios, so they produce predictions not projections — and compare them with observations from after their creation. This will produce better evidence than we have today but still might not provide a “risky” prediction necessary to warrant massive public policy action — diverting resources from other critical challenges (e.g., preparing for return of past extreme weather events, addressing poverty, avoiding destruction of ocean ecosystems).

The criteria to prove current theories about climate change have received too little attention, mostly focusing on increasingly elaborate hindcasts (see this list of papers). Progress will come from better efforts to test the models, new insights from climate scientists, and the passage of time. But by themselves these might prove insufficient to produce timely policy action on the necessary scale. We should add to that list “developing better methods of model validation”.

Karl Popper
Karl Popper’s reaction to modern climate science: facepalm.

Update: a “severe test” more severe than Popper’s

In a comment at Climate Etc Willard points to an powerful analysis by Deborah Mayo: “Severe tests, arguing from error, and methodological underdetermination” in Philosophical Studies, 86 (3) 1997. There are levels of severe tests, some more severe than Poppers. Excerpt, red emphasis added.

“Popper’s problem here is that the grounds for the “best tested” badge would also be grounds for giving the badge to countless many other (not yet even though of ) hypotheses, had they been the ones considered for testing. So this alternative hypothesis objection goes through for Popper’s account.

“This is not the case for the severity criterion I have set out. A non-falsified hypothesis H that passes the test failed by each rival hypothesis H 0 that has been considered, has passed a severe test for Popper – but not for me. Why not? Because for H to pass a severe test in my sense it must have passed a test with high power at probing the ways H can err. And the test that alternative hypothesis H 0 failed need not be probative in the least so far as H’s errors go. So long as two different hypotheses can err in different ways, different tests are needed to probe them severely.”

Other posts about the climate policy debate

For More Information

Join the debate about this post at Climate Etc., the website of Judith Curry (Prof of Atmospheric Science at GA Inst of Tech): 370 530 comments and still running strong. Post your thoughts about this here or there.

Please like us on Facebook, follow us on Twitter. For more information see The keys to understanding climate change, My posts about climate change. , and especially these about computer models…

  1. About models, increasingly often the lens through which we see the world.
  2. Will a return of rising temperatures validate the IPCC’s climate models?
  3. We must rely on forecasts by computer models. Are they reliable?
  4. A frontier of climate science: the model-temperature divergence.
  5. Do models accurately predict climate change? — By eminent climate scientist Roger Pielke Sr.
  6. Do models accurately predict climate change?
  7. How accurate are climate scientists’ findings? Look at ocean warming.

Popper’s advice to us

Excerpt from Conjectures and Refutations: The Growth of Scientific Knowledge
by Karl Popper (1963)

After the collapse of the Austrian Empire there had been a revolution in Austria: the air was full of revolutionary slogans and ideas, and new and often wild theories. Among the theories which interested me Einstein’s theory of relativity was no doubt by far the most important. Three others were Marx’s theory of history, Freud’s psychoanalysis, and Alfred Adler’s so-called individual psychology.

There was a lot of popular nonsense talked about these theories, and especially about relativity (as still happens even today), but I was fortunate in those who introduced me to the study of this theory. We all — the small circle of students to which I belonged — were thrilled with the result of Eddington’s eclipse observations which in 1919 brought the first important confirmation of Einstein’s theory of gravitation. It was a great experience for us, and one which had a lasting influence on my intellectual development.

The three other theories I have mentioned were also widely discussed among students at that time. I myself happened to come into personal contact with Alfred Adler, and even to co-operate with him in his social work among the children and young people in the working-class districts of Vienna where he had established social guidance clinics.

It was during the summer of 1919 that I began to feel more and more dissatisfied with these three theories — the Marxist theory of history, psychoanalysis, and individual psychology; and I began to feel dubious about their claims to scientific status. My problem perhaps first took the simple form, “What is wrong with Marxism, psychoanalysis, and individual psychology? Why are they so different from physical theories, from Newton’s theory, and especially from the theory of relativity?”

Popper: Conjectures and Refutations
Available at Amazon.

To make this contrast clear I should explain that few of us at the time would have said that we believed in the truth of Einstein’s theory of gravitation. This shows that it was not my doubting the truth of those other three theories which bothered me, but something else. Yet neither was it that I merely felt mathematical physics to be more exact than the sociological or psychological type of theory. Thus what worried me was neither the problem of truth, at that stage at least, nor the problem of exactness or measurability. It was rather that I felt that these other three theories, though posing as sciences, had in fact more in common with primitive myths than with science; that they resembled astrology rather than astronomy.

I found that those of my friends who were admirers of Marx, Freud, and Adler, were impressed by a number of points common to these theories, and especially by their apparent explanatory power. These theories appeared to be able to explain practically everything that happened within the fields to which they referred. The study of any of them seemed to have the effect of an intellectual conversion or revelation, opening your eyes to a new truth hidden from those not yet initiated. Once your eyes were thus opened you saw confirming instances everywhere: the world was full of verifications of the theory. Whatever happened always confirmed it. Thus its truth appeared manifest; and unbelievers were clearly people who did not want to see the manifest truth; who refused to see it, either because it was against their class interest, or because of their repressions which were still “un-analysed” and crying aloud for treatment.

The most characteristic element in this situation seemed to me the incessant stream of confirmations, of observations which “verified” the theories in question; and this point was constantly emphasized by their adherents.

A Marxist could not open a newspaper without finding on every page confirming evidence for his interpretation of history; not only in the news, but also in its presentation — which revealed the class bias of the paper — and especially of course in what the paper did not say. The Freudian analysts emphasized that their theories were constantly verified by their “clinical observations.”

As for Adler, I was much impressed by a personal experience. Once, in 1919, I reported to him a case which to me did not seem particularly Adlerian, but which he found no difficulty in analysing in terms of his theory of inferiority feelings, although he had not even seen the child. Slightly shocked, I asked him how he could be so sure. “Because of my thousandfold experience,” he replied; whereupon I could not help saying: “And with this newcase, I suppose, your experience has become thousand-and-one-fold.”

What I had in mind was that his previous observations may not have been much sounder than this new one; that each in its turn had been interpreted in the light of “previous experience,” and at the same time counted as additional confirmation. What, I asked myself, did it confirm? No more than that a case could be interpreted in the light of the theory. But this meant very little, I reflected, since every conceivable case could be interpreted in the light of Adler’s theory, or equally of Freud’s.

I may illustrate this by two very different examples of human behaviour: that of a man who pushes a child into the water with the intention of drowning it; and that of a man who sacrifices his life in an attempt to save the child. Each of these two cases can be explained with equal ease in Freudian and in Adlerian terms. According to Freud the first man suffered from repression (say, of some component of his Oedipus complex), while the second man had achieved sublimation. According to Adler the first man suffered from feelings of inferiority (producing perhaps the need to prove to himself that he dared to commit some crime), and so did the second man (whose need was to prove to himself that he dared to rescue the child). I could not think of any human behaviour which could not be interpreted in terms of either theory.

It was precisely this fact — that they always fitted, that they were always confirmed — which in the eyes of their admirers constituted the strongest argument in favour of these theories. It began to dawn on me that this apparent strength was in fact their weakness.

With Einstein’s theory the situation was strikingly different. Take one typical instance — Einstein’s prediction, just then confirmed by the findings of Eddington’s expedition. Einstein’s gravitational theory had led to the result that light must be attracted by heavy bodies (such as the sun), precisely as material bodies were attracted. As a consequence it could be calculated that light from a distant fixed star whose apparent position was close to the sun would reach the earth from such a direction that the star would seem to be slightly shifted away from the sun; or, in other words, that stars close to the sun would look as if they had moved a little away from the sun, and from one another.

This is a thing which cannot normally be observed since such stars are rendered invisible in daytime by the sun’s overwhelming brightness; but during an eclipse it is possible to take photographs of them. If the same constellation is photographed at night one can measure the distances on the two photographs, and check the predicted effect.

Now the impressive thing about this case is the risk involved in a prediction of this kind. If observation shows that the predicted effect is definitely absent, then the theory is simply refuted. The theory is incompatible with certain possible results of observation — in fact with results which everybody before Einstein would have expected.[1] This is quite different from the situation I have previously described, when it turned out that the theories in question were compatible with the most divergent human behaviour, so that it was practically impossible to describe any human behaviour that might not be claimed to be a verification of these theories.

These considerations led me in the winter of 1919-20 to conclusions which I may now reformulate as follows.

(1) It is easy to obtain confirmations, or verifications, for nearly every theory — if we look for confirmations.

(2) Confirmations should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.

(3) Every “good” scientific theory is a prohibition: it forbids certain things to happen. The more a theory forbids, the better it is.

(4) A theory which is not refutable by any conceivable event is nonscientific. Irrefutability is not a virtue of theory (as people often think) but a vice.

(5) Every genuine test of a theory is an attempt to falsify it, or to refute it. Testability is falsifiability; but there are degrees of testability; some theories are more testable, more exposed to refutation, than others; they take, as it were, greater risks.

(6) Confirming evidence should not count except when it is the result of a genuine test of the theory; and this means that it can be presented as a serious but unsuccessful attempt to falsify the theory. (I now speak in such cases of corroborating evidence.)

(7) Some genuinely testable theories, when found to be false, are still upheld by their admirers-for example by introducing ad hoc some auxiliary assumption, or by re-interpreting theory ad hoc in such a way that it escapes refutation. Such a procedure is always possible, but it rescues the theory from refutation only at the price of destroying, or at least lowering, its scientific status. (I later described such a rescuing operation as a conventionalist twist or a conventionalist stratagem.)

One can sum up all this by saying that the criterion of the scientific status of a theory is its falsifiability, or refutability, or testability.

20 thoughts on “Karl Popper explains how to open the deadlocked climate policy debate

  1. For systems controlled by many variables the problem of verification gets much worse. The phrase: All other things remaining equal…
    Gets used frequently even though everyone knows it’s not true, and worse, we don’t even know how much it’s not true. This aspect of non Physics (Few variables all tightly controlled) endeavor can and does often permanently muddy the water. Some inquiries are intrinsically hopeless. Like they say in MP Holy Grail: Camelot? It is a silly place. Let’s not go there.

  2. Good subject.

    The points you are making are the intellectual justifications of what I argue (unsuccessfully): I shall treat CAGW warnings seriously when the tense changes from conditionals to declaritives. When “may, might, could, should” become “will, does and shall” . The change, of course, forces verification and – threatening to the speakers – accountability.

    But I would – will – argue that what we witness is not strictly a problem of bad scientific behavior. Look to any newspaper and you will see the same: “the fall of Ramadi could be a turning point in the war”, “Chinese stock collapse may trigger global recession”,
    “Clinton’s latest emails might be the smoking gun”. You could say these statements reflect uncertainty or a difficulty in predicting future events. I suggest – believe – they reflect a socially negative compulsion for drama.

    We live in a constantlt reinforced “climate” of tension designed to give us a feeling of importance and necessity. Today is “special”. Our climate science is “special”. By extension we, our lives, our tasks, are special.

    The drama gives meaning to the lives of the 21st century western society. The migrant crisis “may” destabilize Europe. We live in a state of alarm without solution – perhaps Europe won’t fall apart? If so, action would have been worse than just living with the anxiety? We feel alive, existing in anticipation of …. something. Actually, anything.

    It is too easy to say the artificial drama is the fault of the media, that it underpins competition for viewers. While that is true, I say it is a bigger, a response to our Western life of lost meaning. We invent enemies. Tribal distrust of the Other defines one’s identify as well as focus in Afghanistan; the threat equivalent in America is Climate Change.

    Are we driven genetically to drama because physical threats used to be ever present and eternal vigilance necessary to our survival? I say – the optimist here – the drama created is an unnecessary holdover useful to the governors. We fo what they want because we are alarmed. But almost all threats don’t come to pass. While ignoring potential harm is obviously not good, an over emphasis is destructive. But organizationally useful.

    CAGW is drama writ large. The cast, the volume, the rhetoric are planetary. Mann doesn’t say anything more substantial or testable than the sky is falling, but he is guaranteed the front page. Climate science that allowed – or insisted – on Popper tests and technical accountability would end Mannian drama. It would also have forestalled Bush’s invasion of Iraq.

    The promotion of verifiable knowledge over drama would, not might or should, change society. The climate change fiasco would be only one of many – opinion here – things to be impacted in a positive way. But I would not look for it right now. A compulsion for drama will only disappear if our social loss of meaning is fixed.

    1. Interesting that you mention the invasion of Iraq, I noted the same in my post:
      United Nations was supposed to solve international problems of a cultural character – not to become one!:

      Here is a a famous quote in which inductivism is evident:
      “What you will see is an accumulation of facts and disturbing patterns of behavior. The facts on Iraqis’ behavior – Iraq’s behavior demonstrate that Saddam Hussein and his regime have made no effort – no effort – to disarm as required by the international community. Indeed, the facts and Iraq’s behavior show that Saddam Hussein and his regime are concealing their efforts to produce more weapons of mass destruction.”
      (Ref: US secretary of state’s address to the United Nations security council).

  3. Thank you for bringing attention to Popper. I think the core problem with the IPCC assessment reports is that it breach in several ways with the guidelines provided by Popper. I have several Posts on this at my site Science or Fiction?

    I wish more people was familiar with his work. Regarding the excerpt ( Excerpt from Conjectures and Refutations: The Growth of Scientific Knowledge ), I think that is a really great excerpt which really contains the essence.

  4. During my review of Poppers work, mainly The logic of scientific discovery I have collected a some quotes which I find very essential.

    I would like to share this selection with you.
    (If the comment becomes too long and you choose to delete it, you will have my full understanding, I still hope that you find it valuable.)

    “The theory to be developed in the following pages stands directly opposed to all attempts to operate with the ideas of inductive logic. It might be described as the theory of the deductive method of testing, or as the view that a hypothesis can only be empirically tested—and only after it has been advanced.»

    “But I shall certainly admit a system as empirical or scientific only if it is capable of being tested by experience. These considerations suggest that not the verifiability but the falsifiability of a system is to be taken as a criterion of demarcation. In other words: I shall not require of a scientific system that it shall be capable of being singled out, once and for all, in a positive sense; but I shall require that its logical form shall be such that it can be singled out, by means of empirical tests, in a negative sense: it must be possible for an empirical scientific system to be refuted by experience.»

    “Every scientific theory implies that under certain conditions, certain thing will happen. Every test consists in an attempt to realize these conditions, and to find out whether we can obtain a counter-example even if these conditions are realized; for example by varying other conditions which are not mentioned in the theory.» « This fundamentally clear and simple procedure of experimental testing can in principle be applied to probabilistic hypotheses in the same way as it can be applied ton non-probabilistic or, as we may say, for brevity´s sake, «causal» hypothesis. «Testing by experiment thus has two aspects: variation of conditions is one; and keeping constant the conditions which are mentioned as relevant in the hypothesis is another – the one aspect which interests us here. It is decisive for the idea of repeating an experiment.»

    “Tests of the simplest probabilistic hypotheses involve such sequences of repeated and therefore independent experiments – as do also test of causal hypotheses. And the hypothetically estimated probability of propensity will be tested by the frequency distributions in these independent test sequences. (The frequency distribution of an independent sequence ought to be normal, or Gaussian; and as a consequence it ought to indicate clearly whether or not the conjectured propensity should be regarded as refuted or corroborated by the statistical test.»

    «An experiment is thus called «independent« of another, or of certain conditions, or not affected by these conditions, if and only if they do not change the probability of the result. And conditions which in this way have no effect upon the probability of the result are called irrelevant conditions.”

    “I should say that those objective conditions which are conjectured to characterize the event (or experiment) and its repetitions determine the propensity, and that we can in practice speak of the propensity only relative to those selected repeatable conditions; for we can of course in practice never consider all the conditions under which an actual event has occurred or an actual experiment has taken place.»

    «Thus in any explanatory probabilistic hypothesis, part of our hypothesis will always be that we have got the relevant list of conditions .. characteristic of the kind of event which we wish to explain.”

    “… it is always possible to find some way of evading falsification, for example by introducing ad hoc an auxiliary hypothesis, or by changing ad hoc a definition. It is even possible without logical inconsistency to adopt the position of simply refusing to acknowledge any falsifying experience whatsoever. Admittedly, scientists do not usually proceed in this way, but logically such procedure is possible»

    «From my point of view, a system must be described as complex in the highest degree if, in accordance with conventionalist practice, one holds fast to it as a system established forever which one is determined to rescue, whenever it is in danger, by the introduction of auxiliary hypotheses. For the degree of falsifiability of a system thus protected is equal to zero.»

    “the empirical method shall be characterized as a method that excludes precisely those ways of evading falsification which … are logically possible. According to my proposal, what characterizes the empirical method is its manner of exposing to falsification, in every conceivable way, the system to be tested. Its aim is not to save the lives of untenable systems but … exposing them all to the fiercest struggle for survival.»

    “a subjective experience, or a feeling of conviction, can never justify a scientific statement, and that within science it can play no part except that of an object of an empirical (a psychological) inquiry. No matter how intense a feeling of conviction it may be, it can never justify a statement. Thus I may be utterly convinced of the truth of a statement; certain of the evidence of my perceptions; overwhelmed by the intensity of my experience: every doubt may seem to me absurd. But does this afford the slightest reason for science to accept my statement? Can any statement be justified by the fact that Karl Popper is utterly convinced of its truth? The answer is, ‘No’; and any other answer would be incompatible with the idea of scientific objectivity.”

    “All this glaringly contradicts the programme of expressing, in terms of a ‘probability of hypotheses’, the degree of reliability which we have to ascribe to a hypothesis in view of supporting or undermining evidence.”
    Note: The assessment report from IPCC is full of expressions of degree of reliability!
    See: Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on
    Consistent Treatment of Uncertainties

    “The epistemological idea of simplicity plays a special part in theories of inductive logic, for example in connection with the problem of the ‘simplest curve’. Believers in inductive logic assume that we arrive at natural laws by generalization from particular observations. If we think of the various results in a series of observations as points plotted in a co-ordinate system, then the graphic representation of the law will be a curve passing through all these points. But through a finite number of points we can always draw an unlimited number of curves of the most diverse form. Since therefore the law is not uniquely determined by the observations, inductive logic is confronted with the problem of deciding which curve, among all these possible curves, is to be chosen.”

    «We may if we like distinguish four different lines along which the testing of a theory could be carried out. First there is the logical comparison of the conclusions among themselves, by which the internal consistency of the system is tested. Secondly, there is the investigation of the logical form of the theory, with the object of determining whether it has the character of an empirical or scientific theory, or whether it is, for example, tautological. Thirdly, there is the comparison with other theories, chiefly with the aim of determining whether the theory would constitute a scientific advance should it survive our various tests. And finally, there is the testing of the theory by way of empirical applications of the conclusions which can be derived from it. «

    «And although I believe that in the history of science it is always the theory and not the experiment, always the idea and not the observation, which opens up the way to new knowledge, I also believe that it is always the experiment which saves us from following a track that leads nowhere: which helps us out of the rut, and which challenges us to find a new way.»

  5. I came to this post via the re-post of your 09/21 blog at WUWT today. I got down to the 4th paragraph under “Conclusions” and experienced a disorienting moment when I realized the simple test you proposed was not, in fact, business as usual for the modeling crowd. I’ve always thought the proper way to test was develop the model with the middle 50% of the data and then run it to see if you get the two 25% tails with any accuracy. I did not realize this idea could be so revolutionary.

    1. D.J.

      That’s exactly my experience. I have extensive experience in finance, where model failure can mean unemployment — and hindtesting is the equivalent to seeing that the propellers turn on the airplane (i.e., the first stage of testing). I was astonished to discover the indifference to model validation among climate scientists and their disinterest in the large body of knowledge developed about it in other fields.

      My most important post about this is Climate scientists can restart the climate policy debate & win: test the models! See the section at the end describing the cli sci lit about testing. You already know the bottom line, so you won’t be shocked.

Leave a Reply to douglasproctor Cancel reply