Whether environmental modellers are wrong
[This is a comment invited by Issues in Science and Technology as a reply to the article “When All Models Are Wrong” in their Winter 2014 issue. The article is not online there but has been archived by thefreelibrary.com. My comment will appear in the Spring 2014 issue.]
I was interested to read Saltelli and Funtowicz’s article “When All Models Are Wrong”1, not least because several people sent it to me due to its title and mention of my blog2. The article criticised complex computer models used for policy making – including environmental models – and presented a checklist of criteria for improving their development and use.
As a researcher in uncertainty quantification for environmental models, I heartily agree we should be accountable, transparent, and critical of our own results and those of others. Open access journals — particularly those accepting technical topics (e.g. Geoscientific Model Development) and replications (e.g. PLOS One) — would seem key, as would routine archiving of preprints (e.g. arXiv.org) and (ideally non-proprietary) code and datasets (e.g. FigShare.com). Academic promotions and funding directly or indirectly penalise these activities, even though they would improve the robustness of scientific findings. I also enjoyed the term “lamp-posting”: examining only the parts of models we find easiest to see.
However, I found parts of the article somewhat uncritical themselves. The statement “the number of retractions of published scientific work continues to rise” is not particularly meaningful. Even the fraction of retraction notices is difficult to interpret, because an increase could be due to changes in time lag (retraction of older papers), detection (greater scrutiny, e.g. RetractionWatch.com), or relevance (obsolete papers not retracted). It is not currently possible to reliably compare retraction notices across disciplines. But in one study of scientific bias, measured by fraction of null results, Geosciences and Environment/Ecology were ranked second only to Space Science in their objectivity3. It is not clear we can assert there are “increasing problems with the reliability of scientific knowledge”.
There was also little acknowledgement of existing research on the question “Which of those uncertainties has the largest impact on the result?”: for example, the climate projections used in UK adaptation4. Much of this research goes beyond sensitivity analysis, part of the audit proposed by the authors, because it explores not only uncertain parameters but also inadequately represented processes. Without an attempt to quantify structural uncertainty, a modeller implicitly makes the assumption that errors could be tuned away. While this is, unfortunately, common in the literature, the community is making strides in estimating structural uncertainties for climate models5,6.
The authors make strong statements about political motivation of scientists. Does a partial assessment of uncertainty really indicate nefarious aims? Or might scientists be limited by resources (computing, person, or project time) or, admittedly less satisfactorily, statistical expertise or imagination (the infamous “unknown unknowns”)? In my experience modellers might already need tactful persuasion to detune carefully tuned models, and consequently increase uncertainty ranges; slinging accusations of motivation would not help this process. Far better to argue the benefits of uncertainty quantification. By showing that sensitivity analysis helps us understand complex models and highlight where effort should be concentrated, we can be motivated by better model development. And by showing where we have been ‘surprised’ by too small uncertainty ranges in the past, we can be motivated by the greater longevity of our results.
With thanks to Richard Van Noorden, Ed Yong, Ivan Oransky and Tony O’Hagan.
 Issues in Science & Technology, Winter 2014.
 “All Models Are Wrong”, now hosted by PLOS at https://blogs.plos.org/models.
 Fanelli (2010): “Positive” Results Increase Down the Hierarchy of the Sciences, PLoS ONE 5(4): e10068.
 “Contributions to uncertainty in the UKCP09 projections”, Appendix 2.4 in: Murphy et al. (2009): UK Climate Projections Science Report: Climate change projections. Met Office Hadley Centre, Exeter. Available from http://ukclimateprojections.defra.gov.uk
 Rougier et al. (2013), Second-Order Exchangeability Analysis for Multimodel Ensembles, J. Am. Stat. Assoc. 108: 503, 852-863.
 Williamson et al. (2013): History matching for exploring and reducing climate model parameter space using observations and a large perturbed physics ensemble, Climate Dynamics 41:7-8, 1703-1729.
I spoke extensively with Silvio Funtowitz at the 2011 Lisbon conference I helped Silvio’s old co -author Jerry Ravetz organise. Other attendees included Steve McIntyre and Ross McKittrick, Hans von Storch and Judy Curry.
We all agreed, as does sir Brian Hoskins, that integrative assessment models are a bit of a joke. As sir Brian put it:
“Lock two enomists in a room and two hours later you’ll have three theories of economics”.
Thanks Tamsin, very interesting read.
I seem to have read it a bit differently than you. The main point that I came away with was that the authors see a dichotomy between models used for political/policy advice that need to be simpler, more transparent, and better vetted than those used for more esoteric research (where the other constraints you mention are less potentially disastrous). So, in terms of climate modelling, how clearly has the uncertainty inherent in the system been made to policy makers? How well have they been advised on the inability to model clouds or water vapour. How well has it been explained that a strong positive feedback from CO2 has been incorporated into models because the modellers can’t think of any other explanation for observations at the time the models were being parameterized? How well have the uncertainties associated with decade to century length forecasts been made? And so on.
A good example of why I ask these questions can be found in the report just issued by the Australian Government’s Climate Change Authority. The executive summary is worth a read – no uncertainty expressed there at all:
Absolutely, there were many angles I could have taken but I was given a short word limit …
Liz Stephens, David Demeritt and I wrote a paper on how uncertainties could be visualised more effectively and transparently – including giving less processed simulation output (not smoothed maps or time/ensemble averages). Links: blog post and pdf.
I agree that talking about adequacy of the processes is part of that too. The IPCC reports do lay all these out in detail, including assessments of how consistent the different lines of evidence are (the confidence statements, as opposed to the likely statements). But it also needs as many two way conversations with policy makers as possible.
Was there a particular line or lines in that ES you were worried about? At first skim, the only parts I’d want to double-check whether the statements were over-confident with respect to the original projections were: (a) floods, because regional precipitation projections are so variable, (b) sea level, because local effects are so important (c) the uncertainty in the cumulative carbon budget for 2degC. I’d also say that “Australia is likely to better adapt to projected impacts if global warming is limited to less than
2 degrees above pre-industrial levels” is not a particularly useful statement 🙂 Of course it’s easier to adapt to less climate change than more – if there is a critical threshold it’s better to define the local effects of that instead.
Sorry, here’s a no registration link for the (preprint) of that pdf. Or email me for the final typeset copy.
” In my experience modellers might already need tactful persuasion to detune carefully tuned models, and consequently increase uncertainty ranges;”
Blessed be the tactful persuaders. I’d just give them standards to work to and sack those who didn’t.
That’s how it works in engineering science anyway.
This would be a bit of a surprise as a) Integrated Assessment Models are built by other experts as well as economists and b) they are the models that have gone furthest in addressing the critical question “Which of those uncertainties has the largest impact on the result?”.
See figure 7 here, for instance http://www.jbs.cam.ac.uk/fileadmin/user_upload/research/workingpapers/wp1105.pdf
Most economists are indeed not very good at building theories of the discipline of economics, so that it would not be surprising to find multiple theories coinciding within the same layperson.
Fortunately, economists are much better at their core expertise, which is to build a theory of the economy.
Hi Chris and Richard:
I have paraphrased and attributed incorrectly, so apologies for my faulty memory. Sir Brian Hoskins comments came in a 10 minute segment on IAM’s (Integrated assessment models) starting at 01:25 here
Sir Brian was actually responding to Quentin Cooper’s point about the two economists generating three theories. Listen from 9:30 to Hear Quentin put the question and sir Brian’s response. Hoskins points out that the economic theories used in the models have not been validated. Put that issue on top of climate models which have recently been invalidated by ‘the pause’ and “Houston, we have a problem”.
For the purposes supposed to be served by IAM’s, it doesn’t really matter if the ‘missing heat is hiding’ in the deep ocean or has already left the planet, the point is is not at the surface we inhabit, farm and wade through floodwater on, despite predictions that it would be. Observations have dropped out of bottom of the range, and that’s that. If the climate model output used as the ecoomic model input is crap, it doesn’t matter how good your unvalidated theory of economy is really, you’ll still get useless output. Add in the unexpected, like Putin turning off the gas tap, or UKIP winning the 2015 general electi0n and giving the CCC and climate change act the old heave-ho and all bets are off.
It already happened in Australia, and is coming to a bunch of European countries near you next.
“The authors make strong statements about political motivation of scientists.”
I don’t think that’s really true – there’s just a hint of it in rule 1.
Most of what they are saying is just re-iterating the basic common sense Feynman-esque rules of modelling – don’t fool yourself, spell out your assumptions, check parameter dependence, audit, transparency etc.
Apology accepted. You do make an interesting point. But it does matter if the missing heat is hiding in the oceans, because a lot of it would then be expected to come out into surface warming in the next couple of El Nino years. If it’s ‘already left the planet’ that will not be true.
I think it’s throughout:
– Rule 1: “Is it used to elucidate or to obfuscate?”
– Rule 3: “we are defining pseudoscience as the practice of ignoring or hiding the uncertainties in model inputs in order to ensure that model outputs can be linked to preferred policy choices”
– Rule 4: “What were the motives behind the use of such implausible assumptions?”
– Summary: “But if they have not been followed…have good reason to be skeptical of both the motives of the modelers and the plausibility of model outputs.”
Every now and then you see an economist blunder into climate science. The result is usually ugly.
Anthony Watts has an interesting post by Dr. Ball up.
You would be welcome to visit and add your thoughts. (At least I will appreciate your visiting.)
As a physicist who has become a climate modeler I was wondering what you thought of Robert Brown’s post over at WUWT http://wattsupwiththat.com/2014/05/07/the-global-climate-model-clique-feedback-loop/#more-108744
This seems like something you would have a perspective on
I understand “Environmental Modeling” to be a subset to “Forecasting”, with potential to have the highest economic impacts – of the order of $10 to $100 Trillion.
Have “environmental modelers” even thought to incorporate The Standards and Practices of Forecasting Principles? as detailed in:
Principles of Forecasting: A Handbook for Researchers and Practitioners, J. Scott Armstrong (ed.): Norwell, MA: Kluwer Academic Publishers, 2001
May I recommend the special interest group: Public Policy Forecasting
They apply the methodology of Forecasting Audits to public policy forecasts.
I suppose we can call these IAM s a subset of dynamic systems models? If so, I’ve been a model builder, and the economics was the easy piece. The models I built were intended to have extensive weather data. Once I realized how much it would cost to have the right data coverage and how difficult it was to model sea ice I decided to call for an alliance. I think we met in Vancouver in 1994 to discuss the model architecture and set budgets. The project took a couple of years, and in the end we decided we lacked the ice data….so we punted. So where am I headed with this? When these models are built the basic economics equations are easy. The hard part is the actual data and physical models. And the human response we have to tie within the models to mimic how we as a herd behave to events and incoming information. What I found was that we behave irrationally. This means these models behave a bit like one of those roulettes or a game of craps….oh well.
See: J. Scott Armstrong and Kesten C. Green and Andreas Graefe The Golden Rule of Forecasting: Be Conservative 6 February 2014 MPRA Paper No. 53579, 10. February 2014 http://mpra.ub.uni-muenchen.de/53579/
Presentation June 2014 aka: “Forecast unto others as you would have them forecast unto you.”
GoldenRuleOfForecasing.com Supporting materials
Richard Betts states clearly that climate models are not useful for policy making.
I see no reason to disagree with that clear point.
The climate obsession has morphed into a huge industry selling doom and offering a solution based on diong one thing: reducing CO2 emissions.
No cost-benefit analysis is accepted. Lord Stern’s faux economic analysis is a sour joke of historical scales similar to Mao’s 1950’s disastrous 5 year plans.
What has this bizarre climate obsession actually done besides divert immense sums of money from more prodcutive uses?
And it has been driven by models that don’t pass any reasonable standard.