7/18/2010

Statistics: The Achilles Heel of Climate Scientists

When it comes to scientific use of statistics, Odds Are It's Wrong. Science News wrote a great expose of the problem when statistics are used to link A with B, such as fertilizer with crop yields, dog barking with hunger and various genes with various diseases. This last one is interesting because like climate prediction, hundreds of variables could simultaneously influence the outcome (the climate of the future), so establishing a link by statistical methods must be done carefully and is very often done incorrectly.

The article gets into a bit of math, but some of the concepts offer very simple illustrations. For example, how could a drug test for baseball athletes which is correct 95% of the time catch cheaters only 50% of the time and incorrectly accuse non-cheaters the other 50%? Box 4 of the article makes it clear.

Suppose an anonymous player tests positive. What is the probability that he really is using steroids? Since the test really is accurate 95 percent of the time, the naïve answer would be that probability of guilt is 95 percent. But a Bayesian knows that such a conclusion cannot be drawn from the test alone. You would need to know some additional facts not included in this evidence. In this case, you need to know how many baseball players use steroids to begin with — that would be what a Bayesian would call the prior probability. Now suppose, based on previous testing, that experts have established that about 5 percent of professional baseball players use steroids. Now suppose you test 400 players.
How many would test positive?

• Out of the 400 players, 20 are users (5
percent) and 380 are not users.
• Of the 20 users, 19 (95 percent) would be
identified correctly as users.
• Of the 380 nonusers, 19 (5 percent) would
incorrectly be indicated as users.

Author Tom Sigfried writes,
... in practice, widespread misuse of statistical methods makes science more
like a crapshoot. It’s science’s dirtiest secret.
Maybe that's why every other study of coffee drinking seems to be favorable with the others being unfavorable. It's about what you would expect from completely random results derived from poor use of statistics. The same disputes come up with every one of a hundred variables about global warming, including water currents, how land mass affects wind currents, various aerosols in the atmosphere, volcanoes and scores of others. Focusing on one, the claims vary from CO2 causes global warming, to CO2 rises as a result of global warming to the two rise and fall together but are not connected but one does not cause the other. Meanwhile, this writer notes that crystal clear nights are cooler than humid or cloudy nights. CO2 does not seem to cause as much warming as any number of other factors.

It is apt that Mark Twain quoted that going from bad to worse, you have "lies, damned lies and statistics." More than a hundred years later, we have a bigger problem, we rely on the damned statistics more than ever. Perhaps it will be another century before we really learn to use this potentially useful tool.

No comments:

Post a Comment