The word “significance” has many meanings. Those varied meanings actually help the purposes of this post.
On one hand, “significance” refers to “importance.”1 On the basis of their profoundly disruptive impact,2 Nissen and Wolski’s claims certainly were important and worthy of our attention in this series of posts. In that sense, Nissen and Wolski’s meta-analyses3,4 about rosiglitazone (Avandia, Avandamet, Avandaryl; GlaxoSmithKline) had “significance.” But what about the other meanings of “significance?”
More important, for the current purposes, is the definition of “significance” as “meaning.”1 What is the meaning of Nissen and Wolski3,4? The answer to that question requires yet another definition of “significance.” I ask those readers who are either uncomfortable with, or uninterested in, statistics to briefly bear with me, as I lay the groundwork for yet more debunking of Nissen and Wolski to come.
In statistical jargon, “significance” refers to those outcomes of specialized mathematical tests that suggest that differences between groups are unlikely to have occurred by chance alone. In other words, if differences between groups as large as those that were observed would seldom occur by chance alone, they are said to be “statistically significantly different,” or just “significant.” “Significance” in the statistical sense is no guarantee of “significance” in the sense of being meaningful, but, on the other hand, it is best to reserve judgement that results are meaningful, if they could easily have been produced by chance processes. The estimate of how frequently results at least as different as those observed could have been produced by chance alone is conventionally referred to as “p” (for the “probability” of achieving them by chance, alone). The threshold number for the outcomes of those mathematical test should be adjusted, depending upon how you want to balance the costs of possibly mistakenly accepting differences as real or of possibly mistakenly dismissing them as too likely attributable to chance. Rather than bother with actually thinking about such important matters, however, most medical literature arbitrarily uses a “p” value of 0.05, which is equivalent to one chance in twenty of occurring by chance; in other words, odds of 19 to 15 that something other than chance was at work in producing the results.
However, as those of you who are familiar with statistics already know, the most important test of “significance” is not a small “p” value, but actually replicating the results in multiple independent experimental studies (See my post, ADOPT the DREAM of a RECORD for Avandia,” December 26, 2013). This is especially true when the value of “p” was estimated from a meta-analysis, which is an “artificial” or “pretend” study.6
This is actually a very important matter. The reason for it may be found in something known as “Bayes Theorem.” Again, in deference to those who lack a strong statistical background (and may even want to keep it that way!), I will try to keep this very simple: Starting with an estimate of how likely an idea (“hypothesis”) is to be true, you obtain a new estimate which, combined with the old estimate, makes it more likely or less likely than before that the idea is true. Rather than relying upon mere impressions, however, Bayes Theorem provides a formal way to obtain a revised estimate of how likely the idea is to be true. The revised estimate depends upon (1) the old estimate, (2) the new estimate, (3) the strength of the old estimate, and (4) the strength of the new estimate. (S.B. Leavitt provides what may be the best explanation–without going into the mathematics involved–that I have seen of using Bayes theorem to make medical decisions.7)
Even if one accepts on blind faith (as many people evidently have) Nissen and Wolski’s own original3 (and dubious) estimate of how unlikely it was that they would have obtained their results by chance alone, and even if one refuses to recognize (as many people evidently have) the numerous alternative explanations (biases) for their results, Nissen and Wolski still provided only a weak initial estimate for the likelihood of only a very modest effect. Combining Nissen and Wolski’s highly questionable original estimate3 with the information from additional studies makes it very likely that differences as large as Nissen and Wolski claimed would frequently occur by mere chance–even ignoring the serious flaws in their meta-analysis, without which they could not have reached the conclusions that they did (See my post, “Nissen, Wolski, and How Not to Do Meta-Analyses,” December 19, 2013.).
Hiatt, Kaul, and Smith obtained very different results, in their meta-analysis8 of exactly the same set of data as Nissen and Wolski’s original meta-analysis.3 What is more, when Nissen and Wolski added new studies to their original meta-analysis to produce their “updated meta-analysis,”4 even without correcting the serious mistakes of their original meta-analysis, no matter how much they claimed otherwise, they still completely wiped out the appearance of a “statistically significant” result (Again, see my post, “Nissen, Wolski, and How Not to Do Meta-Analyses,” December 19, 2013.).
Others refuted Nissen and Wolski. Nissen and Wolski even refuted themselves. Their results lacked “statistical significance” and were not meaningful. in both senses, Nissen and Wolski’s prattings were not significant.
Stay tuned for RECORD and more bad news for Nissen and Wolski on rosiglitazone.
© 2014 Myron Shank, M.D., Ph.D.
1 Anonymous. Significance. Dictionary.com. http://dictionary.reference.com/browse/significance
2 Initially on the sales, but later on the availability of rosiglitazone (Avandia, Avandamet, Avandaryl; GlaxoSmithKline).
5 In other words, (20-1) to 1.
6 The reader should not overlook the fact that Nissen and Wolski’s meta-analyses were not even performed credibly. See my post, “Nissen, Wolski, and How Not to Do Meta-Analyses,” December 19, 2013.
7 Leavitt S.B. The 6 worst words in evidence-based medicine. Pain-Topics News/Research Updates, Saturday, December 14, 2013. http://updates.pain-topics.org/2013/12/the-6-worst-words-in-evidence-based.html?showComment=1387733972654#c3773157618242887352