The patron saint of meta-analysis, Richard Peto, who helped bring the approach back from what we used to dismiss as ‘unplanned pooling,’ stressed approaches that would avoid bias and error, including selection of endpoints before the trials are analyzed and a reasonably high level of statistical significance.1
Nissen and Wolski said they used the Peto method, for their original meta-analysis of rosiglitazone (Avandia; GlaxoSmithKline).2 So, how did they measure up?
As noted in my January 27, 2014 post, Concerns at FDA: Nissen’s Avandia Analysis, Deputy Director Robert Temple, in a memo to Janet Woodcock, Director of the Food and Drug Administration’s Center for Drug Evaluation and Research, was critical of Nissen and Wolski’s2 peculiar and unblinded1 selective choices of studies (from among those that met their inclusion criteria).3 Dr. Temple also raised questions about the (improper) inclusion in their meta-analysis of the DREAM study, on which their hypothesis had been based.4 About the Nissen and Wolski meta-analysis, Dr. Temple concluded that “there was potential bias in its development (or at least the analysis was done with knowledge of what would emerge).”1
Nissen and Wolski reported doing eight different comparisons in their original meta-analysis.2 If these eight comparisons had been independent of (unrelated to) each other, the probability that they would have found a difference by chance, where none existed, would really have been 0.34,5 instead of the 0.05 that they claimed.2 In other words, instead of the probability of erroneously concluding, by chance, that there were differences between treatment with and without rosiglitazone being one-in-twenty, as they implied, the probability would be more than one-in-three.
Unfortunately, acute myocardial infarctions and cardiovascular deaths are linked; whatever factors affect one, affect the other. “If the comparisons are not independent, it really is impossible to compute the probability . . . .”6 The truth lies somewhere between “p”=0.05 and “p”=0.34–a useless result.
One common technique is to make the threshold for statistical significance much stricter, adjusting by dividing the required “p” value for significance by the number of comparisons made–in this case, 0.05/8=0.00625.7 This is the Bonferroni adjustment8 The Bonferroni adjustment is approximate and only appropriate if the comparisons are independent of each other.9
Some other techniques only test the individual components if the over-all comparison is significant–in other words, if (and only if) we know that there is a difference, we are entitled to find out where it is. This is known as a multiple-stage test.10
There are other techniques, each of which has its own uses and limitations.10
Nissen and Wolski used none of these.
Return for concerns about what the Nissen and Wolski meta-analysis2 left out.
© 2014 Myron Shank, M.D., Ph.D.11
1 Temple Robert. Memorandum to Janet Woodcock: Data on Rosiglitazone, August 8, 2010. http://www.fda.gov/downloads/drugs/drugsafety/postmarketdrugsafetyinformationforpatientsandproviders/ucm226066.pdf
3 Schachtman Nathan A. Learning to embrace flawed evidence–the Avandia MDL’s Daubert opinion. Schachtman Law January 10th, 2011. http://schachtmanlaw.com/learning-to-embrace-flawed-evidence-the-avandia-mdls-daubert-opinion/.
4 As I pointed out, “It is never appropriate to test a hypothesis with the same data that was used to generate it.” I also noted that this was inexcusable behavior from a statistician, such as Dr. Nissen’s co-author, Kathy Wolski. For the statistically literate, I pointed out that excluding DREAM’s statistical “degrees of freedom,” which were improperly included, would have been sufficient, by itself, to have eliminated any pretense of statistical significance for Nissen and Wolski’s meta-analysis.
6 Anonymous. The multiple comparisons problem. GraphPad Statistics Guide http://www.graphpad.com/guides/prism/6/statistics/index.htm?stat_the_problem_of_multiple_compar.htm. Accessed January 26, 2014.
7 The exact calculation is 1.00-0.95-8=0.0064 (Notice the negative sign on the exponent; this is the 8th-root of 0.95.).
8 Anonymous. Glossary of statistical terms: “Bonferroni adjustment,” The Institute for Statistics Education. http://www.statistics.com/index.php?page=glossary&term_id=611. Accessed January 26, 2014.11 Again, this is inexcusable for a statistician, such as Kathy Wolski.
9 Anonymous. Bonferroni. Simple Interactive Statistical Analysis http://www.quantitativeskills.com/sisa/calculations/bonhlp.htm>. Accessed January 26, 2014.
10 Anonymous. The GLM procedure: multiple comparisons. SAS/STAT User’s Guide. http://v8doc.sas.com/sashtml/stat/chap30/sect35.htm. Accessed January 26, 2014.
11 I served: in the speaker program for Takeda Pharmaceuticals America, Inc. and Eli Lilly and Company, Chicago, Illinois, August 7, 1999; as a consultant for Takeda Pharmaceuticals America, Inc. and Eli Lilly and Company, “Current Update and Discussion of the Glitazone Class of Oral Antidiabetics,” January 3, 2001; on Avandia Regional Advisory Panels in Cleveland, Ohio (February 4, 1999), Toledo, Ohio (September 15, 1999), San Juan, Puerto Rico (February 2000), and Beverly Hills, California (April 2000); as a preceptor for SmithKline Beechum, Lima, Ohio (June 23-25, 1999); and in a “Meet-the-Specialist,” SmithKline Beecham, Lima, Ohio, (2000). In 2000, SmithKline Beechum merged with GlaxoWellcome, to form GlaxoSmithKline. Before it became impractical to continue prescribing rosiglitazone (Avandia, Avandamet, Avandaryl; GlaxoSmithKline), I prescribed both it and pioglitazone (ACTOS, ACTOSPlusMet, DuetAct; Takeda).