All sectors of the developed world have witnessed unprecedented growth in information, disseminated by multiple mechanisms in innumerable formats. Medical information has been drastically affected by this phenomenon. While the first medical journals date back to the end of the 18th century (The Medical Repository, the first medical journal published in the United States, was founded in 1797, and the New England Journal of Medicine was founded in 18121), specialty medical journals did not appear until the early 20th century followed by the subspecialty journals later in the 20th century and more recently by a wave of open-access online peer-reviewed publications. Currently MEDLINE/PubMed contains over 5,000 biomedical journals that cumulatively publish more than 800,000 articles per year2.
Evidence-based medicine requires that decisions be based on the scientific evidence that emanates from this rapidly expanding information base. In the face of the constantly increasing number of peer-reviewed papers, this imperative creates the need for summarizing articles published in diverse journals on the same topic. In fact, literature review is the first step scientists take to sharpen their scientific questions and understand what has been accomplished in the field. The challenge of summarizing published literature is often offered to an interested fellow or resident as a good way to engage in scientific writing. Depending on the methodologic rigor of these literature summaries, they can take the form of an overview of the literature, a systematic review, or a meta-analysis.
Many medical journals, including JBJS, help readers evaluate the impact of each clinical article by assigning a level of evidence3. In the hierarchy of evidence levels, meta-analysis carries the noble, highest rank4. However, failure to appreciate the differences between a systematic review and meta-analysis can lead to mislabeling and attendant erroneous characterization of the true value of the information presented in the paper. Even worse, authors may overuse the statistical methodology called meta-analysis5-8.
The paper by Dijkman et al., published in the current issue of The Journal, offers a summary of "meta-analyses" related to the field of orthopaedics over twenty years. The authors conducted a comprehensive assessment of the quantity and quality of meta-analyses published in peer-reviewed journals or conducted by the Cochrane Collaboration related to the field of orthopaedics. While the authors documented an increase in the quantity of meta-analyses published in 2005 and 2008 as compared with prior years, the quality of these studies remained problematic.
Let's turn our attention to the definition of meta-analysis. A better understanding of the definition may offer insights into the persisting problems with the quality of these published studies. Webster defines meta-analysis as "a quantitative statistical analysis of several separate but similar experiments or studies in order to test the pooled data for statistical significance." This definition assumes that certain statistical methods will be applied to a group of studies to derive a single summary statistic. Ideally, the whole emerges as better than the sum of the parts. Similar studies that may be too small to be conclusive are summarized into a single estimate of efficacy that overcomes the limited power of each study.
However, to ensure the validity of a meta-analysis, it is critical that the meta-analysis be conducted on individual studies that each warrant a high level of evidence (such as randomized controlled trials). Further, the populations examined, the choice of controls, the details of the intervention, and the measures of outcome must be similar across all studies included in the analysis. In fact, it is critical that authors perform a formal test of heterogeneity among candidate studies to determine whether results of the studies selected for the meta-analysis describe the same underlying effect or distribution of effects. The results of the heterogeneity testing suggest whether aggregation or "pooling" is appropriate and which statistical methods to use4.
While Dijkman et al. described in detail the quality assessment of each meta-analysis, they did not highlight the proportion of studies in which such a formal heterogeneity analysis was conducted and the results of these analyses. Furthermore, it remained unclear whether each meta-analysis under consideration weighted their analysis by the size of study and/or quality score. The heterogeneity, or lack of "combinability," may be due to different inclusion and exclusion criteria leading to focusing on populations with distinct prognostic factors, longer or shorter follow-up in situations in which the duration of follow-up influences outcome, and varying methods for outcomes ascertainment.
The second concern lies in the quality of individual studies. Combing poor studies and magnifying their impact by calling the results of such aggregation a meta-analysis may lead to suboptimal medical care and unjustifiable increases in health-care costs. In order to provide the highest level of evidence, the meta-analysis should be conducted on a set of combinable smaller randomized controlled trials that failed to reach conclusions due to suboptimal power. Case-series, uncontrolled cohort studies, and small retrospective studies should not be considered for inclusion in a meta-analysis, as they do not individually provide a strong enough level of evidence.
In conclusion, meta-analysis is a powerful statistical methodology that should be applied carefully and with caution. Misuse of meta-analysis may lead to inappropriate conclusions and the recommendation of treatment strategies that lack rigorous examination of efficacy. While an overview of the literature is indeed a productive, enriching experience for a new member of the scientific community, a formal meta-analysis requires the wisdom and analytic sophistication that emerges from the collaboration of clinician scientists and well-trained methodologists.
*In support of their research for or preparation of this work, one or more of the authors received, in any one year, outside funding or grants in excess of $10,000 from The National Institutes of Health. Neither they nor a member of their immediate families received payments or other benefits or a commitment or agreement to provide such benefits from a commercial entity.
1. Kahn RJ, Kahn PG. The Medical Repository--the first U.S. medical journal (1797-1824). N Engl J Med. 1997;337:1926-30.
2. Detailed Indexing Statistics: 1965-2008. http://www.nlm.nih.gov/bsd/index_stats_comp.html. Accessed 2009 Oct 5.
3. JBJS: Levels of Evidence for Primary Research Question. http://www2.ejbjs.org/misc/instrux.dtl#levels. Accessed 2009 Oct 5.
4. Egger M, Smith GD, Phillips AN. Meta-analysis: principles and procedures. BMJ. 1997;315:1533-7.
5. Egger M, Schneider M, Davey Smith G. Spurious precision? Meta-analysis of observational studies. BMJ. 1998;316:140-4.
6. Egger M, Smith GD, Sterne JA. Uses and abuses of meta-analysis. Clin Med. 2001;1:478-84.
7. Goodman SN. Have you ever meta-analysis you didn't like? Ann Intern Med. 1991;114:244-6.
8. Sterne JA, Egger M, Smith GD. Systematic reviews in health care: Investigating and dealing with publication and other biases in meta-analysis. BMJ. 2001;323:101-5.