Publication bias refers to the phenomenon whereby the strength or direction of the findings in a study influences whether it will be published1. (In this article, the terms significant and nonsignificant are used to describe the strength of a study's findings. To describe the direction of a study's findings, the terms positive, negative, neutral, and nonpositive are used. See the Materials and Methods section for a complete definition of these terms.) With the medical community's increasing reliance on meta-analysis as the so-called gold standard in clinical decision-making, the systematic exclusion of certain research from the literature is a concern. The overrepresentation of positive studies in the literature could, for example, result in overly favorable impressions of clinical interventions and adversely affect patient care2-5.
Several prior investigations have suggested that studies with positive6,7 or statistically significant8-13 results are more likely to reach full publication. A recent study in the Cochrane Database of Systematic Reviews examined the fate of abstracts presented at scientific meetings and found that eventual publication was significantly associated with the reporting of positive findings (relative risk, 1.17 [95% confidence interval, 1.02 to 1.35]; p = 0.03) and the reporting of significant results (relative risk, 1.30 [95% confidence interval, 1.14 to 1.47]; p = 0.00004)14.
In the field of orthopaedic surgery, a follow-up of 318 abstracts presented at the Annual Meeting of the American Academy of Orthopaedic Surgeons found that a positive outcome (odds ratio, 1.62 [95% confidence interval, 1.01 to 2.59]; p < 0.05) and significance (odds ratio, 2.05 [95% confidence interval, 1.24 to 3.39]; p = 0.005) were significantly associated with eventual publication15. However, it should be noted that a smaller study of abstracts presented at the Annual Meeting of the Australian Orthopaedic Association, by the same investigators, did not find a significant association between study findings and eventual publication16.
Among articles published in orthopaedic journals, positive findings are common. In particular, two recent studies found positive outcomes to account for 66% (200) of 301 articles and 75% (420) of 559 articles published in leading orthopaedic journals17,18. While the specific reasons for this finding are unknown, many believe that it may be due to the preferential publication of positive studies by orthopaedic journals. The existence of such a belief is pernicious, as it may discourage the authors of nonpositive studies from submitting their research for publication.
Recently, Lynch et al. conducted a study of manuscripts submitted to The Journal of Bone and Joint Surgery (American Volume)19. They found that studies with negative outcomes were no more likely to be accepted than were positive studies despite the fact that the negative studies tended to be of higher quality. While no scientific factor was found to be predictive of acceptance, two nonscientific factors (commercial funding and United States location) were associated with acceptance for publication. While that study had many important strengths, it did not include a power analysis, did not control for confounders, and was restricted to manuscripts about hip and knee arthroplasty. As such, it remains unclear whether publication bias exists in the orthopaedic journal editorial review process.
The purpose of this study was to investigate factors associated with acceptance for publication by The Journal of Bone and Joint Surgery (American Volume). In particular, we sought to determine whether positive findings rendered a manuscript submitted to The Journal of Bone and Joint Surgery more likely to be accepted for publication, after controlling for a number of scientific factors including the level of evidence, prospectivity, blinding, presence of a control group, sample size, and subspecialty field.
Eligibility Criteria
All 1181 clinical and basic-science manuscripts submitted to The Journal of Bone and Joint Surgery between January 1, 2004, and June 30, 2005, for publication as scientific articles were reviewed. (This sample includes 209 manuscripts on adult reconstruction that were recently reported in the study by Lynch et al.19.) Eight manuscripts for which abstracts were not available were excluded. Since the system used to classify study findings applied only to studies in which an item was evaluated (see below), all 318 studies not featuring an evaluation of some kind were excluded. This left 855 manuscripts available for analysis. Manuscript review was conducted retrospectively.
Ethics
Beginning on January 1, 2004, all authors submitting work to The Journal of Bone and Joint Surgery for publication were informed that "The Journal shall have the right to use (and to permit others to use) the Data in reviewing and/or editing the Work and for any other purpose other than the creation or publication of any other work based exclusively on the Data." The review and analysis of submitted manuscripts is covered by this statement.
Direction of Study Findings
For each submitted manuscript, The Journal of Bone and Joint Surgery provided a blinded abstract labeled only with the article title, the manuscript number, and the article type (clinical or research). Blinded abstracts did not include any information on the final disposition term (acceptance or rejection), author characteristics (including names, institution, or country), or any other identifying information. All abstracts were analyzed by two blinded reviewers (K.O. and C.T.M.), with discrepancies resolved by a third blinded reviewer (M.B.). Reviewers classified the findings of each study as positive, negative, neutral, or not applicable. Studies that favored the experimental item over the existing standard of care, or otherwise arrived at a favorable conclusion regarding the item being evaluated, were graded as having positive findings. Conversely, studies that favored the existing standard of care over the experimental item, or otherwise arrived at an unfavorable conclusion regarding the item being evaluated, were graded as negative. Studies that judged the experimental item to be as good as the existing standard of care, or otherwise arrived at a neutral conclusion regarding the item being evaluated, were graded as neutral. All other studies, including those in which no value judgment was made, were categorized as not applicable. In this report, negative and neutral studies are together referred to as nonpositive.
Other Manuscript Characteristics
From the blinded abstracts (described above), study type was classified as clinical or nonclinical. Additionally, the orthopaedic subspecialty field of each manuscript was recorded as adult reconstruction hip, adult reconstruction knee, foot and ankle, hand and wrist, pediatric orthopaedics, practice management, rehabilitation medicine, shoulder and elbow, sports medicine and arthroscopy, spine, musculoskeletal tumor and metabolic disease, trauma, or basic science.
For each clinical study, a level of evidence was assigned by an individual with advanced training in clinical epidemiology (K.O.), according to the guidelines published by The Journal of Bone and Joint Surgery20. The reliability of this assessment has previously been reported21. In addition, the presence or absence of prospectivity, blinding, and controlling, as well as sample size, were assessed for all studies by this same individual.
Results of the Editorial Review Process
The final disposition (acceptance or rejection) was reported by The Journal of Bone and Joint Surgery in spreadsheet form and was recorded for each manuscript. As noted above, this information was recorded by one investigator (K.O.) after the direction of study findings had been assessed for all manuscripts. Similarly, the initial grade received by each manuscript was provided by The Journal of Bone and Joint Surgery and was recorded. At The Journal, manuscripts are graded as A (accept), B (accept with revisions), C+ (revise and resubmit), or C (reject).
Power Analysis
We decided a priori that an absolute difference of 10% in acceptance rates between positive and nonpositive studies would be considered meaningful. On the basis of the results of a previous study22, we estimated that positive studies would outnumber nonpositive studies by a 4:1 margin (80% compared with 20%). On the basis of information provided by The Journal of Bone and Joint Surgery23, we estimated that 20% of the submitted manuscripts would be accepted. From these values, we calculated that a 10% absolute difference in acceptance rates would correspond to acceptance rates of 22% and 12% for positive and nonpositive studies, respectively (since a 4:1 weighted average of 22% and 12% equals 20%). Using a two-sided test for significance of 0.05 and 80% power, we calculated that 785 manuscripts would be needed to detect a significant difference between acceptance rates of 22% and 12%. The actual number of manuscripts analyzed (855) exceeded this target by approximately 10%.
Data Analysis
Reviewer agreement in the grading of study outcome was assessed by means of reliability analysis with use of the intraclass correlation coefficient (2,1). In the multivariate analysis, multiple logistic regression was used to adjust for all variables simultaneously. Associations were estimated by odds ratios and 95% confidence intervals. The level of evidence was defined for clinical studies only. P values were not adjusted for multiple comparisons, and a p value of <0.05 was considered significant. All tests were two-sided. Statistical analysis was performed with use of SAS (version 9; SAS, Cary, North Carolina), Stata (version 9; StataCorp, College Station, Texas), and SPSS (version 15.0; SPSS, Chicago, Illinois).
Of the 1181 manuscripts submitted to The Journal of Bone and Joint Surgery (American Volume) for publication as scientific articles between January 1, 2004, and June 30, 2005, 855 manuscripts, including 636 clinical studies and 219 nonclinical studies, met the inclusion criteria. The level of evidence was generally low, with more than half (53.8%; 342) of the 636 clinical studies labeled as Level-IV evidence (Table I).
Interobserver reliability for the determination of study findings was acceptable (intraclass correlation coefficient, 0.74). Among the 855 studies that featured an evaluation of some kind, 72.5% (620) were positive compared with 15.2% (130) that were neutral and 12.3% (105) that were negative. On the whole, 21.8% (186) of 855 manuscripts gained acceptance for publication, while 78.2% (669) were rejected (Table II).
Among the 235 manuscripts with nonpositive findings, the acceptance rate was 23% (fifty-four studies), which included 21% (twenty-two) of 105 negative studies and 25% (thirty-two) of 130 neutral studies. Positive studies were accepted at a rate of 21.3% (132 of 620). The difference between nonpositive and positive studies was not significant (odds ratio, 1.10 [95% confidence interval, 0.77 to 1.58]; p = 0.593). In the multivariate analysis, the direction of study findings was not a significant predictor of manuscript acceptance (odds ratio, 0.92 [95% confidence interval, 0.62 to 1.35]; p = 0.652) (Table III).
In the multivariate analysis of all manuscripts, the level of evidence was significantly associated with acceptance for publication. In particular, studies with Level-III or IV evidence were less likely to be accepted than studies with Level I or II (odds ratio, 0.40 [95% confidence interval, 0.22 to 0.70]; p = 0.001). Study type, prospectivity, blinding, controlling, and sample size were not found to be significant predictors of publication in the multivariate analysis of all manuscripts (Table III).
Subgroup analyses were conducted by level of evidence. Among the 156 manuscripts with a high level of evidence (I or II), the acceptance rate was 5% (one) of twenty negative studies compared with 37% (thirty-four) of ninety-two positive studies and 36% (sixteen) of forty-four neutral studies (p = 0.018). No significant differences in rates of acceptance were found among studies with a low level of evidence (III or IV) or nonclinical studies (data not shown).
This observational study of manuscripts submitted to The Journal of Bone and Joint Surgery (American Volume) found no evidence of publication bias in the editorial review process, as studies with positive and nonpositive findings were accepted at similar rates. In subgroup analysis of studies with a high level of evidence (I or II), negative studies were less likely to be accepted for publication than were positive or neutral studies. However, the perils of subgroup analysis are well-known24, and one must use caution when interpreting results such as these. Nevertheless, they certainly represent findings to be further explored in future studies.
Our failure to detect an association between the direction of study findings and acceptance for publication is consistent with results recently reported by Lynch et al.19. However, our use of a power analysis suggests that this failure to detect a significant difference was not due to a small sample size. In addition, the fact that we controlled for a wide range of covariates decreases the likelihood that the observed results were unduly influenced by confounding. Finally, the fact that all subspecialty fields of orthopaedics were included in our study may enhance the generalizability of our conclusions.
Two prior studies have examined publication bias as it relates to the strength of study findings (as opposed to direction)25,26. Both were in the field of general medicine, and neither detected a significant difference in acceptance rates. The main findings of those prior studies are compared in Table IV.
In the multivariate analysis, a higher level of evidence was significantly associated with acceptance for publication. This finding is consistent with the results previously reported by Lee et al.25 (Table IV). Nevertheless, Level-IV studies accounted for more than half (56%; seventy-four) of all 131 clinical manuscripts accepted for publication during the study period. This is consistent with previous research by Obremskey et al., who found that 58% of the clinical studies published in The Journal of Bone and Joint Surgery between January and June 2003 were Level IV27.
Prospectivity, blinding, and controlling were not predictive of acceptance in the multivariate analysis, which may have been due in part to residual confounding by level of evidence. Sample size was also not found to be associated with acceptance. Sample size sufficiency (i.e., power) is certainly the more relevant concern, but this factor could not be examined, given that the vast majority of manuscripts did not include a power analysis. Sample size was utilized as a surrogate for sufficiency, under the assumption that larger studies would be more likely to be adequately powered. However, it remains unclear whether study power is associated with acceptance for publication.
The results of our investigation must be considered within the context of its study design. As noted above, our study benefits from the fact that it had a large sample size, was adequately powered, and controlled for a number of potential confounders. Since the majority of manuscripts submitted to The Journal of Bone and Joint Surgery during the study period were included, our results may be generalizable to the publication of orthopaedic research broadly, at least in a general orthopaedic surgery journal.
Our study has several limitations. First, it was conducted retrospectively. Although reviewers were blinded to the final disposition of each manuscript when assessing study findings, it is possible that they were nevertheless aware of certain studies that had or had not been published on the basis of their reading of the literature, which could have introduced bias. Some may consider the fact that the direction of study findings was assigned on the basis of subjective assessment of author conclusions to be a limitation of our study. However, this approach has been successfully used by us22 as well as other investigators6,7,19 in the past. The assessment of study findings was done in duplicate by two independent reviewers with discrepancies resolved by a third independent reviewer, and interobserver reliability was acceptable (0.74). While interobserver reliability was not excellent (>0.90), this is unlikely to have affected our conclusions, given that the acceptance rates of positive and nonpositive studies were quite similar in this study. The fact that study findings and other scientific variables were assessed on the basis of abstracts alone may also be considered a shortcoming of our study. However, this approach has been used successfully in the past by our research group22 as well as others6,7,12,13,28-30. The fact that we chose not to adjust for multiple comparisons in conducting exploratory analyses may be considered a limitation because of an increased risk of detecting false positives; however, in making this choice, we followed the example set by previous authors of similar studies, including Olson et al.26 and Lee et al.25. Furthermore, the one factor that emerged as a significant predictor in the multivariate analysis (level of evidence) was highly significant (p = 0.001) and would have remained so even if a Bonferroni correction had been used.
While it is likely that nonpositive studies are being disproportionately lost before reaching full publication, it does not appear that these losses are taking place at the editorial review stage. In contrast, many of these losses may be occurring before manuscripts are submitted for publication. Surveys administered to researchers who did not publish their findings have generally indicated that the failure to publish is primarily due to investigator-based factors8,12,31,32. In a recent follow-up of unpublished orthopaedic studies, for example, Sprague et al. found that 59.1% (seventy-one) of 120 manuscripts had never been submitted to a journal for publication33. While the reasons for this nonsubmission are many, one of the most common is the belief that the study will not be accepted for publication32,33.
In orthopaedic surgery, decisions regarding the evaluation and treatment of disease are increasingly being made on the basis of information contained within the published literature. As such, factors that inappropriately influence its content have the potential not only to compromise the reliability of the literature but also to adversely affect patient care. Orthopaedic journals have a responsibility to select manuscripts for publication on the basis of research quality and scientific merit, without being influenced by other factors. However, researchers also have a responsibility to ensure the integrity of the orthopaedic literature. By failing to submit studies with nonpositive findings for publication, investigators may be indirectly acting to create skewed impressions of clinical interventions.
Recently, the International Committee of Medical Journal Editors began requiring the prospective registration of all clinical trials at or a similar database34. The registration of all trials at the time of inception may make it easier, in the future, to follow-up on studies that do not reach full publication in a timely manner. While this measure is certainly an important first step in promoting the dissemination of negative outcomes, it does not represent a panacea by any means. Orthopaedic surgeons must join medical journals in working to protect the integrity of the literature, for the advancement of science as well as the benefit of their patients. In particular, orthopaedic researchers should submit negative and neutral studies for publication, confident that the likelihood of acceptance will not be influenced by the direction of study findings. Orthopaedic surgeons should also strive to conduct studies with a higher level of evidence, not only because they are more likely to be accepted but also because they provide better evidence for the care of our patients. 
Note: The authors thank Dr. David Zurakowski, Cathy Griffin, Laurie Lagasse, Sheila Marshall, David Seo, Andrew Tye, Chen Xie, and Jennifer Darrah for their assistance in the preparation of this manuscript.