Conducting a randomized clinical trial is a considerable challenge. In an ideal trial, patients would enter the study, comply with the assigned treatment, and complete the follow-up protocol, but this is rarely the case. Common problems in randomized clinical trials include patients' insistence that they receive a treatment to which they were not originally assigned or their failure to comply with the follow-up protocol—i.e., either skipping a scheduled appointment or dropping out from the study altogether. The intention-to-treat principle is intended to deal with some of these issues.
The intention-to-treat principle dictates that all patients who had been randomly allocated to treatment(s) under the auspices of the study are included in the final data analysis according to the original treatment group to which they had been randomly assigned. Thus, the patients who crossed over to another treatment and those lost to follow-up are analyzed according to their original treatment group1-9. The aim of an intention-to-treat analysis is to preserve the randomization scheme used to allocate the patients to the various treatment groups. This randomization scheme forms the theoretical basis for the validity of the statistical calculations. It is important that the conclusions of the study are capable of being generalized in order to accommodate entire patient populations, not only the individuals included in a given study.
An alternative to the intention-to-treat principle is the per-protocol analysis or "as-treated" analysis, in which patients are analyzed at the time of follow-up according to the treatment that they had actually received. The intention-to-treat analysis is sometimes referred to as an "efficiency" analysis, whereas the per-protocol analysis is referred to as an "efficacy" analysis5,7.
In this study, we examined the use of the intention-to-treat principle in randomized orthopaedic clinical trials and investigated whether the authors had adhered to the strict definition of this principle. Special emphasis was placed on the handling of missing data—i.e., the extent to which patients were lost to follow-up and the methods used to account for them.
We conducted a literature search of randomized clinical trials published between January 2005 and August 2008, in eight leading orthopaedic journals: the American and British volumes of The Journal of Bone and Joint Surgery, Spine, Journal of Pediatric Orthopaedics, Journal of Shoulder and Elbow Surgery, The Journal of Arthroplasty, Journal of Orthopaedic Trauma, and The Journal of Hand Surgery (American volume). The selection was based on a high citation index, our wish to use journals from all orthopaedic subspecialties, our access to the journals, and the findings of a similar study on the use of levels of evidence in orthopaedic journals10. The reported use of intention-to-treat analysis was determined by reviewing the statistical methods section and searching for the string "intent" throughout the entire report of each randomized clinical trial that we found.
We then evaluated in greater depth the trials in which the authors had claimed to have used intention-to-treat analysis. We identified three principal methods of application of the intention-to-treat principle: (1) strict adherence to the intention-to-treat principle—i.e., studies in which data analysis included all randomized patients according to their original treatment allocation (i.e., the authors ignored crossovers and adjusted for missing data); (2) intention-to-treat analysis with exclusion of missing data—i.e., studies in which the data analysis was conducted according to the patient's original treatment allocation (crossovers were ignored) but only patients who completed the follow-up protocol were included; and (3) modified intention-to-treat analysis—studies in which the data analysis included only the patients who started the treatment being evaluated (e.g., those who attended at least the first session of physical therapy). Figure 1 presents a graphical definition of the three types of intention-to-treat analysis. Each clinical trial was further classified according to the nature of the interventions under study: surgery compared with nonsurgical management, two nonsurgical interventions, and two surgical interventions.
Articles that accounted for missing data were evaluated for the manner in which the missing data were adjusted. Four methods were used: (1) the last observation carried forward—i.e., the last observed value is used as a replacement for the missing observations; (2) mean/median imputation—i.e., the mean or median for the treatment group is used as a replacement for the missing observations; (3) worst outcome imputation—i.e., all missing data are replaced by the worst outcome; and (4) longitudinal regression imputation—i.e., imputation is done according to a predictive regression model based on several covariates.
The proportion of patients lost to follow-up was calculated as the number of patients missing at the last follow-up point for each treatment arm, divided by the number of patients who were originally allocated to that treatment arm. The total proportion of patients lost to follow-up was calculated as the total number of patients missing at the last follow-up point divided by the total number of patients who underwent randomization.
The difference in the proportion of missing data between groups was calculated as the proportion of patients lost to follow-up in the intervention group minus the proportion of patients lost to follow-up in the control group. In trials that compared nonsurgical and surgical management, the nonsurgical group was always considered the control. For trials including two surgical (or nonsurgical) interventions, the previous gold standard surgery was considered as the control. The maximum difference in the proportion between the treatment arms and the control group was used in studies that included more than one intervention arm.
The duration of follow-up reported in each trial was recorded as (1) until discharge, (2) up to six months (including six months), (3) longer than six months to one year (including one year), or (4) longer than one year. We compared the total proportion of patients lost to follow-up in each study according to the duration of follow-up.
Each randomized clinical trial was evaluated by at least two of the four authors of the present study. Our agreement regarding the above-mentioned classifications ranged from 80% to 95%. In cases in which there was disagreement, the randomized clinical trial was reviewed by all of the authors and further discussed until consensus was achieved.
Statistical analysis was performed with use of R 2.7.0 software (Vienna, Austria)11. The chi-square test was performed to compare use of the intention-to-treat principle among the selected journals. We checked for a time-dependent trend between use of the intention-to-treat principle and the year of publication—i.e., a yearly increase in the proportion of randomized clinical trials in which intention-to-treat-based analysis was used. This was done by performing a logistic regression analysis with the use of the intention-to-treat principle as the dependent covariate and the year of publication as the explanatory covariate.
The proportion of patients lost to follow-up was reported as a mean and standard deviation. Analysis of both the total proportion and the difference in the proportion of patients lost to follow-up according to follow-up time and intervention types (e.g., surgical compared with nonsurgical treatment, two nonsurgical treatments, or two surgical treatments) was done with the Kruskal-Wallis test. All of the p values reported are two-sided.
Source of Funding
No external funding source financed this research.
We found 274 randomized clinical trials, and the intention-to-treat principle had been used in ninety-six (35%) of them (Table I). The highest proportions of studies using the intention-to-treat principle were published in the American volume of The Journal of Bone and Joint Surgery and in Spine (42% and 45%, respectively), and the difference among journals was significant (p = 0.001). There was a trend for a yearly increase in the proportion of randomized clinical trials using intention-to-treat analyses between 2005 and 2008 (p = 0.025). A strict intention-to-treat analysis was used in forty-five (47%) of the ninety-six trials, an intention-to-treat analysis with exclusion of missing data was used in forty-four (46%), a modified intention-to-treat method was used in six (6%), and the method of intention-to-treat analysis was unclear from the description in one.
The investigators excluded the patients lost to follow-up in twenty-one (72%) of the twenty-nine trials in which surgical intervention was studied. The authors used strict intention-to-treat analysis in thirty-seven (56%) of the sixty-six trials in which only nonsurgical interventions were considered, and the investigators excluded the patients lost to follow-up in twenty-three (35%) of those sixty-six trials. All six of the trials in which a modified intention-to-treat method was used involved nonsurgical intervention groups. The surgical and nonsurgical randomized clinical trials differed significantly with regard to the method of intention-to-treat analysis (p = 0.002).
No patient was lost to follow-up in seventeen (38%) of the forty-five trials in which strict intention-to-treat analysis was used. In the other twenty-eight of these trials, the authors accounted for missing data. "Last observation carried forward" was used in eighteen (40%) of the forty-five trials with use of strict intention-to-treat analysis, the mean or median for the treatment group was used to replace missing data in four (9%), a complex imputation method (regression) was used in two, and worst-outcome imputation was employed in three. One article did not specify how the missing data were accounted for.
After exclusion of three studies in which the exact number of patients lost to follow-up was not available, we determined the proportions of patients lost to follow-up, according to treatment arm, in each randomized control trial in which the intention-to-treat principle had been used. We found that 13.2% had been lost from the clinical trials that compared surgical and nonsurgical interventions; 14.4%, from the trials comparing two nonsurgical interventions; and 12.6%, from the trials comparing two surgical interventions (Table II). These differences did not reach a level of significance (p = 0.9). The total proportion of patients lost to follow-up, however, was found to increase significantly as the follow-up time increased (p = 0.0003). The means of the total proportions of patients lost to follow-up were 0.5% for the eight clinical trials that lasted until hospital discharge, 14% for the twenty-seven with a follow-up time of up to six months, 14% in the thirty-one with a follow-up time of more than six months to one year, and 17% in the thirty with a follow-up time of more than one year.
Figure 2, which presents box-plot graphs, and Table II show that the proportions of patients lost to follow-up differed according to the type of clinical trial (i.e., the interventions under study). More patients in the control (nonsurgical) group were lost to follow-up in the clinical trials that compared surgical with nonsurgical treatments (p = 0.01).
Our results showed that the use of the intention-to-treat principle in orthopaedic randomized controlled studies is still relatively sparse and not uniform. The authors of about half of the clinical trials in which it was claimed that the intention-to-treat principle had been used had not adhered to its strict requirements. Most of the violations of the intention-to-treat protocol involved the handling of missing data—i.e., the exclusion of patients lost to follow-up instead of adjustment for missing data. This might introduce bias to the results and conclusions of the trials. The possibility of this bias is even more relevant to clinical trials comparing surgical with nonsurgical treatments, in which the proportion of patients lost to follow-up was shown to be larger in the control (nonsurgical) groups. We believe that patients in nonsurgical groups are less motivated to comply with a follow-up protocol, especially if they do not have any complications, and that patients who undergo surgery might feel more inclined to comply with follow-up schedules.
Several of the articles on methodology that have been published in the orthopaedic literature did not precisely delineate what is meant by the intention-to-treat principle12-17. Those publications defined intention-to-treat analysis as the analysis of patients according to the original treatment group to which they had been allocated, regardless of whether they crossed over at any point. This definition completely ignores the need to account for missing data.
In a survey similar to ours, the authors examined the use of the intention-to-treat principle in articles published in 1997 in BMJ: British Medical Journal, The Lancet, JAMA: The Journal of the American Medical Association, and The New England Journal of Medicine4. The authors found that use of the intention-to-treat principle was mentioned in 119 (48%) of the randomized clinical trials. The authors of twelve of these 119 studies did not include patients who had not started the assigned treatment. The survey did not include examination of the handling of missing data. Some primary outcome data were reported to be missing from eighty-nine (75%) of the trials in which the intention-to-treat principle had been used, and >10% of the primary outcome data were missing from twenty-nine (24%).
Figure 3 provides an example illustrating the importance of implementing the intention-to-treat principle. In this example, a trial is designed to compare a new surgical procedure with a cast immobilization technique for treatment of a nondisplaced fracture. Displacement is the primary failure end point. Two hundred patients are randomized to each treatment arm. The surgery requires a two-week preparation period in order for local edema to subside and for skin conditions to be suitable for the operation. During this period, the patients scheduled for surgery are treated with a splint and undergo daily skin examination. The two approaches (surgical and cast treatment) have the same outcome: specifically, 10% of the fractures in each treatment arm displace during the two weeks after the injury and another 10% in each treatment arm displace during the one-year trial period. With use of the "as-treated" approach, there are sixty treatment failures in 220 patients who were treated conservatively: 200 treated with a cast and twenty who underwent fracture displacement while being treated with a splint. These results are compared with twenty instances of fracture displacement that occurred in 180 patients who underwent surgery. According to this analytic approach, the relative risk reduction is 0.6 (p = 0.0001), favoring the new surgical procedure. With use of intention-to-treat analysis, however, the 400 patients remain in the groups to which they were randomized and forty in each treatment arm of 200 patients have fracture displacement, so the relative risk reduction is now 0 (p = 1). Note that the researchers could have randomized the patients after the two-week skin-preparation period. Note also that, with the intention-to-treat approach, the investigators are comparing the two treatment arms—i.e., they are comparing surgery preceded by splinting with cast treatment—as opposed to comparing surgery alone with cast treatment.
An example illustrating the importance of accounting for missing data can be found in a randomized clinical trial, by the Canadian Orthopaedic Trauma Society18, comparing surgical with nonoperative treatment for displaced midshaft clavicular fractures. The authors of that study stated that they used the intention-to-treat principle. Careful scrutiny of the article, however, reveals that the analysis did not include patients lost to follow-up. Five (7%) of the sixty-seven patients in the surgical treatment group and sixteen (25%) of the sixty-five patients in the nonoperative group were lost to follow-up. This difference was found to be significant (p = 0.014). Two patients in the nonoperative group who had complications were not followed and were omitted from the analysis. We performed our own analysis using "last observation carried forward," which means that all of the patients who were free of complications at their last visit were considered henceforth to be patients without complications. A comparison of that analysis with the analysis by the investigators from the Canadian Orthopaedic Trauma Society renders the original conclusions in favor of the surgical intervention less convincing (Table III).
There are several methods with which to adjust for missing data. Choosing one can be a challenging task, and the choice could be closely related to the reason for the missing data. If, as hypothesized above, the reason for lack of compliance is lack of clinical need (e.g., no complications), it would be logical to use "last observation carried forward." The influence of the adjustment for missing data should, however, be examined by trying several adjustment methods, an approach termed sensitivity analysis5,19-21. In our survey, the authors of only six articles mentioned having compared more than one imputation method for missing data. It is important to note that the worst-outcome imputation method has received much criticism. Use of this approach might introduce bias when a large amount of data is missing in one group, causing its treatment to seem unsuccessful. The "last observation carried forward" approach has also been criticized as being inappropriate because the assumption that the last observation is the long-term outcome is not always justified. There are many other contemporary methods of handling missing data, including multiple imputation, expectation-maximization algorithms, and propensity adjustments. Missing-data imputation can depend on other variables, such as the number of patients and the number of end points in a study. For the sake of brevity, we did not describe all of the methods or use them all in our data-set example. The interested reader can find many references for these methods in the current literature.
Over the four years studied, the authors of only about a third of the orthopaedic trials used some variation of the intention-to-treat principle. As has been mentioned for randomized trials in surgery22, there is still room for improvement in the performance and analysis of randomized clinical trials in orthopaedics. We conclude by quoting from Fisher et al.3: "One of the great intellectual advances of the twentieth century [was] the concept of randomization." We should go to greater lengths to preserve the benefits of randomization through correct implementation of the intention-to-treat principle. 