Eighty-one patients with a diagnosis of rotator cuff disease seen in the clinical practice of two physicians specializing in the treatment of shoulder disorders (R.Z.T. and A.P.P.) were prospectively enrolled in the study as a consecutive series over a twelve-month period. All patients meeting the inclusion and exclusion criteria (listed below) were asked to participate, and all of those who agreed were enrolled and followed. Approval was obtained from the hospital institutional review board prior to the initiation of the study. A history was recorded, a physical examination was performed, and shoulder radiographs were made for all patients before enrollment. Some patients had magnetic resonance imaging of the shoulder performed prior to the initial evaluation, and a magnetic resonance imaging scan was obtained for all additional patients in whom a rotator cuff tear was suspected on the basis of the history and physical examination. The final diagnosis was based on the initial examination; the initial imaging studies (radiographs and magnetic resonance imaging scans obtained prior to the initial examination); and, if one was obtained, the magnetic resonance imaging scan after the initial examination. A total of forty-three magnetic resonance imaging scans were performed. On the basis of the history, physical examination, and radiographic data, one of the authors (R.Z.T. or A.P.P.) made a diagnosis of either rotator cuff tendinitis or a rotator cuff tear (partial or full-thickness) and the patient was subsequently asked to participate in the study. The history and physical examination findings considered to be consistent with rotator cuff tendinitis included anterolateral or lateral shoulder pain, night pain, pain with overhead activities, point tenderness over the rotator cuff tendon insertion along the greater tuberosity, and pain on rotator cuff testing with resistance against scapular plane forward elevation and external rotation at the side. A clinical diagnosis of rotator cuff disease was made on the basis of one or more of those findings.
Inclusion criteria included an age of more than eighteen years, a clinical diagnosis of a rotator cuff tear or rotator cuff tendinitis, and the patient's agreement to receive nonoperative treatment. Exclusion criteria included early surgical repair (which was done for an acute full-thickness tear or a chronic tear in a patient under the age of sixty), glenohumeral arthritis, adhesive capsulitis, and/or the patient's lack of willingness to receive nonoperative treatment. Fourteen patients were diagnosed with a rotator cuff tear, and sixty-seven were diagnosed with rotator cuff tendinitis. All tears were confirmed with magnetic resonance imaging. Of the sixty-seven patients diagnosed with tendinitis, twenty-nine had a magnetic resonance imaging scan confirming the absence of a rotator cuff tear. The remaining thirty-eight patients had a clinical history and findings on examination that were consistent with rotator cuff tendinitis and it appeared that the rotator cuff was intact, but magnetic resonance imaging was not performed for them. Thus, some of these patients who were assigned a diagnosis of tendinitis may have actually had an unrecognized rotator cuff tear. There were thirty-nine male and forty-two female patients. The dominant shoulder was affected in 67% of the patients. The average age was fifty-one years (range, nineteen to eighty-eight years; median, fifty-one years).
All patients completed a baseline questionnaire form that included the ASES score and the SST. The SST is a shoulder-specific outcome measure that focuses on characterizing the function of patients with pathological shoulder conditions. The SST consists of twelve "yes/no" questions about important activities of daily living that can be performed by patients with normally functioning shoulders. The questions specifically assess shoulder comfort and function12. The validity, ease of use, reproducibility, sensitivity, reliability, and responsiveness of the SST have been documented for several shoulder disorders10,12-17. The ASES score is a patient self-assessed region-specific outcome measure for evaluating pain on a visual analog scale and function with ten Likert-style questions18. Normal ASES scores have been reported19. The ASES has been shown to be a valid, reliable, and responsive outcome tool9.
All patients underwent a variety of nonsurgical treatments for the rotator cuff disorder. These treatments included rest, application of ice, activity modifications, formalized and home physical therapy, anti-inflammatory pain medications, and subacromial corticosteroid injection. Various combinations of each were prescribed (Table I). All injections were performed by the same shoulder surgeon (R.Z.T.) in his office. Physical therapy included shoulder stretching and strengthening (rotator cuff, deltoid, and scapular stabilizer) exercises.
The patients were reevaluated with a follow-up questionnaire at a minimum of six weeks (mean, 3.6 months; maximum, 12.5 months) after the baseline evaluation. The follow-up questionnaire included the ASES score, the SST, and three questions evaluating relief of shoulder pain and improvement of function after treatment (Table II). The three questions (two fifteen-item questions and one four-item question derived from anchor questions designed by Juniper et al.11 and Tubach et al.20) were the anchors used to determine the minimal clinically important differences.
Statistical Analysis
The change in the ASES and SST scores from baseline to the time of follow-up was calculated for each patient. Minimal clinically important differences were determined with use of the anchor-based technique described by Juniper et al.11. Patients were classified on the basis of their responses to the three questions regarding pain relief and functional improvement after treatment (Table II). The fifteen items of the questions in the present study are exactly the same as those used by Juniper et al. except that the question stem was changed to reflect the fact that we were evaluating shoulder pain while Juniper et al. were evaluating asthma symptoms. The four-item question in the present study is exactly the same as that used by Tubach et al.20. Because the ASES score includes direct questions regarding both pain and function, we compared the ASES score with the responses to the two separate fifteen-item improvement questions (one evaluating pain and the other evaluating function). Because the SST directly assesses only shoulder function, we compared the SST score only with the response to the fifteen-item shoulder function question and not to the anchor question evaluating pain relief. While the SST and the ASES score both include questions that inherently ask the patient about shoulder pain, weakness, and overall function, we only included improvement (anchor) questions regarding the aspects of shoulder outcome directly assessed by the questionnaires. Both the ASES score and the SST scores were compared with the response to the four-item improvement question.
We classified patients who answered "No change," "Almost the same, hardly any worse at all," or "Almost the same, hardly any better at all" on the fifteen-item questions and "None—no good at all, ineffective treatment" or "Poor—some effect but unsatisfactory" on the four-item question as having no change in the status of the shoulder. We considered patients who answered "A little better," "Somewhat better," "A little worse," or "Somewhat worse" on the fifteen-item questions and "Good—satisfactory effect with occasional episode of pain or stiffness" on the four-item question as having experienced a small change equivalent to the "minimal important difference."11,20 Minimal clinically important differences were calculated for the ASES score with use of both fifteen-item questions and the four-item question, whereas minimal clinically important differences were calculated for the SST with use of only the fifteen-item function question and the four-item question. Minimal clinically important differences were calculated by subtracting the mean change score of all patients classified as having no change from the mean change score of all patients who were classified as experiencing a "minimal important difference." T tests were performed to compare means between the "unchanged" and "minimal important difference" groups. P values of <0.05 were considered significant.
A secondary analysis was performed to determine the effect of various patient-related and treatment factors on the various calculated minimal clinically important differences. Two-sample t tests were utilized to compare minimal clinically important differences on the basis of the sex of the patient and hand dominance. Spearman correlations were utilized to compare minimal clinically important differences on the basis of age and duration of follow-up. The effect of the baseline ASES and SST scores on the minimal clinically important differences was determined by first defining the cutoff point between high and low baseline scores as the mean baseline ASES score (50.1 ± 17.5 points) and SST score (6 ± 3.2 points). The patients were then categorized as having a minimal clinically important difference or not on the basis of the categories established above for each of the anchor assessment questions. The association between high/low baseline assessment scores and achievement of a minimal clinically important difference was evaluated with use of odds ratios. P values of <0.05 were considered significant.
Source of Funding
There was no external source of funding for this investigation.
As mentioned, the minimal clinically important difference in the SST score was determined with use of the fifteen-item function question and four-item improvement question. On the basis of the fifteen-item function question, twenty-five patients were classified as having a minimal important difference and nine, as having no change. The mean changes in the SST score were -0.33 point and 1.72 points, respectively, for the patients with no change and those with a minimal important difference as determined with the fifteen-item function question. The minimal clinically important difference in the SST score as determined by the patients’ responses to the fifteen-item function question was 2.05 points (p = 0.02). On the basis of the four-item improvement question, forty-eight patients were classified as having a minimal important difference and twenty-four, as having no change. The mean changes in the SST score were 0.5 point and 2.83 points for the patients with no change and those with a minimal change, respectively, as determined with the four-item question. The minimal clinically important difference in the SST score as determined with use of the four-item question was 2.33 points (p = 0.0009). Consequently, the minimal clinically important difference in the SST score for patients with rotator cuff disease is 2 points.
The minimal clinically important difference in the ASES score was determined with use of the fifteen-item function question, the fifteen-item pain question, and the four-item improvement question. The mean changes in the ASES score were 6.85 and 18.87 points, respectively, in the group determined to have no change and the group determined to have a minimal change on the basis of their responses to the fifteen-item function question. The minimal clinically important difference in the ASES score was 12.01 points (p = 0.03) as determined with use of the fifteen-item function question. On the basis of the fifteen-item pain question, twenty-six patients were classified as having a minimal important difference and eight, as having no change. The mean changes in the ASES score were 4.69 points and 21.60 points, respectively, for the patients who reported no change and those who reported a minimal change in response to the fifteen-item shoulder pain question. The minimal clinically important difference in the ASES score as determined with the fifteen-item pain question was 16.92 points (p = 0.004). Lastly, the mean changes in the ASES score were 9.13 and 25.85 points, respectively, for the patients who were classified as having no change and those who were classified as having a minimal change on the basis of the four-item question. The minimal clinically important difference in the ASES score as determined with the four-item question was 16.72 points (p < 0.0001). Consequently, the minimal clinically important difference in the ASES score for patients with rotator cuff disease is between 12 and 17 points.
All eighty-one patients were included in the analysis of factors affecting minimal clinically important differences. No associations were found between any calculated minimal clinically important difference (in the ASES score based on the fifteen-item pain question, fifteen-item function question, or four-item improvement question or in the SST score based on the fifteen-item function question or four-item improvement question) and sex, hand dominance, or age (p > 0.05) (Tables III, IV, and V). The duration of follow-up had a significant effect on the minimal clinically important differences in the ASES score as assessed with the fifteen-item function question (p = 0.04) and the four-item improvement question (p = 0.03), with a longer duration of follow-up associated with a greater minimal clinically important difference. The minimal clinically important difference in the ASES score as determined with the fifteen-item pain question and the minimal clinically important differences in the SST score as determined with the fifteen-item function and four-item improvement assessments were not affected by duration of follow-up (p > 0.05) (Table V). Finally, there was no association between the baseline SST or ASES scores (i.e., whether they were above or below the mean baseline scores) and achievement of a minimal clinically important difference at the time of follow-up (p > 0.05) (Table VI). It should be noted that no power analysis was performed and therefore findings of insignificance may be a result of inadequate sample size.
Outcome studies can have limited benefit if clinicians are unable to interpret how the results relate to their clinical practice. Reference values are required in order to determine if outcome changes due to a treatment are of real benefit to patients. Minimal clinically important differences are such references that can guide clinicians in making the decision to change their clinical practice on the basis of the results of an outcome study. The SST and ASES score are two outcome questionnaires commonly used in studies evaluating various conditions of the shoulder. We determined, in a homogeneous population of patients treated for rotator cuff disease, that the minimal clinically important difference in the SST score is 2 points and the minimal clinically important difference in the ASES score is between 12 and 17 points. These values represent approximations based on the various questions used to determine the minimal clinically important differences. Age, sex, hand dominance, and baseline SST and ASES scores had no apparent effect on minimal clinically important differences, but this lack of effect may be a reflection of inadequate power. Patients with a longer duration of follow-up may have a larger minimal clinically important difference when the ASES score is used than when the SST is used, as we did not find the minimal clinically important differences in the SST score to be affected by the duration of follow-up. However, again, this lack of an effect may be a result of inadequate sample size.
A variety of methods have been described for estimating minimal clinically important differences. These can be classified into three general categories: distribution-based, opinion-based, and anchor-based. Distribution methods are based on the properties of a given instrument in an untreated population. For example, if the values of an untreated population lie within a narrow range, then a small change may represent a clinically important difference. Distribution methods for estimating minimal clinically important differences include calculation of the standard error of measurement and the standard deviation21. The most commonly used distribution method is calculation of the between-person standard deviation of the control group at baseline4. The advantage of distribution methods is that they are easy to use because they require only a single time point. Limitations include the fact that the estimates of variability differ between studies and the absence of a clinical component to these approaches4,21. While distribution-based approaches provide an estimate of the minimal clinically important difference that is useful for determining sample sizes in a study design, this approach alone is insufficient for the interpretation of the results of clinical studies21.
With opinion-based approaches, various methods are used to arrive at a consensus between experts and patients regarding minimal clinically important differences. Although expert opinion regarding minimal clinically important differences may be useful for interpreting outcomes, these methods are infrequently utilized in studies estimating minimal clinically important differences.
With the anchor-based approach, a patient's retrospective rating of a change in status is used to determine the minimal clinically important difference. Once the anchor has been chosen, several different methods can be employed to derive the minimal clinically important difference. The easiest and most widely utilized technique is to specify a range of anchor-instrument results that correspond to the minimal clinically important difference and calculate the change in outcome score that correlates with that range of values1,11. We utilized that method in our analysis. Other authors have utilized receiver operating curves to determine minimal clinically important differences on the basis of the results of anchor instruments8,9.
A variety of instruments have been utilized to evaluate outcomes in patients with shoulder disorders; these include the SST, ASES score, Constant score, UCLA score, Western Ontario Rotator Cuff Index, Penn Shoulder Score, and Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire. Despite the prevalent use of these instruments, very little is known about what constitutes a clinically relevant change in their scores. Michener et al. used the ASES score to evaluate sixty-three patients at an initial therapy visit and then used the ASES score and a global rating of change question to reevaluate them three to four weeks after the physical therapy9. Using a receiver operator curve method of analysis, the authors determined the minimal clinically important difference to be 6.4 points. Limitations of this study include a relatively heterogeneous population of patients, which included eight different diagnoses along with fifteen patients who had been treated with an operation; a lack of evaluation of factors affecting the minimal clinically important difference; and the use of a single anchor question in the estimation. Our estimation of the minimal clinically important difference is significantly larger than that reported by Michener et al. Leggin et al. used the Penn Shoulder Score and a 5-point global rating of change scale to evaluate 109 patients with a total of seven different shoulder-disorder diagnoses7. Using a method of analysis that was similar to the one that we used in the current study, and as described by Juniper et al.11, they determined the minimal clinically important difference in the Penn Shoulder Score to be 11.4 points.
The clinical utility of a minimal clinically important difference is affected by the population of patients from which it is derived. A population that is heterogeneous in terms of patient disease limits the ability to use a minimal clinically important difference accurately in the evaluation of any specific disease. This is because neither the ASES score nor the SST evaluates various shoulder diseases equally. The responsiveness of both the SST and the ASES score has been determined to be affected by patient diagnosis10,22, with the ability of both questionnaires to demonstrate a change in outcome for patients with shoulder instability being inferior to their ability to show such a change for those with rotator cuff disease10,22. Consequently, the amount of change a patient perceives to be clinically important is likely to be affected by the diagnosis with both of these instruments. For instance, a patient with shoulder instability is likely to be able to perform all of the tasks on the SST despite having substantial dysfunction because the instrument is less responsive in cases of instability; thus, a minimal clinically important difference would be relatively small for patients with this diagnosis. In contrast, the instrument is very responsive for patients with rotator cuff disease, and the minimal clinically important difference would be much larger in these cases since the instrument is better able to capture patient outcome change. Deriving a minimal clinically important difference from a group consisting of two populations combined would result in a minimal clinically important difference that was inappropriate for either population. From a statistical standpoint, it may be argued that a more heterogeneous study population would make the minimal clinically important difference more generalizable and potentially superior. However, from a clinical standpoint, a minimal clinically important difference derived from a study population that is homogeneous with regard to diagnosis would be more clinically applicable and therefore superior for evaluation of patients with that specific diagnosis. The homogeneous nature of our study population improves the ability of the derived minimal clinically important differences to be translated into clinical use.
The duration of follow-up was found to significantly affect some of the minimal clinically important differences in the ASES score and therefore should be considered when applying these values clinically. Patients with a longer duration of follow-up had higher minimal clinically important differences. Consequently, patients who had had treatment further in the past required a larger change in function in order to feel that they had had a clinically relevant improvement. One possible explanation for these results is that, as patients improve functionally over time, they expect to have a greater improvement. Alternatively, a longer duration after treatment may affect a patient's perception of his or her current status and create a bias toward an underestimation of the change in status. Consequently, the minimal acceptable improvement increases. The duration of follow-up did not affect the minimal clinically important difference in the SST score. This could be because there were twelve fewer scoring options for the SST than for the ASES score, rendering the SST less sensitive to more subtle changes and differences. It is possible that the longer duration of follow-up actually did affect the minimal clinically important difference in the SST score but the limited number of scoring options did not allow this difference to become statistically apparent. This lack of effect could also be a result of an inadequate power to detect a change. Age, hand dominance, sex, and baseline outcome scores did not affect minimal clinically important differences. With the number of patients available, our results indicate that these minimal clinically important differences are reasonable for the evaluation of the results of most patients who are treated conservatively for rotator cuff disease, and we believe that the values are independent of various patient-related factors.
While we have shown that the duration of follow-up affects the minimal clinically important difference in the ASES score, the overall quantity of the incremental effect was relatively small. Thus, we performed a separate analysis to evaluate the effect of the duration of follow-up on the minimal clinically important difference in the ASES score. In this analysis, we determined the minimal clinically important differences with use of the four-item improvement question and the fifteen-item shoulder function question (both of which showed a significant correlation with the duration of follow-up) for individuals with a maximum of eighteen, twenty-four, thirty, and thirty-six weeks of follow-up. (We were unable to analyze the data separately for patients followed for less than eighteen weeks or those followed for more thirty-six weeks because of small sample sizes.) While there was a trend for the minimal clinically important differences to increase with an increasing duration of follow-up, all of the minimal clinically important differences at the eighteen, twenty-four, thirty, and thirty-six-week time points were between 12 and 17 points. Consequently, our finding that the minimal clinically important difference in the ASES fell between 12 and 17 points was supported even when we took into account the effects of the duration of follow-up.
The current study had several limitations. First, we were unable to determine the exact diagnosis (rotator cuff tendinitis or a tear) for several patients in the cohort. Patients with a tear may have a different threshold for an important decrease in symptoms than patients with tendinitis. We could not accurately determine whether this was the case because magnetic resonance imaging scans were not performed on all patients and thus it is possible that some patients classified as having tendinitis had an unrecognized rotator cuff tear. Also, because rotator cuff tendinitis is a clinical diagnosis, some patients may have been included in the tendinitis group despite having a different diagnosis such as biceps tendinitis, a labral tear, or early osteoarthritis that was not apparent on radiographs.
A second limitation of this study is that the findings may be applicable only to patients with nonoperative treatment of rotator cuff disease. It is extremely difficult to perform this type of study in a cohort of patients undergoing rotator cuff surgery (i.e., to compare preoperative with postoperative status) because it is necessary for a large number of patients to have no substantial improvement in order to determine the minimal clinically important difference. This is very unlikely after rotator cuff repair with current techniques. On the basis of limited information, it appears that minimal clinically important differences, in general, are not affected by the treatment method1. Therefore, we believe that our data could possibly guide physicians on how to interpret outcomes data following surgical as well as nonoperative treatment of rotator cuff disease.
A third limitation is that the anchor questions used in this study as well as those used by Juniper et al.11 and Tubach et al.20 (from which our anchors were derived) have not been validated. We could have utilized the anchor described by Michener et al.9, as doing so might have allowed a better comparison between studies, but we would have had to change the question to reflect our study protocol and it is possible that this would have affected the results. Also, because their anchor, like ours, was not validated, we did not believe that it was superior to ours.
Finally, the lack of a significant association between various factors and the minimal clinically important differences in our study may have been a result of a small sample size. This problem may be alleviated by the study of a larger patient population. Additional study is also required to determine if these minimal clinically important differences can be utilized for patients with other diagnoses, including adhesive capsulitis, osteoarthritis, and instability.