Study Groups
Three study groups were used. This investigation involved the use of
a prospectively maintained computerized data-base of the records of patients
who were undergoing shoulder surgery at a major sports medicine clinic.
Institutional review board approval for the collection of outcome data for
these patients was obtained. Data on the patients with regard to the
subjective assessment, objective assessment, surgical findings, and outcomes
assessment were recorded preoperatively and at standard intervals
postoperatively.
Group A was specifically constructed to assess test-retest reliability.
Group A comprised fifty-six patients with shoulder instability (twenty-one
patients), rotator cuff disease (twenty), and glenohumeral arthritis
(fifteen). The mean patient age was 49.5 years (range, 15.0 to 77.6 years).
Thirty-four (61%) of the patients were male. The twenty-one patients with
instability included those with anterior instability (eleven),
multidirectional instability (seven), and posterior instability (three). The
type of instability was determined on the basis of the history and clinical
evaluation and was confirmed by examination with the patient under anesthesia
and by the surgical findings. The twenty patients with rotator cuff disease
included those with partial-thickness cuff tears (seven patients) and those
with full-thickness cuff tears (thirteen). Differentiation between
partial-thickness and full-thickness rotator cuff tears was based on the
surgical findings. All fifteen patients with glenohumeral arthritis had
full-thickness chondral loss due to osteoarthritis. Scores on the ASES
shoulder scale, subjective assessment, and objective assessment and
demographic data were obtained preoperatively with an original questionnaire,
with a second preoperative questionnaire that was completed within four weeks
after the original questionnaire, and at periodic postoperative intervals
(three months, six months, twelve months, and then yearly). The results were
maintained prospectively in a computerized database.
Group B comprised the patients within the overall shoulder computerized
database. Internal consistency, content validity, construct validity, and
responsiveness were assessed within Group B. Of the 1066 patients in the
group, 455 had instability, 474 had rotator cuff disease, and 137 had
arthritis. The 455 patients with instability included those with anterior
instability (293 patients), multidirectional instability (101), and posterior
instability (sixty-one). The mean age of the patients with instability was
30.3 years (range, 13.2 to 74.5 years), and 280 (62%) were male. The 474
patients with rotator cuff disease included those with partial-thickness cuff
tears (223 patients) and those with full-thickness cuff tears (251). The mean
age of these patients was 56.1 years (range, 17.8 to 95.1 years), and 355
(75%) were male. The 137 patients with arthritis included those with
partial-thickness chondral loss (eight patients) and those with full-thickness
chondral loss (129 patients) that was due to osteoarthritis (108 patients),
osteonecrosis (seven), or rheumatoid arthritis (twenty-two). The mean age of
the 137 arthritis patients was 62.3 years (range, 32.9 to 82.5 years), and 116
(85%) were male. Scores on the ASES shoulder scale, subjective assessment, and
objective assessment and demographic data were collected preoperatively and at
periodic postoperative intervals (three months, six months, twelve months, and
then yearly) (Fig. 1). The
results were maintained prospectively in a computerized database.
Group C was specifically constructed to evaluate criterion validity. It
comprised 106 patients with instability (sixty-eight patients), rotator cuff
disease (thirty), and arthritis (eight). The sixty-eight patients with
instability included those with anterior instability (fifty-five patients),
multidirectional instability (ten), and posterior instability (three). The
mean age of the sixty-eight patients with instability was 30.5 years (range,
13.7 to 59.5 years), and forty-one (60%) were male. The thirty patients with
rotator cuff disease included those with partial-thickness cuff tears (twelve)
and those with full-thickness cuff tears (eighteen). The mean age of the
thirty patients with a rotator cuff tear was 56.5 years (range, 28.3 to 90.2
years), and twenty-two (73%) were male. All eight patients with arthritis had
full-thickness chondral loss due to osteoarthritis. The mean age of the
arthritis patients was 61.5 years (range, 40.7 to 78.4 years), and seven of
the eight patients were male. The scores on the ASES shoulder scale,
subjective assessment, and objective assessment and demographic data were
obtained preoperatively and at periodic postoperative intervals (three months,
six months, twelve months, and then yearly). The results were maintained
prospectively in a computerized database. In addition to the preoperative ASES
shoulder scale, patients completed the Short Form-12 (SF-12) health-related
quality-of-life
scale21.
Test-Retest Reliability
Test-retest reliability was determined in Group A. These patients completed
an original preoperative questionnaire and a second preoperative questionnaire
within four weeks after the original questionnaire. There was no interval
change in health status as ascertained by health history forms with a complete
review of systems. The intraclass correlation coefficient (two-way
mixed-effects model) was determined for the overall ASES shoulder scale and
for the component scales of pain, work, sports, putting on a coat, sleeping
difficulty, washing, toileting, hair-combing, reaching a high shelf, lifting
10 lb (4.5 kg), and throwing a ball. An intraclass correlation coefficient of
=0.75 was considered
acceptable21-25.
Internal Consistency
Internal consistency in Group B was determined for patients with
instability, rotator cuff disease, and arthritis. Preoperative scores on the
ASES shoulder scale were used to establish internal consistency. Overall
internal consistency for all eleven components was determined. A Cronbach
alpha of >0.60 was considered
acceptable21-25.
Content Validity
Content validity in Group B was determined for patients with instability,
rotator cuff disease, and arthritis. Preoperative ASES shoulder scale scores
were used to establish content validity. Floor effects (the lowest possible
score) and ceiling effects (the highest possible score) were determined for
the overall ASES shoulder scale. Floor and ceiling effects of <15% were
considered
acceptable21-25.
Criterion Validity
Criterion validity was determined in Group C. The preoperative ASES
shoulder scale scores were used to establish criterion validity. Correlation
of the overall ASES shoulder scale to the domains (physical functioning,
role-physical, role-emotional, bodily pain, mental health, vitality, and
social function) of the SF-12 health-related quality-of-life scale was
performed. Domain scores were determined from the physical component score
portion of the SF-12 as follows: physical functioning, role-physical,
role-emotional, bodily pain, mental health, vitality, and social function. The
Pearson correlation coefficient was used to determine correlation
significance.
Construct Validity
Construct validity was studied in Group B. Twenty-three hypotheses
(constructs) were developed by consensus. Preoperative ASES shoulder scale
scores were studied. Appropriate hypothesis testing was performed. Constructs,
scaling of independent variables, and statistical testing are shown in the
Appendix.
Responsiveness
Responsiveness to change was assessed in Group B. The preoperative ASES
shoulder scale scores were compared with the postoperative scores. The mean
duration of follow-up was 742 days (range, 312 to 3839 days) for patients with
instability, 426 days (range, 300 to 690 days) for patients with rotator cuff
disease, and 703 days (range, 365 to 2489 days) for patients with glenohumeral
osteoarthritis. Surgical intervention for patients with instability was open
(208 patients) or arthroscopic (247) stabilization of the shoulder. Surgical
intervention for patients with rotator cuff disease was acromioplasty with
rotator cuff débridement (107 patients) or repair (367). Surgical
intervention for arthritis patients was arthroplasty.
Effect size was calculated with the following formula: (mean postoperative
scale - mean preoperative scale)/standard deviation of preoperative scale. The
standardized response mean was calculated with the following formula: (mean
postoperative scale - mean preoperative scale)/standard deviation of the
change in scale. Small effects were considered =0.20, moderate effects were
considered =0.50, and large effects were considered
=0.8021-25.
It was anticipated that there would be large effects for all three groups of
patients, with the largest effects seen in the patients with arthritis
undergoing arthroplasty.
Test-Retest Reliability
There was acceptable test-retest reliability (intraclass correlation
coefficient of =0.75) for the overall ASES shoulder scale and for all
domains except sleeping on the affected side
(Table I).
Internal Consistency
There was acceptable internal consistency (Cronbach alpha >0.60) for the
ASES shoulder scale in patients with shoulder instability (Cronbach alpha =
0.61), rotator cuff disease (Cronbach alpha = 0.64), and glenohumeral
arthritis (Cronbach alpha = 0.62).
Content Validity
The distribution of the preoperative ASES shoulder scale for the overall
study population of 455 patients with shoulder instability, 474 patients with
rotator cuff disease, and 137 patients with glenohumeral arthritis is shown in
Figure 1. For patients with
shoulder instability, the mean ASES shoulder scale was 64.0, the standard
deviation was 22.7, skewness was -0.52, and kurtosis was -0.67. For patients
with rotator cuff disease, the mean ASES shoulder scale was 56.5, the standard
deviation was 19.3, skewness was -0.08, and kurtosis was -0.60. For
glenohumeral arthritis patients, the mean ASES shoulder scale was 51.3, the
standard deviation was 18.6, skewness was -0.10, and kurtosis was -0.86. There
were acceptable (<15%) floor and ceiling effects of the ASES shoulder scale
for patients with shoulder instability, rotator cuff disease, and glenohumeral
arthritis (Table II).
Criterion Validity
Significant correlations (p < 0.05) were found between the ASES shoulder
scale and the physical functioning (r = 0.57), role-physical (r = 0.32), and
bodily pain (r = 0.58) domains of the SF-12 scale
(Table III). There were
appropriate nonsignificant correlations (p > 0.05) between the ASES
shoulder scale and the role-emotional, mental health, vitality, and social
function domains of the SF-12 scale (Table
III).
Construct Validity
All twenty-three hypotheses (constructs) were significant (p < 0.05).
The results are shown in the Appendix.
Responsiveness
There were large (=0.80) effect sizes and response means for patients
with shoulder instability, rotator cuff disease, and glenohumeral arthritis
(Table IV). The largest effect
size was seen for patients with arthritis, and the largest standardized
response mean was seen for patients with rotator cuff disease. The effect size
was greater than the standardized response mean for patients with rotator cuff
disease and patients with arthritis because the standard deviation for the
change scores was greater than the standard deviation for the preoperative
scores in these patients.
In this study, the ASES subjective shoulder scale demonstrated
acceptable psychometric performance for patients with shoulder instability,
rotator cuff disease, and glenohumeral arthritis. The ASES subjective shoulder
scale was reliable as it showed reproducibility when completed a second time
by the same patient (test-retest reliability), and it showed consistency
between questions (internal consistency). The ASES subjective shoulder scale
was valid as it showed minimal floor and ceiling effects (content validity),
correlated well with the appropriate domains of the SF-12 (criterion
validity), and followed accepted hypotheses (construct validity). The ASES
subjective shoulder scale was responsive as it showed large effect sizes after
treatment.
The limitations of this study include the use of a large, prospectively
maintained computerized database. Although this provided a large sample size
for the assessment of ASES subjective shoulder scale scores, there was
heterogeneity among the patients. Patients with shoulder instability included
those with anterior instability, multidirectional instability, and posterior
instability. Patients with rotator cuff disease included those with
partial-thickness tears and those with full-thickness tears. Patients with
glenohumeral arthritis included those with osteoarthritis, rheumatoid
arthritis, and varying degrees of chondrosis. In addition, there was variation
in the specific surgical techniques used. Although this heterogeneous
population and management may preclude focusing on a specific shoulder
diagnosis, such as traumatic anterior instability with a Bankart lesion
treated with open labral repair, it does allow for generalizability to more
diverse populations with various shoulder disorders and treatments. With
respect to construct validity, significant differences or associations were
found for all twenty-three constructs; however, it is not clear whether these
represent clinically important differences. The minimal difference in
ASES shoulder scales that represents a clinically important difference in
functional status was not established. With respect to criterion validity, the
physical functioning, role-physical, and bodily pain domains of the SF-12
scale were used for comparison with the ASES score. These domains of the SF-12
and the Short Form-36 (SF-36) may not be accepted universally as criteria;
however, they are frequently used for the establishment of criterion validity
for orthopaedic condition-specific outcome instruments.
Michener et al. recently analyzed the psychometric properties of the ASES
shoulder score in sixty-three patients with varied diagnoses, including
impingement syndrome, instability, rotator cuff disease, adhesive capsulitis,
glenohumeral arthritis, shoulder weakness, and humeral
fractures19.
Similar to our results, they found acceptable test-retest reliability
(intraclass correlation coefficient = 0.84), internal consistency (Cronbach
alpha = 0.86), criterion validity (with the SF-36), construct validity, and
responsiveness (standardized response mean = 1.5). These authors estimated
that the minimal change in the ASES score that was clinically important was
6.4.
Hollinshead et al. found high correlations between the Rotator Cuff Quality
of Life measure, the Functional Shoulder Elevation Test, and the ASES shoulder
scale15. Gartsman
et al. reported substantial improvement in the Constant and Murley shoulder
score, the University of California at Los Angeles (UCLA) shoulder score, and
the ASES shoulder scale after arthroscopic repair of full-thickness tears of
the rotator cuff2.
Skutek et al. correlated the ASES shoulder scale to the Constant and Murley
shoulder score1.
Beaton and Richards found the ASES shoulder scale, the Simple Shoulder Test,
and the Shoulder Severity Index to be reliable and
responsive3. Cook et
al. showed acceptable test-retest reliability and internal consistency for the
ASES shoulder scale, the UCLA shoulder score, the Constant and Murley shoulder
score, and the Shoulder Pain and Disability
Index10.
A comprehensive assessment of shoulder outcomes would include a generic
measure of health-related quality of life, a shoulder-specific measure of
function, and a measure of patient satisfaction. Outcomes assessment of the
management of shoulder instability, rotator cuff disease, and glenohumeral
arthritis has typically used various shoulder-specific outcome instruments.
Consensus should be formed with regard to a standardized approach to outcomes
assessment for shoulder disorders. Widespread adoption of a single reliable,
valid, and responsive shoulder-specific outcome instrument would allow for the
rigorous comparison of treatment results. In this study, we found that the
ASES subjective shoulder scale demonstrated acceptable psychometric
performance for patients with shoulder instability, rotator cuff disease, and
glenohumeral arthritis. However, the ASES subjective shoulder scale score may
not be the optimal outcome measure. Reliability was acceptable, but may not be
precise enough to be used on an individual basis. Other shoulder-specific
outcome instruments were not compared in the same group of patients.
Psychometric testing of other shoulder-specific instruments in patients with
varied shoulder disorders would be helpful to allow for comparison of
psychometric properties and may provide the impetus for the formal development
of a widely accepted shoulder-specific outcome instrument.
A table describing the details of the construct validity study is available
with the electronic versions of this article, on our web site at
(go to
the article citation and click on "Supplementary Material") and on
our quarterly CD-ROM (call our subscription department, at 781-449-9780, to
order the CD-ROM).