The management of chondral injuries of the knee has recently received much
attention with techniques such as abrasion arthroplasty, microfracture,
autologous osteochondral grafting, allogeneic osteochondral grafting,
periosteal grafting, and autologous chondrocyte
implantation1-18.
Outcomes assessment after the treatment of chondral disorders of the knee has
involved the use of various outcome instruments, such as the International
Knee Documentation Committee (IKDC)
form1, the Tegner
activity
scale2,3,
the Cincinnati knee
scale2,4-8,
the Hospital for Special Surgery knee
scale9-11,
the Western Ontario and McMaster Universities Osteoarthritis Index
(WOMAC)6,12-14,
the Knee Society knee
scale6, and the
Lysholm knee
scale3,8,15,16.
Recently, the International Cartilage Repair Society folded its assessment
documentation into the new IKDC
form19.
The Lysholm knee scale is a condition-specific outcome measure that
contains eight domains: limp, locking, pain, stair-climbing, support,
instability, swelling, and squatting (see
Appendix)20,21.
An overall score of 0 to 100 is calculated, with 95 to 100 indicating an
excellent result; 84 to 94, a good result; 65 to 83, a fair result; and
<65, a poor result. Originally designed to assess ligament injuries of the
knee, the Lysholm knee scale has been used for a variety of knee conditions,
including chondral disorders.
The use of outcome instruments whose psychometric properties have been
vigorously established is essential. The important psychometric properties of
an outcome instrument include reliability, validity, and
responsiveness22.
Reliability refers to the reproducibility of the measure, either between
subjects (test-retest reliability) or between observers (interobserver
reliability). Validity questions whether an outcome instrument actually
measures what it is intended to measure. Components of validity include
content validity ("face" validity and floor and ceiling effects),
criterion validity (how an instrument compares with an accepted so-called
gold-standard instrument), and construct validity (does the instrument follow
expected noncontroversial hypotheses?). Responsiveness assesses changes in the
instrument value over time or treatment.
The purpose of this study was to determine the psychometric properties of
the Lysholm knee scale for various chondral disorders of the knee.
Study Groups
Three study groups were used. Group A comprised 1657 patients with chondral
disorders of the knee. The Lysholm scale, demographic data, subjective
assessment, and objective assessment were measured preoperatively and at
periodic postoperative intervals (three, six, and twelve months and yearly
thereafter) and were maintained prospectively in a computerized database. The
mean age of the patients was forty-four years (range, fourteen to eighty-eight
years). A total of 1011 patients (61%) were male and 646 (39%) were female.
Chondral lesions included traumatic chondral injuries involving only one
compartment in 679 patients (41%), traumatic chondral injuries involving two
or more compartments in 249 patients (15%), and degenerative chondral lesions
in 729 patients (44%). Of the 679 patients with traumatic unicompartmental
chondral lesions, associated lesions included ligament injuries in 230
patients and meniscal injuries in 285 patients. Of the 249 patients with
traumatic multicompartmental chondral lesions, associated lesions included
ligament injuries in sixty-five patients and meniscal injuries in 107
patients. Of the 729 patients with degenerative chondral lesions, associated
lesions included ligament injuries in eighty patients and meniscal injuries in
277 patients.
Group B was a subset of Group A and comprised fifty-seven patients with a
variety of arthroscopically documented chondral disorders of the knee. The
Lysholm knee scale, demographic data, subjective assessment, and objective
assessment were measured preoperatively at two different times, no more than
four weeks apart. Postoperative assessment was the same as that for Group A.
The data were maintained prospectively in a computerized database.
Twenty-three patients had traumatic unicompartmental chondral lesions, nine
patients had traumatic chondral lesions involving two or more compartments,
and twenty-five patients had degenerative chondral lesions. Of the
twenty-three patients with traumatic unicompartmental chondral lesions,
fifteen had isolated chondral lesions, five had associated ligament injuries,
and three had associated meniscal injuries. Of the nine patients with
traumatic chondral lesions involving two or more compartments, three had
isolated chondral lesions, five had associated ligament injuries, and one
patient had an associated meniscal injury. Of the twenty-five patients with
degenerative lesions, two had associated ligament injuries and eleven had
associated meniscal injuries. The mean age of the patients was forty-four
years (range, twenty-eight to sixty-four years). Thirty-five patients (61%)
were male.
Group C was a subset of Group A and was composed of 248 patients with
chondral injuries. In addition to the preoperative Lysholm knee scale,
patients completed the Short Form-12 (SF-12) health-related quality-of-life
scale23, the
Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and
the Tegner activity scale. Postoperative assessment was the same as that for
Group A. Data were maintained prospectively in a computerized database.
Degenerative chondral lesions were present in eighty-four patients,
unicompartmental traumatic chondral lesions were present in 125 patients, and
multicompartmental traumatic lesions were present in thirty-nine patients. The
mean age of these patients was forty years (range, thirteen to seventy-four
years), and 165 (67%) were male. Isolated chondral defects were present in 107
patients, associated anterior cruciate ligament injury was present in
sixty-six patients, associated meniscal injury was present in forty-seven
patients, and associated anterior cruciate ligament and meniscal injury was
present in twenty-eight patients.
Test-Retest Reliability
Test-retest reliability was determined in Group B. These patients completed
an original preoperative questionnaire and a second preoperative questionnaire
within four weeks of the original questionnaire. There was no interval change
in health status as ascertained by health history forms with a complete review
of systems. The intraclass correlation coefficient was determined for the
overall Lysholm scale and for the component scales of pain, instability,
locking, stair-climbing, limp, support, swelling, and squatting. An intraclass
correlation coefficient of >0.70 was considered
acceptable22.
Internal Consistency
Internal consistency was determined in Group A. Preoperative Lysholm knee
scales were used to establish internal consistency, and overall internal
consistency for all eight domains was determined. A Cronbach alpha of >0.60
was considered
acceptable22.
Content Validity
Content validity was determined in Group A. Preoperative Lysholm knee
scales were used to establish content validity. Floor effects (the proportion
of patients who obtain the lowest possible score) and ceiling effects (the
proportion of patients who obtain the highest possible score) were determined
for the overall Lysholm scale and for the eight domains. Floor and ceiling
effects of <30% were considered
acceptable22.
Criterion Validity
Criterion validity was determined in Group C. Preoperative Lysholm knee
scales were used to establish criterion validity. Correlation of the overall
Lysholm scale to domains (physical functioning, role-physical, and bodily
pain) of the SF-12 health-related quality-of-life scale, to domains (pain,
stiffness, and function) of the WOMAC, and to the Tegner activity scale was
performed. The Pearson correlation coefficient was used for the continuous
outcome measures (SF-12 and Tegner), whereas the Spearman rho was used for the
categorical outcome measure (WOMAC).
Construct Validity
Construct validity was studied in Group A. Preoperative Lysholm knee scales
were used to establish construct validity. Nine hypotheses (constructs) were
developed by consensus and were tested in these patients:
Patients with lower activity levels would have lower scores on the Lysholm
knee scale. Activity level was measured on a 5-point ordinal scale, with 1
point indicating an inactive patient and 5 points indicating an extremely
active patient. The Spearman rho was used to determine significance.Patients with a greater number of chondral surfaces with
Outerbridge24
grade-4 changes would have lower scores on the Lysholm knee scale. The number
of surfaces (zero to six) included the lateral tibial plateau, lateral femoral
condyle, medial tibial plateau, medial femoral condyle, patella, and trochlear
groove. Analysis of variance was used to determine significance.Patients with full-thickness chondral defects (Outerbridge grade 4) would
have lower scores on the Lysholm knee scale than would patients with
partial-thickness chondral defects (Outerbridge grade 3). The independent
samples t test was used to determine significance.Patients with chondral defects and associated meniscal tears would have
lower scores on the Lysholm knee scale than would patients with isolated
chondral defects. The independent samples t test was used to determine
significance.Patients who had more difficulty with the activities of daily living would
have lower scores on the Lysholm knee scale than would patients who had less
difficulty with the activities of daily living. Difficulty with the activities
of daily living was measured on a 10-point ordinal scale, with 1 point
indicating an inability to do the activities of daily living and 10 points
indicating no difficulty with the activities of daily living. The Pearson
correlation coefficient was used to determine significance.Patients who had more difficulty working because of the knee would have
lower scores on the Lysholm knee scale than would patients who had less
difficulty working because of the knee. Difficulty with working because of the
knee was measured on a 10-point ordinal scale, with 1 point indicating an
inability to work because of the knee and 10 points indicating no difficulty
with working because of the knee. The Pearson correlation coefficient was used
to determine significance.Patients who had more difficulty with sports because of the knee would have
lower scores on the Lysholm knee scale than would patients who had less
difficulty with sports because of the knee. Difficulty with sports because of
the knee was measured on a 10-point ordinal scale, with 1 point indicating an
inability to participate in sports because of the knee and 10 points
indicating no difficulty with sports because of the knee. The Pearson
correlation coefficient was used to determine significance.Patients with previous knee surgery would have lower scores on the Lysholm
knee scale than would patients without previous knee surgery. The independent
samples t test was used to determine significance.Patients with a poorer assessment of overall knee function would have lower
scores on the Lysholm knee scale than would patients with a better assessment
of overall knee function. Overall knee function was assessed on an ordinal
10-point scale, with 1 indicating severely poor knee function and 10
indicating excellent knee function. The Pearson correlation coefficient was
used to determine significance.
Patients with lower activity levels would have lower scores on the Lysholm
knee scale. Activity level was measured on a 5-point ordinal scale, with 1
point indicating an inactive patient and 5 points indicating an extremely
active patient. The Spearman rho was used to determine significance.
Patients with a greater number of chondral surfaces with
Outerbridge24
grade-4 changes would have lower scores on the Lysholm knee scale. The number
of surfaces (zero to six) included the lateral tibial plateau, lateral femoral
condyle, medial tibial plateau, medial femoral condyle, patella, and trochlear
groove. Analysis of variance was used to determine significance.
Patients with full-thickness chondral defects (Outerbridge grade 4) would
have lower scores on the Lysholm knee scale than would patients with
partial-thickness chondral defects (Outerbridge grade 3). The independent
samples t test was used to determine significance.
Patients with chondral defects and associated meniscal tears would have
lower scores on the Lysholm knee scale than would patients with isolated
chondral defects. The independent samples t test was used to determine
significance.
Patients who had more difficulty with the activities of daily living would
have lower scores on the Lysholm knee scale than would patients who had less
difficulty with the activities of daily living. Difficulty with the activities
of daily living was measured on a 10-point ordinal scale, with 1 point
indicating an inability to do the activities of daily living and 10 points
indicating no difficulty with the activities of daily living. The Pearson
correlation coefficient was used to determine significance.
Patients who had more difficulty working because of the knee would have
lower scores on the Lysholm knee scale than would patients who had less
difficulty working because of the knee. Difficulty with working because of the
knee was measured on a 10-point ordinal scale, with 1 point indicating an
inability to work because of the knee and 10 points indicating no difficulty
with working because of the knee. The Pearson correlation coefficient was used
to determine significance.
Patients who had more difficulty with sports because of the knee would have
lower scores on the Lysholm knee scale than would patients who had less
difficulty with sports because of the knee. Difficulty with sports because of
the knee was measured on a 10-point ordinal scale, with 1 point indicating an
inability to participate in sports because of the knee and 10 points
indicating no difficulty with sports because of the knee. The Pearson
correlation coefficient was used to determine significance.
Patients with previous knee surgery would have lower scores on the Lysholm
knee scale than would patients without previous knee surgery. The independent
samples t test was used to determine significance.
Patients with a poorer assessment of overall knee function would have lower
scores on the Lysholm knee scale than would patients with a better assessment
of overall knee function. Overall knee function was assessed on an ordinal
10-point scale, with 1 indicating severely poor knee function and 10
indicating excellent knee function. The Pearson correlation coefficient was
used to determine significance.
Responsiveness
Responsiveness to change was assessed in Group C. The preoperative scores
on the Lysholm knee scale were compared with the scores at a mean of 51.2
months (range, 12.5 to 79.4 months) after treatment with arthroscopic
microfracture16.
The effect size was calculated with the formula: (mean postoperative
score—mean preoperative score)/standard deviation of preoperative score.
The standardized response mean was calculated with the formula: (mean
postoperative score—mean preoperative score)/standard deviation of the
change in score. Small effects were considered =0.20, moderate effects were
considered =0.50, and large effects were considered
=0.8022.
Test-Retest Reliability
The overall Lysholm knee scale demonstrated acceptable test-retest
reliability (intraclass correlation coefficient of >0.70)
(Table I). The component scores
for instability, locking, limp, support, swelling, and squatting also
demonstrated acceptable test-retest reliability
(Table I). There was less than
acceptable (intraclass correlation coefficient of <0.70) test-retest
reliability for the components of pain and stair-climbing
(Table I).
Internal Consistency
The Lysholm knee scale demonstrated acceptable internal consistency
(Cronbach alpha = 0.65).
Content Validity
The overall mean score on the Lysholm knee scale (and standard deviation)
was 58.3 ± 19.0 (range, 2 to 100) with acceptable (<30%) floor and
ceiling effects (Table II). The
domains of pain, swelling, limp, instability, support, stair-climbing, and
locking had acceptable (<30%) floor effects, and the domains of pain,
swelling, squatting, and stair-climbing demonstrated acceptable ceiling
effects (<30%). The domain of squatting had a high (>30%) floor effect,
and the domains of limp, instability, support, and locking had high (>30%)
ceiling effects (Table II).
Criterion Validity
Significant (p < 0.05) correlation was found between the overall Lysholm
knee scale and the physical functioning, role-physical, and bodily pain
domains of the SF-12 scale and the pain, stiffness, and function domains of
the WOMAC scale (Table III). In
addition, a significant (p < 0.05) correlation was demonstrated between the
overall Lysholm scale and the Tegner activity scale.
Construct Validity
All nine hypotheses (constructs), tested in the 1657 patients in Group A,
were found to be significant (p < 0.05).
Patients with lower activity levels had significantly lower scores on the
Lysholm knee scale (r = 0.410, p < 0.001).Patients with a greater number of chondral surfaces with Outerbridge
grade-4 changes had significantly lower scores on the Lysholm knee scale (p
< 0.001) (Fig. 1). (The mean
score was 59.9 ± 19.0 for 776 patients with no surface involvement,
59.1 ± 19.3 for 396 patients with one surface involved, 55.3 ±
18.5 for 373 patients with two surfaces involved, 53.4 ± 16.5 for
sixty-three patients with three surfaces involved, 54.9 ± 19.6 for
thirty-four patients with four surfaces involved, 49.3 ± 10.6 for four
patients with five surfaces involved, and 42.6 ± 26.2 for eleven
patients with six surfaces involved.)Patients with full-thickness chondral defects (Outerbridge grade 4) had
significantly lower scores on the Lysholm knee scale than did patients with
partial-thickness chondral defects (Outerbridge grade 3). (The mean score was
56.7 ± 19.0 for 881 patients with full-thickness defects and 59.9
± 19.0 for 776 patients with partial-thickness defects [p =
0.001].)Patients with chondral defects and associated meniscal tears had
significantly lower scores on the Lysholm knee scale than did patients with
isolated chondral defects. (The mean score was 56.4 ± 19.2 for 795
patients with a chondral defect and meniscal tear and 59.1 ± 18.8 for
862 patients who had a chondral defect alone [p = 0.01]).Patients who had more difficulty with the activities of daily living had
significantly lower scores on the Lysholm knee scale than did patients with
less difficulty with the activities of daily living (r = 0.421, p <
0.001).Patients with more difficulty working because of the knee had significantly
lower scores on the Lysholm knee scale than did patients with less difficulty
working because of the knee (r = 0.407, p < 0.001).Patients with more difficulty with sports because of the knee had
significantly lower scores on the Lysholm knee scale than did patients with
less difficulty with sports because of the knee (r = 0.330, p < 0.001).Patients with previous knee surgery had significantly lower scores on the
Lysholm knee scale than did patients without previous knee surgery. (The mean
score was 56.7 ± 18.9 for the 848 patients who had previous surgery and
59.9 ± 19.0 for the 774 who had no previous surgery [p = 0.001]. Data
were not available for thirty-five patients.)Patients with a poorer assessment of overall knee function had
significantly lower scores on the Lysholm knee scale than did patients with a
better assessment of overall knee function (r = 0.475, p < 0.001).
Patients with lower activity levels had significantly lower scores on the
Lysholm knee scale (r = 0.410, p < 0.001).
Patients with a greater number of chondral surfaces with Outerbridge
grade-4 changes had significantly lower scores on the Lysholm knee scale (p
< 0.001) (Fig. 1). (The mean
score was 59.9 ± 19.0 for 776 patients with no surface involvement,
59.1 ± 19.3 for 396 patients with one surface involved, 55.3 ±
18.5 for 373 patients with two surfaces involved, 53.4 ± 16.5 for
sixty-three patients with three surfaces involved, 54.9 ± 19.6 for
thirty-four patients with four surfaces involved, 49.3 ± 10.6 for four
patients with five surfaces involved, and 42.6 ± 26.2 for eleven
patients with six surfaces involved.)
Patients with full-thickness chondral defects (Outerbridge grade 4) had
significantly lower scores on the Lysholm knee scale than did patients with
partial-thickness chondral defects (Outerbridge grade 3). (The mean score was
56.7 ± 19.0 for 881 patients with full-thickness defects and 59.9
± 19.0 for 776 patients with partial-thickness defects [p =
0.001].)
Patients with chondral defects and associated meniscal tears had
significantly lower scores on the Lysholm knee scale than did patients with
isolated chondral defects. (The mean score was 56.4 ± 19.2 for 795
patients with a chondral defect and meniscal tear and 59.1 ± 18.8 for
862 patients who had a chondral defect alone [p = 0.01]).
Patients who had more difficulty with the activities of daily living had
significantly lower scores on the Lysholm knee scale than did patients with
less difficulty with the activities of daily living (r = 0.421, p <
0.001).
Patients with more difficulty working because of the knee had significantly
lower scores on the Lysholm knee scale than did patients with less difficulty
working because of the knee (r = 0.407, p < 0.001).
Patients with more difficulty with sports because of the knee had
significantly lower scores on the Lysholm knee scale than did patients with
less difficulty with sports because of the knee (r = 0.330, p < 0.001).
Patients with previous knee surgery had significantly lower scores on the
Lysholm knee scale than did patients without previous knee surgery. (The mean
score was 56.7 ± 18.9 for the 848 patients who had previous surgery and
59.9 ± 19.0 for the 774 who had no previous surgery [p = 0.001]. Data
were not available for thirty-five patients.)
Patients with a poorer assessment of overall knee function had
significantly lower scores on the Lysholm knee scale than did patients with a
better assessment of overall knee function (r = 0.475, p < 0.001).
Responsiveness
The overall Lysholm knee scale demonstrated a large (=0.80) overall
effect size ([81.6—58.2]/20.2) = 1.16) and a large (=0.80) overall
standardized response mean ([81.6—58.2]/21.3) = 1.10)
(Table IV). The domains of
pain, limp, swelling, and squatting had large (=0.80) effect sizes and
standardized response means, and the domains of locking, stair-climbing, and
support had moderate (=0.50) effect sizes and standardized response. There
was a small (=0.20) effect size and standardized response mean for the
domain of instability.
In the present study, the Lysholm knee scale demonstrated, in general,
acceptable psychometric parameters (test-retest reliability, internal
consistency, floor and ceiling effects, criterion validity, construct
validity, and responsiveness) to justify its use in outcomes assessment for
various chondral disorders of the knee. However, it may not be the optimal
outcome instrument. Other knee-specific outcome measures were not compared.
There were significant floor effects for the domain of squatting and
significant ceiling effects for the domains of limp, instability, support, and
locking. Thus, these domains may lack the ability to differentiate functional
status for chondral disorders of the knee. There was less than acceptable
test-retest reliability for the domains of pain and stair-climbing. Thus,
these domains may lack the repeatability necessary for scientific precision
and may require further refinement to improve reliability. There was a small
effect in terms of responsiveness for the domain of instability. Thus, this
domain may lack the ability to measure changes in function over time or in
response to treatment.
The limitations of this study include the heterogeneity of the study
population. All of these patients had chondral disorders of the knee; however,
they represented various lesions. Some had focal traumatic chondral defects,
whereas others had diffuse degenerative chondral lesions. Some were isolated
chondral lesions, whereas others had associated meniscal or ligament injuries.
Some had unicompartmental lesions, whereas others had multicompartmental
lesions. This heterogeneity in the validation population precludes focusing
specifically on patients with isolated, traumatic, unicompartmental chondral
defects. Therefore, the Lysholm knee scale may not be measuring the impact of
the chondral lesion alone on knee function. However, the use of a large,
prospectively maintained computerized database on 1657 patients with varied
lesions allows for generalizability to more diverse populations with chondral
disorders in the knee as commonly seen in clinical settings. When studying
construct validity, significant differences or associations were found for all
nine constructs; however, it is not clear whether these represent clinically
important differences. The minimal difference in the Lysholm knee scale that
represents a clinically important difference in functional status has not been
established. A subset of 248 patients was assessed with the SF-12, WOMAC, and
Tegner scales in order to determine criterion validity and responsiveness to
change. Responsiveness was measured after intervention with arthroscopic
microfracture. There is controversy with regard to the optimal technique of
chondral resurfacing; however, microfracture is an established and frequently
performed
procedure16.
The general psychometric performance of the Lysholm knee scale has been
previously examined. Irrgang et al. studied the Activities of Daily Living
Scale of the Knee Outcome Survey and the Lysholm knee scale in 397 patients
with varied diagnoses, including ligament injury, arthritis, meniscal tear,
tendinitis, patellofemoral pain, and plica
syndrome25. For
internal consistency of the Lysholm knee scale, they found Cronbach alpha
values of 0.60 to 0.73, which were similar to the value (0.65) observed in the
present study. For criterion validity, correlations between the Lysholm scale
and a global rating of function were 0.54 to 0.57. For responsiveness of the
Lysholm scale, they found effect sizes of 0.82 to 1.13, which were slightly
lower than the value (1.16) observed in the present study. Marx et al. studied
the reliability, validity, and responsiveness of the Cincinnati knee-rating
system, the American Academy of Orthopaedic Surgeons sports kneerating scale,
the Activities of Daily Living scale of the Knee Outcome Survey, and the
Lysholm knee scale in subsets of thirty-one patients (reliability), 133
patients (validity), and forty-two patients (responsiveness) with a variety of
knee disorders26.
For reliability of the Lysholm knee scale, they found an intraclass
correlation coefficient of 0.95, similar to the intraclass correlation
coefficient (0.91) that we observed in the present study. For criterion
validity of the Lysholm knee scale, they found substantial correlations
between the Lysholm Knee Scale and the physical functioning, role-physical,
and bodily pain subscales of the SF-36, which were similar to the significant
correlations that we observed with these subscales of the SF-12. They also
found substantial correlations between the Lysholm knee scale and the other
knee scales, and both the physician and patient rating of severity. Those
authors found no ceiling or floor effects with the Lysholm scale in their
patient population. For responsiveness of the Lysholm knee scale, they found a
standardized response mean of 0.9, similar to the value (1.10) that we
observed in the present study. Other authors have correlated the Lysholm knee
scale to a single-assessment numeric evaluation of
function27 and
other instruments, including the Feagin and Blake scale, the Hospital for
Special Surgery knee scale, and the IKDC
scale28. Although
the aforementioned studies examined the general psychometric properties of the
Lysholm knee scale in patients with varied diagnoses, the validity of the
scale for patients with chondral disorders has not been established.
The management of chondral injuries in the knee has received much
attention1-18.
A standardized approach to outcomes assessment for chondral disorders of the
knee would be helpful. A comprehensive assessment of outcomes would include a
generic measure of health-related quality of life, a knee condition-specific
measure of function, and a measure of patient satisfaction. In general, the
Lysholm knee scale demonstrated acceptable psychometric parameters; however,
certain domains demonstrated high ceiling or floor effects, which limit their
discriminative ability, and certain domains demonstrated less than acceptable
test-retest reliability, which limits their repeatability. Psychometric
testing of other condition-specific knee instruments in patients with chondral
disorders of the knee would be helpful to allow for comparison of psychometric
properties and may provide the impetus for the formal development of a widely
accepted chondral-specific outcome instrument.
A table showing the details of the Lysholm knee scale is available with the
electronic versions of this article, on our web site at
(go to the article citation and click on "Supplementary Material")
and on our quarterly CD-ROM (call our subscription department, at
781-449-9780, to order the CD-ROM).