We performed a cross-sectional study of a consecutive cohort of patients
who underwent arthroscopic débridement for osteoarthritis of the knee.
We obtained human-subjects-protection approval from the institutional review
board of the Lifespan Academic Medical Center. Patients were eligible for
arthroscopic débridement if they met American College of Rheumatology
criteria for osteoarthritis of the tibiofemoral joint and treatment with oral
anti-inflammatory medications had failed. Criteria for inclusion in the study
were an age of eighteen to seventy years, an osteoarthritis grade of 2 or
higher according to the Kellgren-Lawrence radiographic
scale10, and the
ability to communicate regarding the outcomes of the procedure. Patients who
had previously sustained a traumatic injury to the knee were included, but
those with previous infection were not. Other exclusion criteria were
osteoarthritis involving the patellofemoral joint only, a diagnosis other than
osteoarthritis (including isolated cartilage lesions without arthritis), and
confounding diagnoses (such as osteoarthritis of the ipsilateral hip or
radiculopathy). One hundred and twenty-two patients were eligible for the
study; twelve (10%) were lost to follow-up, leaving 110 patients available for
data analysis.
The cohort analyzed consisted of thirty-six men and seventy-four women with
an average age of 61.7 years and a mean body mass index of 31.8 at the time of
the index arthroscopy. The patients had used a mean of two different
nonsteroidal anti-inflammatory drugs preoperatively, and all had discontinued
use of these drugs for at least four weeks prior to the index procedure.
Twenty of the 110 patients had received intra-articular corticosteroid
injections, but no patient had had an injection within three months before the
surgery. Twelve of the 110 patients had had a course of physiotherapy
consisting of quadriceps-strengthening and stationary bicycle riding. The mean
follow-up period was thirty-four months (range, twenty-four to seventy-four
months).
Clinical data were collected by a research nurse who had no knowledge of
the radiographic findings, intraoperative findings, or treatment expectations.
Pain was assessed preoperatively and postoperatively with the pain domain of
the Knee Society scoring system, which is a joint-specific score ranging from
0 to 50 points, with 50 points indicating no pain and 0 points indicating
severe pain11. All
patients were asked standard questions to facilitate scoring and group
assignment. Mild pain was defined as pain experienced only when the patient
walked, and it was usually occasional. Moderate pain, graded as either
occasional or continuous, was defined as being associated with limitations in
functional activity and occasional use of analgesics. Severe pain was present
at night or at rest and required the frequent use of analgesics. The mean
preoperative pain score in our series was 11.9 (of 50) points.
Symptoms were further classified as being primarily present in stance,
mechanical (for example, locking or buckling), or inflammatory. The body mass
index was calculated from height and weight measurements. A standard physical
examination of the knee was performed. Standing, weight-bearing
anteroposterior radiographs were used to measure the tibiofemoral angle and
the widths of the medial and lateral joint
spaces12. The
severity of the arthritis was scored with the Kellgren-Lawrence radiographic
method10. All
radiographs were scored without knowledge of the patient's identity, clinical
data, intraoperative findings, or treatment outcome. Each radiograph was
measured by two observers so that we could calculate the interobserver error,
and both observers scored the radiographs on two occasions, approximately
eight weeks apart, so that we could calculate the intraobserver error. The
distribution of Kellgren-Lawrence grades, joint space widths, and tibiofemoral
angles is presented in Table
I.
All of the arthroscopic procedures were carried out by one surgeon
(R.K.A.), who used superolateral, anterolateral, and anteromedial portals and
a Dyonics 4-mm arthroscope (Smith and Nephew, Andover, Massachusetts).
Cartilage lesions were inspected and were palpated with an angled probe with a
5-mm tip to stage their severity. Lesion severity was scored with a
modification of the grading system described by Noyes and
Stabler13. Each
articular surface was graded, and the scores were added to create
compartmental scores and a whole knee score. The distribution of the
intraoperative scores is shown in Table
I.
Following grading, a limited surgical débridement of damaged
cartilage was performed with use of a motorized chondrotome. Loose flaps of
articular cartilage were resected, crater edges were smoothed, and loose
bodies were removed. Torn meniscal cartilage and hypertrophic synovial tissue
were resected. No bone-drilling or abrasion was done. The joint was then
irrigated and evacuated.
After the procedure, the patients wore a knee immobilizer and walked with
partial weight-bearing for two to three days. They then began range-of-motion
exercises and gait as tolerated.
The operating surgeon did not participate in the preoperative clinical or
radiographic assessment or the assessment of the postoperative outcome.
Statistical Methods
The data from the observers who performed the clinical, radiographic, and
intraoperative evaluations were collected by a database manager and were
analyzed according to a preestablished protocol by a biostatistician. On the
clinical basis presented below, we dichotomized the Knee Society pain scores
for selected analyses, with scores of 0 to 20 points indicating a treatment
failure and scores of =30 points indicating a treatment success. We used
Pearson correlations and simple (least-squares) linear regression to explore
associations between the postoperative pain score and the intraoperative
lesion-severity score. The paired t test was used to compare preoperative and
postoperative pain scores. All other bivariate analyses comparing patient
characteristics and operative outcomes by using the dichotomized pain severity
variable and by using the presence or absence of mechanical symptoms were
conducted with the independent-samples t test and the Pearson chi-square test.
All analyses were performed with use of Stata version-7 software (Stata,
College Station, Texas) with p < 0.05 considered to be significant.
Clinical Findings
The mean pain score for the entire cohort of 110 patients improved from
11.9 points preoperatively to 30.8 points postoperatively (t = -9.1; p <
0.001). No patient with a postoperative pain score of =30 points elected to
have further treatment, and all patients seeking further treatment, either
medication or total knee replacement, had a pain score of =20 points. This
allowed pain scores to be dichotomized as success or failure. Overall,
seventy-two (65%) of the 110 patients had substantial pain relief
postoperatively. Of the thirty-eight patients for whom the treatment failed,
seventeen (15% of the total population) underwent total knee replacement
during the follow-up period. The mean time to the total knee replacements was
fourteen months after the arthroscopy. Total knee replacement was performed in
three (5%) of the fifty-eight Kellgren-Lawrence grade-2 knees at a mean of
twenty months after the arthroscopy, in seven (22%) of the thirty-two grade-3
knees at a mean of ten months after the arthroscopy, and in seven (35%) of the
twenty grade-4 knees at a mean of fifteen months after the arthroscopy. The
severity of the preoperative pain and the type of symptoms had no influence on
the postoperative pain scores, with the numbers available. Sixty-two of the
110 patients had symptoms that were considered to be mechanical, six patients
had pain in stance only, thirty-six patients had inflammatory symptoms, and
the symptoms could not be specifically categorized in the remainder of the
patients. The physical examination revealed joint-line tenderness, pain, and
crepitation in essentially all patients; these findings did not reflect the
type of symptoms or predict the presence or absence of meniscal tears. Of the
patients who had improvement after the arthroscopy, forty-nine (68%) had it
within one year and most (forty-four) had it within the first six months
(Fig. 1). Twenty-three other
patients continued to have improvement, some for up to 2.5 years after the
arthroscopy.
Radiographic Findings
The Kellgren-Lawrence grades were reproduced with an intraobserver
reliability of ±0.27 on repeated measurements. Intraobserver and
interobserver error was 1° (tibiofemoral angle) and 1 mm (joint space
width) on repeated measurements. The severity of the arthritis, as graded with
the Kellgren-Lawrence system, had a profound effect on outcome (p < 0.0001)
(Fig. 2). Forty-nine (84%) of
the fifty-eight knees with minimal radiographic changes (grade 2) had
substantial pain relief postoperatively and were considered to have a clinical
success, whereas only five (25%) of the twenty knees with severe
osteoarthritis (grade 4) had adequate pain relief. Interestingly, of the
thirty-two knees with moderate osteoarthritis (grade 3), seventeen (53%) had
pain relief and the remainder had a treatment failure. (Further analysis of
this group is presented below.) Thirty-five (60%) of the fifty-eight patients
with grade-2 osteoarthritis had improvement within the first six months after
the arthroscopy. However, an additional twelve patients demonstrated
improvement at twenty-four to thirty-six months postoperatively
(Fig. 3). Six (19%) of the
thirty-two patients with grade-3 arthritis had slow, steady improvement for
the first twenty-four months after the arthroscopy. Minimal improvement was
seen in the knees with grade-4 osteoarthritis.
With the numbers available, we found no association between limb alignment
(the tibiofemoral angle) and the outcome when limb alignment was considered to
be a continuous function, but we did find an association when it was
considered relative to a presumptive normal range of 5° to 9°
(Fig. 4). Twenty-two (85%) of
the twenty-six knees with a normal tibiofemoral angle had minimal
postoperative pain and were considered to have had successful treatment. The
postoperative pain score was 42 (of 50) points for the knees with a normal
tibiofemoral angle. The mean pain score (27 points) was lower for the knees in
malalignment, and fewer of those knees were considered to have had successful
treatment (p = 0.0013). Knees in valgus alignment did particularly poorly,
whereas thirty-six (64%) of the fifty-six knees with mild varus (a
tibiofemoral angle of 0° to 4°) did well. Medial and lateral joint
space widths were associated with the outcome when they were considered as
either continuous or dichotomous functions. A joint space width of =2 mm,
particularly on the medial side, was associated with poorer postoperative pain
scores and a higher likelihood of treatment failure compared with a joint
space width of =3 mm (p < 0.001). Knees with a medial joint space width
of 0 to 2 mm had a mean postoperative pain score (and standard deviation) of
14.7 ± 4.4 points compared with a score of 33.2 ± 1.9 points for
knees with a medial joint space width of =3 mm (p = 0.0001). Five (31%) of
the sixteen knees with a preoperative medial joint space width of =2 mm had
substantial pain relief and successful treatment, whereas sixty-three (69%) of
the ninety-one knees with a medial joint space width of =3 mm were
considered to have had successful treatment.
Intraoperative Findings
The associations between lesion severity as assessed during the operation
and postoperative pain, considered as continuous variables, are presented in
Figure 5. More severe cartilage
lesions in all three compartments were associated with poorer postoperative
pain scores, with the strongest relationship observed between the whole-joint
lesion-severity score and postoperative pain. It is most likely that the
associations were moderately strong because pain reflected multicompartmental
disease. When considered as a dichotomous variable, the severity of the
cartilage lesions corresponded strongly with outcome. A whole-joint score of
=30 points was most predictive of clinical failure. The eighty-nine knees
with less severe lesions (<30 points) had a mean postoperative pain score
of 28.4 ± 1.5 points, and sixty-nine (78%) of those knees met the
criteria for successful treatment. Conversely, only two (10%) of the
twenty-one knees with a whole-joint lesion-severity score of =30 points had
clinical success. The mean pain score in this group was 17.5 ± 0.8
points (p < 0.001). A score of <12 points for the severity of the lesion
in the medial compartment was most commonly associated with clinical benefit.
The mean lesion-severity scores for each compartment are presented in
Table II. Significant
differences in these mean scores were observed between the knees with and
those without a clinically successful result.
Meniscal tears were found in seventy-nine knees; crystal deposition, in
eight; and loose bodies, in nine. With the numbers available, these findings
had no influence on the outcome, probably because of the ubiquity of meniscal
tears and the relative rarity of loose bodies or crystal deposition.
Analysis of Kellgren-Lawrence Grade-3 Knees
Since there were approximately equal numbers of clinical successes and
failures in the group of patients with Kellgren-Lawrence grade-3
osteoarthritis of the knee, this group was analyzed further in an attempt to
identify preoperative clinical characteristics that might be associated with
outcome. However, no such characteristics could be identified with the numbers
available. Nearly all of the grade-3 knees had alignment that was close to the
normal range and joint space widths of =3 mm; thus, we found no additional
radiographic predictors beyond the Kellgren-Lawrence grade. With both the
intraoperative lesion-severity scores and postoperative pain considered as
continuous variables, knees in which more severe lesions were seen
intraoperatively were less likely to have a successful clinical outcome (r =
-0.5; p = 0.0034). Knees that were considered to be clinical successes (a pain
score of =30 points) had a mean intraoperative lesion-severity score of
18.5 points, whereas knees that were clinical failures had a mean
intraoperative lesion-severity score of 25.0 points (p = 0.013). Knees with an
intraoperative lesion-severity score of <30 points had a mean postoperative
pain score of 28.9 points (in the range of clinical success), whereas those
with an intraoperative score of =30 points had a mean postoperative pain
score of 12.5 points (in the range of the scores for patients requiring
further treatment). Although this difference was not significant (p = 0.10),
post hoc sample-size estimates revealed that an increase in the sample size of
10% would bring this comparison to a level of significance (p = 0.04). When
dichotomized, postoperative pain scores were less well predicted by
intraoperative lesion-severity scores.
Aconsensus on the role of arthroscopy in the treatment of osteoarthritis of
the knee has been elusive for many years. Osteoarthritis has a clinical
spectrum of severity, with or without coexisting mechanical derangements.
However, in most studies of which we are aware, the authors did not explore
the role of arthroscopy in subsets of patients with varying disease severity
but rather aggregated the outcome scores of all knees without regard to the
extent of the arthritis. The older literature, consisting largely of
single-cohort studies of forty-four to 441 patients, has described
satisfactory clinical outcomes in 50% to 78% of knees followed for twelve to
forty-two
months1-5.
Generally speaking, about two-thirds of knees with osteoarthritis have a good
clinical response to arthroscopy. The authors of one study reported clinical
success rates of 80% (thirty-two of forty knees) at twelve months
postoperatively and 59% (nineteen of thirty-two knees) at sixty
months14. Some
studies in which the authors attempted to compare the results of arthroscopic
intervention with those in control populations have had methodological flaws
that make interpretation of their findings difficult. One comparison with
joint lavage did demonstrate some improvement with arthroscopy, but post hoc
sample-size estimates demonstrated the likelihood of type-2 (ß)
errors6. The authors
of a recent study compared arthroscopic débridement with sham surgery
in a population of men in the United States Department of Veterans Affairs
hospital system7,
but osteoarthritis of the knee has a female predominance, making the results
of that study difficult to generalize to the population at large. Also, the
authors did not perform subgroup analyses to determine if there were subsets
of patients who may have received benefit from the arthroscopic treatment.
Finally, and most importantly, the investigators inappropriately used a
radiographic grading system that obscured the severity of arthritis. The
Kellgren-Lawrence grading system scores the severity of osteoarthritis of the
entire joint on a scale of 0 to 4
points10. In the
Veterans Affairs study, each compartment was scored on a scale of 0 to 4
points, with a maximum whole-joint score of 12 points. A score of 4 points,
therefore, could mean either mild osteoarthritis of three compartments or
severe osteoarthritis of one compartment, and a grade of mild or moderate
osteoarthritis gives no assurance that one compartment is not severely
involved. Thus, that method obscures the severity of osteoarthritis and makes
interpretation of mild and moderate arthritis difficult.
The carefully defined cohort in our study was representative of the overall
population of patients with osteoarthritis of the knee, and we took measures
to make our observations both internally valid and generalizable to that
population. Features to minimize bias were incorporated into the study design.
Susceptibility bias was minimized by including consecutive patients, and the
disease was carefully defined to always include tibiofemoral osteoarthritis.
We minimized performance bias by limiting the treatment to a consistent,
arthroscopic chondral and meniscal débridement carried out by one
surgeon. Transfer bias was minimized by a loss-to-follow-up rate of 10% of the
consecutive cohort. To minimize observer bias, three separate observers
collected the clinical, radiographic, and intraoperative data.
The interpretation of the data in this study has certain limitations. The
study was performed with a cross-sectional methodology, and clinical outcome
was compared internally among subsets of patients. Subgroup analyses were, in
some cases, prohibited by small numbers of patients.
The cohort as a whole had an improvement in the mean pain score from 11.9
to 30.8 points, and seventy-two (65%) of the 110 patients were considered to
have substantial improvement. These results are consistent with the findings
in the general literature, but they demonstrate the loss of information that
results from aggregating data. With the numbers studied, we did not observe
any demographic, clinical, or physical characteristics that were associated
with outcome. Symptoms and physical signs are too nonspecific in the presence
of osteoarthritis to predict outcome, and the arthritis obscures the diagnosis
of discrete internal
derangements15. In
a series of 154 patients with symptomatic osteoarthritis and forty-nine
asymptomatic controls, 140 (91%) of the arthritic patients had meniscal tears;
however, the relationship between the meniscal lesions and symptoms was
uncertain8. We found
meniscal tears in seventy-nine (72%) of 110 knees. Although it has
occasionally been asserted that patients with mechanical symptoms have
better-than-average results after arthroscopy, little quantitative support for
this view can be found in the
literature2,16,17.
The authors of one study reported substantially better outcomes for patients
with mechanical symptoms, although follow-up times were very variable (range,
six to sixty months) and were not specified for patients with and without
mechanical
symptoms5. Our data
do not support a role for the type of symptoms in predicting a successful
outcome.
Some investigators3,16-18 have sought to describe radiographic features
that are associated with the clinical outcome of arthroscopic
débridement in osteoarthritic knees, although these features often have
not been described in quantitative terms. Analysis of our data revealed
radiographic subsets associated with specific outcomes after arthroscopic
débridement in knees with osteoarthritis. The Kellgren-Lawrence score,
limb alignment, and joint space width, all of which reflect the severity of
arthritis, were all associated with the clinical outcome both in knees with
mild osteoarthritis and those with severe osteoarthritis. Fifty-two (90%) of
fifty-eight knees with mild osteoarthritis (Kellgren-Lawrence grade 2), normal
or slightly varus alignment, and a joint space width of =3 mm were improved
after arthroscopic débridement, and we believe that the procedure
should be strongly considered as appropriate treatment in such cases.
Conversely, only five of twenty knees with severe osteoarthritis
(Kellgren-Lawrence grade 4), limb malalignment, and a joint space width of
<2 mm had clear-cut relief of symptoms. Arthroscopic débridement
probably should not be routinely advised for such patients, but it could be
recommended for specific treatment goals (for example, alleviation of
mechanical locking). Valgus knees did particularly poorly—an observation
also made by
others3,18—but
mild varus alignment was compatible with pronounced pain relief. Still
unresolved by this study is the role of arthroscopy for patients with moderate
osteoarthritis (Kellgren-Lawrence grade 3). For this group, the severity of
the cartilage lesions measured intraoperatively was the only strong indicator
of clinical outcome, and the likelihood of substantial pain relief could not
be predicted preoperatively. Patients need to be counseled that their clinical
outcome may depend on the severity of the cartilage lesions identified at
surgery and that their expectations of benefit must take this factor into
account15.
?