Abstract
Background: Accurate and reliable radiographic classifications of
the relative severity and outcome of Legg-Calvé-Perthes disease are
essential in the study of that disease. As part of a prospective multicenter
study*, we sought to define more
clearly the lateral pillar classification of severity and the Stulberg
classification of outcome; we sought especially to define the borderlines
between classification groups.
Methods: We performed interobserver and intraobserver trials of the
lateral pillar and Stulberg classifications using sets of twenty radiographs
chosen from a prospective study of 345 hips. To establish reliable definitions
of the lateral pillar classification, we added a new, intermediate group
termed the B/C border group, which includes femoral heads with a thin
or poorly ossified lateral pillar and those with a loss of exactly 50% of the
original height of the lateral pillar. The resulting classification consists
of four groups: A, B, B/C border, and C. In our application of the
classification system of Stulberg et al., we defined a class-II femoral head
as round and fitting within 2 mm of a circle on both anteroposterior and
frog-leg lateral radiographs. We defined a Stulberg class-III femoral head as
out of round by more than 2 mm on either view and a Stulberg class-IV femoral
head as one with at least 1 cm of flattening of the weight-bearing articular
surface. To assess interobserver and intraobserver agreement, we performed two
trials of each classification with six orthopaedic surgeons reviewing twenty
radiographs or pairs of radiographs.
Results: In the first trial of the lateral pillar classification,
there was 81% agreement per radiograph and the average weighted kappa was
0.71. In the second trial, there was 85% agreement per radiograph and the
weighted kappa averaged 0.79. Intraobserver reliability testing showed a 77%
match between Trials 1 and 2, an average weighted kappa of 0.81, and an
average generalizability coefficient of 0.91. In Trial 1 of the Stulberg
classification, there was 91% agreement per radiograph and an average weighted
kappa of 0.82. In Trial 2, there was 92% agreement per radiograph and an
average weighted kappa of 0.82. Intraobserver reliability testing showed an
89% match between Trials 1 and 2, an average weighted kappa value of 0.88, and
an average generalizability coefficient of 0.92.
Conclusions: The interobserver and intraobserver trials of these
classifications produced kappa values and generalizability coefficients in the
excellent range. The modified lateral pillar classification and the redefined
Stulberg classification are sufficiently reliable and accurate for use in
studies of Legg-Calvé-Perthes disease.
There is great variability in the clinical and radiographic expression of
Legg-Calvé-Perthes
disease1-59.
There are rare cases of minimal disease with little or no permanent change in
the contour of the femoral head, which goes through the healing stages
rapidly33,60,61.
The majority of cases are moderately severe, with round or ovoid femoral heads
at
maturity37,44,49,51,62-65.
While these hips do well for many years, only the truly round heads last a
lifetime49,66,67.
Children over the age of nine years at the onset of symptoms more frequently
have flattening of the femoral head and acetabulum, with some hips becoming
symptomatic in
adolescence63,68,69.
The worst outcome is a flattened femoral head in a round acetabulum,
resembling an adult hip with osteonecrosis, which usually occurs in children
over the age of eleven years at the onset of
symptoms56. Young
children often fare better than older ones, but a small percentage of younger
children have slow healing and substantial permanent deformity of the femoral
head70,71.
Reported outcomes have varied, with good results in sixty-four of eighty hips
in one series72, in
three of forty-nine hips in
another73, and in
none of thirty-four hips in
another74. Some
authors consider Legg-Calvé-Perthes disease to be a mild disorder that
requires little treatment, whereas others believe the prognosis to be
unfavorable and recommend surgery for most
patients28,32,49,51,62-64,66,67,70,73,75-81.
In such a varied disorder, it is important to identify factors that provide
the clinician with an accurate prognosis. In addition, it is essential to have
a reproducible classification with which to compare results between
studies.
The Legg-Perthes Study Group includes Peter Armstrong, John G. Birch,
William Browning, Alvin H. Crawford, Peter DeLuca, Frederick Dietz, Ann Dzus,
Keith Gabriel, Neil Green, Richard Gross, Curtis Gruel, William Herndon, John
A. Herring, Brian Hotchkiss, Jay Jarvis, Charles Johnston, Vicki Kalen, John
King, Dale Maples, Richard McIvor, Peter Meehan, Marc Moreau, Raymond
Morrissy, Colin Moseley, Richard Nicol, George Rab, B. Stevens Richards,
Thomas Rinsky, Andy Sullivan, George Thompson, Timothy Ward, Hugh Watts,
Stuart Weinstein, David Yngve, and Seymour Zimbler. Diane Ramey was the study
coordinator.
Classification of radiographs to assess the severity of
Legg-Calvé-Perthes disease began with Legg, who described a cap and a
mushroom-type of
deformity82.
Waldenstrom described three types of disease, with only the third being
associated with a poor
result83. In the
1950s, Goff described three types: a spherical type, a cap type, and an
irregular
type84.
Modern classification began with Catterall, who defined four types, with
the first two associated with a good prognosis and the third and fourth
associated with a poorer
prognosis10. In
group 1 of Catterall's classification, radiographic changes are restricted to
the anterior part of the capital epiphysis. In group 2, density changes
involve the central portion of the femoral head with more of the anterior
segment involved as well. In group 3, most of the epiphysis has radiographic
changes, with the uninvolved parts lying medial and lateral to the central
segment. In group 4, the entire epiphysis is involved. Catterall also defined
a group of risk signs that indicated a more severe disease course. The four
head-at-risk signs include the Gage sign—a radiolucent "v"
in the lateral portion of the epiphysis, calcification lateral to the
epiphysis, lateral subluxation of the femoral head, and a horizontal physis.
The Catterall system was used widely, but the results were difficult to
reproduce in interobserver
trials85-88.
Hardcastle et al. found the classification to have poor reliability but
improved accuracy if groups 2 and 3 were
combined86.
Christensen et al. found poor interobserver agreement on the classification
even when groups 2 and 3 were
combined85. Van Dam
et al. noted that the classification frequently changed during the course of
the disease when it had been applied prior to the fragmentation
stage87.
Salter and Thompson classified severity on the basis of the extent and
location of the subchondral fracture that appears early in the course of the
disease53.
Mukherjee and Fabry noted good correlation between the Salter-Thompson
classification and outcome, and they were able to apply the classification to
ninety-four of 116
hips89. They also
found a strong correlation between the Salter-Thompson and Catterall
classifications. However, Wiig et al. did not find good interobserver
agreement for the Salter-Thompson classification or for any other
classification that they
evaluated90. The
subtrochanteric fracture may be missed in many patients because it appears
briefly in the course of the radiographic changes and is seen in only a
minority of patients. It was present in 376 (30%) of the 1264 hips in the
series of Salter and
Thompson53 and in
ninety-seven (68%) of 142 hips in a subgroup studied in
Cleveland53.
In our multicenter study, the Catterall classification was abandoned after
repeated trials and educational sessions failed to improve agreement among the
investigators91.
Subsequently, our group developed the lateral pillar
classification35,
which was shown to have acceptable interobserver agreement and a strong
predictive value in a number of
studies88,92-95.
Accurate classification of the radiographic outcome in patients with
Legg-Calvé-Perthes disease is essential for the performance of studies
comparing treatment. Since its publication in 1981, the classification
described by Stulberg et al. has been frequently used to classify the end
results in patients with this
disorder56.
Recently, the reproducibility of the classification was called into question
by Neyt et al.96,
but we are not aware of any other studies in which it was evaluated.
In our long-term multicenter study of treatment of
Legg-Calvé-Perthes disease, we carefully analyzed the hips to be
certain that our designations of the lateral pillar and Stulberg
classifications were accurate and reliable. Radiographs of hips that were
difficult to classify were presented to members of the study group, who
instructed us to develop more reliable definitions of both classifications. We
then specifically sought to define the borderlines between classification
groups. In published descriptions of the two
classifications35,56,
typical examples of the various groups are well described, but the transitions
between one classification group and the next are not. The refinements of
these classifications and the results of evaluations of the interobserver and
intraobserver reliability of the systems are the subject of this paper.
Acontrolled, long-term multicenter study of patients with
Legg-Calvé-Perthes disease was initiated in 1984. All patients were
over six years of age at the onset of symptoms and had had no prior treatment.
Hips that had reached the stage of early reossification were excluded, as were
patients with prior steroid treatment, hip infection, a history of
developmental dysplasia of the hip, multiple epiphyseal dysplasia,
hemoglobinopathy, hypothyroidism, juvenile arthritis, diabetes, metabolic or
neoplastic disease, renal failure, and failure to follow protocol. Each
investigator followed one of five treatment protocols and submitted
anteroposterior and frog-leg lateral radiographs at four, eight, and twelve
months during the first year and at least yearly thereafter. Additional
radiographs made in the course of management were also submitted to the study.
The parents or guardians of all patients gave informed consent for study
participation. The study began well before the development of the
institutional review board process. Thirty-eight investigators submitted cases
to the study. The complete protocol of this study is described in Part II,
which follows this article in this issue of The Journal. Initially,
438 patients with a total of 451 involved hips were enrolled in the study.
After some hips were excluded and others were lost to follow-up at skeletal
maturity, 345 hips in 337 patients remained in the study.
Lateral Pillar Classification
The lateral pillar classification was determined from anteroposterior
radiographs of the pelvis made in the early fragmentation stage of the
disease. The radiographic changes necessary for classification were usually
evident within six months after the onset of symptoms. The radiographs that
showed the greatest lucency of the lateral pillar in the early fragmentation
period were chosen for classification. We often used several radiographs made
several months apart in the early fragmentation period to compensate for
differences in radiographic quality as well as to accommodate for changes due
to progression of the fragmentation.
The lateral pillar is defined as the lateral portion of the femoral head,
on the anteroposterior radiograph, that is demarcated from the central portion
of the head by a lucent line of fragmentation
(Fig. 1). Although the lateral
pillar was originally described as the lateral 15% to 30% of the femoral head,
in this review we found the area of this demarcated segment to range from 5%
to 30% of the femoral
head35
(Fig. 2). When there was no
visible demarcation, the lateral one-fourth of the head was arbitrarily chosen
to represent the lateral pillar (Fig.
3). In the original lateral pillar classification, group-A hips
are defined as those with no involvement of the lateral pillar, with no
density changes and no loss of height of the lateral pillar
(Fig. 1). Group-B hips have
lucency in the lateral pillar and may have some loss of height
(Fig. 4), but not exceeding 50%
of the original height. Group-C hips are those with more lucency in the
lateral pillar and >50% loss of
height35
(Fig. 3).
The senior author (J.A.H.) had previously classified the study radiographs
with use of the lateral pillar system as the radiographs were received at the
study headquarters over a period of ten years. In preparation for a review by
the study group, that author and another (H.T.K.) reclassified all of the hips
in the study. We specifically tried to define the borderlines between the
groups to increase the accuracy of the classification. As we reviewed the
radiographs, we identified a group of hips with radiographic findings that
were more severe than those typical of group B but less severe than those seen
in group C. We attempted to refine the original definitions of the lateral
pillar groups, but after many efforts we were unable to reliably classify
these hips with use of those definitions. These radiographs were reviewed at a
meeting of the study group, and again consensus could not be reached on their
classification. After further review, we created a new classification group
termed the B/C border group for these cases. We defined the
radiographic findings in this group as (1) a very narrow lateral pillar (2 to
3 mm wide) that was >50% of the original height
(Fig. 2), (2) a lateral pillar
with very little ossification but with at least 50% of the original height
(Fig. 5), or (3) a lateral
pillar with exactly 50% of the original height that is depressed relative to
the central pillar (Fig. 6).
The definition of group A remained a hip with no density changes in the
lateral pillar and no loss of the height of the lateral pillar
(Fig. 1). Group B consisted of
hips with a lateral pillar of >50% of the original height, a width of more
than a few millimeters, and substantial ossification
(Fig. 4). Group C was defined
as hips with collapse of the lateral pillar beyond 50% of the original height
(Fig. 3).
Stulberg Classification
The Stulberg classification was also reviewed, with two authors (J.A.H. and
H.T.K.) and the study group using the original descriptions of Stulberg et
al.56 to review and
classify the most recent anteroposterior and frog-leg lateral radiographs of
all 345 hips in the present study. We found that there was general agreement
on the classification of "typical" cases, but it was difficult to
delineate the borderlines between groups with use of the published criteria
for the groupings. To improve the accuracy of the classification, we specified
the definitions and techniques of applying the Stulberg classification as
follows.
Stulberg et al. defined a class-I hip as "a completely normal hip
joint"56
(Figs. 7-A and
7-B). Class-II hips were
defined as spherical with a larger-than-normal femoral head, a shorter femoral
neck, or an abnormally steep acetabulum. The femoral head contour was said to
match the same concentric circle on the anteroposterior and frog-leg lateral
radiographs (Figs. 8-A and
8-B). Stulberg et al. noted
that all class-II hips had a fair Mose rating, meaning that the head fell
within 2 mm of a concentric circle on both radiographic
views56. In this
review, we found the use of templates with circles 2 mm apart to be a source
of error. The problems included difficulty seeing the outline of the femoral
head through the template on low-contrast radiographs and difficulty being
certain that we were using the same concentric circle for both the
anteroposterior and the frog-leg lateral radiograph. To improve the accuracy
of this determination, a new method was devised to measure circle fit (Figs.
9-A,
9-B,
9-C,
9-D,
9-E). A line is drawn across
the femoral head at its widest part. A perpendicular line is then drawn at the
midpoint of that line. The point of a compass is moved up and down that line
until the center of the head is located. A best-fit circle is then drawn over
the outline of the femoral head. The same radius is maintained on the compass,
and a best-fit circle is drawn on the frog-leg lateral radiograph. If the
femoral head fits within 2 mm of the circle on both views, the hip is
classified as Stulberg class II. If the femoral head does not fit within 2 mm
on either view, it is classified in a more severe group.
Stulberg et al. described class-III heads as nonspherical, with an ovoid,
mushroom, or umbrella shape, but not
flat56 (Figs.
10-A and
10-B). Class-IV hips were
described as having a flat femoral head and acetabulum. The border between
classes III and IV was not well defined in the original
description56. We
noted some degree of flattening of various portions of femoral heads that
otherwise appeared ovoid. We then arbitrarily defined class IV as a femoral
head with more than 1 cm of flattening in a weight-bearing area on either the
anteroposterior or the frog-leg lateral radiograph (Figs.
11-A and
11-B). There is considerable
variability in this group, with severely affected hips having a concave
articular surface (Figs. 12-A
and 12-B). Stulberg et al.
described class-V hips as those with a flat femoral head and a normal femoral
neck and
acetabulum56 (Figs.
13-A and
13-B).
For the final definitions used in this study, class I indicated a femoral
head that cannot be distinguished from normal; class II, a round femoral head
that fits within 2 mm of the same circle, drawn with the protractor technique,
on both the anteroposterior and the frog-leg lateral radiographs (Figs.
8-A and
8-B); class III, an ovoid
femoral head that does not fit within 2 mm of the circle, drawn with the
protractor technique, on one or both radiographic views (Figs.
10-A and
10-B); class IV, a femoral
head with at least 1 cm of flattening of the weight-bearing area on one or
both views (Figs. 11-A,
11-B,
12-A,
12-B); and class V, a femoral
head with collapse, usually central, within a round acetabulum (Figs.
13-A and
13-B).
Interobserver Trials
Twenty anteroposterior radiographs of hips in the fragmentation stage were
chosen for the trials of the lateral pillar classification, and twenty cases
with both anteroposterior and frog-leg lateral radiographs at skeletal
maturity were selected for the Stulberg classification trials. The radiographs
selected for both trials included an equal number of images that had been
deemed easy or difficult to classify. The hips that were easy to classify had
characteristics typical of a classification group. Those that were difficult
to classify were close to the borderline between two classification groups. No
Stulberg class-V hips were included because of their scarcity in the study,
and the investigators were so informed. Six observers were instructed on the
details of the appropriate classification in a ten-minute lecture before each
trial. Four individuals participated in both the lateral pillar and the
Stulberg classification trial, and two different pairs of observers each
participated in one of the trials. The radiographs were placed on view boxes
without identification of the treatment or treating physician. The
classification was written down without discussion by three staff surgeons and
three orthopaedic fellows. A second trial was performed in a similar fashion
four to five weeks later with the radiographs in a different order. We chose a
sample size of twenty radiographs for the lateral pillar classification trial
and twenty sets of radiographs for the Stulberg classification trial in order
to have sufficient variability of radiographic findings without inducing
fatigue and loss of attention among the observers. Several studies support the
use of approximately this number of raters and
samples97,98.
Statistical Methods
Several approaches were used to assess the intraobserver and interobserver
reliability of the lateral pillar and Stulberg classifications. This reflects
the recent work by Feinstein and
Cicchetti99,
Kraemer100,
Cicchetti and
Feinstein101, and
Maclure and
Willett102, which
demonstrated that no one approach, such as the commonly used kappa or the
generalizability coefficient, is capable of providing a complete measure of
reliability in this type of setting.
The weighted kappa was used instead of the simple kappa, as the weighted
kappa is preferable when the categories of response have a natural
order100, such as
is the case for the lateral pillar and Stulberg classifications. Whereas the
simple kappa counts all failures to agree as equivalent, the weighted kappa
considers some differences to be less important than others. To that end, the
lateral pillar classifications were coded as 1 for "A," 2 for
"B," 2.5 for "B/C," and 3 for "C." These
codes reflect the assumption that misclassifying a "B" as a
"B/C" is not as great a mistake as is misclassifying a
"B" as an "A." The Stulberg classifications were coded
as 2 for I or II, 3 for III, and 4 for IV. The choice of 2 for I or II
reflects the fact that a misclassification of I as II or the reverse would be
of no clinical importance in terms of the long-term outcome for the
patient56.
Interobserver Reliability
We assessed the prospect of two or more observers assigning the same
classification to the same radiograph. Separate reliability assessments were
performed for each trial.
To gain a basic sense of interobserver reliability, we calculated the
percentage of observers who assigned the most commonly assigned classification
to each radiograph. Those percent agreements per radiograph were then averaged
for the twenty radiographs and reported as the gross percent agreement. This
was done for the three fellows and the three staff orthopaedists separately
and for all six observers combined.
To obtain an overall weighted kappa assessment of reliability, the weighted
kappa was calculated for each unique pair of the six observers, yielding
fifteen kappa values. The average of those fifteen values was recorded as the
overall kappa value. In addition, the weighted kappa was calculated for the
three unique pairs of fellows and the three unique pairs of staff
orthopaedists. According to Landis and
Koch103, a kappa
of 0.21 to 0.40 is considered to be fair; 0.41 to 0.60, moderate; 0.61 to
0.80, substantial; and 0.81 to 1.0, excellent.
It should be noted that the value of the simple or weighted kappa statistic
is influenced by how frequently certain categories appear on a set of
radiographs. For that reason, the same set of observers could have a markedly
different kappa value if, for example, 80% of the hips shown in a set of
radiographs are Stulberg class II, 10% are Stulberg class III, and 10% are
Stulberg class IV, as opposed to being evenly split across the Stulberg
categories. For that reason, these kappa values can be used as a general
indication of reliability, but the reader is cautioned against comparing the
kappa values presented here with kappa values published in other studies.
Another measure of reliability is the generalizability
coefficient104,
which is a measure of how much of a variation in the responses is due to
differences between the observers or differences between the radiographs. This
measure is also vulnerable to differing distributions of classifications of
the radiographs. If most of the radiographs have the same or similar
classifications, it is very difficult to obtain a large value for the
generalizability coefficient. Conversely, if there is a great diversity of
categories among the radiographs, a less than stellar set of observers could
still yield an unexpectedly high coefficient. Again, the reader is cautioned
against comparing results between studies.
Intraobserver Reliability
For each observer, we noted the percentage of the twenty radiographs for
which the same response was given in both trials. The average of those
percentages was calculated for the three fellows, the three staff
orthopaedists, and all six observers combined. A weighted kappa was calculated
for each observer, with use of the two classifications of each radiograph by
that observer. The average of those kappas was calculated for the same three
groups. The generalizability coefficient was calculated for each observer, and
the averages were determined in the same manner.
Lateral Pillar Classification
Interobserver Reliability
In Trial 1, there were at least three matching responses for each of the
twenty radiographs, at least four for eighteen, at least five for twelve, and
at least six for seven. Therefore, the percent agreement per radiograph ranged
from 50% to 100%, with an average of 81% for the twenty radiographs. The
weighted kappa values for each pairing of the observers ranged from 0.49 to
0.89, with an average of 0.71 (95% confidence limits, 0.51, 0.91). The
generalizability coefficient averaged 0.70 for the fellows, 0.92 for the staff
orthopaedists, and 0.79 overall.
In Trial 2, we saw some evidence of a learning curve. At least three of the
six observers agreed on the classification for nineteen of the twenty
radiographs, at least four of the six agreed on seventeen, at least five of
the six agreed on sixteen, and all six agreed on ten. For one radiograph, no
more than two observers agreed on the classification. Therefore, the percent
agreement per radiograph ranged from 33% to 100%, with an average of 85% for
the twenty radiographs. The weighted kappa values for each pairing of the
observers ranged from 0.57 to 0.92, with an average of 0.79 (95% confidence
limits, 0.60, 0.96). The generalizability coefficient averaged 0.85 for the
fellows, 0.96 for the staff orthopaedists, and 0.88 overall.
Intraobserver Reliability
The percentage of the twenty radiographs for which the results matched
between Trial 1 and Trial 2 ranged from 55% to 90% for the six observers, with
an average of 77%. The six weighted kappa values ranged from 0.60 to 0.92,
with an average of 0.81 (95% confidence limits, 0.65, 0.96). The six
generalizability coefficients ranged from 0.77 to 0.98, with an average of
0.91.
Stulberg Classification
Interobserver Reliability
In Trial 1, at least three of the six observers agreed on the
classification for all twenty radiographs, at least four of the six agreed on
nineteen, at least five of the six agreed on eighteen, and all six agreed on
twelve. Therefore, the percent agreement per radiograph ranged from 50% to
100%, with an average of 91% for the twenty radiographs. The weighted kappa
values for each pairing of the observers ranged from 0.72 to 0.94, with an
average of 0.82 (95% confidence limits, 0.64, 0.99). The generalizability
coefficient was 0.86 for the fellows, 0.93 for the staff orthopaedists, and
0.89 overall.
In Trial 2, we saw no evidence of a learning curve. At least four of the
six observers agreed on the classification for all twenty radiographs, at
least five of the six agreed on nineteen, and all six agreed on eleven.
Therefore, the percent agreement per radiograph ranged from 67% to 100%, with
an average of 92% for the twenty radiographs. The weighted kappa values for
each pairing of the observers ranged from 0.72 to 0.94, with an average of
0.82 (95% confidence limits, 0.64, 0.99). The generalizability coefficient was
0.90 for the fellows, 0.86 for the staff orthopaedists, and 0.88 overall.
Intraobserver Reliability
The percentage of the twenty radiographs for which the classifications
matched between Trial 1 and Trial 2 ranged from 85% to 95% for the six
observers, with an average of 89%. The six weighted kappa values ranged from
0.83 to 0.94, with an average of 0.88 (95% confidence limits, 0.72, 0.99). The
generalizability coefficients ranged from 0.89 to 0.97, with an average of
0.92. These values represent excellent intraobserver reliability.
The classification of radiographs to determine the severity of
Legg-Calvé-Perthes disease has a long history. Although the Catterall
classification was used for many years in almost all studies, a number of
authors were unable to achieve satisfactory interobserver and intraobserver
agreement in objective
studies85-88.
At the outset of our investigation, our study group assumed that the members
could reliably apply the classification. However, as the study progressed,
submitted data sheets and radiographs revealed a wide disparity in the use of
the classification. Multiple subsequent efforts to standardize the use of the
classification were unsuccessful. Subsequently, the lateral pillar
classification was developed, tested, and thereafter used to classify
severity.
Ritterbusch et
al.88 found the
lateral pillar classification to have good interobserver reliability. In their
study, all three observers agreed on the classification of fifty-six of
seventy-eight hips when they used the lateral pillar classification compared
with thirty-two of seventy-eight hips when they used the Catterall
classification. Ritterbusch et al. also noted good correlation between the
lateral pillar classification and the final outcome as rated with the Stulberg
system (Spearman correlation, 0.64 compared with 0.38 for the Catterall
classification). Podeszwa et
al.92 also found
the lateral pillar classification to have good intraobserver and interobserver
reliability. Pediatric orthopaedic experience had no observable effect on the
performance of the observers. The kappa values ranged from 0.637 to 0.842,
with an average of 0.742. Pietrzak et
al.93 also found
better agreement with the lateral pillar classification than with the
Catterall system. Specchiulli and
Scialpi94 noted
that the lateral pillar classification was 80% reproducible compared with 42%
for the Catterall system. Meurer et
al.95, on the other
hand, found the two classifications to have similar reproducibility.
In our repeated radiographic reviews, it became evident that the borderline
between groups B and C of our classification system was unclear and that many
hips fell along this border. We attempted to refine the definitions of groups
B and C but were unable to do so. We then created another classification
group. Since the concepts of lateral pillar groups B and C were well
established, we elected to name this new group the B/C border group
rather than rename all of the groups. The addition of this group enabled us to
better define the changes in the femoral head and to more critically specify
the classification of borderline cases. Using these definitions, the six
examiners were able to classify the hips with good-to-excellent interobserver
and intraobserver reliability. The degree of agreement is encouraging
considering that half of the hips originally had been considered difficult to
classify because of radiographic features near the borderlines of the
classification grades.
The question arises as to why our observers did not reach perfect agreement
with regard to their classifications. It is our opinion that such perfection
is unlikely with a biologic classification based on clinical radiographs.
Radiographs were made at four to six-month intervals and, in many cases, more
frequently. However, some patients may have a short period of fragmentation,
and the hip may pass through a classic appearance without a proper radiograph
available for review. Hips are irritable in the fragmentation stage of this
disease and often remain externally rotated, making it difficult to obtain a
true anteroposterior projection. The quality of the actual radiographs is
variable, the size and weight of the patients vary, and copy radiographs were
used in this study. Finally, biologic events such as fragmentation of the
femoral head represent continual changes from mildest to most severe, and
borderlines arbitrarily created for classification purposes are not natural
occurrences and are by nature imprecise.
The lateral pillar classification is applied to radiographs made about the
time of early fragmentation, and the radiograph that shows the greatest
density changes in the lateral pillar is
used35. Some
clinicians attempt to classify the hip prior to early fragmentation, and
classification at this stage is not reliable with our method. Lappin et al.
noted that, on average, a duration of symptoms of seven months was required
for accurate application of the lateral pillar
classification105.
As the classification was developed, we found that fragmentation occurred at
an average of six months after the onset of symptoms. Specifically, once the
lateral pillar separates from the central fragment, is clearly seen as being
elevated relative to a central collapsed segment, and is >50% of its
original height, it can be reliably classified as group B. On the other hand,
if the lateral pillar is clearly collapsed more than an estimated 50% of its
original height, it can be classified as group C. Hips that do not easily fit
these definitions are now classified as B/C border hips.
We encourage attention to the details of the definitions and methods of the
classification prior to clinical use. In this study, we provided a ten-minute
summary of these details prior to the classification trials so that the
investigators began with the same ground rules.
Classification earlier in the disease process would be desirable, and Tsao
et al. made efforts to accomplish
this58. Using
serial scintigraphy of the femoral head at four-month intervals, they found
that hips with early revascularization of the lateral column had better
results than did those with delayed revascularization. They were able to
classify hips three months earlier than they could using radiographic
classifications. Their findings emphasize the prognostic importance of the
lateral pillar of the femoral head and in some cases may allow earlier
classification.
We and
others43,62,74,88
have long assumed that the classification system of Stulberg et al. was
reliable. However, in our review of the definitions of Stulberg et al. and in
our group meetings, we found that the definitions describe typical cases but
do not specify the degree of deviation that establishes the borderline between
groups. We attempted to define those borderlines without fundamentally
changing the classification. We suggest a new technique for measuring
sphericity, and we allow a 2-mm deviation from a single concentric circle
within class II. We found that the use of the protractor technique improved
the ability of the observer to construct a circle that best fit the femoral
head and to fit the same circle to the frog-leg lateral radiograph. We also
specify the degree of "flatness" that represents the minimum
criterion for categorizing a hip as class IV; this is a flattened area on the
weight-bearing surface over an area of =1 cm. The range of femoral head
shapes within class IV ranges from ovoid to concave. In the future, it may be
useful to further subdivide this group and evaluate the natural history of the
subgroups.
We performed interobserver and intraobserver trials of the Stulberg
classification after an instructional discussion, and we found an excellent
level of agreement, with an average interobserver kappa of 0.82 (range, 0.72
to 0.94). The average intraobserver kappa was 0.88.
Neyt et al.96
found interobserver kappa values around 0.783 and intraobserver values between
0.659 and 0.952 for the Stulberg classification, which they deemed to be
"marginally acceptable." We used weighted kappa values, an
approach that penalizes more severe errors more than "near
misses." Landis and
Koch103 noted that
the interpretation of the kappa statistic is arbitrary. They considered values
between 0.61 and 0.80 as showing "substantial" agreement and
values between 0.81 and 1.00 as demonstrating "almost perfect"
agreement. The improved reliability of the Stulberg classification in our
study may in part be a result of our revision of the definitions of the groups
and our use of a different technique to assess circle fit. In the prospective
study, <1% of the hips were classified as Stulberg class V. This was
expected on the basis of the age limits for acceptance into the study. Because
they posed no classification issues, we eliminated Stulberg class-V hips from
the analysis. Neyt et al. included that group, which makes comparison between
the studies difficult.
We believe that our method of applying the Stulberg classification is a
clarification rather than a true modification. The exact technique used by
Stulberg et al. cannot be discerned from the original
article56, and
there may be some differences between their application and ours. We believe
that these differences are probably small enough that results classified with
our method can be compared with the long-term prognostic findings of Stulberg
et al. We conclude that our modifications of the lateral pillar and Stulberg
classifications are reproducible and may be used for meaningful studies of
Legg-Calvé-Perthes disease.
ArieE, Johnson
F, Harrison MH, Hughes JR, Small P. Femoral head shape in Perthes'
disease. Is the contralateral hip abnormal? Clin
Orthop.1986;209:
77-88.20977
1986
AxerA, Hendel
D. Recurrent Legg-Calvé-Perthes' disease. A case report.
Clin Orthop.1977;126:
170-1.126170
1977
[PubMed]
AxerA,
Gershuni DH, Hendel D, Mirovski Y. Indications for femoral osteotomy in
Legg-Calvé-Perthes disease. Clin Orthop.1980;150:
78-87.15078
1980
[PubMed]
BaksiDP.
Palliative operations for painful old Perthes' disease. Int
Orthop.1995;19:
46-50.1946
1995
BarnesJM.
Premature epiphysial closure in Perthes' disease. J Bone Joint Surg
Br.1980;62:
432-7.62432
1980
BellyeiA, Mike
G. Acetabular development in Legg-Calvé-Perthes disease.
Orthopedics.1988;11:
407-11.11407
1988
[PubMed]
BlakemoreME,
Harrison MH. A prospective study of children with untreated Catterall
group 1 Perthes' disease. J Bone Joint Surg Br.1979;61:
329-33.61329
1979
[PubMed]
BosCF, Bloem
JL, Bloem RM. Sequential magnetic resonance imaging in Perthes' disease.
J Bone Joint Surg Br.1991;73:
219-24.73219
1991
[PubMed]
BowenJR,
Schreiber FC, Foster BK, Wein BK. Premature femoral neck physeal closure
in Perthes' disease. Clin Orthop.1982;171:
24-9.17124
1982
[PubMed]
CatterallA. The natural history of Perthes' disease.
J Bone Joint Surg Br.1971;53:
37-53.5337
1971
[PubMed]
CatterallA. Natural history, classification, and x-ray
signs in Legg-Calvé-Perthes' disease. Acta Orthop
Belg.1980;46:
346-51.46346
1980
CatterallA. Legg-Calvé-Perthes syndrome.
Clin Orthop.1981;158:
41-52.15841
1981
[PubMed]
CatterallA,
Pringle J, Byers PD, Fulford GE, Kemp HB, Dolman CL, Bell HM, McKibbin B,
Ralis Z, Jensen OM, Lauritzen J, Ponseti IV, Ogden J. A review of the
morphology of Perthes' disease. J Bone Joint Surg Br.1982; 64:
269-75.64269
1982
[PubMed]
CatterallA.Legg-Calvé-Perthes'
disease. New York: Churchill
Livingstone;1982.
1982
ChackoV,
Joseph B, Seetharam B. Perthes' disease in South India. Clin
Orthop.1986;209:
95-9.20995
1986
ClarkeTE,
Finnegan TL, Fisher RL, Bunch WH, Gossling HR. Legg-Perthes disease in
children less than four years old. J Bone Joint Surg
Am.1978;60:
166-8.60166
1978
ClarkeNM,
Harrison MH. Painful sequelae of coxa plana. J Bone Joint Surg
Am.1983;65:
13-8.6513
1983
ConwayJJ.
A scintigraphic classification of Legg-Calvé-Perthes disease.
Semin Nucl Med.1993;23:
274-95.23274
1993
[PubMed][CrossRef]
CoopermanDR,
Stulberg SD. Ambulatory containment treatment in Perthes' disease.
Clin Orthop.1986;203:
289-300.203289
1986
[PubMed]
CotlerJM,
Donahue J. Innominate osteotomy in the treatment of
Legg-Calvé-Perthes disease. Clin Orthop.1980;150:
95-102.15095
1980
[PubMed]
DanielssonL,
Pettersson H, Sunden G. Early assessment of prognosis in Perthes' disease.
Acta Orthop Scand.1982;53:
605-11.53605
1982
[PubMed][CrossRef]
de SanctisN,
Rega AN, Rondinella F. Prognostic evaluation of Legg-Calvé-Perthes
disease by MRI. Part I: the role of physeal involvement. J Pediatr
Orthop.2000;20:
455-62.20455
2000
[CrossRef]
de SanctisN,
Rondinella F. Prognostic evaluation of Legg-Calvé-Perthes disease
by MRI. Part II: pathomorphogenesis and new classification. J
Pediatr Orthop.2000;20:
463-70.20463
2000
[CrossRef]
DickensDRV,
Menelaus MB. The assessment of prognosis in Perthes' disease. J
Bone Joint Surg Br.1978;60:
189-94.60189
1978
DominguezR, Oh
KS, Young LW, Goodman M. Acute chondrolysis complicating
Legg-Calvé-Perthes disease. Skeletal Radiol.1987;16:
377-82.16377
1987
[PubMed][CrossRef]
EllisW.
Metaphysis in Perthes disease: a method of assessment and selection for
treatment. J Pediatr Orthop.1984;4:
731-4.4731
1984
[PubMed][CrossRef]
FisherRL,
Roderique JW, Brown DC, Danigelis JA, Ozonoff MB, Sziklas JJ. The
relationship of isotopic bone imaging findings to prognosis in Legg-Perthes
disease. Clin Orthop.1980;150:
23-9.15023
1980
[PubMed]
FulfordGE,
Lunn PG, Macnicol MF. A prospective study of nonoperative and operative
management for Perthes' disease. J Pediatr Orthop.1993;13:
281-5.13281
1993
[PubMed][CrossRef]
GershuniDH. Preliminary evaluation and prognosis in
Legg-Calvé-Perthes disease. Clin Orthop.1980;150:
16-22.15016
1980
[PubMed]
GowerWE,
Johnston RC. Legg-Perthes disease. Long-term follow-up of thirtysix
patients. J Bone Joint Surg Am.1971;53:
759-68.53759
1971
[PubMed]
GreenNE,
Beauchamp RD, Griffin PP. Epiphyseal extrusion as a prognostic index in
Legg-Calvé-Perthes disease. J Bone Joint Surg
Am.1981;63:
900-5.63900
1981
GriffinPP,
Green NE, Beauchamp RD. Legg-Calvé-Perthes disease: treatment and
prognosis. Orthop Clin North Am.1980;11:
127-39.11127
1980
[PubMed]
HerringJA,
Lundeen MA, Wenger DR. Minimal Perthes' disease. J Bone Joint
Surg Br.1980;62:
25-30.6225
1980
HerringJA.
Legg-Calvé-Perthes disease: a review of current knowledge.
Instr Course Lect.1989;38:
309-15.38309
1989
[PubMed]
HerringJA,
Neustadt JB, Williams JJ, Early JS, Browne RH. The lateral pillar
classification of Legg-Calvé-Perthes disease. J Pediatr
Orthop.1992;12:
143-50.12143
1992
[CrossRef]
HerringJA,
Williams JJ, Neustadt JN, Early JS. Evolution of femoral head deformity
during the healing phase of Legg-Calvé-Perthes disease. J
Pediatr Orthop.1993;13:
41-5.1341
1993
[CrossRef]
HerringJA.
The treatment of Legg-Calvé-Perthes disease. A critical review of the
literature. J Bone Joint Surg Am.1994;76:
448-58.76448
1994
[PubMed]
HerringJ.Legg-Calvé-Perthes disease. Rosemont, IL:
American Academy of Orthopaedic Surgeons; 1996.
1996
HirohashiK,
Kanbara T, Kuroda K, Okajima M, Hyashi M, Shimazu A. Perthes'
disease—a classification based on the extent of epiphyseal and
metaphyseal involvement. Int Orthop.1980;4:
47-55.447
1980
[PubMed][CrossRef]
HoffingerSA,
Henderson RC, Renner JB, Dales MC, Rab GT. Magnetic resonance evaluation
of "metaphyseal" changes in Legg-Calvé-Perthes disease.
J Pediatr Orthop.1993;13:
602-6.13602
1993
[PubMed]
HoikkaV,
Poussa M, Yrjonen T, Osterman K. Intertrochanteric varus osteotomy for
Perthes' disease. Radiographic changes after 2-16-year follow-up of 126 hips.
Acta Orthop Scand.1991;62:
549-53.62549
1991
[PubMed][CrossRef]
IppolitoE,
Tudisco C, Farsetti P. Long-term prognosis of Legg-Calvé-Perthes
disease developing during adolescence. J Pediatr
Orthop.1985; 5:
652-6.5652
1985
[CrossRef]
IppolitoE,
Tudisco C, Farsetti P. The long-term prognosis of unilateral Perthes'
disease. J Bone Joint Surg Br.1987;69:
243-50.69243
1987
[PubMed]
IsmailA,
Macnicol M. Prognosis in Perthes' disease: a comparison of radiological
predictors. J Bone Joint Surg Br.1998;80:
310-4.80310
1998
[PubMed][CrossRef]
JosephB.
Morphological changes in the acetabulum in Perthes' disease. JBone
Joint Surg Br.1989;71:
756-63.71756
1989
KeretD,
Harrison MH, Clarke NM, Hall DJ. Coxa plana—the fate of the physis.
J Bone Joint Surg Am.1984;66:
870-7.66870
1984
[PubMed]
KiepurskaA. Late results of treatment in Perthes' disease
by a functional method. Clin Orthop.1991;272:
76-81.27276
1991
[PubMed]
LeitchJM,
Paterson DC, Foster BK. Growth disturbance in Legg-Calvé-Perthes
disease and the consequences of surgical treatment. Clin
Orthop.1991;262:
178-84.262178
1991
McAndrewMP,
Weinstein SL. A long-term follow-up of Legg-Calvé-Perthes disease.
J Bone Joint Surg Am.1984;66:
860-9.66860
1984
[PubMed]
NorlinR,
Hammerby S, Tkaczuk H. The natural history of Perthes' disease.
Int Orthop.1991;15:
13-6.1513
1991
[PubMed][CrossRef]
PoussaM,
Yrjonen T, Hoikka V, Osterman K. Prognosis after conservative and
operative treatment in Perthes' disease. Clin Orthop.1993;297:
82-6.29782
1993
[PubMed]
ReinkerKA,
Larsen IJ. Patterns of progression in Legg-Perthes disease. J
Pediatr Orthop.1983;3:
455-60.3455
1983
[CrossRef]
SalterRB,
Thompson GH. Legg-Calvé-Perthes disease. The prognostic
significance of the subchondral fracture and a two-group classification of the
femoral head involvement. J Bone Joint Surg Am.1984;66:
479-89.66479
1984
[PubMed]
SongHR, Dhar
S, Na JB, Cho SH, Ahn BW, Ko SM, Suh SW, Koo KH. Classification of
metaphyseal change with magnetic resonance imaging in
Legg-Calvé-Perthes disease. J Pediatr Orthop.2000;20:
557-61.20557
2000
[PubMed][CrossRef]
SponsellerPD,
Desai SS, Millis MB. Abnormalities of proximal femoral growth after severe
Perthes' disease. J Bone Joint Surg Br.1989;71:
610-4.71610
1989
[PubMed]
StulbergSD,
Cooperman DR, Wallensten R. The natural history of
Legg-Calvé-Perthes disease. J Bone Joint Surg
Am.1981;63:
1095-108.631095
1981
ThompsonGH,
Salter RB. Legg-Calvé-Perthes disease. Current concepts and
controversies. Orthop Clin North Am.1987;18:
617-35.18617
1987
[PubMed]
TsaoAK, Dias
LS, Conway JJ, Straka P. The prognostic value and significance of serial
bone scintigraphy in Legg-Calvé-Perthes disease. J Pediatr
Orthop.1997;17:
230-9.17230
1997
YrjonenT,
Poussa M, Hoikka V, Osterman K. Poor prognosis in atypical Perthes'
disease. Radiographic analysis of 19 hips after 35 years. Acta
Orthop Scand.1992;63:
399-402.63399
1992
[CrossRef]
KatzJF.
Minimal Legg-Calvé-Perthes disease. J Mt Sinai Hosp
NY.1968;35:
408-16.35408
1968
Dal MonteA,
Andrisano A, Capanna R, Rubbini L. Long term results of centralising
intertrochanteric osteotomy in Legg-Calvé-Perthes' disease. (Report on
60 cases). Ital J Orthop Traumatol.1982;8:
413-22.8413
1982
[PubMed]
ChigwandaPC. Early natural history of untreated Perthes'
disease. Cent Afr J Med.1992;38:
334-42.38334
1992
[PubMed]
WillettK,
Hudson I, Catterall A. Lateral shelf acetabuloplasty: an operation for
older children with Perthes' disease. J Pediatr
Orthop.1992;12:
563-8.12563
1992
KruseRW,
Guille JT, Bowen JR. Shelf arthroplasty in patients who have
Legg-Calvé-Perthes disease. A study of long-term results. J
Bone Joint Surg Am.1991;73:
1338-47.731338
1991
CatterallA. Adolescent hip pain after Perthes' disease.
Clin Orthop.1986;
209: 65-9.20965
1986
[PubMed]
CanarioAT,
Williams L, Wientroub S, Catterall A, Lloyd-Roberts GC. A controlled study
of the results of femoral osteotomy in severe Perthes' disease. J
Bone Joint Surg Br.1980;62:
438-40.62438
1980
YrjonenT.
Prognosis in Perthes' disease after noncontainment treatment. 106 hips
followed for 28-47 years. Acta Orthop Scand.1992;63:
523-6.63523
1992
[PubMed][CrossRef]
OlneyBW, Asher
MA. Combined innominate and femoral osteotomy for the treatment of severe
Legg-Calvé-Perthes disease. J Pediatr Orthop.1985;5:
645-51.5645
1985
[PubMed][CrossRef]
NoonanKJ,
Price CT, Kupiszewski SJ, Pyevich M. Results of femoral varus osteotomy in
children older than 9 years of age with Perthes disease. J Pediatr
Orthop.2001;21:
198-204.21198
2001
[CrossRef]
SnyderCR.
Legg-Perthes disease in the young hip—does it necessarily do well?
J Bone Joint Surg Am.1975;57:
751-9.57751
1975
[PubMed]
SchoeneckerPL,
Stone JW, Capelli AM. Legg-Perthes disease in children under 6 years old.
Orthop Rev.1993;22:
201-8.22201
1993
[PubMed]
KellyFB Jr,
Canale ST, Jones RR. Legg-Calvé-Perthes disease. Long-term
evaluation of non-containment treatment. J Bone Joint Surg
Am.1980;62:
400-7.62400
1980
SponsellerPD,
Desai SS, Millis MB. Comparison of femoral and innominate osteotomies for
the treatment of Legg-Calvé-Perthes disease. J Bone Joint
Surg Am.1988;70:
1131-9.701131
1988
MartinezAG,
Weinstein SL, Dietz FR. The weight-bearing abduction brace for the
treatment of Legg-Perthes disease. J Bone Joint Surg
Am.1992;74:
12-21.7412
1992
KamegayaM,
Shinada Y, Moriya H, Tsuchiya K, Akita T, Someya M. Acetabular remodelling
in Perthes' disease after primary healing. J Pediatr
Orthop.1992;12:
308-14.12308
1992
[CrossRef]
EvansIK,
Deluca PA, Gage JR. A comparative study of ambulation-abduction bracing
and varus derotation osteotomy in the treatment of severe
Legg-Calvé-Perthes disease in children over 6 years of age.
J Pediatr Orthop.1988;8:
676-82.8676
1988
[PubMed][CrossRef]
MenelausMB. Lessons learned in the management of
Legg-Calvé-Perthes disease. Clin Orthop.1986;209:
41-8.20941
1986
[PubMed]
NomuraT,
Terayama K, Watanabe S. Perthes' disease: a comparison between two methods
of treatment, Thomas' splint and femoral osteotomy. Arch Orthop
Trauma Surg.1980;97:
135-40.97135
1980
[CrossRef]
EdsbergB,
Rubinstein E, Reimers J. Containment of the femoral head in
Legg-Calvé-Perthes' disease and its prognostic significance.
Acta Orthop Scand.1979;50:
191-5.50191
1979
[PubMed][CrossRef]
Lloyd-RobertsGC, Catterall A, Salamon PB. A controlled study
of the indications for and the results of femoral osteotomy in Perthes'
disease. J Bone Joint Surg Br.1976;58:
31-6.5831
1976
[PubMed]
KruseRW,
Guille JT, Bowen JR. Shelf arthroplasty in patients who have
Legg-Calvé-Perthes disease. A study of long-term results. J
Bone Joint Surg Am.1991;73:
1338-47.731338
1991
LeggAT.
Osteochondral trophopathy of the hip-joint. Surg Gynecol
Obstet.1916;22:
307-23.22307
1916
WaldenstromH. The definite form of the coxa plana.
Acta Radiol.1922;
1: 384-95.1384
1922
GoffCW.Legg-Calvé-Perthes syndrome and related osteochondroses of
youth. Springfield, IL: Thomas; 1954.
1954
ChristensenF,
Soballe K, Ejsted R, Luxhoj T. The Catterall classification of Perthes'
disease: an assessment of reliability. J Bone Joint Surg
Br.1986;68:
614-5.68614
1986
HardcastlePH,
Ross R, Hamalainen M, Mata A. Catterall grouping of Perthes' disease. An
assessment of observer error and prognosis using the Catterall classification.
J Bone Joint Surg Br.1980;62:
428-31.62428
1980
[PubMed]
Van DamBE,
Crider RJ, Noyes JD, Larsen LJ. Determination of the Catterall
classification in Legg-Calvé-Perthes disease. J Bone Joint
Surg Am.1981;63:
906-14.63906
1981
RitterbuschJF,
Shantharam SS, Gelinas C. Comparison of lateral pillar classification and
Catterall classification of Legg-Calvé-Perthes' disease. J
Pediatr Orthop.1993;13:
200-2.13200
1993
MukherjeeK,
Fabry G. Evaluation of the prognostic indices in Legg-Calvé-Perthes
disease: statistical analysis of 116 hips. J Pediatr
Orthop.1990;10:
153-8.10153
1990
WiigO,
Terjesen T, Svenningsen S. Inter-observer reliability of radiographic
classifications and measurements in the assessment of Perthes' disease.
Acta Orthop Scand.2002;73:
523-30.73523
2002
[PubMed][CrossRef]
HerringJA,
Hair M, Short D, Brown R; Legg-Perthes Study Group. Comparison of a
computerized system of analysis (Gross-Harry) of Legg-Perthes radiographs with
radiographic measurements by physicians. In: Uhthoff HK, Wiley JJ, editors.
Behavior of the growth plate. New York: Raven Press;
1988. p 393-400.393
1988
PodeszwaDA,
Stanitski CL, Stanitski DF, Woo R, Mendelow MJ. The effect of pediatric
orthopedic experience on interobserver and intraobserver reliability of the
Herring lateral pillar classification of Perthes disease. J Pediatr
Orthop.2000;20:
562-5.20562
2000
[CrossRef]
PietrzakS,
Napiontek M, Tomaszewski M. [Inter-observer variation of the Caterall and
Herring classification in Perthes disease]. Chir Narzadow Ruchu
Ortop Pol.2000;65:
33-8. Polish.6533
2000
SpecchiulliF,
Scialpi L. Catterall versus Herring classification in Perthes' disease.
Chir Organi Mov.1997;82:
289-93.82289
1997
[PubMed]
MeurerA,
Schwitalle M, Humke T, Rosendahl T, Heine J. [Comparison of the prognostic
value of the Catterall and Herring classification in patients with Perthes
disease]. Z Orthop Ihre Grenzgeb.1999;137: 168-72.
German.137168
1999
[PubMed][CrossRef]
NeytJG,
Weinstein SL, Spratt KF, Dolan L, Morcuende J, Dietz FR, Guyton G, Hart R,
Kraut MS, Lervick G, Pardubsky P, Saterbak A. Stulberg classification
system for evaluation of Legg-Calvé-Perthes disease: intra-rater and
inter-rater reliability. J Bone Joint Surg Am.1999;81:
1209-16.811209
1999
[PubMed]
DonnerA,
Eliasziw M. Sample size requirements for reliability studies.
Stat Med.1987;6:
441-8.6441
1987
[PubMed][CrossRef]
WalterSD,
Eliasziw M, Donner A. Sample size and optimal designs for reliability
studies. Stat Med.1998;17:
101-10.17101
1998
[PubMed][CrossRef]
FeinsteinAR,
Cicchetti DV. High agreement but low kappa: I. The problems of two
paradoxes. J Clin Epidemiol.1990;43:
543-9.43543
1990
[PubMed][CrossRef]
KraemerHC.
Measurement of reliability for categorical data in medical research.
Stat Methods Med Res.1992;1:
183-99.1183
1992
[PubMed][CrossRef]
CicchettiDV,
Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes.
J Clin Epidemiol.1990;43:
551-8.43551
1990
[PubMed][CrossRef]
MaclureM,
Willett WC. Misinterpretation and misuse of the kappa statistic.
Am J Epidemiol.1987;126:
161-9.126161
1987
[PubMed]
LandisJR, Koch
GG. The measurement of observer agreement for categorical data.
Biometrics.1977;33:
159-74.33159
1977
[PubMed][CrossRef]
FeldtLS,
Brennan RL. Reliability in generalizability theory. In: Linn RL, editor.
Educational measurement. Part I: theory and general
examples. 3rd ed. Phoenix: Oryz Press; 1993. p
127-41.127
1993
LappinK,
Kealey D, Cosgrove A. Herring classification: how useful is the initial
radiograph? J Pediatr Orthop.2002;22:
479-82.22479
2002
[PubMed][CrossRef]