Background: Providing the best treatment options and appropriate
prognostic information to patients with cartilaginous neoplasms of long bones
depends on distinguishing benign from malignant lesions. Correlative
interpretation of imaging, histopathology, and clinical information is the
current method for making this distinction, yet the reliability of this
approach has not been critically evaluated. This study quantifies the
interobserver reliability of the determination of grade for cartilaginous
neoplasms among a group of experienced musculoskeletal pathologists and
Methods: Nine recognized musculoskeletal pathologists and eight
recognized musculoskeletal radiologists reviewed forty-six consecutive cases
of cartilaginous lesions in long bones that underwent open biopsy or
intralesional curettage. All diagnosticians had a bulleted history and
preoperative conventional radiographs for review. Pathologists reviewed the
original hematoxylin and eosin-stained glass slides from each case.
Radiologists reviewed any additional imaging that was available, variably
including serial radiographs, magnetic resonance imaging, and computed
tomography scans. Each diagnostician classified a lesion as benign, low-grade
malignant, or high-grade malignant. Kappa coefficients were calculated as a
measure of reliability.
Results: Kappa coefficients for interrater reliability were 0.443
for the pathologists and 0.345 for the radiologists (p < 0.0001 for both).
Kappa coefficients for a subgroup of cases determined to be high risk by
subsequent clinical course were poorer at 0.236 and 0.206, respectively (p
< 0.0001 for both). Slightly improved agreement among radiologists was
noted for the twenty lesions that had magnetic resonance imaging available
(Kappa = 0.437, p < 0.0001), but not for the lesions analyzed with serial
plain radiographs or computed tomography scans.
Conclusions: This study demonstrates low reliability for the grading
of cartilaginous lesions in long bones, even among specialized and experienced
pathologists and radiologists. This included low reliability both in
differentiating benign from malignant lesions and in differentiating
high-grade from low-grade malignant lesions, both of which are critical to the
safe treatment of these neoplasms. This may explain in part the wide variation
in outcomes reported for chondrosarcomas treated in different medical centers.
New diagnostic and grading strategies linked to protocol-driven treatments are
needed, but they must be measured against the long-term gold standard of