R.L. Barrack and P.M. Morgan reply:
We would like to thank the letter writers for their interest in our study, "The Value of Intraoperative Gram Stain in Revision Total Knee Arthroplasty." We apologize for the error in Table I; the correct number in the last column should be 154 rather than 142. This was an error in the editing process that had no effect on any of the statistical analysis. There are obviously differences of opinion regarding the statistical tests utilized and statistical terminology. To resolve these differences, we submitted the manuscript, the letter, and the raw data to the University of Minnesota Biostatistics Design and Analysis Center. The responses that follow are based on their analysis and conclusions.
The letter writers’ suggestion that intraoperative Gram stain may be at least as good as any one preoperative test (when evaluated individually) is an initially intuitive but ultimately misleading method of evaluating the data. As has been shown by multiple authors1-3, preoperative infection studies (erythrocyte sedimentation rate, C-reactive protein, cell count and differential of aspirated fluid) should not be interpreted individually but should instead be used together when evaluating a patient for periprosthetic infection. When used in combination, these studies have been shown to have both excellent sensitivity and specificity and remain, for this reason, the essential diagnostic tool for the revision total joint surgeon. As previous authors have shown2,4,5, the intraoperative Gram stain is a particularly poor study for the investigation of the presence of infection at the site of an arthroplasty and, although it adds nothing to the preoperative workup, it does add additional cost and potentially more operative time when waiting for urgent Gram stain results. In terms of our sample of patients, we used a large, multicenter registry that included all revision total knee arthroplasties performed over a set period of time. One center obtained preoperative aspiration routinely, whereas the other two used it selectively. This sample method, although imperfect, is as free of implicit or explicit bias as is possible when using a multicenter registry. We found no strong argument for replacing the familiar and interpretable measures of sensitivity, specificity, and positive/negative predictive value (SSPNPV) with the letter writers’ suggested methods. The Youden index (sensitivity + specificity - 1) is unable to differentiate between high sensitivity/poor specificity and poor sensitivity/high specificity, and, in the investigation for periprosthetic infection, we would suggest that knowing a study's actual sensitivity and specificity is therefore preferable. In the context suggested, the diagnostic odds ratio is difficult to interpret and appears misleading (a diagnostic odds ratio of 250 for a test with poor sensitivity and high specificity); we question its appropriateness when compared with the more standard SSPNPV. Similarly, we have reservations in applying accuracy to the event of periprosthetic infection without a mechanism in place for correcting for agreement-by-chance. The Youden index is used even more rarely than the diagnostic odds ratio in the orthopaedic literature, probably for the reasons noted, and certainly there is no basis to favor these tests over the SSPNPV analysis presented in the paper.
Finally, the suggestion that the Mann-Whitney U test is favored over the Mann-Whitney t test with Gaussian approximation or that decision matrix is more appropriate than chi-square is a distinction without a difference. In both cases, these are minor differences in terminology that have no impact on the statistical analysis.
In summary, there is currently no basis to support an increased use of intraoperative Gram stain for the diagnosis of periprosthetic infection. The Gram stain, while of great historical interest, has numerous inherent problems, including sampling error and tremendous operator dependence in terms of both performing and interpreting the test. This is compounded by the fact that it is frequently utilized intraoperatively, when time pressures can introduce even more variability. The existing literature, including this study, supports care and caution in utilizing this test for clinical decision-making. Diligent performance and interpretation of a full set of preoperative serologic tests and examination of aspirated fluid in most cases is essential when treating the failed total knee arthroplasty. When this is done, the remaining role of the Gram stain is extremely limited.