To The Editor:
"Association Between Hospital and Surgeon Procedure Volume and
Outcomes of Total Hip Replacement in the United States Medicare
Population" (2001;83-A:1622-9), by Katz et al., provides the basis
for discussing one of the most important issues facing health care
today-the results of specialization. Their data support the concept
that specialists provide better outcomes. Specifically, in the Medicare
population, patients treated with primary total hip replacement
by surgeons who performed more than fifty of these procedures per
year had a markedly reduced complication rate in comparison with those
patients whose surgeons performed ten or fewer of these procedures per
year.
In a paper presented at the annual meetings of the American Academy
of Orthopaedic Surgeons and the American Shoulder and Elbow Surgeons
in 2001, we reported the results of a study that made use of the
1998 database of the Center for Medical Consumers (www.medicalconsumers.org/#Main_Index)
to determine the volume distribution among surgeons and hospitals
in New York State of total/partial shoulder replacements, total/partial
hip replacements, and total knee replacements
1
. We learned that 14,644 hip replacements, 12,328 knee replacements,
and 902 shoulder replacements were performed by 1175, 820, and 389
surgeons, respectively. Approximately 40% of surgeons who performed
hip and knee replacements in New York State performed ten or more
replacements in that year. In contrast, only ten (<3%) of all
surgeons who performed shoulder replacements did ten or more such
procedures in 1998, and more than three-quarters of these surgeons
performed only one or two. Seventy-eight percent of the shoulder
replacements were performed by surgeons who did ten or fewer of
these procedures per year whereas <31% of hip and knee arthroplasties
were performed by such low-volume surgeons. More than forty percent
of patients who had shoulder arthroplasty were operated upon by
surgeons who performed only one or two of these procedures per year
(
Table
).
These results, coupled with those of Katz et al., suggest that
many patients are undergoing arthroplasty done by surgeons who do
not perform this procedure frequently, that the complication rate
is higher for these low-volume surgeons, and that the skew in the
distribution of experienced surgeons is more dramatic for shoulder
arthroplasty than it is for hip or knee replacement.
Patients routinely ask, "Who is the best person to do my procedure?"
The answers often given are: "Someone on the provider list of your
health plan," "someone near your home," and "someone suggested by
your primary-care physician." Rarely given are the answers: "Someone
who does a critical number of these procedures" and "someone who
can document his or her personal efficacy in treating the condition
in question." Where should the standard of excellence fit into the
formula for surgeon selection, and by what means can information
about surgeon experience be provided to patients considering surgery?
There are now over twenty articles documenting the correlation
between procedure volume and results of total joint replacement
in the peer-reviewed literature. Katz et al. provided another. What is
missing is a discussion of the underlying causes of this correlation.
The authors may wish to comment on the following.
• Is the busiest surgeon busiest because she or he does
the best job-i.e., is volume a marker of quality (as in the case
of restaurants, where the best ones tend to have the longest lines
out front)?
• Does the busiest surgeon do the best job because he
or she has done more-i.e., does "practice make perfect"?
• There is evidence that low-volume surgeons tend to
operate on patients who have a greater risk of complications
2
. Does a surgeon's experience improve patient selection (as in buying
art or watermelons)?
• Does high volume beget better support services for
a procedure-i.e., are the better nurses and therapists assigned
to frequently performed procedures (like the benefits assigned to frequent
fliers)?
• Is there a limit to the volume effect, or does quality
continue to improve with increasing volume?
What is also missing is a discussion of the implications of the
data. The authors may also wish to consider the following questions.
If volume data are important, for what procedures should surgeon-volume
data be collected, how, and by whom? If quality and volume are associated, shouldn't
the volume data be made accessible to patients so that they can consider
this information along with that regarding proximity and payer in making
the decision of where to have surgery? Is the surgeon or center
obligated to disclose volume as a part of informed consent? With
a few exceptions, such surgeon-specific data are difficult for patients
to acquire. For patients electing to have surgery performed by low-volume
surgeons, how can they be protected from the potential risks of
this choice? Are low-volume surgeons at enhanced legal risk? If
so, how might they be protected? In that low-volume surgeons have
a financial disincentive to refer their patients to high-volume
surgeons, how can this conflict of interest be best handled? Are the
American Academy of Orthopaedic Surgeons and implant manufacturers encouraging
surgeons to perform arthroplasties by holding "sawbones learning
centers," even though the surgeons who attend may perform only one
or two of these procedures per year? Is the volume effect transferable, i.e.,
if one is a high-volume surgeon in terms of performing hip arthroplasties, does
this experience apply to knee, ankle, and shoulder arthroplasties
as well? Recognizing that every surgeon begins his or her career
as a "low-volume surgeon," how can our educational process accommodate
the inevitability of the learning curve in a way that does not jeopardize
patient care? What do the effects of surgeon procedure volume suggest
to payers, such as Medicare, with respect to regionalization of major
surgical procedures? If low volumes of total hip replacement (i.e., <10/yr)
are associated with a higher complication rate, can it be inferred
that the same volume criterion (<10/yr) would apply to total
ankle, knee, elbow, and shoulder replacement as well as spinal instrumentation
and endoscopic carpal tunnel release, or does this type of study
need to be repeated for each procedure? If spine surgery is best
done by a spine surgeon, hip surgery is best done by a hip surgeon,
and hand surgery is best done by a hand surgeon, what is general
orthopaedics?
Answers to these questions have huge implications for surgical
education, practice distribution, and health-care financing.
The Journal
and the orthopaedic community are challenged to consider these implications,
remembering that our first duty is to the patients we serve. What
is in their best interest?
J.N. Katz, E. Losina, C.B. Phillips, N.N. Mahomed, R.A.
Lew, W.H. Harris, R. Poss, J.A. Baron, A.H. Fossel, N. Maher, J.
Barrett, and J. Tullar reply:
We are pleased to respond to Dr. Matsen's thoughtful commentary
on our article. Dr. Matsen addressed scientific aspects of the association
between volume and outcomes, including causality and the need to
account for factors such as comorbidity that affect outcome, and the
clinical and health-care policy implications of our findings. Our response
addresses both of these considerations.
Scientific Considerations
Dr. Matsen raised the question of whether outcomes beget volume
(people flock to certain restaurants because the restaurants are
excellent) or volume begets outcomes (practice makes perfect). In
the absence of a randomized trial (which would probably be infeasible),
we cannot establish causality with certainty. Luft and colleagues
3
proposed a method that makes use of cross-sectional data to gain
insight into the causal direction of associations between volume
and outcomes. We have adapted their approach, as follows.
We started by recognizing that hospitals with more beds and those
with teaching programs perform a higher volume of hip replacements.
Indeed, these two factors accounted for 26% of the variance in the
algorithm of total hip replacement volume in our analyses. We then
examined the association between hospital rate of mortality and the
residuals from this regression. (The residual is the difference
between a hospital's volume of total hip replacement that can be
predicted
on the basis of its number of beds and its teaching status and
the
actual
volume of total hip replacement.) If outcomes drive volume, then
the residuals should be associated with mortality-that is, a hospital
with an especially high rate of mortality should have a lower annual
volume of total hip replacements than predicted on the basis of
its number of beds and its teaching status (because patients would avoid
the hospital). Similarly, a hospital with an especially low rate
of mortality should have a higher volume of total hip replacement
than predicted because patients would flock to the hospital. In fact,
the residuals explained virtually none (0.04%) of the variability
in mortality. This finding lends no support to the hypothesis that
outcomes of hip replacement drive volume, and it is more consistent
with a practice-makes-perfect mechanism.
Dr. Matsen also raised the question of selection bias-whether
low-volume surgeons tend to operate on patients who are at greater
risk for complications. Indeed, patients who are operated on in
low-volume hospitals are more likely to be older, less educated,
nonwhite, and poor (our unpublished data). However, our analyses
adjusted for demographic and clinical factors, including age, race,
gender, arthritis diagnosis, comorbidity, and poverty status. Even
after this adjustment, low-volume hospitals and surgeons had worse
perioperative outcomes. Thus, an imbalance between high and low-volume
centers for these variables does not account for the differences
in outcome. Of course, claims data are not ideal sources of information
on comorbidity and cannot account for differences in technical complexity
among cases. Thus, it remains possible that aspects of case severity
that we could not measure (or could not measure well) with claims data
may explain some of the differences in outcome.
Dr. Matsen asked whether the enhanced support services in high-volume
centers might account for the superior results. In work that is
not yet published, we examined whether hospital characteristics
explain the association between volume and outcome. Our analyses indicate
that hospital characteristics account for little of the effect of
volume on outcome, leaving us to conclude once again that a "practice-makes-perfect"
effect is the dominant mechanism. Dr. Matsen's comment also raises
the question of which has greater influence on outcomes-the experience
of the surgeon or the characteristics of the hospital? We examined
the independent effects of surgeon volume and hospital volume in
our analyses. As our article shows, mortality following primary total
hip replacement is driven largely by hospital volume and not by
surgeon volume. On the other hand, rates of dislocation and infection
are influenced by both hospital and surgeon volume, with surgeon
volume being the greater influence of the two. The finding that
some outcomes are driven more by surgeon volume and others, by hospital
volume has important implications for a patient's choice of hospital
and surgeon. For example, even within high-volume hospitals where
twenty-six to fifty total hip replacements per year are performed
in the Medicare population, patients whose surgeons perform five
or fewer of these procedures per year have dislocation rates that
are three fold those of patients whose surgeons perform more than
fifty hip replacements per year.
Dr. Matsen also asked whether there are discrete threshold volume
values above which outcomes become stable. In response to this comment,
we have split our highest volume stratum into two substrata, 100
to 150 procedures per year and more than 150. The mortality rates
were 0.57% for the highest-volume substratum (more than 150 procedures)
and 0.74% for the next substratum (100 to 150). These two mortality
rates are not significantly different, but the pattern shows no
evidence of a threshold. An analysis of dislocation yielded similar
results. These limited data suggest that higher volume is associated
with a better perioperative outcome at all points along the continuum
of hospital and surgeon volume, with no evidence of a discrete threshold.
Dr. Matsen asked whether these observations must be confirmed
for each individual surgical procedure (e.g., total shoulder arthroplasty)
or whether the associations between volume and outcome that are
observed in regard to one procedure can be generalized to others. A
recent review of the literature on associations between volume and
outcome found inverse associations in 77% of reports
4
. Thus, the association is not universal. We hesitate to generalize
the implications of our findings for hip replacement to other orthopaedic
procedures until more research on some of these procedures has been
performed.
Policy Implications
The decision of whether to have surgery in a high or a low-volume
center is complex, especially if the patient lives a great distance
from a high-volume center. The advantages of care in a high-volume
center are clear. For example, we reported in our article that the
rate of mortality within ninety days of elective primary total hip
replacement in high-volume centers is just over half of that in
the lowest-volume centers (adjusted odds ratio = 0.58). While this is
impressive, the difference in absolute risk for ninety-day mortality
is modest (1.3% in hospitals where ten or fewer procedures per year
are performed versus 0.7% in hospitals where more than 100 are performed).
If we assume these mortality rates, then, for every 167 patients
whose care is transferred from a low-volume hospital where fewer than
ten total hip replacements per year are performed in the Medicare
population to a hospital where more than 100 are performed, one
life would be saved (1/0.006 = 167). On a national scale, if the
approximately 6700 patients who had a primary total hip replacement
in centers with annual procedure volumes of less than ten per year
in 1995 were instead referred to centers with procedure volumes
in excess of 100 per year, forty lives would have been saved. If these
patients were referred to centers where fifty-one to 100 procedures
were performed per year (mortality = 0.9%), twenty-seven lives would
have been saved.
While the potential advantages of shifting patients from low
to high-volume centers are easy to calculate, the disadvantages
of referral to a high-volume center are more subtle. Many patients prefer
to receive care in low-volume settings. The reasons that patients
select low-volume centers have not been well studied but likely
include the hospital affiliation of the surgeon to whom they are
referred, the recommendations of their primary-care physicians,
recommendations of family and friends, convenience of the location
for patients and their families and friends, and other factors.
Some patients might simply refuse to have the procedure if it could only
be performed at the distant high-volume center rather than at the
local low-volume hospital. This would have important effects on
quality-adjusted life expectancy. A patient with a ten-year life
expectancy who spends the remainder of his or her life with end-stage
hip arthritis would live two to five quality-adjusted life-years
less than would a patient who has a successful total hip replacement
5-7
. We have not modeled the trade-offs formally, but it is clear from
these examples that mandatory referral to a high-volume center saves
some lives at the expense of an unknown but potentially large number
of quality-adjusted life-years. Our data also suggest that patients
who elect not to travel to the high-volume center may be older, poorer,
and less educated. Thus, mandatory referral to high-volume centers could
exacerbate existing disparities in the utilization of total hip
replacement among whites, blacks, and Hispanics, as well as between
poor and nonpoor
8
.
In response to another of Dr. Matsen's questions, we are unaware
of whether low-volume surgeons are at legal risk, but it would seem
prudent from this standpoint to fully disclose data on surgeon and
hospital volume. Our data do not provide answers to several other provocative
questions that Dr. Matsen raised, including how to align financial incentives
with referral to high-volume centers and how to manage the tension between
educating more surgeons in the techniques of arthroplasty and the resultant
increase in low-volume surgeons. We invite continued dialogue, research,
and policy analysis to address these important concerns.
These complex issues are especially critical because payers are
paying attention to volume. The Centers for Medicare and Medicaid
Services (CMS), which manages the Medicare program, has initiated
a pilot program that designates centers of excellence for total
hip and knee replacement surgery. Similar programs in cardiac surgery
were successful in reducing costs with no compromise in outcomes.
Volume is one of many indicators of quality used in the CMS project.
Payers in the private sector have also committed to using high-volume
providers. The Leapfrog Group, a consortium of major businesses
dedicated to improving health-care quality and efficiency, has identified
referral to high-volume providers as a strategic goal for improving employees'
health
9
. We believe that programs to restrict care to high-volume centers
should await formal, comprehensive policy analysis and that the
choice of hospital and surgeon should be left to the patient. We
agree with Dr. Matsen that the medical community has an obligation
to inform patients fully of these volume-outcome relationships and
of the volume of surgeries performed by specific surgeons and at
specific hospitals. As with many other complex medical and surgical
decisions, we believe that patient preferences should drive the choice
of surgeon and hospital and that our job as researchers and clinicians
is to inform patients fully and to help them to make choices that
are congruent with their preferences
10
.