Metal-on-metal hip resurfacing is a popular treatment for hip osteoarthritis in younger patients, as it offers the advantage of low wear and a reduced dislocation rate1-4. To date, more than 80,000 resurfacing hip replacements have been performed worldwide and this number is increasing, in part because of the recent approval of these devices by the United States Food and Drug Administration. Recently, several authors have raised concerns about destructive lesions that develop in some patients with these devices5-9. These lesions are often heterogeneous in their clinical presentation and, as a result, have been variously described as cysts10,11, bursae, soft-tissue reactions8,12-15, adverse reactions to metal debris16, and pseudotumors, and often cause severe local destruction to bone and soft-tissue7. The cause of these lesions is unknown, although some authors have implicated toxicity8,17 while others have suggested a hypersensitivity reaction18 related to the metal wear debris from these implants.
One way to investigate the link between metal wear debris and soft-tissue reactions in large metal-on-metal bearings is to measure the location and magnitude of wear on the surface of retrieved implants. However, the measurement of wear in hard-on-hard bearing couples is difficult and requires a highly accurate measurement technique. Conventional techniques lack resolution for measuring linear wear and/or require reconstruction of the surface for volumetric wear determinations. All measurement techniques can potentially give inaccurate results if the implant is damaged during removal, or if there are low levels of wear19. Optical profilometry can accurately identify the location, depth, and volume of the wear scar20,21, on both femoral and acetabular components, even in low-wear situations. This method can also accurately detect edge wear of the acetabular component, which can significantly increase bearing surface wear in hard-on-hard bearing couples20,21.
This study had two hypotheses. The primary hypothesis was that implants that were retrieved because of pseudotumor would have higher linear and volumetric wear than implants that were retrieved because of other conditions, and the secondary hypothesis was that this wear could be attributed to an increased prevalence of edge wear in patients with pseudotumor.
We have established an Implant Retrieval Bank at our institute and, as part of this effort, we have collected a number of large-joint metal-on-metal implants that were retrieved during revision procedures in patients with pseudotumor and other conditions. We define a pseudotumor as a solid mass, with or without cystic components, which is neither neoplastic nor infective. We identified thirty-nine resurfacing implants, all of which were retrieved during revision procedures performed at our institution. During the revision procedure, the operative findings were recorded and the implants were carefully explanted so as to minimize damage to the components. Tissue samples were obtained intraoperatively for histopathological and microbiological examination. A diagnosis of pseudotumor was made on the basis of criteria established at our institution and was based on imaging, surgical, and histopathological findings7, and the implants that were retrieved from patients with this diagnosis made up the pseudotumor group. Microbiological testing was performed to evaluate the presence of possible infection. All implants revised for a diagnosis other than pseudotumor were defined as control implants (i.e., the control group). Implants were not included in the study if the reason for revision was not clear.
Three implants were excluded from the study group because the reason for revision was in doubt; these implants were also not used as part of the control group. In total, the implants retrieved from thirty-six patients were analyzed (eighteen implants in the pseudotumor group and eighteen implants in the control group) (Table I).
This study had institutional review board approval (09/H0606/11). Demographic variables and details of the original primary surgery were collected preoperatively. The preoperative and immediate postoperative radiographs from the revision procedure were identified. One implant in the pseudotumor group and none in the control group were from patients who had evidence of cutaneous allergy to metals.
Of the thirty-six implants in this study, 47% were from patients who had undergone their primary surgery at our institution and 53% were from patients who had received their implants at another center. The acetabular and femoral components were stored in 10% formaldehyde solution to preserve bone and soft tissue.
The mean time that elapsed from the time of the primary operation until the time of revision and the mean time that elapsed between the time of onset of symptoms until the time of revision for both groups are reported in Table I.
Linear and Volumetric Wear Measurements
Bearing surface wear was measured with a noncontact optical coordinate measuring system (Artificial Hip Profiler; RedLux, Southampton, United Kingdom) and was characterized in terms of linear wear, volumetric wear, and local radius. The highly reflective spherical metal surfaces of the implants were scanned with a noncontact point sensor, and the positions of the surface points in space were recorded, thus allowing the creation of a map of the surface within a few minutes time and with a resolution of 20 nm, with no stitching required22. A best-fit sphere was then fitted to the unworn part of the surface, so that the deviation from the perfect sphere and the local radius at each point could be computed22.
The wear-patch contour is easily defined on the local radius map, and the points within the wear patch were excluded from the original sphere radius and center calculation. The software was then used to calculate the linear deviation from the assumed original sphere to determine the linear wear and to calculate the wear volume of the wear patch as volumes of discrete prisms summed across the cloud of data points. For the cup, wear was assessed from measurements taken up to a distance of 0.5 mm from the edge of the acetabular component. This distance was chosen to minimize artifact resulting from the measurement technique and to minimize the effect of variations in cup design, thus providing a conservative estimate of wear.
Edge-Wear Measurements
The presence or absence of edge wear was determined visually. For the acetabular component, edge wear was deemed to be present if the wear scar traversed the edge of the acetabular socket. As has been seen for all cups with edge wear, the corresponding femoral head had a flattened stripe on the local radius map for the head. For the components with no cup, head stripes were used as evidence of edge wear. With use of this method, cups and femoral heads were classified has having “edge wear” or “no edge wear.”
In addition, an objective method for quantifying edge wear was developed for the femoral component by measuring the aspect ratio (length:width) of the wear patch. A program was developed that isolated the top 2.5% of values in the local radius data point distribution. This provided us with an automated method to find the flattest part of the wear patch. On that part of the patch, length and width distances between four cardinal points were calculated. These distances were calculated for all femoral heads for which a wear scar was visible on the wear map. The aspect ratio measurements were correlated with the visual classification of edge wear.
Reliability Testing of the Wear-Measurement Method
Because optical profilometry is a relatively new measurement technique, reliability and repeatability studies were performed, as compared with gravimetric analysis (Sartorius resolution, 0.01 mg; linearity, ≤0.1 mg) (Sartorius ME235S; Sartorius, Goettingen, Germany), on six new resurfacing implants on which wear patches had been artificially induced. In addition, the interobserver reliability was determined, using five explanted resurfacing implants measured by three individuals (S.G.-J., A.R., and H.S.G.).
Radiographic Measurements
Acetabular component abduction angle and anteversion were measured from supine anterosuperior postoperative radiographs, with use of Ein Bild Roentgen Analyse-Femoral Component Analysis software (EBRA-FCA, Innsbruck, Austria)23,24.The anteroposterior head-neck ratio, which has been associated with an increased risk of pseudotumor25, was also calculated.
Statistical Methods
The distribution of continuous data (wear and demographic) was assessed with use of the Shapiro-Wilk test. Differences between the pseudotumor and the control groups were tested with use of the Fisher exact test for categorical data and t test or Mann-Whitney test for continuous data, depending on distribution. The reliability of the Artificial Hip Profiler measurements and the interobserver reliability were determined with use of intraclass correlation coefficients. Multivariate analysis was used to test for the effect of sex on volumetric and linear wear. Odds ratios were calculated for edge wear.
Correlations between volumetric and linear wear and the head and cup wear were assessed with use of the Spearman rho test. Throughout the analysis, standard deviations were used to display the distribution of data, and significance was achieved at the 5% level (p < 0.05).
Source of Funding
The National Institute for Health Research (NIHR) Biomedical Research Unit of the Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), University of Oxford, funded this study. Finsbury Orthopaedics (Leatherhead, United Kingdom) performed all wear analysis, without charge.
Demographic Data
Thirty-six femoral heads and twenty-four acetabular components were retrieved from thirty six hip-resurfacing implants (four Cormet [Corin Medical, Cirencester, United Kingdom], nineteen Birmingham [Midland Medical Technologies, Birmingham, United Kingdom], and thirteen Conserve [Wright Medical, Arlington, Tennessee]). Fewer acetabular components were retrieved as it has been common practice to retain the cup in cases of femoral neck fracture and femoral head osteonecrosis in our healthcare system.
There were more females in the pseudotumor group than in the control group (89% [sixteen of eighteen patients] in the pseudotumor group, and 56% [ten of eighteen patients] in the control group; p = 0.04). Of the eighteen patients in the control group, seven had revision arthroplasty for fracture; six, for infection; three, for instability; one, for pain; and one, for loosening due to femoral head osteonecrosis.
The results of the Shapiro-Wilk test for normality indicated that the acetabular anteversion and inclination measurements were normally distributed (p < 0.05). There was no difference in the mean acetabular inclination or anteversion between the pseudotumor and control groups (Table II). However, there was a greater range of acetabular component inclination in the pseudotumor group (Fig. 1), with 50% (9 of 18 components) falling outside the Lewinnek safe zone26 for inclination and 28% (5 of 18 components) for anteversion, compared with 33% (6 of 18) and 28% (5 of 18), respectively, in the control group.
Linear and Volumetric Wear Measurements
The results of the Shapiro-Wilk test indicated that the wear data were not normally distributed (p = 0.004 to p < 0.001).
The wear data are summarized in Table III. For the whole set of measured implants, there was a highly significant correlation between linear and volumetric wear in both the femoral (ρ = 0.93, p < 0.001) and acetabular (ρ = 0.84, p < 0.001) components. There was also a significant cross-correlation between the femoral component wear and the acetabular component wear (volumetric wear: ρ = 0.87, p < 0.001; linear wear: ρ = 0.65, p < 0.001).
There was no significant correlation between wear and cup anteversion, cup abduction angles, and femoral head-neck ratio (Fig. 1).
The linear wear rate of the femoral components in the pseudotumor group (8.4 ± 8.7 μm/yr) was significantly greater than the wear rate in the control group (2.9 ± 3.9 μm/yr, p = 0.01; Fig. 2-a). This difference probably resulted from the effect of three high-wearing implants in the pseudotumor group. The volumetric wear rate and standard deviation of femoral components in the pseudotumor group (3.3 ± 5.7 mm3/yr) was significantly greater than those in the control group (0.8 ± 1.2 mm3/yr, p = 0.009; Fig. 2-b).
Similar differences in wear were observed in the acetabular components; where the linear wear rate in the pseudotumor group (16.1 ± 21.4 μm/yr) was significantly greater than that in the control group (1.0 ± 1.5 μm/yr, p = 0.001; Fig. 2-d). The volumetric wear rate in the pseudotumor group (2.5 ± 6.3 mm3/yr) was also significantly greater than that in the control group (0.4 ± 0.8 mm3/yr, p = 0.01; Fig. 2-c).
Multivariate and univariate analysis did not reveal a significant effect of sex on wear (p = 0.61 and p = 0.45, respectively).
Edge-Wear Measurements
Analysis identified two distinctive visual patterns of polar wear and edge wear on both the femoral and acetabular components (Fig. 3). The aspect ratio measurements of bearings that had a visual classification of edge wear were significantly higher than those with no edge wear (Fig. 4-a). All bearing surfaces with visible edge wear had an aspect ratio of >3, whereas only one polar-bearing implant had an aspect ratio of <3. An aspect ratio of >3 was therefore used as an indicator of edge wear. According to this classification, 67% (twenty-four of thirty-six, odds ratio = 2.03) of patients had evidence of edge wear (aspect ratio >3) on either the femoral or acetabular component; 94% (seventeen of eighteen, odds ratio = 17.16) of subjects in the pseudotumor group had edge wear, compared with 33% (six of eighteen, odds ratio = 0.49) in the control group (p < 0.001). Subjects with signs of edge wear had significantly greater linear wear, volumetric wear, linear wear rates, and volumetric wear rates than did subjects who did not have edge wear (Fig. 4-b).
Wear-Measurement-Method Reliability Tests
There was a high reliability of the optical profilometry method as compared with the gravimetric method (intraclass correlation coefficient = 0.99). There was also a high interobserver reliability (intraclass correlation coefficient = 0.99).
The aims of this study were to measure the magnitude of wear and the prevalence of edge wear in resurfacing implants retrieved from patients who underwent revision procedures for the treatment of pseudotumor, and to compare these results with a control group of implants from patients who underwent revision for other reasons. This study demonstrated significantly greater wear in the implants that were retrieved from patients with pseudotumor as compared with the implants retrieved from control patients. There was over four times more total linear wear and over three times more total volumetric wear of the femoral and acetabular components in the pseudotumor group than there was in the control group (Table III). Our study also showed that patients with pseudotumors have a significantly higher prevalence of edge wear than controls.
Wear
High wear was detected on the femoral and acetabular components of implants revised for pseudotumor. In addition, the average linear and volumetric wear rate of both components was significantly higher in the pseudotumor group, which suggests that, in most cases, pseudotumor is associated with increased bearing surface wear.
Some authors have demonstrated that pseudotumor tissue displays a mixed inflammatory picture on histological analysis, with features of both a foreign body response to wear debris and features of a Type-IV hypersensitivity reaction7. In vitro studies suggest that pseudotumors are caused by toxicity, as cobalt-chromium particles cause rapid macrophage death at high concentrations27. The results of our study suggest that metal wear-related toxicity is responsible for a significant proportion of pseudotumors; however, there are outliers in both the pseudotumor and control groups.
It is difficult to define high and low wear in metal-on-metal bearings, as there are almost no long-term studies linking wear to outcome. Schmalzried et al. measured the long-term linear wear in retrieved large-head metal-on-metal couples, associating good outcome with a total linear wear rate of approximately 4 μm/year (head and cup)28. Others have reported total linear wear rates in well-functioning devices of 0.8 to 14 μm/year29. Using 4 μm/year as a threshold and using the femoral head linear wear rate as an indicator, we have divided our patients into two groups: those with “high” wear (≥4 μm/year), and those with “low” wear (<4 μm/year). This stratification revealed that a wear rate that is ≥4 μm/year does not always give rise to a pseudotumor, as four patients in the control group of our study had “high” wear and did not have any histological or clinical evidence of pseudotumor (Fig. 5). Conversely, two patients in the pseudotumor group had “low” wear and still developed florid clinical reactions that were consistent with pseudotumors (Fig. 5). This suggests that the cause of pseudotumor may not be exclusively related to metal toxicity, but may also have an allergic component, as suggested by some authors18.
Patients with pseudotumor had increased wear when compared with the control group; the reason for this increased wear is not yet clear but may possibly be due to edge-loading. Some authors have suggested that implant design affects wear17. Other studies have demonstrated a clear correlation between cup inclination and serum ion concentration30. De Haan et al. have demonstrated that a cup inclination angle of >55° is associated with a fourfold increase in serum cobalt and chromium ion concentrations30, which appears to be related to an increased likelihood of edge-loading and wear.
Edge Wear
In our study, patients with a pseudotumor were twice as likely to have signs of edge wear as control patients were. The reason for this is unclear, as there was no significant difference in the mean component abduction angle between the two groups. However, there was greater variability in the component abduction angle in the pseudotumor group. Analysis performed on retrieved ceramic bearings suggests that edge-loading is influenced by acetabular cup anteversion and inclination31. The mechanism of edge wear may be related to impingement between the femoral neck and the acetabular component or the pelvis. Any mechanism that leads to microseparation of the bearing surfaces is likely to cause edge wear, and the edge wear will be more severe if the bearing surfaces are highly loaded31. Sixteen of the eighteen patients in the pseudotumor group in our study were female, compared with ten of eighteen patients in the control group. Women are known to have a greater range of hip motion32 and increased joint laxity33 as compared with male patients; this may lead to an increased likelihood of impingement and edge-loading34,35. Two of the three patients who underwent revision for instability in our study demonstrated signs of femoral neck impingement against the socket. Women are also more likely to have cutaneous sensitivity to metals36,37. Although this is not associated with an overall increased risk of revision in conventional hip arthroplasty, there may be a higher risk associated with metal-on-metal couples37. It is important to note, however, that there was a predominance of females in the pseudotumor group, which may have led to selection bias. A larger study with a more even sex distribution may eliminate this bias; however, as the majority of patients who undergo revision procedures for the treatment of pseudotumor are female, we believe that this study cohort is representative because it reflects the epidemiological distribution that is associated with this condition38.
Limitations and Conclusions
One potential limitation of this study is that we may have underestimated the magnitude of acetabular wear. In theory, during edge-loading, the maximum wear will occur on the edge of the acetabular component. Our measurement protocol measures wear to within 0.5 mm of the edge of the cup so as to not overestimate wear; consequently, the theoretical point of maximum wear is not fully assessed. However, despite this limitation, there was still a significant difference seen between the pseudotumor and control groups, and this limitation does not apply to femoral heads. A further limitation of this study relates to our reporting of the relationship between implant position and wear; our study of this relationship was underpowered due to small sample size.
In summary, this study demonstrates that the hip resurfacing implants of patients who underwent revision for the treatment of pseudotumor have a significantly greater wear rate than the hip resurfacing implants of patients who underwent revision for other conditions. There is a significant association between pseudotumor and increased metal-on-metal bearing wear. However, we have also demonstrated that not all patients with high wear develop a pseudotumor and not all patients with pseudotumor have high wear. This suggests that at least some diagnosed “pseudotumors” may be caused by a different mechanism, such as allergy, rather than toxicity to metal wear debris.