It is often useful to have a pre-intervention health status measurement in clinical trials for the purpose of calculating a "change score" (post-intervention status versus pre-intervention status) or to model interactions that might affect outcome on the basis of the pre-intervention status. Baseline, pre-intervention health status can be measured in situations in which the intervention is planned and scheduled electively. However, in the absence of an actual baseline measurement (e.g., when intervention is undertaken for traumatic or unexpected conditions), the only way to obtain information on the pre-intervention health status is to ask patients to recall. There are several reasons why recalled information might not accurately reflect actual baseline health status. The patient may simply not remember accurately, which would produce a random error in either a positive or negative direction. However, recall error might be systematically biased in one direction or the other if the intervention (or its consequences) results in a changed perspective for a patient that makes him or her view the baseline status differently than he or she would have prior to the intervention, a phenomenon known as "response shift."
When considering whether it is valid to substitute recall baseline data for actual baseline health status, it is essential to know the types of patients and health conditions to which this practice might be reasonably applied and how far back in time recall memory remains accurate. Previous work has suggested that, for the study of patients who have had spinal operations or hip replacements, recall bias may not be an important limitation when recalled data are collected within six weeks of the intervention1,2. Recall up to three months after total hip replacement has been reported to be more accurate in young patients (i.e., patients who are less than sixty-five years of age) and for the measurement of pain and function as compared with stiffness and emotional status1. Furthermore, recalled data more closely reflect actual baseline health-status measurements in patients who have had a simple primary hip replacement rather than a complex revision procedure1.
In their paper, "Older Patients Can Accurately Recall Their Preoperative Health Status Six Weeks Following Total Hip Arthroplasty," Marsh et al. suggest that recall bias may not be a limitation in the use of retrospective baseline data at six weeks after total hip replacement. Data collected on the day of surgery (pre-intervention) correlated very well with preoperative data collected four weeks prior to surgery, and the day-of-surgery data were more accurate than recall data (collected up to six weeks after surgery). The authors point out that power was always better if a baseline adjustment was made (i.e., calculating a "change score") versus making simple group comparisons of only postoperative functional data. They also note that, while recalled data were similar to actual baseline data, an increase in within-subject variability resulted in a loss of power, which suggests that a larger sample size would be required when substituting recalled data for actual baseline health-status data. Although the magnitude was small, recalled data for the WOMAC and the SF-12 Physical and Lower Extremity Functional Scale measures tended to reflect a worse recalled quality of life compared with the actual baseline data. This suggests that patients may have experienced a response shift after total hip replacement surgery (or that an alternate reference point was used as per the implicit theory of change). A worse pre-intervention quality of life due to response shift has been documented previously for patients undergoing total joint replacement1,3,4. Response shift is a change in an individual's internal standards or values or in the conceptualization of his or her own quality of life5. It can be triggered by a life event or by an intervention that changes the person's frame of reference relative to that of others and to his or her own concept of ideal health status. Thus, the patient may use a quality-of-life questionnaire differently independent of a treatment effect. Baseline health status data collected after intervention may in fact be more useful than actual preoperative data in situations in which a major change in patient perspective (response shift) causes the patient to conceptualize baseline health as worse after an intervention than he or she would have conceptualized it before receiving treatment. In a study of HIV-infected patients who were starting antiviral therapy, adjusting for retrospectively collected baseline data showed a stronger association with changes in clinical indicators than did prospective baseline data, resulting in a larger effect size and better power when using the retrospective data6. It should be noted that effect size is only increased if patients systematically report their baseline as significantly worse when asked retrospectively versus prospectively, since this provides a larger change score.
In summary, it appears that recall bias at six weeks may not be a significant concern for older patients undergoing primary total hip replacement. Thus, retrospective baseline data can be reliably substituted for actual baseline data under these circumstances for group comparisons. While there is a small apparent response shift in the direction of an increased effect size, investigators need to be aware that the increased variability of recall data leads to a loss of power and that a larger sample size is therefore required when using recall versus actual baseline data when studying patient groups after total hip replacement surgery. Before extrapolating the results obtained with patients undergoing scheduled total hip replacement to other conditions, such as unexpected trauma, much more information is required regarding the magnitude, direction, and consistency of response shift under these circumstances.
*The authors did not receive any outside funding or grants in support of their research for or preparation of this work. Neither they nor a member of their immediate families received payments or other benefits or a commitment or agreement to provide such benefits from a commercial entity.
1. Howell J, Xu M, Duncan CP, Masri BA, Garbuz DS. A comparison between patient recall and concurrent measurement of preoperative quality of life outcome in total hip arthroplasty. J Arthroplasty. 2008:23:843-9.
2. Hägg O, Fritzell P, Odén A, Nordwall A; Swedish Lumbar Spine Study Group. Simplifying outcome measurement: evaluation of instruments for measuring outcome after fusion surgery for chronic low back pain. Spine. 2002;27:1213-22.
3. Razmjou H, Yee A, Ford M, Finkelstein JA. Response shift in outcome assessment in patients undergoing total knee arthroplasty. J Bone Joint Surg Am. 2006;88:2590-5.
4. Razmjou H, Schwartz CE, Yee A, Finkelstein JA. Traditional assessment of health outcome following total knee arthroplasty was confounded by response shift phenomenon. J Clin Epidemiol. 2009:62:91-6.
5. Sprangers MA, Schwartz CE. Integrating response shift into health-related quality of life research: a theoretical model. Soc Sci Med. 1999;48:1507-15.
6. Nieuwkerk PT, Tollenaar MS, Oort FJ, Sprangers MA. Are retrospective measures of change in quality of life more valid than prospective measures? Med Care. 2007;45:199-205.