Abstract
Background: Although an investigator may limit
bias through randomization, concealment of patient allocation, and
blinding, the results of randomized trials may be less convincing
when the sample size is not sufficiently large to reveal a true
difference between treatment groups. When the sample size is small, randomized
trials are subject to beta errors (type-II errors)—that
is, the probability of concluding that no difference between treatment
groups exists when, in fact, there is a difference. The purpose
of this study of randomized trials involving fracture care published
between 1968 and 1999 was twofold: (1) to evaluate type-II error
rates and study power (1 - β) for the primary outcomes
and (2) to identify whether investigators clearly identified the
primary and secondary outcomes.
Methods: To be eligible, studies were required to
(1) be published in English, (2) be described as a randomized trial,
(3) involve the care of adult patients with fractures, treated either
operatively or nonoperatively, and (4) contain sufficient outcome
information to enable study power to be calculated. Computer database
searches were performed independently by two investigators to identify
all potentially relevant study titles. Additional strategies to
identify articles included (1) hand searches of selected orthopaedic
journals from 1989 to 1999, (2) searches of the bibliographies of
potentially relevant articles, and (3) review by content experts
to identify missing studies. For each study, a standard power calculation
was performed on the primary and secondary outcomes. For those studies
in which the primary outcome was not explicitly reported, the most
clinically relevant measure was chosen by consensus. Acceptable
study power was agreed a priori to be 80% (type-I
error of £ 0.20).
Results: We identified 620 potentially relevant
citations from MEDLINE, of which only 187 were potentially eligible.
We identified nine more articles with other searches, and application
of the eligibility criteria to the 196 articles eliminated seventy-nine.
Thus, we analyzed 117 studies in which a total of 19,942 patients
with orthopaedic trauma had been randomized. Sample sizes ranged
from ten to 662 patients (mean and standard deviation, 95 79 patients).
The majority (34%) of trials involved the treatment of
hip fractures. The mean overall study power among the 117 trials
was 24.65% (range, 2% to 99%). The type-II
error rate for primary outcomes was 90.52%.
Conclusions: Mean type-II error rates in the orthopaedic
trauma trials that we analyzed exceeded accepted standards. Investigators
can reduce type-II error rates by performing power and sample-size
calculations prior to conducting a trial.
Although there is agreement that a randomized trial is the
best study design for the assessment of treatment effectiveness,
it is believed that trials of surgical therapies can be too small
to have a meaningful impact on clinical practice. Studies with a small
sample size are subject to beta errors (type-II errors)—that
is, the probability of concluding that no difference between treatment
groups exists when, in fact, there is a difference1-3. Typically, investigators accept
a beta error rate of 20% (β = 0.20),
which corresponds with a study power of 80%. Most investigators
agree that beta error rates of >20% (study power
of <80%) are subject to an unacceptably high risk
of false-negative results4-6.
Therefore, although an investigator may limit bias through randomization,
concealment of patient allocation, and blinding, the results of
a randomized trial are less convincing when the sample size is not
sufficiently large to reveal a true difference between treatment
groups7.
Previous investigators have examined the prevalence of type-II
error rates in many medical fields4-6,8-11.
To our knowledge, there has been no systematic appraisal of the
type-II error rates in trials of orthopaedic trauma treatment. Given
the increased popularity of randomized trials in the orthopaedic
trauma literature, the purpose of our study was twofold: (1) to
evaluate the type-II error rates for primary and secondary outcomes
in studies in which "nonsignificant" results were
reported, and (2) to identify whether investigators clearly reported
the primary and the secondary outcomes in trials of treatments for
orthopaedic trauma.
Eligibility Criteria
We included studies that met the following eligibility criteria: (1)
published in English, (2) described as a randomized trial, (3) involved
the care of adult patients with fractures, treated either operatively
or nonoperatively, and (4) contained sufficient outcome information
to enable calculation of study power.
Study Identification
We conducted a search of MEDLINE from 1968 to 1999 with use of
the following keywords: "fractures (MeSH)" and "randomized
controlled trials (publication type)." The search was restricted
to "human subjects" and "English language" articles.
Two independent reviewers applied the eligibility criteria to potentially
eligible study titles. One of the two reviewers was trained in health
research methodology, and the other was an orthopaedic traumatologist
with experience in the conduct of randomized trials. After a second
application of the eligibility criteria to abstracts by the independent
reviewers, the complete articles for the potentially eligible studies
were retrieved. Two of us reviewed the methods section of each of the
retrieved articles to ensure that all inclusion criteria were met.
In addition to the MEDLINE searches, two of us performed a search
of the National Institutes of Health PubMed computerized database,
and one of us conducted a search of the Cochrane database. For both
searches, we used "fractures" and "randomized
trials" as keywords.
Additional strategies to identify relevant citations included: (1)
hand searches of the tables of contents of the Journal of
Orthopaedic Trauma, Journal of Trauma, Clinical Orthopaedics and
Related Research, and Acta Orthopaedica Scandinavica published
from 1989 to 1999; (2) review of the reference lists of eligible
(included) studies to identify other potentially eligible studies;
and (3) a review by content experts (traumatologists) of the list
of eligible studies to identify any missing studies.
Characteristics of Eligible Studies
Two reviewers independently abstracted general characteristics
of each eligible study. These included first author (surgeon, nonsurgeon,
or epidemiologist), epidemiology affiliation, geographic location,
category of intervention, body region of focus (upper extremity,
lower extremity, or spine), number of participating centers, and
whether or not the study was funded.
Determination of Primary Outcomes in Eligible
Studies
In the overwhelming majority (94%) of studies, multiple
outcomes were reported but the primary outcome was not identified.
The same reviewers independently reviewed all study outcomes in
each eligible trial and identified the most relevant outcome measure
as the "primary outcome." Relevant outcome measures
were considered those that pertained directly to the interventions
that were compared. Although in the majority of studies no explicit
statement was made about the primary outcome, the study title, abstract,
and introduction often contained information that could be used
to infer the authors’ intentions. When no information was
available, we used our best judgment to designate a primary outcome
for the study. We based our choice on the important clinical outcome
specific to the interventions that were compared. All other study
outcomes were designated as secondary outcomes. Any discrepancies
were resolved by consensus. The chosen primary outcome of each study
was described as positive (a difference between treatments) or negative
(no difference between treatments).
Calculation of Type-II Error Rates
A type-II error (beta error) occurs when investigators conclude
that there was no difference between two interventions when a difference
actually exists. Study power (1 - β) is the ability
of a study to show a difference when one actually exists (Fig. 1). Standard post
hoc power calculations were conducted for each outcome
in the studies that demonstrated no or nonsignificant differences between
treatment groups. The method of calculation used to determine study
power depended upon the type of outcome (continuous or dichotomous).
For continuous outcomes (such as time to fracture-healing in weeks),
standard power calculations were performed12 (see
Appendix [Formulae for Standard Power Calculations]). When
outcomes were dichotomous (such as the presence or absence of deep
infection), we chose to calculate study power by the method of Pocock12 (see Appendix [Formulae
for Standard Power Calculations]). In both instances, the
area under the curve for the calculated Zβ values
was determined from a standard normal curve table. The power was
calculated by subtracting the area by 1. Acceptable study power
was agreed a priori to be 80% (type-II
error of £0.20).
Validity of Power Calculations
To ensure accuracy, the same two investigators independently performed
a random sample of the power calculations for thirty articles. The
remaining power calculations were not performed until 100% agreement
was obtained. Inconsistencies between study results and power calculations
were also examined independently as an additional check for validity. We
also calculated standard 95% confidence intervals for each
outcome and compared them with post hoc power calculations
as a final check of validity.
Assessing Reviewer Agreement
Agreement in the application of study eligibility criteria as well
as the identification of study outcomes and study results (positive
or negative) was quantitated with the kappa statistic with quadratic
weighting. The kappa statistic, a measure of the agreement between
two or more observers beyond chance, provided a measure of agreement
between the reviewers with regard to titles, abstracts, and methods
sections of potentially relevant studies. Donner and Klar13 and Fleiss14 provided
persuasive arguments in favor of the use of this statistic instead
of other measures of interobserver agreement that have been proposed.
Literature Search
We identified 620 potentially relevant study titles from the MEDLINE
database search. Application of the study eligibility criteria eliminated
433 titles and left 187 for further consideration. The advanced
PubMed and Cochrane database searches yielded an additional two
articles not identified by the MEDLINE search. A review of 11,800
study titles from a hand search of the four journals from the previous
ten years identified seven additional potentially eligible studies.
Bibliography searches and suggestions from content experts did not
yield additional relevant studies. In total, 196 studies were considered
to be potentially eligible on the basis of the study title alone
and were retrieved for a detailed review (see electronic Appendix [references]).
Agreement on the application of eligibility criteria to the study
titles was substantial (kappa = 0.88, 95% confidence
interval = 0.81 to 0.95).
Complete Manuscript Review
Application of the eligibility criteria to 196 complete articles eliminated
seventy-nine studies (Table I). The majority of these studies
(forty-three [54%] of the seventy-nine)
were excluded because the authors reported a positive result (or
a significant outcome) and were therefore ineligible for type-II
error calculation.
Twenty studies were eliminated because more than two fracture
treatment methods were used. Four articles were found to be duplicate
publications of the same research presented in other articles and
were removed from the study group. Three studies that involved children
were also eliminated because we aimed specifically to identify studies
of adult patients. Two studies that had significant primary or secondary
end points were removed, and one study was removed because it did
not focus on fracture management. Six studies were found to have
insufficient information for statistical calculation of the study
power and were thus eliminated. Ultimately, 117 trials with nonsignificant
results met all of the eligibility criteria and were used for all
subsequent power analyses.
Study Characteristics
The 117 eligible trials were published in twenty-five different journals
(see electronic Appendix [Table E-1]). The majority of
the studies (seventy-three [62%] of the
117) were published in The Journal of Bone and Joint Surgery (American
and British editions), Acta Orthopaedica Scandinavica, Clinical
Orthopaedics and Related Research, and Injury. Forty-six
(39%) of the studies were conducted in Scandinavia; twenty-three
(20%), in North America; twenty-two (19%), in
the United Kingdom; and twelve (10%), in other countries
in Europe (see electronic Appendix [Table E-2]).
A surgeon was the first author of 108 (92%) of the articles,
and nine (8%) articles were written by nonsurgeons. None
of the randomized trials had an epidemiologist as the first author, and
only four (3%) had at least one author with cited training in
biostatistics (MSc or PhD) or affiliation with a department of statistics,
public health, or clinical epidemiology. A total of 19,942 patients
were randomized in the 117 trials. Study sample sizes ranged from
ten to 662 patients (mean and standard deviation, 95 ±
79 patients). One hundred and eight (92%) of the studies
were conducted at only one center, and 115 (98%) focused
upon interventions related to fracture repair. Fractures of the
hip were the primary focus of forty (34%) of the studies.
Outcomes Assessment
The vast majority of the eligible trials (110; 94%)
involved multiple outcomes, but they were not explicitly defined
as primary or secondary end points. On the basis of the nature of the
treatment comparisons in each trial, we identified 190 primary outcome
measures (see electronic Appendix [Table E-3]).
Almost 50% of the articles reported primary outcomes such
as clinical or functional scores (accounting for forty-seven [25%] of
the 190 primary outcomes), radiographic results or scores (twenty-one;
11%), or complications (twenty; 11%). We found
that secondary end points were reported 101 times in the 117 trials.
They included complications (accounting for fourteen [14%] of
the 101 secondary outcomes), implant failures (twelve; 12%),
pain (ten; 10%), and reoperations (nine; 9%) (see
electronic Appendix [Table E-4]).
Study Power and Type-II Error Rates
The study power for the 190 defined primary outcomes averaged
24.65% (range, 2% to 99%), which corresponded
with an average beta value of 0.75 (range, 0.01 to 0.98) (Table II). We found
that 172 of 190 primary outcomes were limited by type-II errors.
Analysis of secondary outcomes revealed that the study power averaged
only 19.66% (range, 2% to 99%); thus,
the average beta value was 0.80 (range, 0.01 to 0.98). Type-II errors
limited secondary outcomes in 113 of the 117 trials. Of the 117
studies, only five (4%) even mentioned study power in the
methods section.
We performed a systematic review to examine the rates of type-II
errors in 117 clinical trials with "negative" outcomes in
the orthopaedic trauma literature. The current study was strengthened
by our use of predefined eligibility criteria, a comprehensive search
of the literature to identify relevant studies, assessment of the
reproducibility of study selection, determination of the primary
and secondary outcomes for each study by consensus, and detailed
calculations of study power performed in duplicate for each eligible
study. The majority of studies (95%) that met our eligibility
criteria did not meet conventional standards of acceptable type-II
error rates (study power of 80% or beta value of £0.20)
with regard to both their primary and their secondary outcomes.
The results are limited by the fact that we determined the primary
and secondary outcomes by consensus because few, if any, authors
explicitly stated the primary outcome in their study. Moreover,
as we identified only the articles published in English, it remains
uncertain whether these findings can be generalized to articles
published in other languages.
Most surgeons are familiar with the concept that the results
of a particular study may appear to be true when, in fact, they
are due to chance (or random sampling error). This erroneous false-positive
conclusion is designated as a type-I or alpha error (Fig. 1). By convention,
most authors of orthopaedic studies adopt an alpha error rate of
0.05. Thus, investigators can expect a false-positive error about
5% of the time5,15.
Less appreciated by investigators who conduct surgical trials is
the risk of concluding "no difference" between
treatments when a difference actually exists (Fig. 1). This type
of conclusion is termed a type-II error (beta error). It is equally
important to minimize the probability of a type-II error as it is
to minimize the probablility of a type-I error. By convention, investigators
in clinical orthopaedic trials have designated acceptable type-II
error rates as 0.20, a 20% chance of a false-negative conclusion.
Cohen16 defended the choice of
a type-II error rate that is four times larger than the type-I error
rate with the rationale that increasing the study power (or lowering
the type-II error rate) would result in large increases in the sample
size. For example, decreasing the beta error from 0.20 to 0.05 would
increase the sample size from approximately ten patients to 10,000 patients.
Such a sample size would be prohibitive for almost all trials in
orthopaedics5.
Additionally, this type of error is seen as less egregious because
a wrong conclusion that there is no difference between treatments
is not likely to effect a substantial change in the clinical practice
of medicine. This is not necessarily the case, however. For example,
one study (reference 178 in the list in the electronic Appendix)
demonstrated "no difference" between operative
and nonoperative management of calcaneal fractures. If the conclusion
of that study is true, no patient would be likely to choose to have
a calcaneus fixed, given the reported complications of operative
treatment. However, if the conclusion is false and there is actually
an advantage to reduction and fixation, the reported study will
have done a serious disservice to all patients with that injury.
The results of randomized studies are given much greater weight
than are those of retrospective or case-controlled studies, but
if they are underpowered they can lead to conclusions that may justify
an inferior treatment.
Study Power in Clinical Trials
The power of a study is the probability that it will demonstrate a
difference between two treatments when one actually exists2,3,9. Power (1 - β)
is simply the complement of the type-II error (beta error). Thus,
if we accept a 20% chance of an incorrect study conclusion
(β = 0.20), we also accept the corollary
that we will come to the correct conclusion 80% of the
time. Study power can be calculated before the start of a clinical
trial to assist with the determination of sample size or after the
completion of a study to determine whether the negative findings
were true or due to chance17.
The power of a statistical test is typically a function of the magnitude
of the treatment effect, the designated type-I error (alpha error)
rate, and the sample size5,12.
When a clinical trial is designed, investigators can decide upon
the desired study power (1 — β) and calculate
the sample size needed to achieve this goal. If investigators conduct
a post hoc power analysis after the completion
of a study, they use the actual sample size to calculate the power
of the study.
The magnitude of the effect is, for example, the difference between
the mean functional score of the surgically treated group and that
of the nonoperatively treated group. The difference can be divided
by the standard deviation of the control group to compensate for
the variability of the functional scores in each group (variance
or standard deviations about the mean scores). The resultant value
is termed the "effect size." Interpretation of
the effect size is largely a clinical decision and should represent
the point at which a surgeon will change his or her practice if
the results are true2,3,5. Cohen16 suggested broad guidelines for the
interpretation of effect sizes, with 0.2 considered a small effect;
0.5, a moderate effect; and 0.8, a large effect.
Sample size plays an important role in power analyses. The smaller
the difference that an investigator wishes to detect, the larger
the sample size required for the study. Extreme examples of large
sample sizes needed to detect relatively small treatment effects
can be seen in the clinical trials of treatments for cardiovascular
disease. In a recent trial of angiotensin-converting-enzyme inhibitor
therapy for patients at high risk for cardiovascular events, investigators
recruited 9297 patients to identify a 0.5% difference (p = 0.02)
in myocardial infarction rates between the treatment and placebo
groups18.
Study Results
More than 90% of the 117 trials included in the current
review were underpowered (<80%) for the treatment
effect of their primary outcome. This result is similar to the findings
of Chung et al.9, who reported
that 86% of thirty-nine trials with negative outcomes in
the hand literature lacked sufficient power to identify a moderate
treatment effect. Williams et al.6 reviewed
forty-one articles in the cardiovascular literature and identified
80% that were insufficiently powered to detect at least
one outcome. In a review of fourteen articles that reported negative
results in emergency medicine, Brown et al.8 found
none that met acceptable standards of study power.
Not unexpectedly, only five (4%) of the 117 eligible
articles in our study included a discussion of sample size and power issues.
Brown et al.8 found that only
one of fourteen reports in their series provided the study power
for the given sample size. Perhaps the lack of consideration of
sample size and power issues in the current group of trials was
related, in part, to the disproportionately high percentage of single-center
initiatives (>90%), led by surgeons. Involvement
of someone from a department of epidemiology or biostatistics was
infrequent.
Choice of Outcome Measure: Primary and Secondary
In their review of thirty-nine reports with negative outcomes in
the hand literature, Chung et al.9 identified
the choice of outcome measure as a potential source of insufficient
study power. The most common primary outcome measure in our series
was some measure of patient function, either a score or a qualitative
description. Of the forty-seven primary outcomes that were based
on patient function, ten were continuous variables (scores) and
thirty-seven were dichotomous variables (good versus poor). Only four
of the ten studies that reported continuous outcomes and none of
the thirty-seven studies that reported dichotomous outcomes were
adequately powered.
Considerations in the Plan of a Clinical Trial
Given the prevalence of type-II errors in clinical trials involving
orthopaedic trauma, future investigators should preplan estimated
sample-size requirements on the basis of conventionally accepted
standards for study power (80%) and type-I errors (a = 0.05).
Small pilot studies on a topic of interest or previous reports in
the literature can be helpful to determine the likely treatment
effect and to avoid type-II errors. For example, when a trial of
alternate strategies for the treatment of tibial shaft fractures
is planned, an investigator may identify, on systematic review of
the literature, an article that reports that the time to fracture-healing
with treatment A is 120 ± 45 days whereas the time
to healing with treatment B (control group) can be up to 140 ±
40 days. The expected treatment difference is twenty days, and the effect
size is 0.5 (20/40). We know that this is a moderate effect
and is likely to be clinically relevant16.
The anticipated sample size for this continuous outcome measure
is determined by use of the following equation12:

where Za = 1.96, Zβ = 0.84, s = 40,
and d = 20.
This study will require a total of approximately sixty-three patients
to obtain sufficient power to identify a difference of twenty days
between treatments. An investigator may then review the records
from his or her center for the last year and decide whether enough
patients are likely to present to the center to meet the sample-size
requirements.
Let us assume that this same investigator chooses nonunion instead
of time to union as the primary outcome. On the basis of previous
reports in the literature, treatment A will result in a 95% union
rate and treatment B (control group) will result in a 90% union
rate. A different sample-size calculation, for dichotomous variables,
is used12:
where PA = 95,
PB = 90, and f(a,β) = 7.9.
A total of 869 patients is required to identify a 5% difference in
nonunion rates between treatments. An investigator may realize that
this number is too large for the trial to be conducted at one center
and may try to obtain support from multiple sites for this trial.
In conclusion, a systematic review of articles in the orthopaedic
trauma literature showed that the majority of clinical trials are
limited with regard to sample size and are insufficiently powered.
Investigators can avoid the risk of type-II errors by enlisting
the aid of biostatisticians to perform a priori power
and sample-size calculations when clinical trials are planned.
Formulae for Standard Power Calculations
For continuous variables, we used the equation: N = {[(Zα
- Zβ)s]/δ},
where N = sample size, Za = 1.96, and D = difference
between treatments. The standard deviation (s) was determined by
calculating the pooled standard deviation between treatment groups:
s2 = [(Ntreatment -
1)(streatment)2 + (Ncontrol
- 1)(scontrol)2]/Ntreatment
- Ncontrol.
For dichotomous variables, we used the equation: Zβ = [n/2s]D
- Za. The standard deviation was calculated with the formula: s = PT(1
- PT) + PC (1 - PC)/2, where PT and PC = proportion
of events in the treatment and controls, respectively.
A reference list of the identified randomized trials and tables showing
the journals from which the 117 articles were obtained, characteristics
of the eligible trials, and primary and secondary outcomes are available
with the electronic versions of this article, on our web site at
www.jbjs.org (go to the article citation and click on "Supplementary
Material") and on our quarterly CD-ROM (call our subscription
department, at 781-449-9780, to order the CD-ROM).
Dorrey F,Swiontkowski MF. Statistical tests. What do we learn from a clinical study?
pValues versus confidence intervals.. Advances Orthop Surg,1997;21: 81-5. 2181
1997
Guyatt GH, Jaeschke R, Heddle N, Cook D, Shannon H,Walter S. Basic statistics for clinicians: 1. Hypothesis testing.. CMAJ,1995;152: 27-32. 15227
1995
[PubMed]
Guyatt GH, Jaeschke R, Heddle N, Cook D, Shannon H,Walter S. Basic statistics for clinicians: 2. Interpreting study
results: confidence intervals. CMAJ,1995;152: 169-73. 152169
1995
[PubMed]
Staquet MJ, Rozencweig M, Von Hoff DD,Muggia FM. The delta and epsilon errors in the assessment of cancer
clinical trials. Cancer Treat Rep,1979;63: 1917-21. 631917
1979
[PubMed]
Streiner DL. Sample size and power and psychiatric research. Can J Psychiatry,1990;35: 616-20. 35616
1990
[PubMed]
Williams JL, Hathaway CA, Kloster KL,Layne BH. Low power, type II errors, and other statistical problems
in recent cardiovascular research. Am J Physiol,1997;273: 487-93. 273487
1997
Guyatt GH, Sackett DL,Cook DJ. Users’ guides to the medical literature. II.
How to use an article about therapy or prevention. A. Are the results
of the study valid? Evidence-Based Medicine Working Group. JAMA,1993;270: 2598-601. 2702598
1993
[PubMed][CrossRef]
Brown CG, Kelen GD, Ashton JJ,Werman HA. The beta error and sample size determination in clinical
trials in emergency medicine. Ann Emerg Med,1987;16: 183-7. 16183
1987
[PubMed][CrossRef]
Chung KC, Kalliainen LK,Hayward RA. Type II (beta) errors in the hand literature: the importance
of power. J Hand Surg [Am],1998;23: 20-5. 2320
1998
[PubMed][CrossRef]
Freiman JA, Chalmers TC, Smith H Jr,Kuebler RR. The importance of beta, type II error and sample size
in the design and interpretation of the randomized control trial.
Survey of 71 "negative" trials. N Engl J Med,1978;299: 690-4. 299690
1978
[PubMed][CrossRef]
Mittendorf R, Arun V,Sapugay AM. The problem of the type II statistical error. Obstet Gynecol,1995;86: 857-9. 86857
1995
[PubMed][CrossRef]
Pocock SJ. Clinical
trials: a practical approach. New York: Wiley; 1983. p
123-40.
Donner A,Klar N. The statistical analysis of kappa statistics in multiple
samples. J Clin Epidemiol,1996;9: 1053-8. 91053
1996
[CrossRef]
Fleiss JL. Measuring agreement between two judges on the presence
or absence of a trait. Biometrics,1975;31: 651-9. 31651
1975
[PubMed][CrossRef]
Dorrey F,Swiontkowski MF. Statistical tests. What they tell us and what they don’t. Advances Orthop Surg,1997;21: 81-5. 2181
1997
Cohen J. Statistical
power analysis for the behavioral sciences. Rev. ed. New
York: Academic Press; 1977.
Goodman SN,Berlin JA. The use of predicted confidence intervals when planning experiments
and the misuse of power when interpreting results. Ann Intern Med,1994;121: 200-6. 121200
1994
[PubMed]
Yusuf S, Sleight P, Pogue J, Bosch J, Davies R,Dagenais G. Effect of an angiotensin-converting-enzyme inhibitor,
ramipril, on cardiovascular events in high-risk patients. The Heart Outcomes
Prevention Study Investigators. N Engl J Med,2000;342: 145-53. 342145
2000
[PubMed][CrossRef]