Elsevier

The Lancet

Volume 372, Issue 9656, 20 December 2008–2 January 2009, Pages 2152-2161
The Lancet

Lecture
De testimonio: on the evidence for decisions about the use of therapeutic interventions

https://doi.org/10.1016/S0140-6736(08)61930-3Get rights and content

Section snippets

Randomised controlled trials

The introduction of randomised controlled trials (RCTs) in the middle of the 20th century, has had a profound effect on the practice of medicine and its essential features are well described.4, 6, 7 An RCT involves comparing the effects of two (or more) interventions that have been allocated randomly to groups of contemporaneously treated patients.

Double-blind RCTs, when properly done and analysed, unquestionably provide confidence in the internal validity of the results6, 8 in so far as the

The null hypothesis

The analysis of RCTs has traditionally been based on the null hypothesis, which presumes there is no difference between treatments. The null hypothesis is tested by estimating the probability (the frequency) of obtaining a result as extreme, or more extreme, as the one observed were there no difference. If the probability is less than some arbitrary value—usually less than 1 in 20 (ie, p<0·05)—then the null hypothesis is rejected. This so-called frequentist approach to the design and analysis

Probability

In the frequentist approach, if the p value is sufficiently small either the null hypothesis is false or a very rare event has occurred. By convention, a probability of less than 5% (ie, p<0·05) is generally used to distinguish between these two possibilities. However, a p value of greater or less than 0·05 neither disproves or proves (respectively) the null hypothesis. Some, though not all the problems with p values can be avoided by expressing results as confidence intervals, which indicate

Generalisability

RCTs are generally done in selected populations of patients for a finite—usually relatively brief—period of time. In clinical practice the intervention is likely be used in a more heterogeneous population of patients—often with comorbid illnesses—and frequently for much longer periods. The extent to which the findings from RCTs have external validity and can be extrapolated—or generalised—to wider populations of patients28, 29 has become an increasingly important issue. Table 2 outlines the

Resources

The costs of RCTs are substantial in money, time, and energy. Figure 2 shows the range of costs of 153 RCTs that were completed in 2005–06. These data combine the costs of trials that were funded by the UK National Institute for Health Research and the UK Medical Research Council as well as those incurred by three major pharmaceutical companies in their phase II and III studies. The median cost was £3 202 000 with an interquartile range of £1 929 000 to £6 568 000.

These data are neither

Observational studies

The nomenclature describing observational (ie, non-randomised) studies is confused. I eschew a distinction between controlled and uncontrolled studies because all observational studies involve some form of implicit (informal) or explicit (formal) comparisons. Nor do I consider the terms cohort studies or quasiexperimental studies particularly illuminating. Cohort studies include studies that are, in reality, distinct entities, and quasiexperimental is a term that I have never found to be

Historical controlled trials

Table 3 lists examples of interventions of unquestioned benefit, as demonstrated by historical controlled trials where comparisons are made between a new intervention and past experience with the condition. In the past, the use of historical controls has been much criticised.6 During the late 1980s, however, clinical trialists became less hostile to the concept. Prompted by the emerging AIDS epidemic they accepted45 that some of the traditional approaches to clinical trial design were

Case–control studies

Case–control studies compare the use of an intervention in groups with and without a particular disease or condition. These studies, like other observational designs, provide information about an association with exposure to a particular intervention but do not necessarily show whether the relation is causal. The problems of selection bias and confounding are no less relevant to the interpretation of case–control studies than they are with other controlled observational designs. They can,

Hierarchies of evidence

The first hierarchy of evidence was published in the late 1970s.64 Since then many similar hierarchies, of increasing elaboration and complexity, have appeared. A survey in 200265 identified 40 such grading systems and study in 2006 identified 20 more.66

The hierarchy in table 1, like others, places RCTs at the highest level with a lesser place for those based on observational studies. This hierarchical approach to evidence has not only been adopted by many in the evidence-based medicine and

Conclusion

Experiment, observation, and mathematics, individually and collectively, have a crucial role in providing the evidential basis for modern therapeutics. Arguments about the relative importance of each are an unnecessary distraction. Hierarchies of evidence should be replaced by accepting—indeed embracing—a diversity of approaches. This is not a plea to abandon RCTs and replace them with observational studies. Nor is it a claim that the bayesian approaches to the design and analysis of

First page preview

First page preview
Click to open first page preview

References (71)

  • R Harbour et al.

    A new system for grading recommendations in evidence based guidelines

    BMJ

    (2001)
  • AR Jadad et al.

    Randomized controlled trials

    (2007)
  • MD Rawlins et al.

    National Institute for Clinical Excellence and its value judgments

    BMJ

    (2004)
  • SJ Pocock

    Clinical Trials

    (1983)
  • DG Altman et al.

    The revised CONSORT statement for reporting randomized trials: explanation and elaboration

    Ann Intern Med

    (2001)
  • D Moher et al.

    The CONSORT Statement: revised recommendations for improving the quality of reports of parallel-group randomized trials

    Ann Intern Med

    (2001)
  • M Clarke et al.

    Reports of clinical trials should begin and end with up-to-date systematic reviews of other relevant evidence: a status report

    J R Soc Med

    (2007)
  • MD Rawlins

    De testimonio

    (2008)
  • K Rothman

    No adjustments are needed for multiple comparisons

    Epidemiology

    (1990)
  • VM Montori et al.

    Randomised trials stopped early for benefit: a systematic review

    JAMA

    (2005)
  • P Armitage et al.

    Statistical methods in medical research

    (2002)
  • SJ Pocock

    When not to stop clinical trials for benefit

    JAMA

    (2005)
  • SJ Pocock et al.

    More on subgroup analyses

    N Engl J Med

    (2008)
  • R Wang et al.

    Statistics in medicine—reporting of subgroup analyses in clinical trials

    N Engl J Med

    (2007)
  • D Ashby

    Bayesian statistics in medicine: a 25 year review

    Stat Med

    (2006)
  • DJ Spiegelhalter et al.

    Bayesian approaches to randomised trials

    J R Soc Stat [Ser A]

    (1994)
  • DJ Spiegelhalter et al.

    Bayesian methods in health technology assessment: a review

    Health Technol Assess

    (2000)
  • Feasability, safety, and efficacy of domiciliary thrombolysis by general practitioners: Grampian region early anistreplase trial

    BMJ

    (1992)
  • SJ Pocock et al.

    Grampian region early anastroplase trial

    BMJ

    (1992)
  • JM Bland et al.

    Bayesians and frequentists

    BMJ

    (1998)
  • D Berry et al.

    Introduction to bayesian methods: floor discussion

    Clin Trials

    (2005)
  • Guidance for the use of bayesian statistics in medical device trials

    (2006)
  • DA Berry

    Introduction to bayesian methods III: use and interpretation of bayesian tools in design and analysis

    Clin Trials

    (2005)
  • AP Grieve

    25 years of Bayesian methods in the pharmaceutical industry: a personal, statistical bummel

    Pharm Stat

    (2007)
  • N Black

    Why we need observational studies to evaluate the effectiveness of health care

    BMJ

    (1996)
  • Cited by (251)

    View all citing articles on Scopus
    View full text