Starting or Strengthening a Drug Bulletin - A Practical Manual
(2005; 165 pages) View the PDF document
Table of Contents
View the documentPreface
View the documentHow the manual was produced
View the documentAbout ISDB
View the documentExecutive summary
Open this folder and view contents1. Introduction
Open this folder and view contents2. Rational use of medicines
Open this folder and view contents3. What are drug bulletins?
Open this folder and view contents4. Defining aims, target and type of bulletin
Open this folder and view contents5. Planning resources
Open this folder and view contents6. Planning bulletin production: schedules and timing
Open this folder and view contents7. The editorial process
Close this folder8. Reviewing a new drug: is it a therapeutic advance?
View the document8.1 Introduction
View the document8.2 When is a new treatment a therapeutic advance?
Open this folder and view contents8.3 Collecting evidence about the drug
Close this folder8.4 Evaluation in terms of efficacy, harm and convenience
View the document8.4.1 Efficacy
View the document8.4.2 Adverse effects/harms
View the document8.4.3 Convenience
Open this folder and view contents8.5 Judging the overall value of the drug
Open this folder and view contents8.6 Cost
View the document8.7 What patients need to know
View the document8.8 References
Open this folder and view contentsAnnexe to Chapter 8: Evaluating harm
Open this folder and view contents9. Design and production
Open this folder and view contents10. Dissemination
Open this folder and view contents11. Organizational and legal issues
Open this folder and view contents12. Evaluating quality and usefulness
Open this folder and view contents13. Partnership and collaboration
Open this folder and view contents14. Keeping records and creating a memory
Open this folder and view contentsAppendix: Electronic sources of information

8.4.1 Efficacy

Efficacy describes to what extent a drug achieves its intended effect (e.g. longer survival, reduced morbidity, pain relief, contraception).

1. Strength of the evidence

The strength of the evidence must be assessed by looking at the primary outcome measures used in the trials and at other aspects of the design of the study. The outcome measures and the design and conduct of randomised controlled trials are often inadequate, and lead to unreliable or irrelevant conclusions. Therefore careful appraisal of trial reports is needed to assess the reliability of the trial results.

When evaluating a treatment for a disease from which patients die, the most obvious and measurable outcome is whether the treatment improves survival. However, even when ‘survival’ is the most appropriate primary endpoint, very often in clinical trials a surrogate endpoint, such as transient symptomatic relief and/or improvement of certain laboratory tests, is used instead. The reason for this is that it allows trials to be shorter or to require the inclusion of fewer patients.

Another problem is the use of combined endpoints (e.g. definite myocardial infarction (MI) and death from MI (cause-specific morbidity and mortality)). A combined endpoint may miss important effects of treatment that actually shorten overall survival and/or lead to other serious complications.5 The primary endpoint of real interest for patients is death from all causes, with all serious events, such as cancer, included in the endpoint. Box 8.1 shows a hierarchy of endpoints. The hierarchy is from the US National Cancer Institute and relates to evaluation of treatments for cancer, but it can be adapted to other therapeutic areas too. Box 8.2 shows a hierarchy of study design.

These hierarchies do not include non-clinical evidence, which should also be considered in an evaluation - e.g. pharmacokinetic studies, dose-ranging studies, studies in healthy volunteers, toxicology (see Box 8.1 and the annexe at the end of this chapter).

Box 8.1 Strength of endpoints (ranked in descending order)


Total mortality (or overall survival from a defined point in time)
Comment: This outcome is arguably the most important one to patients and is also the most easily defined and least subject to investigator bias.



Cause-specific mortality (or cause-specific mortality from a defined point in time)
Comment: Although this may be the most biologically important in a disease-specific intervention, it is a more subjective endpoint than total mortality and more subject to investigator bias in its determination. It may also miss important effects of therapy that actually shorten overall survival. For example, oestrogens in the treatment of prostate cancer.



Cause-specific morbidity alone or in combination with B-1 (*a)
Comment: It is also a more subjective endpoint than total mortality and more subject to investigator bias in its determination. It may also miss important harm induced by therapy that may actually shorten overall survival and/or quality of life.



Carefully assessed quality of life (i.e. assessed independently of other indicators of activity in daily life) (*b).



Indirect surrogates


1) Disease-free survival
2) Progression-free survival
3) Tumour response rate
4) Scales and other measures that are not clinically validated in the specific clinical condition or population (*a).


Source: Based on a hierarchy from the US National Cancer Institute web site at:

*a: Added by the manual’s editors, to make more applicable to other diseases and interventions.

*b: If “carefully assessed quality of life” is combined with overall survival, the combined endpoint could be classified as A-2.

Box 8.2 Hierarchy of study design


Systematic review (with homogeneity) of randomised controlled trials or single large-scale randomised controlled trial (mega-trial).



At least a single randomised controlled trial.



Systematic review of cohort studies or non-randomised controlled trials.



Systematic review of case-control studies.



Case series (includes poor quality cohort and case-control studies) (*a).



Expert opinion without explicit critical appraisal.


Source: Based on a hierarchy from Levels of Evidence and Grades of Recommendation by Centre for Evidence-Based Medicine, Institute of Health Sciences, Oxford, UK.
[ of evidence.asp#notes]

*a: All or none case-series (i.e. when all patients died before the treatment became available, but some now survive on it; or when some patients died before treatment became available, but none now die on it, are classified as 1c).

2. How reliable is the evidence?

It is helpful to use a checklist of strength of endpoints (e.g. as in Box 8.1) and of other design features to look out for when appraising a clinical trial. A simple validated scale such as the JADAD6 scale is very easy for beginners to use for assessing quality of clinical trials (note: this was originally developed as an instrument for assessing a large number of clinical trials for systematic reviews and/or surveys, such as time trends of quality of clinical trials). It involves asking five questions when assessing a clinical trial:

1. Is the study randomised?
2. Is the study double blinded?
3. Is there a description of withdrawals?
4. Is the randomisation adequately described?
5. Is the blindness adequately described?

A positive answer to any question scores 1 point. Good quality randomised controlled trials score 3 points or more.

Other tools are the CONSORT statement7 (for clinical trials) and QUORUM statement8 (for systematic reviews), which set out standards for the reporting of clinical trials and systematic reviews. The UK “Critical Appraisal Skills Programme” (CASP) has tools for appraising various types of study on its web site

3. Are other key aspects of the design of the trial appropriate and relevant?

Is the test treatment and/or comparator appropriate?

Occasionally, wrong comparators are used. Examples include comparing the new drug with a non-standard treatment, with a standard treatment at suboptimal dose or at too high a dose inducing more adverse effects, or with placebo when an established treatment already exists. Another important consideration is whether the new drug used in the trials is exactly the same as the one finally marketed: i.e. the same pharmaceutical form, with the same excipients, same administration route, etc. This might have an influence on adherence to treatment, or on adverse effects, etc.

Is the study population and/or context appropriate?

Study populations and/or contexts often do not represent those where the new treatment would be applied. For example, evidence of efficacy of oseltamivir for influenza prevention is very limited in people at high risk (e.g. because of old age, chronic obstructive respiratory disease, cardiovascular disease and diabetes).9 Nevertheless, it is licensed for influenza prevention in these populations in many countries.

Is the trial duration appropriate?

Therapies for chronic conditions need long-term studies. Check the appropriateness of the trial duration according to the purpose of the trial. Studies lasting two or three months are not sufficient to draw reliable conclusions about the benefit or harm from drugs used for life-long therapy (e.g. antihypertensive or lipid-lowering drugs). For example, it may be many years before it becomes apparent that a drug causes cancer.

Is the statistical test appropriate for the purpose of the study?

Comparative trials are often designed to prove only that the new drug is no less or only as effective as a standard treatment (non-inferiority or equivalence trials). These do not help in determining if the new drug is an advance over existing therapy. Sadly, these represent a large proportion of industry-sponsored clinical trials. A superiority test is usually reserved for trials in which the new drug is tested against placebo.

4. Could there be bias in the trial?

Consider carefully whether bias could have been introduced. It is fairly easy to check for bias:

at the start of the trial, by looking for differences between the baseline characteristics of the groups. But be careful, because the p values for differences in key baseline data are sometimes close to or even less than 0.05, but authors may still claim “no difference in baseline characteristics” and make no adjustment in analysing the results.

during the trial the most important thing to check is whether the trial report takes account of patients who have dropped out of the trial due to adverse reactions. This influences both the comparability of groups and the results.

at the end of the trial the most important check on bias occurring is whether measurement of outcome has been manipulated, particularly when subjective outcome measures are used.10

5. Interpretation of results

You should look carefully at the authors’ interpretation of the result. Authors often misinterpret results especially when they have conflicting interests. Look at the statistical tests, the level of significance, confidence intervals, p value and statistical power for the efficacy analysis. When analysing harm, look carefully for signs of harm from adverse events, including whether any adverse reaction is misclassified as a non-drug related adverse event (see Section 8.4.2 and the annexe at the end of this chapter for a further discussion of evaluation of harm). Secondary outcome measures of cause-specific morbidity may only be listed as adverse events when the intervention induced more morbidities.

Be aware that authors might present interpretations of results without giving the reason or a citation to validate their conclusions. For instance, the authors of the VIGOR study of rofecoxib (two employees and 11 paid consultants of the sponsor) wrote that the comparator naproxen was cardioprotective.11 By claiming this they tried to hide the cardiotoxicity of rofecoxib.12 Headings and abstracts which are routinely reproduced in databases like PubMed may be misleading and not supported by the data of the article.13So never draw any conclusion without having read the complete original article.

6. Do the investigators have any conflicting interests?

You should look at the declaration of interests. Is the publication written by employees of a pharmaceutical company and/or by paid consultants or by authors who declare that there are no conflicting interests? Conflicts of interests of the authors could be the reason for interpretations of the results which cannot (completely) be understood from the data presented.

Box 8.3 summarises the areas that need to be checked when appraising the evidence. You should write about the deficiencies in the trials in your article, for example, if you find no data on overall survival related to the intervention, you should state this clearly in your assessment. However, be aware that with increasingly sophisticated manipulation of data (especially in trials sponsored by pharmaceutical companies), it may not be possible to uncover all the faults in a clinical trial.14

Box 8.3 Summary of what to check when appraising a clinical trial

1. What was the purpose of the study?

• Is treatment necessary? (See Section 8.2.1 and Box 8.1).

• Is the outcome measure appropriate and relevant? (See Box 8.1).

• Is an effective treatment already available?

2. How strong is the study design? (see Box 8.2)

3. Are the other key design elements of the trial appropriate and relevant?

• Is the test treatment and/or comparator appropriate?

• Is the study population and/or context appropriate?

• Is the trial duration appropriate?

• Which is appropriate: superiority, equivalence, or non-inferiority test?

4. Is bias in the study possible?

at start of trial: method of randomisation; comparability of groups (difference in baseline data)

during and at the end of trial: masking (blinding); patients who withdrew from the study, e.g. as a result of adverse reactions?

at the end of trial: measurement of outcome - consider whether it could have been manipulated, e.g. particularly when subjective outcome measures are used.

5. Are there factors that might lead to misinterpretation of results?

• intention-to-treat analysis (ITT): was an ITT analysis planned at the outset and were the results analysed on the basis of ITT?

• efficacy analysis: statistical test, level of significance, confidence intervals, p value, statistical power,

• harm analysis: have any adverse effects been misclassified as adverse events (i.e. as unrelated to the drug treatment)?

6. Do the investigators have any conflicting interests?

to previous section
to next section
The WHO Essential Medicines and Health Products Information Portal was designed and is maintained by Human Info NGO. Last updated: December 6, 2017