Research and treatment of pulmonary vascular diseases pose distinct challenges. Despite these hurdles (or perhaps because of them), a cadre of basic scientists, geneticists, physiologists, clinical researchers and clinicians have dedicated themselves to the study and care of patients with pulmonary vascular disease. The European Respiratory Journal (ERJ) series “Advances in Pathobiology, Diagnosis, and Treatment of Pulmonary Hypertension” 1–10 comprehensively categorises the achievements of these individuals and concludes this month with a fitting finale. Peacock et al. 11 (this issue of the ERJ) and Hoeper et al. 12 have written statements from recent conferences focused on the methodology of randomised controlled trials (RCTs) in pulmonary arterial hypertension (PAH); until recently, a seemingly quixotic discussion. As with all important research, these documents potentially raise more questions than they answer.
The availability of bosentan (an endothelin‐1 receptor antagonist) and other safe, effective therapies for PAH led the working groups to reevaluate the ethical and scientific merit of using untreated placebocontrol arms in RCTs 2, 13–15. This is a very controversial area faced by researchers in other conditions with preexisting effective therapies, such as congestive heart failure (CHF) 16–19. In general, it is considered to be ethical to perform such trials if: 1) the patient is fully informed about the alternatives; and 2) there is minimal risk of irreversible or severe morbidity, or mortality, posed by withholding effective therapy.
While the former concern is relatively straightforward, the working groups recognised the difficulty in defining the boundaries of the latter. As the success of future placebocontrolled trials in this rare disease will depend on patients' and physicians' willingness to enrol in such trials, future study designs will have to be acceptable to both groups. Such acceptability may be important to investigate before initiating new studies 20. Researchers will also have to satisfy local and central regulatory requirements if they are to delay clinicallyproven therapy during the conduct of RCTs with placebo.
Could other study designs suffice? The workinggroup statements considered performing future placebocontrolled studies in PAH with concurrent clinicallyproven therapy, a standard approach in CHF which minimises ethical dilemmas. Scientifically, it makes sense to test new therapies with distinct mechanisms in the setting of established treatment, as clinicians and patients demand new therapies that provide incremental improvement over the current standard of care.
A significant problem with this approach is that a therapy which may be effective when used singularly may not appear effective in combination with other therapies because of an interaction. This could potentially shelve a useful drug for patients who are intolerant of established therapies. Coadministration with other therapy may also require increased sample size and longer trial duration in order to achieve the same precision in results. Lastly, this approach is not appropriate for drugs with mechanisms similar to those of current therapies.
The working groups point out the significant difficulties with activecontrol equivalency (or noninferiority) studies in PAH. While a noninferiority study without a placebo arm can answer whether the effect of a new drug is “not worse” than that of an established drug, there is no way to prove that the active control (and subsequently the new drug) had a benefit in that particular study (assay sensitivity) 16. We depend entirely on evidence from other studies in identical populations that the active drug is effective. This assumption is very difficult to meet in common diseases with large clinical trials, much less in the rare disease of PAH. Other limitations, including increased sample size, confinement to the use of previous endpoints and inclusion criteria, establishment of a noninferiority margin, problems with “per protocol” analysis and the stance of regulatory bodies, made the working groups hesitant to endorse this study design for PAH trials 19, 21. Clearly, the design of future RCTs in PAH will have to meet a combination of ethical, scientific and feasibility requirements.
The statements provide a muchneeded definition for endpoints in RCTs for PAH. The ultimate clinical endpoint for this fatal disease is timeuntildeath. Certain PAH therapies, such as intravenous epoprostenol and lung transplantation, entail risk and some compromise in quality of life as a tradeoff for the benefit received, so that prolonging the time until resorting to these modalities would also be desirable.
The working groups endorsed the use of combining these events in composite endpoints. The advantages of composites include increased precision (from an increased number of events), a less restrictive definition of “failure of therapy”, and the inclusion of multiple endpoints without increasing Type I error 22. A potential drawback of this approach is a less certain answer. Therapeutic decisions and “clinicalworsening” endpoints included in the composite endpoint are often subjectively determined. Significant effects on such endpoints may not be generalised without objective protocol guidelines. Also, the components of a composite endpoint are equally weighted. Clearly, hospitalisation or initiation of epoprostenol is preferable to death; however, these three events would be considered equivalent in a combined endpoint.
Substantial improvement or maintenance of daily function and quality of life (intermediate endpoints) in this progressive disease would also be quite valuable, even if the time until ultimate clinical endpoints was unchanged in the end. Intermediate endpoints are distinct from surrogate endpoints, defined by the Food and Drug Administration as “a laboratory measurement or physical sign that is used in therapeutic trials as a substitute for a clinically meaningful endpoint... and is expected to predict the effect of the therapy” 23. Valid surrogate endpoints are measures or quantities which, although not necessarily clinically important in themselves, reflect the effect of a therapy on the ultimate clinical endpoints.
A single measure may be both a surrogate and an intermediate endpoint. For example, it appears that therapy which affects changes in distance walked in six minutes (6MWD) also reduces mortality 24, so that this measure may be a surrogate endpoint in PAH. However, the 6MWD may also be an intermediate endpoint, in that some absolute change in distance walked may actually improve a patient's quality of life. In this latter case, while we may not draw inferences about beneficial effects on survival, we would welcome a drug which improved function in a clinically significant way (providing it did not shorten survival).
Peacock et al. 11 and Hoeper et al. 12 highlight the unclear clinical relevance of certain parameters; these may then not be optimal intermediate endpoints. On the other hand, if we believe that a measure, such as cardiac index, is a valid surrogate endpoint in PAH, even a small effect on this measure may still have dramatic implications regarding longterm outcomes.
Peacock et al. 11 and other authors 25, 26 touch on the criteria for valid surrogate endpoints, which are helpful to review, as follows. 1) Surrogate measures must be reliable. The parameter must not vary within an individual subject and the measurement must have intra and intercentre reproducibility for multicentre trials. 2) Biological plausibility is a consideration. The surrogate should be integral (or closely related) to a pathophysiological factor which leads to morbidity or mortality in PAH. 3) There must be a strong prognostic relationship between the marker and clinical outcomes, preferably documented in multiple studies. 4) There must be evidence from other RCTs in PAH that a quantitative modification of the surrogate measure reliably leads to a similar modification in the target outcome.
Investigators must incorporate the measurement of potential surrogates as secondary outcomes into RCTs, which are adequately powered to show an effect of an intervention on established endpoints. These trials must show that the effect of an intervention on clinical outcomes is preceded by some quantifiable effect on the surrogate, in order to validate the measure in PAH. While such validation studies may be expensive, these investigations prevent the use of unvalidated surrogate endpoints and uninterpretable results. Inconclusive studies are more costly to sponsors and can possibly delay approval of effective therapies.
Which potential surrogates considered by the working groups fulfil these criteria? Haemodynamics fulfil all of the criteria for valid surrogates 24, although the “right” parameter for PAH studies is unclear and could differ depending on the intervention being tested. For example, cardiac index may serve well as a surrogate endpoint for a clinical trial of a pulmonary vasodilator, whereas a pure inotrope may have a shortterm effect on this endpoint without having a significant impact on longterm survival.
The invasiveness of these measures is a drawback. Also, regulatory agencies are hesitant to accept haemodynamics as surrogates due to their inadequacy in CHF, where therapies which improved the haemodynamic profile subsequently had adverse effects on overall outcomes 27. It is difficult to extrapolate the success or failure of potential surrogate endpoints, or therapies, from one disease to the other. Decisions regarding acceptability of endpoints must be individualised according to disease.
Echocardiographic measurements also show promise as surrogates, since they fulfil all of the criteria 28, 29. One benefit over haemodynamics is the noninvasiveness of the measure. Intercentre variability and study quality, based on subject characteristics, are potential weaknesses.
Plasma biomarkers are an attractive option as assessment is relatively noninvasive, some have low withinsubject variability and assays may be performed centrally. While certain biomarkers meet the first three criteria (e.g. brain natriuretic peptide 30), RCTs of effective therapies with biomarker data have not been published, so the last criterion remains to be fulfilled.
6MWD appears to satisfy the requirements for validity and is the primary endpoint in virtually all recent RCTs in PAH, although some issues remain 2. This effortdependent endpoint can be subtly and unintentionally biased by clinicians or subjects in clinical trials of therapies with prominent sideeffects or telltale laboratory abnormalities. For example, a patient in a study of a prostacyclin analogue with prominent flushing, jaw pain and gastrointestinal sideeffects may put forth a different effort during exercise than a patient with no sideeffects. In addition, conduct of the test may affect subject performance; methods must be standardised to ensure consistency in studies.
Not only is it important to decide which measure is the best surrogate, but which form of that measure should be used. A difference in mean 6MWD between two groups at the conclusion of a study is a different result than a difference between the changes in 6MWD between two groups. One of these functional forms may be a valid surrogate endpoint and the other may not. Future research should better define optimal parameters.
How can we maximise what we learn from each clinical investigation? The working group participants proposed sharing individualpatient data. There has been a relative explosion of RCTs in PAH. Each trial assembles a wellcharacterised cohort of patients with an exceedingly rare disease, offering the opportunity to: 1) better define the epidemiology of the disease and predictors of outcome; 2) merge datasets to perform metaanalyses; and 3) collect samples of plasma and genetic material for highthroughput analyses. While additional costs are a concern to sponsors, the first two options mainly require prospective planning in terms of informed consent and appropriate safeguards for datasharing and cooperation among study investigators, which are ostensibly achievable goals.
A twobytwo factorial trial allows investigators to study two therapies “for the price of one” in the right setting. In this design, all patients are randomised to either active drug 1 or placebo 1 and either active drug 2 or placebo 2. Such trials are most efficient when the effect of each therapy does not depend on the presence or the absence of the other therapy. While these studies may be slightly more complex, they present the opportunity to test two drugs with a single trial in this rare disease.
These workinggroup statements initiate the discussion of how we should best study potential therapies for pulmonary arterial hypertension. We are fortunate to have the opportunity to address these issues. While many unknowns regarding the longevity of effects of new medical therapies and the timing of lung transplantation remain, discoveries may lie in novel methodological approaches. Future working groups such as these will surely be fertile ground for innovation, as we reevaluate the answers we already have and formulate the next questions we want to ask.
- © ERS Journals Ltd