Abstract
The French Pulmonary Hypertension Network (FPHN) registry and the Registry to Evaluate Early And Long-term Pulmonary Arterial Hypertension Disease Management (REVEAL) have developed predictive models for survival in pulmonary arterial hypertension (PAH). In this collaboration, we assess the external validity (or generalisability) of the FPHN ItinérAIR-HTAP predictive equation and the REVEAL risk score calculator.
Validation cohorts approximated the eligibility criteria defined for each model. The REVEAL cohort comprised 292 treatment-naïve, adult patients diagnosed <1 year prior to enrolment with idiopathic, familial or anorexigen-induced PAH. The FPHN cohort comprised 1737 patients with group 1 PAH.
Application of FPHN parameters to REVEAL and REVEAL risk scores to FPHN demonstrated estimated hazard ratios that were consistent between studies and had high probabilities of concordance (hazard ratios of 0.72, 95% CI 0.64–0.80, and 0.73, 95% CI 0.70–0.77, respectively).
The REVEAL risk score calculator and FPHN ItinérAIR-HTAP predictive equation showed good discrimination and calibration for prediction of survival in the FPHN and REVEAL cohorts, respectively, suggesting prognostic generalisability in geographically different PAH populations. Once prospectively validated, these may become valuable tools in clinical practice.
Abstract
REVEAL risk score and ItinérAIR-HTAP equation show geographic validity (generalisability) in distinct PAH populations http://ow.ly/INyx8
Introduction
Pulmonary arterial hypertension (PAH) is a rare, progressive condition characterised by increased pulmonary arterial pressure and a progressive increase in pulmonary vascular resistance leading to right ventricular failure and death [1–5]. PAH may be idiopathic, heritable (related to a genetic defect), or associated with another condition or exposure to toxins or drugs (e.g. appetite suppressants (anorexigens)). Despite progress in our management of PAH over the past decade and the availability of new therapies, patient outcome, although improved, remains poor [3, 6–10]. Timely and accurate estimates of mortality risk may prompt earlier initiation of interventions to improve survival [11].
Patient registries have provided important observational data that characterised the survival of patients with pulmonary hypertension [2, 3, 6, 7, 12–17]. A number of registries have been implemented over the past decade to collect data on patients with PAH and analyse the course of PAH in the current treatment era [6, 7, 16, 18–20]. Two such contemporary PAH registries include the US-based Registry to Evaluate Early And Long-term PAH Disease Management (REVEAL) [12, 20] and the French Pulmonary Hypertension Network (FPHN) registry [6, 21]. REVEAL enrolled adult and paediatric patients meeting a broad definition of World Health Organization group 1 PAH and receiving any current treatment. The FPHN registry enrolled patients with all forms of PAH, including patients enrolled in previous FPHN registries.
One of the pre-specified objectives of both REVEAL [12, 20] and the FPHN registry [6, 21] was to identify predictors of short- and long-term survival derived from baseline evaluations prior to treatment. REVEAL and FPHN investigators derived predictive models of survival based on multiple parameters, and have reported contemporary prognostic equations [2, 14]. Each of these equations has been statistically validated in a population from the respective registry that did not contribute to construction of the model. The results of these analyses suggested an improvement in the survival of patients with PAH compared with that reported at the time of the US National Institutes of Health primary pulmonary hypertension registry [13, 14, 18, 22–24].
Validating models developed in one cohort in a different patient population assesses their generalisability [25, 26]. In this report, we describe the results of a collaboration designed to assess the external validity (or generalisability) of the REVEAL and French risk-predictive models.
Materials and methods
Patient source
REVEAL and FPHN are observational, prospective registries of patients diagnosed with pulmonary hypertension in the USA (REVEAL) and France (FPHN), which were conducted during the same era of disease knowledge, patient management and treatment. The methodology of REVEAL and the baseline characteristics of enrolled patients have been described previously [12, 20]. Briefly, the REVEAL registry includes 3515 patients diagnosed with group 1 PAH enrolled from 55 participating centres (university-affiliated and community hospitals) between 2006 and 2009 (fig. E1) [25]. The FPHN is a national centralised network of specialised pulmonary hypertension units in university hospitals coordinated by one referral centre in France. The FPHN conducted a first PAH registry in 2002–2003 (hereafter referred to as the FPHN ItinérAIR-HTAP), which included 674 patients followed for at least 3 years (fig. E1) [6, 14, 15]. In November 2006, the FPHN initiated a new national registry (hereafter referred to as the FPHN registry) of all groups of pulmonary hypertension (fig. E1) [21]. Patients with a diagnosis of pulmonary hypertension on or after November 2006 are prospectively included in the FPHN registry. Patients with an earlier date of diagnosis were retrospectively registered. The FPHN registry remains open and currently includes more than 6000 patients with pulmonary hypertension.
Study design
The REVEAL and FPHN collaboration consisted of two modules: 1) evaluation of the FPHN ItinérAIR-HTAP predictive equation using a REVEAL validation cohort; and 2) evaluation of the REVEAL prognostic risk score calculator using a FPHN registry validation cohort.
The REVEAL, FPHN ItinérAIR-HTAP, and later FPHN registries were developed independently with different enrolment criteria, data collection and timing of follow-up. For this inter-study analysis, validation cohorts were built to approximate the inclusion and exclusion criteria originally defined by each registry to develop their predictive models. The parameters used to build the cohorts were not the parameters used to predict mortality. Study time-points for survival analyses were different between registries and will be addressed in the description of each module below. For both study modules, survival was censored at the target follow-up time upon which the individual equations were originally based (3 years for the REVEAL validation cohort of the FPHN ItinérAIR-HTAP equation [14, 15] and 1 year for the FPHN validation cohort of the REVEAL risk score [2, 13]). Patient values were not available for all variables included in the predictive models. Missing data did not pose a challenge for the French Registry validation of the REVEAL equation because missing data are explicitly included as part of the reference category, and assigned zero points, in the computation of the REVEAL risk score. For the REVEAL validation of the FPHN ItinérAIR-HTAP equation, however, there was no rule developed for missing data, so the analyses were carried out on a subset on patients with non-missing data for all three of the parameters of the FPHN ItinérAIR-HTAP equation variables. Following these two rules allowed validation of each cohort in the population for which the equations were intended to be used, as the REVEAL score was intended to provide a prediction even when data were missing and the FPHN ItinerAIR-HTAP equation was intended to provide a combined prognostic evaluation when both 6-min walking distance (6MWD) and cardiac output were available.
Module 1: evaluation of the FPHN ItinérAIR-HTAP predictive equation in a REVEAL validation cohort
The FPHN ItinérAIR-HTAP is a three-term equation (including female sex, greater 6MWD and higher cardiac output) predicting survival at 3 years from diagnosis [14, 15]. The kernel of the equation is: The survival at year t is computed as exp(−0.02−0.28×t)A(x,y,z). It was derived from survival analysis of a combined population of incident patients (n=56) and additional cases diagnosed within 3 years of study entry (n=134) recruited in the FPHN ItinérAIR-HTAP registry from October 2002 to October 2003 [14, 15]. To approximate the FPHN ItinérAIR-HTAP equation eligibility criteria [15], the REVEAL validation cohort for this analysis comprised recently diagnosed (diagnostic right heart catheterisation <1 year prior to enrolment), treatment-naïve patients aged ≥18 years with idiopathic PAH (IPAH), familial PAH (FPAH) or anorexigen-associated PAH (APAH), comparable to the incident population selected when developing the FPHN ItinérAIR-HTAP equation. This validation study cohort was then divided into subgroups of patients without missing data for all FPHN ItinérAIR-HTAP equation parameters versus those with any missing data.
For the FPHN ItinérAIR-HTAP equation development cohort, survival was estimated from diagnosis rather than enrolment (to avoid survivor bias) in a combined population of incident and prevalent patients (a delayed entry model was used to avoid immortal time bias associated with left truncation [27]). Transplanted patients were censored at the time of transplant, regardless of subsequent outcome, in order to correspond to the FPHN ItinérAIR-HTAP analysis even though patients were followed post-transplant in REVEAL. Thus, censoring at transplant is an analysis decision rather than a study design element.
Module 2: evaluation of the REVEAL risk score calculator in a FPHN validation cohort
The REVEAL simplified risk score calculator was derived from the predictive equation developed from survival analysis of the REVEAL cohort (n=2716) [2, 13]. It includes 19 variables significantly associated with 1-year survival, assigning weighted values for each independent prognostic factor of survival identified from multivariate analysis (table 1). The REVEAL predictive model was developed on a predominantly prevalent cohort. It was further validated in a prospective cohort of patients with newly diagnosed PAH showing that the prognostic equation accounts for survivor bias and is a good model for patients with newly diagnosed disease [2, 13].The FPHN validation cohort was selected to approximate the inclusion/exclusion criteria defined to develop the REVEAL risk score calculator [2, 13]. The data cut-off for the FPHN registry was November 2011.
For the REVEAL development equation cohort, time zero was defined by the patient enrolment date (date of informed consent). Enrolment date is not applicable to the FPHN registry because it is not used in a similar way as REVEAL. Thus, time zero was defined for the FPHN validation cohort as follows: for patients with an initial evaluation before November 1, 2006, time zero was defined as the date of the first visit after November 1, 2006 (inclusive). For patients with an initial evaluation after November 1, 2006 (inclusive), time zero was defined as the date of the initial evaluation. The REVEAL risk score calculator identified renal insufficiency to be a risk factor, but renal function data were not collected in the FPHN registry. Thus, the validation analysis was performed without the renal insufficiency parameter.
Statistical methods
The external validity of each equation was based on the performance of the prognostic models in terms of calibration and discrimination, following the approach of Altman et al. [28] and Cook [29]. Cook [29] notes “the performance of risk prediction models in the cardiovascular literature is often judged solely on the basis of the c statistic despite the existence of large prospective cohort studies from which risk can be estimated directly” and “calibration has largely been overlooked in discussions of model fit”. Therefore, we sought to demonstrate external validity through an evaluation of not just discrimination, but calibration as well.
Model calibration was assessed by comparing the Kaplan–Meier estimates and 95% confidence intervals observed in the validation cohort to the survival predicted from the equations; predicted survival was based on the published predictive equations. Discrimination for the equations and the REVEAL calculator was determined by calculating the probability of concordance (c-index) [30, 31]. In addition to the analysis of the published equation, the fit of the β-coefficients proposed in the FPHN ItinérAIR-HTAP predictive equation was further evaluated by fitting a new Cox proportional hazards model to the REVEAL data, while otherwise maintaining the structure of the original three-term FPHN ItinérAIR-HTAP model.
Results
Module 1: evaluation of the FPHN ItinérAIR-HTAP predictive equation in a REVEAL validation cohort
Patient characteristics
The REVEAL validation cohort for evaluation of the FPHN ItinérAIR -HTAP predictive equation comprised 292 patients with non-missing data for all FPHN ItinérAIR-HTAP equation parameters (fig. 1). The REVEAL validation and the FPHN ItinérAIR-HTAP equation development cohorts had approximately comparable characteristics (table 2), except that the REVEAL validation cohort contained a higher proportion of females (78% versus 63%) and a lower proportion of anorexigen-APAH (3.4% versus 13.2%) than the FPHN ItinérAIR-HTAP equation development cohort. Both cohorts had a similar functional class (FC) profile at diagnosis, with most patients in New York Heart Association FC III. The REVEAL validation cohorts with missing data for 6MWD or cardiac output (n=144) and non-missing data (n=292) were generally similar, but differed with respect to the proportion of patients in functional class IV (table E1).
Survival
Survival from diagnosis in the REVEAL validation cohort according to FPHN ItinérAIR-HTAP equation risk quartiles is shown in figure 2a. A distinct separation is clear between the mortality risk quartiles over 3 years, with only the highest mortality risk quartile showing meaningful separation from the others over the first year. At 3 years of follow-up, patients in the highest mortality risk quartile had the lowest survival estimate, at 53.9%, compared with 82.5%, 87.7% and 93.3% in the upper three quartiles.
The predicted versus observed survival at 3 years is shown in figure 2b. Risk stratification of the REVEAL validation cohort using the FPHN ItinérAIR-HTAP equation showed good discrimination between high- and low-mortality risk patients. Observed survival in REVEAL was consistent with predicted survival and was slightly better than that predicted by the FPHN ItinérAIR-HTAP equation, particularly in the middle strata. The largest differences were in the second and third risk quartiles where there were eight and 11 observed deaths, although 14.8 and 20.1 deaths were predicted. Although the validation focused exclusively on the aetiologies for which the model was intended, observed REVEAL survival was also computed for APAH-connective tissue disease (CTD) patients stratified by risk quartiles. These patients had survival estimates of 82.9%, 69.0%, 62.0%, and 45.2%, all falling somewhat below the predicted values of 87.2%, 77.0%, 66.5%, and 48.8%. By contrast, the primary analysis cohort had observed survival significantly or marginally better than predicted in the middle quartiles.
The three-term FPHN ItinérAIR-HTAP Cox proportional hazards model refitted to the REVEAL data for the target aetiologies is shown in table 3. Application of the FPHN ItinérAIR-HTAP parameters to the REVEAL validation cohort demonstrated a good correlation of estimated hazard ratios (HRs) between the two studies and a robust c-index of 0.72 (95% CI 0.64–0.80). Refitting the three β-coefficients in the FPHN ItinérAIR-HTAP equation to the REVEAL data, without otherwise changing the structure of the equation, suggested a similar fit for these three parameters in REVEAL, although 6MWD appeared to be a moderately stronger predictor (HR 0.99 versus 1.00 per 1-m increase) in REVEAL and cardiac output may be a moderately weaker predictor (HR 0.85 versus 0.76 per L·min−1). We conducted a sensitivity analysis replacing cardiac output in the equation with cardiac index to determine whether cardiac index (which relates cardiac output to body surface area) may be affected by potential differences in body weight and body surface area between patients in the USA and France. We found similar trends (HR 0.75, 95% CI 0.44–1.29), most likely because cardiac output was the weakest of the three predictors in the equation in both the French model development cohort and the US validation cohort.
Module 2: evaluation of the REVEAL risk score calculator in a FPHN validation cohort
Patient characteristics
The FPHN validation cohort included 1737 patients with group 1 PAH, representing all subgroups that were part of the REVEAL development cohort (fig. 3). Characteristics of this cohort are summarised in table 4. 48% of patients in the FPHN validation cohort were newly diagnosed, and 53% were in FC III. Mean±sd 6MWD was 356±139 m.
Survival
Survival from time zero among the FPHN validation cohort according to the REVEAL risk classes is shown in figure 4a. In the first month, there was a separation of patients in the highest two risk classes. At 6 and 12 months, survival corresponded to risk stratification.
Observed 1-year survival for patients in mortality risk score strata 0–7, 8, 9, 10–11 or ≥12 were 96.4%, 88.7%, 86.3%, 78.6% and 62.1%, respectively. These are close to the predicted mortality risk for each of these pre-specified risk strata, although there were statistically significantly more events observed than expected in the risk strata associated with a risk score of 8 (34 observed versus 20.9 expected), suggesting the negative predictive value associated with a score of exactly 8 may be lower than predicted (88.7% based on the observed Kaplan-Meier versus 90–95% based on the internal REVEAL validation). The c-index for 1-year survival and risk score was 0.73 (95% CI 0.69–0.77) (fig. 4b).
Discussion
There is currently considerable international effort towards studying survival, risk scores and predictive equations in PAH [27, 32, 33]. This collaborative study is an important first step in the external statistical validation of two predictive models of survival in patients with PAH in the modern treatment era. Both the REVEAL and the FPHN validation cohorts demonstrated good discrimination for the FPHN and REVEAL predictive models, respectively.
The FPHN ItinérAIR-HTAP equation accurately stratified a matched US population according to risk in patients with IPAH, heritable PAH and anorexigen-APAH. The better survival observed in the REVEAL validation cohort compared with the survival predicted by the FPHN ItinérAIR-HTAP equation may reflect the effects of exclusion of patients with missing data and/or inherent differences between the registries, such as enrolment eras, coincident PAH-targeted drug availability and body mass index. The validation cohort assembled from the FPHN registry demonstrated good discrimination by the REVEAL risk score calculator. Observed and predicted survival were well correlated using the REVEAL risk score in the FPHN validation cohort.
Despite the fact that each validation cohort was constructed to approximate each equation development cohort, the two resulting populations were not precisely matched. Notable differences included the different enrolment periods. Despite these differences, the HRs estimated in the validation cohort closely resembled the original estimates from the FPHN ItinérAIR-HTAP equation, probably reflecting the fact that cardiac output was a weaker predictor than sex and 6MWD in the FPHN ItinérAIR-HTAP equation. Indeed, replacing cardiac output with cardiac index in the Cox refit model produced a similar nonsignificant HR. Each equation demonstrated good discriminatory ability in the respective validation cohort: the REVEAL risk calculator and the FPHN ItinérAIR-HTAP equation had a c-index of 0.72 and 0.57, respectively, in the original derivation populations. The same discriminatory ability was maintained in the validation cohorts, with a c-index of 0.72 and 0.73 in modules 1 and 2, respectively. It is important to note, however, that similarly high c-indexes have been achieved with single parameters in other studies [34] and there is considerable debate about whether use of multi-factor models provides sufficient benefit over individual factors to warrant the added complexity [24]. The potential importance of evaluating risk of death in PAH based on multifactorial assessment compared with a single parameter has been shown in several analyses using the REVEAL risk score [24, 35]. These models may offer a prognosis better fitted to a patient's individual level of risk that might not be captured from a single clinical parameter. However, the relative value of single parameter evaluations, which are simpler than predictive equations and often quite powerful, warrants further study.
The validation of the FPHN ItinérAIR-HTAP predictive model was conducted retrospectively, 5–6 years after the equation was derived [15]. Thus, the better than predicted survival among the REVEAL cohort could be explained by an “era effect”; since completion of the FPHN ItinérAIR-HTAP there has been a dramatic reduction in the use of conventional therapy alone. While correlation does not determine causality, the change in treatment patterns over this short time frame is striking and may potentially explain the “era effect”. A more in-depth discussion of the “era effect” seen across numerous PAH registries is provided by McGoon et al. [36].
Narrowly defining the validation cohorts to be similar to the development cohorts may limit the breadth of the external validity. For example, a large number of patients were excluded from the evaluation of the French equation due to missing 6MWD or cardiac output evaluations, which are, in addition to sex, the parameters needed to apply the three-term FPHN ItinérAIR-HTAP equation. The fact patients with missing evaluations were more likely to be FC IV demonstrates that data were not missing completely at random. This reflects clinical practice and emphasises that equations which require non-missing data are only generalisable to patients with complete data. While it may seem appropriate to generalise the formula only to those patients who have the measures involved in the formula, it is also possible that a different cohort with more or less missing data might yield different results. It is important to clarify, however, that these data are missing due to the tests not being performed, not due to poor compliance with electronic case report form completion. Thus, the missing data highlights the fact that the target population for intended use may be smaller than expected, but it does not limit its potential usefulness within that target population. The REVEAL equation involves explicit scoring instruction related to missing data, but it could still be affected by differences in the extent of missing data, which is part of the reason that validation in different cohorts is critical. Another difference between the equations is that the FPHN equation is addressing pre-transplant mortality, while the REVEAL risk score is intended to predict overall mortality without restriction for transplanted patients. While the absence in the REVEAL risk score of censoring at transplant for the patients transplanted is inconsistent with the proportional hazards assumption (as post-transplant predictors differ from pre-transplant risk factors), censoring patients at transplant in the FPHN equation is clearly a form of informative censoring by itself. In spite of these differences, the REVEAL and FPHN validation cohorts provided a large population of patients for a robust evaluation of these models.
Statistical validation, which emphasises discrimination and calibration of a mathematical prediction, is not synonymous with clinical validation of the tool. Data from the REVEAL and FPHN registries, and the two equations derived from them, have been useful to compare patients' characteristics, predict outcomes in two different populations, and test the validity of prognostic indicators such as FC, 6MWD and cardiac index. However, the use of the REVEAL risk score calculator and the FPHN ItinérAIR-HTAP equation in routine clinical practice requires further evaluation in patients with PAH, with prospective experimental designs rather than observational [35].
Although this inter-study collaboration has reduced the chances of unintended errors without adding the biases common to internal validation, our study has several major limitations. This analysis was retrospective and not all clinical parameters were available for every patient; post hoc procedures were required for handling missing data. Because the FPHN Itinérair-HTAP equation was developed to be applied to patients diagnosed with IPAH, FPAH or anorexigen-induced PAH, APAH patients (e.g. PAH associated with CTD, congenital heart disease or portopulmonary hypertension) enrolled in REVEAL were necessarily excluded from the validation cohort to be consistent with proposed usage of the equation. The use of the FPHN cohort as an external validation of the REVEAL risk score is limited because renal function measurements, for which the REVEAL risk score assigns a “+1” score, are not collected by the FPHN registry. Another limitation of both equations is that they are subject to survival bias if they are applied to a different time frame than the proposed and validated time frames. An equation predicting survival from time of diagnosis may be inappropriately pessimistic for prevalent patients and an equation designed for use at any point during follow-up may not sufficiently capture some of the short-term risks in the first few months after diagnosis [2, 15]. Moreover, biases related to diagnostic and treatment availability may have been influential. Finally, we acknowledge the risk that missingness could have occurred nonrandomly in our study, which may significantly bias the results when utilising the restrictive cohort to those subjects with no missing data and limits its generalisability. Accurate assessment of disease severity and prognosis is necessary to guide disease management in PAH, with current guidelines recommending a combination of established prognostic parameters [4]. Predictive models such as these may have the potential to be useful tools in everyday clinical practice to identify high-risk patients, select appropriate advanced therapies for patients [37, 38], or refer patients for lung transplantation [39]. Further assessment remains necessary to determine whether the use of these predictive models will facilitate individualisation and optimisation of therapeutic strategies [40]. Further study may also allow identification of risk thresholds around which risk is incrementally higher and clinical aggressiveness may be modified. As our understanding of PAH and its management continues to evolve and new therapies become available, these predictive equations will need updating to reflect current practice.
In conclusion, the REVEAL risk score calculator and the FPHN ItinérAIR-HTAP predictive equation appear accurate and well calibrated in FPHN and REVEAL validation cohorts, respectively, suggesting their prognostic generalisability in geographically different PAH populations, with the important caveat that great care must be taken in the application and interpretation in populations with substantial missing data. Once prospectively validated, these equations may become valuable tools to guide therapeutic strategies in clinical practice.
Acknowledgements
The authors are saddened to report the passing of Robyn J. Barst, MD, in April 2013. She was an esteemed physician, investigator, and colleague. Her research focused extensively on pulmonary hypertension and she was a distinguished leader in the field of paediatric pulmonary hypertension. Dr. Barst's contributions to the field are invaluable.
Assistance with editorial support for this manuscript was provided by Anna Lau (Percolation Communications LLC, Annandale, NJ, USA) and funding was provided by Actelion Pharmaceuticals US, Inc (San Francisco, CA, USA).
Footnotes
This article has supplementary material available from erj.ersjournals.com
Support statement: Actelion Pharmaceuticals US Inc. is the sponsor of the REVEAL Registry, and provided funding and support for the analysis presented.
Disclosures: Disclosures can be found alongside the online version of this article at erj.ersjournals.com
- Received January 6, 2014.
- Accepted January 25, 2015.
- Copyright ©ERS 2015