The mean prevalence of congenital cardiac disease at birth is 7.7 per 1000 live births.Reference Samanek1 Congenital cardiac disease is a generic term for malformations of the heart present at birth. Three of the most common malformations are (a) the narrowing of the aorta, that is, aortic coarctation; (b) an opening in the wall dividing the left and right heart chambers, that is, ventricular septum defect; and (c) an opening in the wall dividing the left and right atrium, that is, atrial septum defect. In the Netherlands, approximately 1600 children with a congenital cardiac defect are born each year.Reference Kamphuis, Zwinderman and Vogels2 At least 85% of these patients reach adulthood owing to the successes of cardiac surgery.Reference Kamphuis, Zwinderman and Vogels2 Even after corrective surgery, however, most patients have residual lesions with varying effects on daily functioning, for example, exercise capacity and quality of life.Reference Kamphuis, Ottenkamp and Vliegen3, Reference Lane, Lip and Millane4
In daily clinical practice, many treating cardiologists assess the exercise capacity or functional status of patients with congenital cardiac disease following the four New York Heart Association classes (Table 1). These four classes were originally developed to help physicians evaluate the effect of cardiac symptoms on patients’ daily activities, but are also increasingly used to estimate patients’ functional status in clinical trials.Reference Bennett, Riegel, Bittner and Nichols5 The New York Heart Association classification has been found to be clinically useful, as it is associated with survival and quality of life.Reference Kamphuis, Zwinderman and Vogels2, Reference Dimopoulos, Diller and Koltsida6, Reference Moons, Van, De, Gewillig and Budts7
Table 1 New York Heart Association classification.

In large-scale medical research where patients cannot be seen by a physician at each time point, it is advantageous if functional status could be assessed by patients themselves. However, the usefulness of such patient-based New York Heart Association class assessment critically depends on its agreement with the cardiologist-based score. Since the New York Heart Association classification is a physician-based score, the cardiologist can be seen as the gold standard.
To our knowledge, only four studies explicitly compared patient–cardiologist-assessed New York Heart Association class.Reference Ekman, Kjork and Andersson8–Reference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 These studies included patients with heart failure. In these four studies, only Goode et alReference Goode, Nabb, Cleland and Clark9 made a direct comparison between patient-assessed and physician-assessed New York Heart Association class. The remaining three studiesReference Ekman, Kjork and Andersson8, Reference Kubo, Schulman, Starling, Jessup, Wentworth and Burkhoff10, Reference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 inferred the New York Heart Association class from patient-reported functional class scalesReference Ekman, Kjork and Andersson8, Reference Kubo, Schulman, Starling, Jessup, Wentworth and Burkhoff10, Reference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 or a quality-of-life questionnaire.Reference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 In these four studies, different levels of agreement were found. Contrary to patients with heart failure, patients with a congenital cardiac malformation are born with impairments. Therefore, these patients may be more used to their limitations, which might result in a different perspective of their functional status. The objective of this study is to compare three patient-based New York Heart Association class assessments with cardiologist-assessed New York Heart Association class in outpatients with congenital cardiac disease in order to enable “best choice”.
Materials and methods
Study population and procedure
Consecutive patients, who attended one of the four cardiologists of the congenital heart outpatient clinic of the Academic Medical Center from March to June, 2007, were asked to participate in this study. Patients who were not literate in Dutch were excluded.
Patients with various confirmed congenital cardiac defects completed three questionnaires assessing the New York Heart Association class preceding their visit to the cardiologist. The treating cardiologist completed the New York Heart Association assessment on the same day as the patient assessments and was blinded to the patients’ responses. Since no ethical approval is required for completion of the self-report questionnaires under Dutch law, the medical ethics committee exempted this study from ethical approval. This study is conducted in full accordance with the principles of the “Declaration of Helsinki”, as amended in Tokyo, Venice, and Johannesburg.
Measures
Sociodemographic and clinical data
Sex, birth date, employment status, and presence of co-morbidity – such as diabetes, renal diseases, hypertension, chronic obstructive pulmonary diseases, rheumatoid arthritis, chronic allergies, chronic back pain, limitation in the use of arm or leg, or “other illnesses” – were measured through self-report. Primary diagnosis was extracted from the CONCOR database, a nationwide registry for patients with congenital cardiac disease.Reference van der Velde, Vriend, Mannens, Uiterwaal, Brand and Mulder12
New York Heart Association class assessment.
Patient-based translation
We directly translated the four classes into patient-based statements (see Appendix 1). Class II, for example, was formulated as “I am slightly limited in performing physical activities. I do not experience any symptoms at rest, but ordinary physical activities cause extraordinary fatigue, palpitation or dyspnea”. Patients were asked to choose the statement that was most applicable to him or her. The New York Heart Association classes were directly derived from the answers.
Self-constructed questionnaire
The self-constructed questionnaire was devised with the help of four expert cardiologists and consisted of 11 questions concerning possible physical limitations as a result of the following cardiac symptoms: fatigue, dyspnoea, and palpitation. We queried the presence of each of the symptoms separately at three different levels of exertion: heavy, exemplified by running, doing sports, biking with adverse wind, climbing a flight of 20 steps; ordinary, exemplified by climbing a flight of three steps, walking, dressing; and at rest or when performing the slightest exertion, which was exemplified by standing up, reading a book, talking. For example, “Do you experience palpitations during regular physical activities (walking, climbing three steps, showering, getting (un)dressed)?” For all questions, a “yes” or “no” response option was used. The final question assessed whether discomfort of possible symptoms at rest increased when any physical activity was undertaken.
Specific Activity Scale.
The Specific Activity Scale consists of five stem questions, of which three have a different number of sub-questions – that is, 4–8 – each addressing a different number of example activities – that is, 1–5. The Specific Activity Scale is based on the metabolic expenditure values of activities that a patient reports he or she can or cannot do, and classifies patients into one of the four functional classes.Reference Goldman, Hashimoto, Cook and Loscalzo13 The Specific Activity Scale is available in the original article by Goldman et al.Reference Goldman, Hashimoto, Cook and Loscalzo13 The Specific Activity Scale functional classification system is comparable to the New York Heart Association classification system.Reference Ekman, Kjork and Andersson8 The Specific Activity Scale was translated into Dutch. Minor cultural adaptations that left the structure intact and which did not affect the New York Heart Association scoring included the deletion of four examples that are inapplicable to or too specific for Dutch adults – that is, roller skating, hang washed clothes, bowl, and push power lawnmower; the transformation of weight (pounds) and speed (miles) into the international system of unit – that is, kilograms and kilometres.
Pilot, order, and debriefing questions.
Two pilot studies were conducted to test the appropriateness of the wording of the three questionnaires. Improvements were made to the questionnaires, wherever needed. The order of the three questionnaires was counterbalanced to avoid order effects. Thus, there were six different sets of questionnaires, which were alternately administered to the participating patients.
After completing the three questionnaires, the patients were asked two additional debriefing questions: “In your opinion, which questionnaire describes your physical functioning best?” and “Which questionnaire did you find easiest to answer?”
Cardiologist-assessed New York Heart Association class.
The standard definition of the New York Heart Association classification (Table 1) was used by the treating cardiologists to assess patients’ New York Heart Association class following regular clinical guidelines.
Statistical analysis
For the patient-based translation of the New York Heart Association class, the scores were mapped directly to a New York Heart Association class. For the self-constructed questionnaire, the New York Heart Association classes were calculated by following an algorithm designed after consulting an expert cardiologist and following clinical guidelines in assessing the New York Heart Association class. Patients were categorised as New York Heart Association class I if they answered negatively to all questions, indicating that they were not at all physically limited. Patients who indicated to be physically limited at heavy exertion were rated as New York Heart Association class II. Patients limited at ordinary exertion were rated as New York Heart Association class III. Patients were categorised in the New York Heart Association class IV, if they indicated to have experienced at least one of the three cardiac symptoms at rest, and the experienced discomfort at rest increased when any physical activity was undertaken. In all, 38 patients, that is, 44.2%, completed the self-constructed questionnaire inconsistently. For example, patients rated that they experienced cardiac symptoms at ordinary, but not at heavy, exertion. For these 38 patients, the New York Heart Association class was blindly assessed by one of the cardiologists (BJMM) by manually rating the answers. For the Specific Activity Scale, we followed the original scoring procedure as developed by Goldman et al.Reference Goldman, Hashimoto, Cook and Loscalzo13
The association between patient- and cardiologist-assessed New York Heart Association class was calculated by the Spearman rank correlation coefficient and was interpreted as small (if smaller than 0.30), medium (if ranging from 0.30 to 0.50), or large (if bigger than 0.50).Reference Cohen14 To assess the agreement between patient- and cardiologist-assessed New York Heart Association class, we calculated percent agreement and weighted kappa, which was interpreted as slight (if smaller than 0.20), fair (if ranging from 0.21 to 0.40), moderate (if ranging from 0.41 to 0.60), or substantial (if bigger than 0.61).Reference Cohen15, Reference Landis and Koch16 Weighted kappa was used, as the inclusion of a weight variable enabled the calculation of kappa in SPSS, despite the unequal range of scores across types of raters, that is, cardiologists and patients.17 Since co-morbidity is known to affect self-reported health, we also explored the level of agreement for patients without co-morbidity.
Results
Patients
A total of 86 adult outpatients with a congenital malformation of the heart participated. The median age was 35.8 years, and more than half the patients were women, that is, 53.5%. Most patients worked at least part time, that is, 74.4%. Patients were primarily diagnosed with the Marfan syndrome (26.7%), aortic coarctation (16.3%), valve malformation (15.1%), or Tetralogy of Fallot (12.8%; see Table 2). In all, 15 patients (17.4%) categorised into “other congenital cardiac defects”, including 11 different diagnoses, for example, Eisenmenger's syndrome, Ebstein's syndrome, and atrium septum defect. A total of 56 patients (65.1%) reported to have no co-morbidity, whereas 23 patients (26.7%) reported one co-morbidity. The most common co-morbidities were hypertension (9.3%), chronic back pain (5.8%), chronic obstructive pulmonary diseases (2.3%), and rheumatoid arthritis (2.3%). The number of co-morbidities were distributed across the New York Heart Association classes as follows: class I included 27.9%, that is, 17 patients; class II included 54.5%, that is, 12 patients; and class III included 33.3% of the patients, that is, one patient, who had one or more co-morbidities.
Table 2 Patient characteristics.

Other congenital cardiac defects = including, for example, Eisenmenger's syndrome, Ebstein's syndrome, and atrium septum defect
Comparison of patient- and cardiologist-assessed New York Heart Association class
Patient–cardiologist agreement and association for each questionnaire are presented in Table 3. The agreement between the patient-based translation and the cardiologist assessment was 75.6%. The patient-based translation correlated highly (Spearman rank correlation coefficient is 0.54), and agreed moderately (weighted kappa is 0.43) with the cardiologist-assessed New York Heart Association class. In 11 cases, the New York Heart Association class assessed by the patient-based translation was overestimated, as patients reported a higher New York Heart Association class compared with the cardiologist assessment, whereas in 10 cases New York Heart Association class was underestimated by patients. When calculating agreement, including only patients without co-morbidity (56 patients), the percentage agreement increased from 75.6% to 82.1%, and weighted kappa from 0.43 to 0.51.
Table 3 Patient–cardiologist agreement and association per questionnaire.

Data are presented as frequencies, unless stated otherwise
The agreement between the self-constructed questionnaire and cardiologist assessment was 70.6%, with a high correlation (Spearman rank correlation coefficient is 0.59) and moderate agreement (weighted kappa is 0.44). The self-constructed questionnaire led primarily to the overestimation of patient-assessed New York Heart Association class – 22 overestimations versus three underestimations. A similar increase in agreement levels was seen when agreement was calculated for only those patients without co-morbidity (from 70.6% to 78.2% and weighted kappa from 0.44 to 0.53).
The Specific Activity Scale agreed in 74.4% of the cases with the cardiologist assessment. There was a moderate correlation (Spearman rank correlation coefficient is 0.40) and a fair agreement (weighted kappa is 0.28). The Specific Activity Scale led to underestimation in 18 cases, and in only four cases to overestimation compared with the cardiologist-assessed New York Heart Association class. Again, agreement levels between the Specific Activity Scale and the cardiologist were calculated for patients without co-morbidity, showing a small increase in agreement percentages, from 74.4% to 78.6%, and a decrease in weighted kappa, from 0.28 to 0.18.
As shown in Table 3, in two occurrences there was maximal discrepancy between the patient and cardiologist, that is, a patient rated himself/herself in class IV, whereas the physician rated the patient in class I. Inspection of the data identified that the same patient was involved in both occurrences. Additional analyses showed that this patient reported to have one co-morbidity, that is, “other illness”.
Debriefing questions
In all 18 (20.9%) and eight (9.3%) patients did not answer the first and second debriefing questions, respectively. Four (4.7%) and one (1.2%) patient(s) chose all three questionnaires. The first and second debriefing questions were thus completed following the instruction by 89.5% and 74.4% of the patients, respectively. The distribution of their preference is given in Table 4. Patients reported the Specific Activity Scale as the questionnaire best describing their functional status followed by the self-constructed questionnaire, and patient-based translation. Both the patient-based translation and Specific Activity Scale were reported as easiest to complete, followed by the self-constructed questionnaire.
Table 4 Answers to the debriefing questions for each questionnaire.

Data are presented in percentage (numbers)
Discussion
This study was conducted to explore which patient-based New York Heart Association class assessment agrees best with cardiologist-assessed New York Heart Association class and can be used in future research contexts. The patient-based translation was found to be the best choice in assessing the New York Heart Association class in congenital cardiac disease patients given its adequate agreement, its equal over- and underestimation, and its ease of completion. The patient-based translation can be used for research purposes; however, its 75.6% agreement with the cardiologist precludes its use in the individual case. The self-constructed questionnaire also showed adequate agreement, but led primarily to overestimation of the New York Heart Association class, whereas the Specific Activity Scale showed only fair agreement, and led primarily to the underestimation of the New York Heart Association class. Interestingly, for all three questionnaires agreement levels for patients without co-morbidity were higher than agreement levels for the total group.
In general, the agreement levels found in our study are higher than the agreement levels found in the four previous studies.Reference Ekman, Kjork and Andersson8–Reference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 Goode et alReference Goode, Nabb, Cleland and Clark9 used a direct comparison that is comparable to the patient-based translation and found an agreement of kappa is 0.28. Similar to our study, there was equal over- and underestimation of patient-assessed New York Heart Association class. In the study by Goode et al, patients were referred to the cardiologist for the first time, whereas in our study the New York Heart Association class was assessed by the regularly treating cardiologist. Perhaps the latter cardiologists have more clinical data – for example, electrocardiography or echography – about the patient to base their rating on. Moreover, they may be better informed about possible co-morbidity and therefore might be more accurate. In addition, patients may also be better aligned to the physician owing to experience (“training effect”). These factors may have resulted in a higher level of agreement between patient- and cardiologist-assessed New York Heart Association class in our study.
Subramanian et alReference Subramanian, Weiner, Gradus-Pizlo, Wu, Tu and Murray11 assessed patient-based New York Heart Association class by means of the Kansas City Cardiomyopathy Questionnaire. A general nurse or a project coordinator assessed the New York Heart Association class on average 20 days after the patient assessment. An agreement level of 43%, with a weighted kappa of 0.28, was found. Factors likely to have contributed to this greater discrepancy between patient-based and cardiologist-assessed New York Heart Association class include a lower health literacy of the patients, assessment by a nurse or project coordinator instead of a cardiologist, and a time lapse of 20 days between assessments.
In the study by Kubo et al,Reference Kubo, Schulman, Starling, Jessup, Wentworth and Burkhoff10 patients with heart failure were interviewed by a physician assistant or nurse, who recorded their answers on a questionnaire. These answers were then categorised into the New York Heart Association classes by one of three independent raters. The New York Heart Association class assessed by the cardiologist was subsequently compared with the New York Heart Association class scored by each of these independent raters. Results showed agreement levels ranging from 57% to 65%, with weighted kappa scores ranging from 0.55 to 0.63. Contrary to our study, rater-based patient assessment underestimated the New York Heart Association class compared with cardiologist-assessed New York Heart Association class.Reference Goode, Nabb, Cleland and Clark9 The agreement levels found by Kubo et al were lower, whereas slightly higher kappa levels were reported compared to those found in our study. The raters of the patient-based assessments are health-care professionals, and thus they are likely to be more closely aligned to the cardiologists.
Ekman et alReference Ekman, Kjork and Andersson8 found an agreement of 32% between the Specific Activity Scale and the cardiologist-assessed New York Heart Association class. Kappa was not calculated. Similar to our study, the Specific Activity Scale scores primarily underestimated the New York Heart Association class compared with cardiologist assessment.Reference Ekman, Kjork and Andersson8 Similar to the study conducted by Goode et al,Reference Goode, Nabb, Cleland and Clark9 patients were referred to an outpatient heart failure clinic for the first time. The discrepancy between the Specific Activity Scale and the cardiologist-assessed New York Heart Association class might be explained by similar reasons formulated to explain the results of Goode et al.Reference Goode, Nabb, Cleland and Clark9
On a general note, patients with a congenital cardiac defect are born with their impairments, and as a consequence have visited a cardiologist their entire life, contrary to patients with heart failure. This might result in better alignment with their cardiologist in the interpretation of their symptoms and the assessment of the New York Heart Association class. Moreover, both the patient and the cardiologist assessed the New York Heart Association class on the same day, possibly further increasing agreement levels. In addition, in all four studies the presence of co-morbidity was not assessed and its influence on the level of agreement was therefore not explored. Co-morbidity is relevant for patients with heart failure as they are generally in the higher age ranges.
The finding that co-morbidity affects patient–cardiologist agreement in assessing the New York Heart Association class deserves further attention, especially since co-morbidity is common in patients with congenital cardiac disease. It might be hypothesised that the cardiologists filter out the impact of co-morbidity in assessing the New York Heart Association class, whereas patients do not. The maximal discrepancy in patient and cardiologist assessment found in this study might also be explained by the ability of the cardiologist to discriminate between congenital cardiac defect and co-morbidities, since the patient involved reported to have one co-morbidity. On the basis of the results of this study, we added an instruction to the patient-based translation (see Appendix 1) in which we ask patients to only consider functional impairments caused by their congenital cardiac defect.
The results of this study raise the following question: who should be the gold standard in assessing the New York Heart Association class, the cardiologist or the patient? Since the New York Heart Association class is based on the patient's subjective perceptions of disease-related restrictions in physical activity, and the patient is by definition the expert on these subjective perceptions, the patient can be considered the gold standard. In contrast, the ability of the cardiologists to filter out the impact of co-morbidity when assessing the New York Heart Association class pleads for the cardiologist as the gold standard. One can also choose an empirical approach to this question. Future studies should examine whose subjective assessment, that is, cardiologist or patient, is most closely aligned with an objective measure of patients’ functional status such as an exercise test, for example, the 6-minute walking test. The most closely related measure can be considered the gold standard.
The limitations of this study merit attention. First, the sample size was too small to explore patient characteristics affecting the patient–cardiologist agreement. For example, it would have been interesting to compare the patients who underestimated versus overestimated their New York Heart Association class with regard to a number of background characteristics. However, the sample size was sufficiently large to explore whether co-morbidity affects patient–cardiologist agreement levels, by examining the group of patients without co-morbidity separately. Second, the focus of this study was on outpatients. As a consequence, the distribution of the New York Heart Association classes was skewed, with most patients being categorised in physician-based New York Heart Association classes I and II and none in class IV. This may have resulted in lower kappa levels.Reference Byrt, Bishop and Carlin18 More importantly, the results are only generalisable to patients with New York Heart Association classes I and II. Despite the fact that it may not be too far fetched to expect that the simple, direct translation of the New York Heart Association classes would also work for the two higher classes, this needs to be confirmed in future studies with patients who have poorer function, such as hospitalised patients. Third, it is important to note that the distribution of congenital cardiac defects was not representative for adults with congenital cardiac disease, since patients with the Marfan syndrome constituted the largest group. Our patient sample was therefore not representative of the population of adults with congenital cardiac disease. Fourth, we were unable to describe the patient sample with respect to a number of clinical characteristics, such as cardiac functioning and type of treatment. However, we did present data on the type of congenital cardiac disease, physician-based functional class and co-morbidity, allowing for some characterisation of the patient sample in clinical terms.
We would also like to highlight the strengths of our study. It is the first study that addresses outpatients with congenital cardiac disease, compares three patient-based questionnaires in a counterbalanced manner, and explores whether co-morbidity affects the patient–cardiologist agreement in the New York Heart Association class assessment. A final strength of this paper is that patients and their treating cardiologists completed the New York Heart Association assessment on the same day. This study shows that the simple and direct translation of the New York Heart Association class, as provided in Appendix 1, is a valuable patient-based tool that can be used in future studies of outpatients with congenital cardiac defects.
In summary, the patient-based translation with the instruction to only consider functional impairments caused by the congenital cardiac defect is recommended in future studies of outpatients with congenital cardiac disease, given its adequate agreement with cardiologist-assessed New York Heart Association class, its equal over- and underestimation, and its ease of completion.
Acknowledgements
We are very grateful to all participants. Moreover, we thank all participating congenital cardiologists from the Academic Medical Centre in Amsterdam. This study is partly financed by the Interuniversity Cardiology Institute of the Netherlands.
Appendix
Patient-based translation

†This instruction was adapted on the basis of this study