As a result of improved survival of CHD patients,Reference Moons, Bovijn, Budts, Belmans and Gewillig 1 the population of adults with CHD is growing and ageing, with an estimated prevalence of three per 1000 adults.Reference van der Bom, Bouma, Meijboom, Zwinderman and Mulder 2 Adults accounted for two-thirds of the entire CHD population in 2010 in Quebec, Canada.Reference Marelli, Ionescu-Ittu, Mackie, Guo, Dendukuri and Kaouache 3 Residual anomalies and late-onset complications have increased the need for hospitalisations and health care utilisation.Reference Mackie, Pilote, Ionescu-Ittu, Rahme and Marelli 4 , Reference Opotowsky, Siddiqi and Webb 5 As CHD patients age, they develop an increased risk of life-long cardiacReference Lin, Liu, Wu, Chen, Chang and Chu 6 and cerebrovascular events. Moreover, care gaps, common in adult CHD, have increased the risk of cardiovascular events.Reference Mackie, Ionescu-Ittu, Therrien, Pilote, Abrahamowicz and Marelli 7 – Reference Yeung, Kay, Roosevelt, Brandon and Yetman 9
As a byproduct of a patient care, vast quantities of information are collected and stored in administrative health databases for the purposes of registration, billing, or record-keeping. The reuse of patient data for research has gained considerable importance in non-adult CHD, as well as adult CHD, populations. Follow-up of CHD populations can be traced for decades using health administrative databases. Multiple years of data permit studying change over time for numerous variables. In addition, such data usually include vital statistics, physician visits, hospital discharge abstracts, pharmaceutical prescriptions, and claims data routinely field by physicians. Administrative health data sources have thus emerged as an important source of population-based analyses for adult CHD patients in order to guide public health policy and resource allocation in industrialised countries.
Data reuse, also called secondary use of data, refers to studies whose purpose is not directly related to the initial reason for collecting data or to the care of the individual patient who is the subject of the health information. Such comprehensive and broad data sources, although not initially developed for the study of disease distribution or disease trends, offer the opportunity for population-level analyses. Data on population health differ between countries in terms of availability, size, and content. Denmark, for example, has gathered a wide range of data variables on all its citizens, including very comprehensive data on a patient trajectory within their health system.Reference Frank 10 Such data sources are typically collected on national or state-wide levels. These data sources have become increasingly used in adult CHD populations worldwide, in developed countries, and where at least some portion of the population benefits from government-funded health insurance.
Against this backdrop, we carried out a systematic literature review to identify all the studies based on secondary use of administrative health data sources that provided new knowledge on adult CHD. Our purpose was to review the outcomes covered, the data source characteristics, and the strengths and limitations of administrative data sources used to address knowledge gaps in the adult CHD populations.
Materials and methods
Systematic search
This systematic review focuses on studies reusing administrative health data repositories. The review is in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).Reference Shamseer, Moher and Clarke 11
Search strategy
We conducted a comprehensive search of PubMed and Embase for relevant peer-reviewed publications from January 1, 2006, to January 1, 2016. The search strategy was developed by H.G. and S.C. with the help of a reference librarian. A comprehensive list of MeSH terms and keywords was used to query Medline and Embase (Supplementary Table 1). The search strategy also included screening of reference lists of relevant publications – the “snowball” search technique.
Eligibility criteria
Publications were selected if they met the following criteria: the study relied on administrative health data of any kind – e.g., expenditures, hospital discharges, claims, national survey, and death certificates – in the methodology, either for initiation of the research or for follow-up, regardless of outcomes – electronic medical record was not considered as “administrative health data”; the study population comprised adult CHD patients (⩾18 years) or included both adults and children but distinguished the two groups in the results and presented specific comments referring to adult CHD; and the study was published in a peer-reviewed journal in English.
CHD was defined according to the criteria of Mitchell et alReference Mitchell, Korones and Berendes 12 already used for several reviewsReference van der Bom, Bouma, Meijboom, Zwinderman and Mulder 2 – that is, “a gross structural abnormality of the heart or intrathoracic great vessels that is actually or potentially of functional significance”. Thus, we excluded publications dealing with non-structural lesions – such as cardiomyopathies and congenital arrhythmias – ductus arteriosus in premature infants, mitral valve prolapse, or isolated bicuspid aortic valve. We included patients with Marfan syndrome when they presented a complication that was “functionally significant” or required an invasive intervention as they have commonalities: the relatively low prevalence, the absence of curative treatment, the need for cardiac surgery, and the importance of life-long follow-up specialised care.
Study selection
Supplementary Figure 1 represents the study selection process. The PubMed and Embase searches yielded 2217 publications. After exclusion of 29 duplicates, the titles and abstracts of 2188 records were screened and assessed according to the following exclusion criteria: the adult population was not specifically studied or mentioned; the data source used non-exhaustive or comprised a registry of patients volunteering to be included. As such, studies from the CONCOR registry or from tertiary centres with high-volume care of CHD, such as the databases from the CHD Program at the University Hospitals Leuven, Belgium, or from the Royal Brompton Hospital in London, United Kingdom, were excluded based on these criteria. Two of the authors, H.G. and S.C., independently read the first 50 abstracts to harmonise the search. Disagreements were resolved by consensus meetings. In case a database was used in several studies, all the corresponding articles were considered for review. Finally, 197 full-text publications were independently selected for eligibility assessment by both authors – H.G. and S.C. At this stage, 10 publications were added to the 197 after searching the reference lists of relevant publications. After a detailed review of the full text of these 207 eligible publications, 59 were finally included in this systematic review (Supplementary Fig 1).
Data extraction
Data were extracted using a standardised collection form. Attention was given to the study characteristics: year of publication, data source and coverage, follow-up duration, definition of CHD diagnoses, study design, population included and its characteristics, exclusion criteria, objectives of the study (classified into categories), potential bias, and journal impact factor obtained from the Journal Citation Reports (Thomson Reuters, New York, New York, United States of America). We then grouped the selected articles according to the database used and briefly described each one: name, coverage, sponsoring organisation, data sources, and available data.
Results
Description of administrative health data sources used
To date, 59 studies relied on secondary use of administrative health databases to describe specific issues associated with adult CHD patients. Most of them originated in the United States of America (n=32; 55%) and Canada (n=17; 28%). Only four (7%) were from Europe and six (10%) from Asia (Supplementary Fig 2). In some countries, publications were derived from several administrative databases. Canadian publications originated mostly from the province-wide Quebec CHD database, in 15 out of 17 Canadian studies. American studies were derived from one federal database, the California Office of Statewide Health Planning, and from five national bases: the Nationwide Inpatient Sample, the multiple cause of death (MCOD) public-use data file, the Pediatric Health Information System, the United Network for Organ Sharing, and the University Health System Consortium Clinical Database/Resource Manager.
The 59 studies were derived from 12 different data sources from six countries. Only two of them were CHD-specific – the Quebec CHD Database and the Danish Register of CHD – and were created by merging several national or provincial data sources before extracting CHD patients’ data. Table 1 shows the characteristics of each source. A unique personal identifier provided for every inhabitant is described in Denmark, in Canada, and in the United States of America with the United Network for Organ Sharing. In the Database from the Canadian Institute for Health Information, excluding Quebec, multiple hospitalisations for the same patient can be tracked with the use of this unique patient identifier. Further, in Quebec and Denmark, using this encrypted number, data from different administrative sources, for example, physicians’ claims database, hospital discharge summary database, death registry, and prescription registry, may be cross-linked for each patient to provide a global and longitudinal overview of a patient’s history. Thus, such databases contain patient-level data, whereas the Nationwide Inpatient Sample, for example, contains hospitalisation-level data. Among the 59 publications included, 58% (n=34) studies derived from eight data sources linked administrative data at a patient level, whereas 42% (n=25) studies derived from two data sources contained hospitalisation-level information (Table 1). Supplementary Results give further details on data sources from Quebec, the United States of America, Taiwan, and Denmark, which were particularly productive in reusing administrative databases in the field of adult CHD in recent years.
Table 1 Description of administrative health data sources used in the included articles.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180522111451555-0078:S1047951118000446:S1047951118000446_tab1.gif?pub-status=live)
NHS=National Health Service; OSHPD=Office of Statewide Health Planning and Development
Validation procedures and/or specific algorithms that merge several administrative data sources were used to increase the internal validity of CHD diagnoses in 37% (n=22) of the included studies from three data sources: Denmark, Quebec, and Taiwan.
Study characteristics related to methodology
Among the 59 publications included in this review, 24 used a cross-sectional design,Reference Mackie, Pilote, Ionescu-Ittu, Rahme and Marelli 4 , Reference Gurvitz, Inkelas, Lee, Stout, Escarce and Chang 13 – Reference Beausejour Ladouceur, Lawler and Gurvitz 35 15 used a cohort study design,Reference Patel, Weiss and Allen 36 – Reference Islam, Yasui, Kaul, Marelli and Mackie 50 nine were non-analytical descriptive,Reference Marelli, Ionescu-Ittu, Mackie, Guo, Dendukuri and Kaouache 3 , Reference Opotowsky, Siddiqi and Webb 5 , Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 – Reference Maxwell, Maxwell and Wong 57 and one used a time-series analysis.Reference Ionescu-Ittu, Mackie and Abrahamowicz 58 In six studies, the authors constructed a matched control cohortReference Lin, Liu, Wu, Chen, Chang and Chu 6 , Reference Videbaek, Laursen, Olsen, Hofsten and Johnsen 59 – Reference Lowe, Therrien, Ionescu-Ittu, Pilote, Martucci and Marelli 63 using a non-CHD control population when available.Reference Lin, Liu, Wu, Chen, Chang and Chu 6 , Reference Videbaek, Laursen, Olsen, Hofsten and Johnsen 59 – Reference Maxwell, Wong, Kin and Lobato 62 Four studies combined cohort study and nested case–control study to assess the cumulative incidence or prevalence of an outcome, such as atrial arrhythmia, coronary artery disease, and stroke, as well as its predictors (Supplementary Table 2).Reference Mylotte, Pilote and Ionescu-Ittu 8 , Reference Lanz, Brophy, Therrien, Kaouache, Guo and Marelli 44 , Reference Bouchardy, Therrien and Pilote 64 , Reference Roifman, Therrien and Ionescu-Ittu 65 Because of the inherent nature of the data sources, all studies were retrospective. Study characteristics are summarised in Supplementary Table 2.
Study objectives and end points
Although all of these publications dealt with the adult CHD field, the questions they answered were not the same, as shown in Supplementary Table 3. Overall, five (8.5%) studies addressed demographics of the adult CHD population, such as prevalence, incidence, and sex ratio,Reference Marelli, Ionescu-Ittu, Mackie, Guo, Dendukuri and Kaouache 3 , Reference Thompson, Kuklina, Bateman, Callaghan, James and Grotegut 34 , Reference Islam, Yasui, Kaul, Marelli and Mackie 50 , Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 , Reference Chan, Ting, Ho, Poon, Cheung and Cheng 53 , Reference Afilalo, Therrien, Pilote, Ionescu-Ittu, Martucci and Marelli 66 and 19 (32.2%) assessed specific issues in the adult CHD population. These were cardiovascular – atrial arrhythmia,Reference Bernier, Marelli and Pilote 37 , Reference Wu, Lu, Chen, Chiu, Kao and Huang 48 , Reference Bouchardy, Therrien and Pilote 64 pulmonary hypertension,Reference Lowe, Therrien, Ionescu-Ittu, Pilote, Martucci and Marelli 63 coronary artery disease,Reference Roifman, Therrien and Ionescu-Ittu 65 heart failure,Reference Rodriguez, Moodie and Parekh 25 and strokeReference Lanz, Brophy, Therrien, Kaouache, Guo and Marelli 44 , Reference Wu, Chen, Kao and Huang 47 , Reference Nyboe, Olsen, Nielsen-Kudsk and Hjortdal 60 – or non-cardiovascular – maternal issues,Reference Opotowsky, Siddiqi, D’Souza, Webb, Fernandes and Landzberg 20 , Reference Thompson, Kuklina, Bateman, Callaghan, James and Grotegut 34 , Reference Hassan, Patenaude, Oddy and Abenhaim 43 pneumonia,Reference Nyboe, Olsen, Nielsen-Kudsk, Johnsen and Hjortdal 61 non-alcoholic cirrhosis,Reference Krieger, Moko and Wu 24 cancer,Reference Lee, Chen and Jeng 45 dementia, gastrointestinal bleed, and chronic kidney disease.Reference Afilalo, Therrien, Pilote, Ionescu-Ittu, Martucci and Marelli 66 As the data were routinely derived from health insurance claims, resource utilisation and cost/cost-effectiveness were reported in, respectively, 26 and 12 articles. Resource utilisation included lengths of stay in hospitalisation and/or ICU, total hospital charges, emergency department visits, outpatient visits (to general practitioner or cardiologist), number of procedures, and number of admissions. In 16 articles (24%), the authors focused on patterns of care, comparing specialised versus non-specialised centres,Reference Mylotte, Pilote and Ionescu-Ittu 8 , Reference Gurvitz, Inkelas, Lee, Stout, Escarce and Chang 13 , Reference Fernandes, Chamberlain and Grady 29 , Reference Maxwell, Maxwell and Wong 57 teaching versus non-teaching hospitals,Reference Opotowsky, Landzberg, Kimmel and Webb 14 , Reference Karamlou, Diggs, Ungerleider, McCrindle and Welke 54 adult versus paediatric hospitals,Reference Bhatt, Rajabali, He and Benavidez 27 or even analysing the impact of hospital volumeReference Opotowsky, Landzberg, Kimmel and Webb 14 , Reference Kim, Gauvreau, Bacha, Landzberg and Benavidez 16 , Reference Bhatt, Patel and Patel 28 , Reference Hayward, Dewland and Moyers 30 , Reference Singh, Badheka and Patel 32 , Reference Maxwell, Steppan and Cheng 46 , Reference Karamlou, Diggs, Ungerleider, McCrindle and Welke 54 , Reference Lu, Agrawal, Lin and Williams 56 on outcomes. Finally, these databases covered long periods of follow-up, which made it possible to assess temporal trends in half of the articles (n=30, Supplementary Table 3).
Publication trends and journal impact factor
No qualifying study was published before 2007. From 2006 to 2015, the number of articles grew exponentially, going from three articles in 2007 to 16 from four different countries in 2015 (Fig 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180522111451555-0078:S1047951118000446:S1047951118000446_fig1g.jpeg?pub-status=live)
Figure 1 Number of articles on adults with CHD patients using administrative databases published per year and per country. NIS=Nationwide Inpatient Sample.
Journal impact factors for all 59 articles ranged from 0.825 to 17.047, with a median of 4.04 and an interquartile range of 3.15, 7.44. The impact factors according to the source of publications are shown in Table 2. Denmark, Quebec, and the United States of America – non-Nationwide Inpatient Sample – published articles in very-high-impact-factor (impact factors >13.5) journal data sources (Fig 2a). Figure 2b represents the cumulative number of articles and their impact factor according to the year of publications. In 2015, 11 articles (18.6%) of the 59 were published in very-high-impact-factor journals.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180522111451555-0078:S1047951118000446:S1047951118000446_fig2g.jpeg?pub-status=live)
Figure 2 Distribution of journal impact factors (IF) according to ( a ) the source of the publication and ( b ) the year of publication. NIS=Nationwide Inpatient Sample.
Table 2 Journal impact factors according to the source of publications.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180522111451555-0078:S1047951118000446:S1047951118000446_tab2.gif?pub-status=live)
NIS=Nationwide Inpatient Sample
Discussion
Although generally developed for registration and billing purposes, administrative health data are rich sources of information and were extensively used over the past decade in adult CHD research in a growing number of countries. This systematic review highlights the differences between data sources used with a wide variation in the availability of patient-level compared with hospitalisation-level episodes of care or in the availability of internal validation of diagnoses. In addition, the characteristics of the studies are influenced by the structure of the data sources from which they were derived. Thus, such data sources were particularly used for assessing resource utilisation and temporal trends of outcome end points over long observation periods.
In this review, we excluded all registries assembled as primary data sources even if they covered large adult CHD populations. For example, since November 2001 in the Netherlands, adult CHD patients have been recruited in tertiary and secondary medical centres and included in the nationwide CONCOR registry.Reference van der Velde, Vriend, Mannens, Uiterwaal, Brand and Mulder 67 Thus, this registry makes it possible to study only a survival cohort of patients and comprises neither very mild CHD in patients lost to medical surveillance nor critical CHD that led to death before enrolment. Moreover, the CONCOR registry has an inclusion bias because of its recruitment in tertiary and secondary medical centres. The same limitations exist in the CHD database of Leuven, a Belgian tertiary medical centre,Reference Moons, Bovijn, Budts, Belmans and Gewillig 1 or in the Royal Brompton Hospital in London.Reference Brida, Dimopoulos and Kempny 68 On the other hand, by their nature, these sources provide highly detailed data for clinical assessment. Here, we focused on studies relying on administrative health databases to provide new knowledge on adult CHD. In areas in which access to health care is universal, as in Taiwan, Quebec, or Denmark, the entire population is covered by the data collection, and non-participation is minimal. This is a paramount difference compared with registries or clinical databases in which participation is voluntary and therefore biased. Owing to their comprehensiveness and their large size, these databases can generate sample sizes usually not available in single or even multi-institutional databases. This is particularly helpful and relevant when studying rare diagnoses or procedures.
As shown, research based on administrative data sources can be done at a patient level or at a hospitalisation level depending on the characteristics of the data source. The use of a consistent set of identifiers in administrative health databases allows researchers to build histories of individuals. In the Quebec CHD Database, the three available province-wide administrative databases were linked using patients’ unique encrypted health care insurance numbers.Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 In Denmark, the number assigned to all residents at birth or upon immigration is included in all national registers, whereas the Danish Government gathered nearly 200 databases, on everything from medical records to socio-economic data on jobs and salaries.Reference Frank 10 Going even further, the Danish Biobank Register stores more than 22 million samples from 5.4 million individuals in national administrative registries at an individual level. Conversely, in other countries, patient identifiers may change over time, making longitudinal analyses more difficult or impossible. Indeed, the unit of measure in the Nationwide Inpatient Sample system is hospital stay, not the patient timeline. The Nationwide Inpatient Sample does not identify individual patients, and recurrent hospitalisations appear as distinct observations.Reference Khera and Krumholz 69
This difference of level of analysis is one of the factors that appear to be reflected in the journal impact factor of published studies: Quebec, Denmark, and the United States non-Nationwide Inpatient Sample reach the highest impact factors where patient-level analyses are possible. Another contributing factor to high-impact publications appears to be related to the use of validation procedures to migrate raw data from administrative data sources to “clean” data assembled in the form of a database (Quebec). Publications in high-impact-factor journals reflect enhanced study quality and rigour, with the increase in the cumulative impact factor of published studies over time being an encouraging indicator of the growing quality of studies being produced in adult CHD research with such data sources.
The absence of shared identifiers between the administrative health database and other data sources – for example, a clinical research file – prevents record linkage among heterogeneous data sources. Thus, outpatient data or vital status is not available in United States databases, except for United Network for Organ Sharing, if death did not occur at hospital. In addition, there is usually no link with outpatient clinical data.Reference Khera and Krumholz 69 Methodologies using probabilistic linkage based on variables such as admission date, discharge date, patient sex, and patient date of birth have been developed to merge information from different origins. For example, this methodology has been used to merge information from a registry, the Society of Thoracic Surgeons (STS) Congenital Heart Surgery Database, with a paediatric administrative data set, the Pediatric Health Information Systems (PHIS).Reference Pasquali, Jacobs and Shook 70 Similarly, the availability of direct identifiers allowed linkage of the Pediatric Cardiac Care Consortium data with the National Death Index and the United Network for Organ Sharing, thereby providing significant information regarding the long-term outcomes after surgical procedures.Reference Spector, Menk and Vinocur 71 However, until now, none of these methods has been used to assess specific issues in adults.
As CHD is associated with life-long co-morbidities, and also benefits from life-long specialised care, longitudinal studies across the lifespan are essential.Reference Afilalo, Therrien, Pilote, Ionescu-Ittu, Martucci and Marelli 66 In fact, as shown in this systematic review, data are usually available over a large number of years, facilitating longitudinal studies in which unique identifiers can be followed up over time. Follow-up can be traced for years or decades in order to analyse CHD-specific surgical outcomes, adverse events, or co-morbidities or practice patterns. Using the Quebec CHD Database from 1990 to 2005, Marelli et al showed a significant increase in referrals to specialised adult CHD centres following the introduction of clinical guidelines.Reference Mylotte, Pilote and Ionescu-Ittu 8 This change in clinical practice was independently associated with reduced mortality. Such extensive historical data are important for actionable policy-driven decision-making. This is underscored with studies that monitor health care utilisation, cost of disease burden,Reference Mackie, Pilote, Ionescu-Ittu, Rahme and Marelli 4 , Reference Opotowsky, Siddiqi and Webb 5 , Reference Islam, Yasui, Kaul and Mackie 49 , Reference Islam, Yasui, Kaul, Marelli and Mackie 50 , Reference Lu, Agrawal, Lin and Williams 56 and inadequate access to services,Reference Marelli and Gurvitz 72 , Reference Marelli, Therrien, Mackie, Ionescu-Ittu and Pilote 73 ideally designed with administrative data sources.Reference Riehle-Colarusso, Bergersen and Broberg 74 Thus, administrative health databases are a powerful tool to assess patient management and outcomes and to further develop quality of care improvement programmes.
Diagnostic validity has been an important criticism of CHD studies using administrative data sources.Reference Khera and Krumholz 69 Indeed, in this review, only 37% of studies reported validation procedures. Overall, in this review, all published articles used International Classification of Diseases codes (Eighth, Ninth, or Tenth Revision) to identify CHD patients, which often lack sufficient detail to adequately characterise specific CHD phenotypes or procedures; for example, there is no International Classification of Diseases code for a Norwood procedure. The lack of granularity in the coding schemes – for example, detailed anatomic diagnoses or procedures – and the lack of standardised definitions may give a coarse overview of the diagnoses or the patient’s clinical status. Hence, researchers are limited to investigating broad classes of defects such as severe CHD,Reference Marelli, Ionescu-Ittu, Mackie, Guo, Dendukuri and Kaouache 3 , Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 simple CHD,Reference Videbaek, Laursen, Olsen, Hofsten and Johnsen 59 univentricular,Reference Karamlou, Diggs and Welke 19 , Reference Collins, Fram, Tang, Robbins and Sutton 22 , Reference Krieger, Moko and Wu 24 , Reference Collins, Fram, Tang, Robbins and St John Sutton 26 , Reference Seckeler, Moe and Thomas 31 , Reference Tabtabai, DeFaria Yeh, Stefanescu, Kennedy, Yeh and Bhatt 33 or valvular diseases.Reference Ionescu-Ittu, Mackie and Abrahamowicz 58 More rarely, CHD with an unequivocal definition as coarctation of the aorta,Reference Bhatt, Patel and Patel 28 , Reference Wu, Chen, Kao and Huang 47 , Reference Roifman, Therrien and Ionescu-Ittu 65 or, for example tetralogy of Fallot,Reference Wu, Lu, Chen, Chiu, Kao and Huang 48 have been published specifically. In some studies, lesion-specific algorithms substantially enhance the quality of the work relating to atrial septal defect to distinguish it from persistent foramen ovale.Reference Mylotte, Quenneville and Kotowycz 41 Even when the relevant code exists, however, there may be errors due to the coding process. Physicians may lack expertise in the International Classification of Diseases terminology. In jurisdictions where coding is done by administrative personnel, coding errors may occur owing to staff’s limited medical knowledge or because of poor documentation in the medical record, leading to variations in the quality of administrative data on diagnosis.Reference Cronk, Malloy and Pelech 75 , Reference Frohnert, Lussky, Alms, Mendelsohn, Symonik and Falken 76 In Quebec, authors minimised misclassification bias by using all available data for a given subject, including inpatient, outpatient, procedural, and provider information. From these, they developed an algorithm, and tested it by manually auditing almost a third of the files.Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 In Taiwan, to minimise misclassification bias, at least two corresponding codes were required for confirmation if CHD codes originated from outpatient data, also using procedural codes to enhance diagnostic accuracy.Reference Lin, Liu, Wu, Chen, Chang and Chu 6 , Reference Chen, Hsiao and Cheng 42 In Denmark, included patients’ hospital records were validated to secure a correct diagnosis and status.Reference Nyboe, Olsen, Nielsen-Kudsk and Hjortdal 60 , Reference Nyboe, Olsen, Nielsen-Kudsk, Johnsen and Hjortdal 61
As administrative databases also contain diagnostic codes for co-morbid conditions, they are a source of misclassification. For example, the studies presented in this review have examined conditions including dementia, gastrointestinal bleed, chronic kidney disease,Reference Afilalo, Therrien, Pilote, Ionescu-Ittu, Martucci and Marelli 66 stroke,Reference Lanz, Brophy, Therrien, Kaouache, Guo and Marelli 44 , Reference Wu, Chen, Kao and Huang 47 or coronary artery disease.Reference Roifman, Therrien and Ionescu-Ittu 65 Depending on the clinical question, inherent limitations in this type of data include the lack of accurate assessment of unmeasured confounders including smoking status, alcohol, drug abuse, and obesityReference Lee, Chen and Jeng 45 , Reference Wu, Chen, Kao and Huang 47 or absence of detail on left- versus right-sided heart failure and specifics of prosthetic materials.Reference Lanz, Brophy, Therrien, Kaouache, Guo and Marelli 44 Similarly, family history, lifestyle factors, and drug prescription information were available only in a few nationwide integrated data systems.Reference Nyboe, Olsen, Nielsen-Kudsk and Hjortdal 60 , Reference Nyboe, Olsen, Nielsen-Kudsk, Johnsen and Hjortdal 61
Records in administrative health databases only include data for individuals who use the services during the period of interest. Those without access to care, those who failed to encounter the health care system during the study period, or those who may have migrated may thus not be captured. In areas in which access to care is universal as in Taiwan, Quebec, or Denmark, and with long follow-up periods, this bias is minimal.Reference Marelli, Mackie, Ionescu-Ittu, Rahme and Pilote 51 , Reference Nyboe, Olsen, Nielsen-Kudsk, Johnsen and Hjortdal 61 Conversely, in countries where health care is supported by private insurance, the information extracted from the administrative health databases may be influenced by access to care and insurance status, socio-economic level, and ethnicity, thus limiting the generalisability of findings to other countries with different structures for access to care.Reference Marelli, Ionescu-Ittu, Mackie, Guo, Dendukuri and Kaouache 3 , Reference Beausejour Ladouceur, Lawler and Gurvitz 35 , Reference Maxwell, Steppan and Cheng 46 , Reference Bouchardy, Therrien and Pilote 64 Recently, based on this knowledge, Gilboa et alReference Gilboa, Devine and Kucik 77 estimated the CHD prevalence across all age groups in the United States of America by extrapolating from the population of Quebec and applying a race–ethnicity adjustment factor.
Limitations
Some limitations of the present systematic review need to be discussed. First, it should be noted that some details on the databases used in some of these studies are not available in the published articles but only available on websites. Moreover, specificities of each administrative health database depend on the specificities of each health care system, which may not be extensively described in each study. Second, the analyses were made at the article level, not at a database level. We recognise that most of the sources have led to several articles. Thus, they may be over-represented when they originated from research groups that are highly productive in terms of publications. However, we did not have source-level data available for analysis. We carried out a systematic review to study the contribution of using a methodology related to administrative health data sources in adult CHD research, so that conducting a meta-analysis or statistical analyses could not be applied. Finally, we reported an increasing number of studies using an administrative health database in the field of adult CHD around the industrialised world, but such studies do not capture the growing population of adult CHD patients in underdeveloped countries.Reference Webb, Mulder and Aboulhosn 78
In conclusion, this systematic literature review focuses on the secondary use of administrative health data sources for adult CHD research purposes in industrialised countries. With the increasing access and use of these data sources, understanding their features and limitations is critical to ensure appropriate interpretation and extends beyond the scope of adult CHD. Although not designed for research purposes, such data sources can be particularly useful for the assessment of population-level epidemiology, outcomes, and health services research over long observation periods, providing a powerful tool to further develop quality of care improvement programmes. Study quality is enhanced with validation procedures, unique identifiers over time, and comprehensive data capture. Prevailing limitations include diagnostic accuracy in specific subgroups, unmeasured confounders, and lack of clinically relevant patient-level data. Geographic variations in health insurance limit generalisability between jurisdictions. In the future, efforts to standardise diagnostic coding will facilitate data pooling, integration, and reuse of existing data at a supranational level to compare and aggregate results where relevant. Interoperability, quality control, validation, and merging with clinical data sources would optimise the specificity and validity of study findings. Data are increasingly covering a variety of modalities, including administrative databases, electronic medical records, clinical registries, research data sets, monitoring systems, and biobanks. Harmonising data collection will improve the translational potential of adult CHD research.
Acknowledgements
The authors thank Sophie Guiquerro for her help in building the search algorithm and Sarah Zohar for her comments.
Financial Support
S.C. was supported by the French Federation of Cardiology, Institut Servier, and the Fondation pour la Recherche Médicale. A.J.M. is supported by the Canadian Institute of Health Research, the Heart and Stroke Foundation of Canada, and the Fonds de recherche du Québec – Santé.
Conflicts of Interest
None.
Ethical Standards
The authors assert that all procedures contributing to this study comply with the ethical standards of relevant national guidelines on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008, and has been approved by the institutional ethics committee of the Queens University of Belfast.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S1047951118000446