In many areas of medicine, patient populations are so highly heterogeneous that single centre or even multi-institutional studies come up short in answering key clinical questions. As a result, it becomes challenging to clearly identify best practices and to assure that patients cared for at different institutions are receiving the standard of care – in fact, the “standard of care” is often illusive to define. Numerous clinical registries focused on care quality have emerged in the past several decades. The Pediatric Cardiac Critical Care Consortium (PC4) is a quality-focused clinical database gathering data related to the ICU stay for both surgical and medical paediatric patients with cardiac disease. Reference Gaies, Cooper and Tabbutt1
Since the collaborative’ s inception, the data coordinating centre of PC4 recognised that the mission of the consortium – transparent collaboration, cross-boundary partnerships, and evidence-based advancement – could only be achieved if the data collected is accurate, complete, and timely. The data audit process put in place within PC4 is rigorous and extensive. In 2016, Gaies et al. Reference Gaies, Donohue and Willis2 reported on the initial data audits that were undertaken by the data coordinating centre at the eight founding sites. Since that time, PC4 has expanded to over 50 institutions with a total of nearly 100,000 individual cardiac ICU admissions included in the database (Fig 1).
The database itself has undergone two revisions, and the audit process has also been refined to address both the database updates and the scientific and quality improvement needs of the consortium. We believe it is important to be publicly transparent regarding the data integrity of the clinical registry so that readers of our published work can judge the analyses in context. This analysis updates the consortium’s audit process and audit outcomes after 6 years of data submission.
Materials and methods
The clinical registry and data collection process
The clinical champions (cardiac ICU physicians) and data entry personnel who make up each institution’s PC4 team undergo extensive training prior to beginning data entry. Each institution receives a two-day training on the data base and data entry process, which includes a review of the current data definitions manual. (Previous to the COVID pandemic, these were all in-person trainings at the participating institution, while training is currently being completed remotely; only pre-COVID data are presented here.) Any staff who will participate in data entry are required to pass a certification exam prior to submitting cases via the web-based data entry portal. Paper case report forms are available for use, although some sites elect to directly enter data online or use a combination of paper and direct entry methods.
Supporting site data entry personnel remains a high priority to optimise accurate data collection. Weekly conference calls allow data entry staff to request definition clarification and/or assistance with entering data for unusual clinical circumstances. Frequently asked questions are documented and answers are available on the PC4 website. In addition, the PC4 lead data manager, executive director, and clinicians from the database committee are available on conference calls and via email for additional queries. Finally, there are sessions held at the annual PC4 conference that focus specifically on optimising accurate data entry and improving local internal data auditing practices; the growing number of databases that focus on the care of paediatric cardiac patients requires that the team of data entry personnel at any given site for these various registries have a strong system of routine communication to assure that the data across platforms are consistent.
Data in a number of epidemiologic, clinical, and outcomes areas are entered into the registry; a non-exhaustive list of data categories and sub-categories is shown in Table 1. In total, a complex surgical admission can include as many as 473 data fields, and a complex medical admission can include up to 428 data fields. Over 90% of the database fields are mandatory for case submission.
The audit process, content, and quantitative analysis
The audit process that was initially created rested on a foundation of onsite audits completed by members of the data coordinating centre. Audits include two areas of focus: (1) census adjudication to assure that all cases that qualified were included in the database and (2) data verification. The cases to be audited are randomly selected by the data coordinating centre, and auditor access to the electronic medical record is arranged by the site’s PC4 personnel in conjunction with their information technology team. Data are collected in compliance with the Health Insurance Portability and Accountability Act and transferred to the data coordinating centre using an encrypted and secure communication system. Both blinded abstraction (with an auditor comparing the medical record to the data entered in the database) and source data verification (where an auditor asks the onsite data entry personnel to retrieve the data for each audited field from the medical record and compares that to what was actually entered into the database) are utilised during each audit visit. The minimum goal for each audit is to review 60 discrete ICU admissions, or a minimum of 10% of annual case submissions, whichever is fewer. Any discrepancies identified are adjudicated on site to assure that they are true errors. Additionally, any systematic issues, such as a misunderstanding of a definition that resulted in repeated errors, result in a follow-up report to the site that identifies all cases where the error was made to allow for correction in all of the affected records. At the conclusion of the audit, the audit team meets with the onsite clinical champion(s) and data entry personnel to review overall findings and provide suggestions for improved data entry as needed. The data coordinating centre produces a formal report in the following weeks for the site and gives a final “score” based on discrepancy rate (accuracy), completeness, and timeliness of data entry. The overarching goal of the audit process, in addition to assuring data accuracy, is to identify and correct systematic data entry errors and to provide supportive recommendations to the site team. It is expressly not meant to be a punitive process.
The data coordinating centre as well as the database and executive committees identified the original subset of data fields that would be audited, focusing on those fields that would be of greatest value in assessing care quality (e.g. dates/times that establish length of stay, mortality, and complications associated with significant morbidity) as well as data elements where a higher degree of inaccuracy might be expected. Specific criteria were developed to define a clinically important discrepancy for each audited field (e.g. a 500 g error in weight for a neonate being significantly different than a 500 g error in weight for a teenager), and discrepancies were coded as major or minor. For example, missing an arrhythmia episode that required therapy is a major discrepancy, while entering the date of therapy initiation incorrectly is a minor discrepancy. Each institution’s audit report includes a measurement of overall accuracy ([fields without a major or minor discrepancy/total audited fields] × 100) and major discrepancy rate ([fields with major discrepancy/audited fields with potential for major discrepancy] × 100).
Initial site audits are completed within 9–12 months of the initiation of data entry. If sites pass the initial audit, subsequent audits are completed at 2−3 year intervals (decision making on exact timing reviewed below). The executive committee of PC4 defined a passing score as >97% overall accuracy and a major discrepancy rate <1.5%. Additionally, near real-time data entry is a significant emphasis within PC4, as one particularly valuable benefit of collaborative participation is the ability to use the web-based data analysis platform to assess institutional performance across numerous quality metrics and to identify high-performing hospitals. This makes data entry timeliness an essential component of participation in the collaborative, with a goal of 90% case submission within 30 days of hospital discharge. The audit measures whether the participating site has met this goal at the time of audit. In addition to the objective audit scoring rubric, a qualitative assessment is also made by the audit team based on several criteria, including but not limited to the site personnel’s demonstrated familiarity with database definitions, prevalence of errors noticed in non-audited fields, and engagement level of the clinical champion(s) as well as paediatric cardiac critical care fund of knowledge of the data entry personnel.
The audit team debriefs with the PC4 site personnel and the data coordinating centre after each audit to communicate general findings and discuss recommendations, with the final report and audit score communicated to the site a short time later. Failure to meet any of the objective passing criteria triggers an automatic re-education programme for that site with a repeat audit in 1 year. Sites that do not pass the initial audit also forfeit the opportunity to be “un-blinded” – a process that allows sites in good standing to view site-specific data on the analytics platform with each site identified. A site that passes the initial audit will be re-audited every 2−3 years.
As the collaborative grew and the database underwent revisions, it became necessary to review and amend our auditing process as well. We formed an audit committee comprised of both clinical champions and data collection personnel from PC4 participating sites. These individuals were trained in the auditing process and participated in two audits with data coordinating centre members and/or other experienced auditors prior to auditing independently. The committee now consists of 21 members, and the lead data manager of PC4 continues to participate in all initial audits. Of note, audits are currently being performed remotely secondary to restrictions related to the COVID pandemic; all data presented in this analysis represents pre-COVID, in-person audits.
The timing of follow-up audits after passing the initial audit has also been revised after noting a clear pattern in audit results over time, with high-performing sites continuing to perform well on follow-up audits. If a site (1) passes their initial audit and the first follow-up audit at 2 years, (2) there have been no changes in site personnel (clinical champions or data entry personnel), and (3) the database has not undergone a revision, the next audit can be deferred for an additional year.
The PC4 database is currently in its third version; with every version update, the audit fields and scoring rubric are reviewed and modified as needed to account for changes in data fields and definitions. These modifications are made by the audit committee with significant input from the PC4 data manager and executive director. We remain focused on assuring continued rigorous assessment of the accuracy, reliability, and timeliness of data entry that supports the research and quality improvement functions of PC4.
Results
As of July, 2019, audits of encounters entered into version 2 of the database had been completed at 21 sites undergoing their first audit, 17 sites undergoing their second, and one site undergoing its third. Overall accuracy and major discrepancy rates for each of those audits are shown in Table 2. Of the four audits with a major discrepancy rate > 1%, three were initial audits and one was a follow- up audit. The leading cause for the high major discrepancy rate was related to data definition misinterpretation. Additionally, the audit committee has identified that centres with engaged clinical champions and/or a robust internal audit system tended to have far fewer errors/discrepancies.
A total of 2219 cardiac ICU encounters were included in the 39 audits, for an average of 57 encounters per site. A total of 191,124 fields were audited across all 39 sites, 175,124 of which are included in the final audit score. There were 1122 discrepancies identified for an overall accuracy rate of 99.4%. A major discrepancy was defined for 135,392 of the audited fields; 700 discrepancies were identified in these fields for a major discrepancy rate of 0.52%. Only one site failed to achieve a passing score; that site had a major discrepancy rate of 2.49%. A second site achieved a passing score, but the audit team reported subjective concerns with data quality. Both of these sites were re-audited within a year, and the data from the follow-up audits are not included in our reported data. No site audited had an overall accuracy of <97%, and 35 of the 39 audits had major discrepancy rates <1%. Audited fields with the highest major and minor combined discrepancy rates are shown in Table 3.
ARF = acute renal failure; CPR = cardiopulmonary resuscitation; CRRT = continuous renal replacement therapy; ECMO = extracorporeal membrane oxygenation; ECPR = cardiopulmonary resuscitation requiring ECMO.
Two variables with high discrepancy rates, “arrythmia types” and “latest arrhythmia end date,” require a deeper dive into the EMR compared with other variables. The documentation is not always easy to locate depending on EMR type, familiarity with the EMR, and/or the accuracy of clinicians’ notes. A third variable found to have a higher major discrepancy rate is the “current surgical status at CICU admission.” This field is dependent on timing of the encounter (pre-operative versus post-operative) and is most commonly coded incorrectly if the patient is admitted preoperatively and then undergoes the index surgery. All three fields underwent intentional definition clarification as version 3 education rolled out and the data coordinating centre continues to provide data definition instruction on weekly conference calls. In addition, we have encouraged clinical champions to continue to act as reliable resources to their data entry personnel and to perform internal audits to assure that systematic errors are caught early and can be corrected. Complete field by field audit results aggregated for all sites are in Supplementary Table S1.
Discussion
Clinical registries have become a cornerstone of research and quality assessment within the healthcare community; this is especially true in subspecialty areas where there is significant heterogeneity within the patient population being evaluated, making individual and even multi-institutional data insufficiently powered to draw statistically solid conclusions. A crucial but often underreported foundational aspect of clinical registries is the need to assure that the data are complete and accurate. In fact, a recent review of 153 clinical registries in the United States found that only 18.3% utilise audits to monitor data accuracy. Reference Lyu, Cooper and Patel3 PC4’s commitment to complete and accurate data has served as a key driver of our auditing process.
When others have assessed clinical registry data quality the results are highly variable. An evaluation of clinical databases in Australia utilising a scoring rubric which encompassed completeness of patient inclusion, reliability of data entry, explicitness of data definitions, amount of missing data, and methods of data validation showed that only 14 out of the 40 included databases (41.2%) received a passing score of >75%. Reference Md Emdadul Hoque, Ruseckaite, Lorgelly, McNeil and Evans4 Similarly, a survey in the United Kingdom that evaluated the quality of data in 105 clinical databases across a variety of specialties (cancer, surgery, trauma, cardiovascular) found that only 27% of the databases met the highest criteria for incorporating a data validation process, 42% met the highest criteria for completeness, and 36% met the highest criteria for utilising explicit definitions. Reference Black, Barker and Payne5 A description of the development and implementation of a 138 participant adult ICU database noted that there was no process in place for onsite training at new centres and no on site audits, noting that ”periodic audits of data against original medical records and specific data sets are a key goal for the future.” Reference Stow, Hart and Higlett6
Among cardiovascular specific databases, Andrianopoulos et al. described the audit process for two databases in Australia; one for interventional catheterisation and another for cardiovascular surgery. The catheterisation database audited 4.3% of the variables in the database for 3% of the entered cases in the first phase of audits, finding an overall agreement of 96.5%; the second phase assessed 6.4% of the variables from 5% of the cases with an agreement of 97%. Reference Andrianopoulos, Dinh and Duffy7 Our registry compares favourably, with 25–30% of entered fields examined for each audited encounter and an overall agreement of 99.4% in the database version 2 audits completed between January 2015 and July 2019.
Perhaps the database that is most closely clinically aligned with PC4 is the Society of Thoracic Surgeons Congenital Heart Surgery Database. The Society of Thoracic Surgeons reported on the audit process and outcomes for their General Thoracic Surgery Database and their Adult Cardiac Surgery Database in 2013 Reference Shahian, Jacobs and Edwards8 and 2017, Reference Magee9 and more recently completed a similar review for the Society of Thoracic Surgeons Congenital Heart Surgery Database. For the Society of Thoracic Surgeons Congenital Heart Surgery Database, on site source data verification audits were initiated in 2007; at that time, there were 58 participating sites submitting approximately 15,000 operations annually. By 2018, these numbers had risen to 119 sites submitting approximately 39,000 operations per year. Over the course of 10 audit cycles, 73 total audits took place between 2011 and 2018. Unlike PC4, the Society of Thoracic Surgeons Congenital Heart Surgery Database utilises independent auditors rather than utilising peer review and is now doing a majority of their audits remotely rather than on site. The data assessment during the most recently reported audit cycle (2017) demonstrated a rate of completeness of 97.6% and a rate of agreement of 97.4%. Reference Overman, Jacobs and O'Brien10 The PC4 data quality compares favourably to these audit outcomes, with an overall agreement of 99.4% in the version 2 audit cycle. (Completeness of entered encounters is not a part of the PC4 audit process as over 90% of PC4 data fields are mandatory and a case cannot be submitted until all of the mandatory fields are entered.)
It is vital to have a rigorous system of data auditing in place for data registries to be safely and appropriately used to guide programmatic decision making to improve the quality of care being delivered. If such decisions are made in a setting of incomplete and/or inaccurate data, there could be unintended consequences and potential for patient harm as a result. Wang et al delineated six specific parameters that should be used when assessing the quality of a registry database Reference Wang, Storey and Firth11 ; Table 4 lists each of these parameters and describes how each is addressed within the PC4 registry itself and/or within the audit process.
Sources of data entry error are variable; Assareh et al., Reference Assareh, Waterhouse and Moser12 in a report on clinical database data quality, divided the error sources into five categories:
-
Method of data collection (real-time direct entry versus use of a data collection form with subsequent data transfer)
-
Forms/platform (layout, ease of use)
-
Personnel (training, motivation, communication, and feedback)
-
Instruction
-
Software (internal validity checks, user friendly interface, embedded definitions)
Similarly, a literature review on the accuracy of medical registry data found that insufficient training of data collection personnel, unclear definitions, inaccurate transcription, and incomplete abstraction were the four most common sources of data inaccuracy. Reference Arts, De Keizer and Scheffer13 A survey of adult ICU registry participants in New Zealand noted that lack of dedicated staff and lack of feedback to data collectors to decrease errors also contributed to decreased data accuracy. Reference Hewson-Conroy, Tierney and Burrell14
Accurate data entry is the cornerstone of the PC4 collaborative. PC4 utilises a strategy of optimising training and ongoing support with a goal of maximising data integrity. Our reporting platform allows real-time transparent benchmarking across centres, and we believe it is crucial to instill the utmost confidence in the data among our users to promote collaborative quality improvement through use of this platform. As a result, we believe that the data integrity of our registry led hospitals to use the benchmark data and achieve an aggregate 24% reduction in ICU mortality at PC4 centres following 2 years of participation. Reference Gaies, Pasquali and Banerjee15 Though expensive and time-intensive, the components of our approach – rigorous in-person training, weekly conference calls, daily access to the PC4 data manager and physician leads, resourcing at annual meetings and in-person audits – are all essential activities that contribute to accurate data entry. Our audit data, as demonstrated in this analysis, and the improvement in outcomes we are witnessing among our participants seem to justify this investment.
The data integrity process within PC4 has many secondary benefits as well. The data entry staff are frequently former cardiac ICU nurses or advanced practice nurses, bringing a solid core knowledge of the patient population to the task of data abstraction. They have invested their own time to be part of the database and audit committees, thus ensuring continued improvement of the clinical registry and the audit process. Many of these data entry personnel also get involved in presenting the data to their local leaders and engage in quality improvement projects both locally and within the national PC4 community. Invested participation in PC4 by these individuals represents a potential career development and advancement opportunity.
We continue to evaluate the census adjudication process that assesses timeliness of data entry and the completeness of case inclusion. We consider these important aspects of data integrity, but it is arduous and time consuming to look at more than a small sample for this purpose. Further, due to patient privacy considerations, it is nearly impossible to conduct a census adjudication remotely. Finally, our findings from reviewing high error fields suggest the need to revise some definitions and re-educate data collection personnel, resulting in nearly real time quality improvement of our data. Additionally, this process has resulted in important insights that allow us to make decisions on excluding certain fields from research and quality improvement activities to avoid using inaccurate data.
Conclusion
PC4 has instituted a rigorous and extensive auditing process that includes initial and follow-up audits of every participating site with an extremely high level of accuracy across a broad array of audited fields. Areas for future consideration in data collection include establishing the ability to track patients who receive care across multiple centres, connecting with outside databases to assure complete transplant and mortality reporting, and automated abstraction of objective data from the EMR – perhaps directly into the database platform. Although the data presented represent PC4’s pre-COVID pandemic experience, it is worth noting that the PC4 audit team and data coordinating centre have transitioned to a remote audit format since the spring of 2020. It will be important to analyse not only the cost savings (which are substantial) but also the effectiveness of this new auditing approach, how it is affecting the recruitment and retention of volunteer auditors, and the effect of the missed opportunity to engage with each other in person. PC4 will continue to intentionally and thoroughly re-evaluate our auditing process as the collaborative continues to grow and responds to new developments in our field.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S1047951121004984
Acknowledgements
The authors recognise the work of all audit committee members who have contributed to the PC4 auditing process and all data collection team members at the participating consortium sites whose efforts contributed to the data quality.
Financial support
None.
Conflicts of interest
None.