Hostname: page-component-6bf8c574d5-t27h7 Total loading time: 0 Render date: 2025-02-23T21:47:46.595Z Has data issue: false hasContentIssue false

Cochrane's Linked Data Project: How it Can Advance our Understanding of Surrogate Endpoints

Published online by Cambridge University Press:  01 January 2021

Rights & Permissions [Opens in a new window]

Abstract

Cochrane has developed a linked data infrastructure to make the evidence and data from its rich repositories more discoverable to facilitate evidence-based health decision-making. These annotated resources can enhance the study and understanding of biomarkers and surrogate endpoints.

Type
Symposium Articles
Copyright
Copyright © American Society of Law, Medicine and Ethics 2019

Cochrane is a global, not-for-profit organization made up of over 13,000 members with 50,000 supporters whose aim is to improve health-care decision making through the production of high-quality systematic reviews and other synthesized evidence content. Cochrane's network is made up of researchers, health-care professionals, patients, carers, and others interested in improving health outcomes for everyone. Cochrane's scope is broad, meaning they will address any relevant health question, though to date the focus has been on pairwise comparison reviews of interventions fed by primary evidence and data from randomized controlled trials.

The Cochrane Database of Systematic Reviews (CDSR) is the largest single repository of systematic reviews in the world, and all the reviewers in the CDSR conform to Cochrane's high-quality methods and structure. CDSR currently contains nearly 8,000 published reviews which are continually updated when new evidence becomes available, a key and unique feature of Cochrane Reviews. However, the CDSR data is currently not as structured and well-described as it could be. Thus, in 2014, Cochrane initiated the Linked Data Project, whose aim is to make Cochrane's vast evidence base more discoverable and useful for decision-making.1 In what follows, we describe the current state of the Cochrane Linked Data Project in greater detail, and then discuss how this project can be useful for advancing our understanding of surrogate endpoint biomarkers.

The Cochrane Database of Systematic Reviews (CDSR) is the largest single repository of systematic reviews in the world, and all the reviewers in the CDSR conform to Cochrane's high-quality methods and structure. CDSR currently contains nearly 8,000 published reviews which are continually updated when new evidence becomes available, a key and unique feature of Cochrane Reviews. However, the CDSR data is currently not as structured and well-described as it could be. Thus, in 2014, Cochrane initiated the Linked Data Project, whose aim is to make Cochrane's vast evidence base more discoverable and useful for decision-making. In what follows, we describe the current state of the Cochrane Linked Data Project in greater detail, and then discuss how this project can be useful for advancing our understanding of surrogate endpoint biomarkers.

Reaching Decision Makers and Researchers

Cochrane is widely considered the “gold standard” for health evidence in the form of its systematic reviews and meta-analyses that address specific PICO (Population, Intervention, Comparator, Outcome) questions of clinical relevance. Its groups span healthcare domains from breast cancer to acute respiratory infections, as well as broader topic areas such as how care is delivered.2 Cochrane's mission is three-fold: (1) to promote evidence-informed health decision-making; (2) to help ensure that evidence reaches decision makers and researchers in a timely manner; and (3) to provide evidence in a context that facilitates rapid uptake and bridging the “know-do” gap.

The CDSR is published on The Cochrane Library and though it is technically a database, the reviews are presented and rendered as standard journal articles. The problem with this journal presentation format (PDF and HTML) is that the reviews often run to hundreds of pages long with many figures, tables, and analyses containing valuable results (e.g., effect estimates, summary of findings, risk of bias assessments of trials and others) that are difficult to locate within the review and even more difficult to compare across reviews for PICO questions that span multiple reviews or topic areas. Yet, the ability to quickly and reliably compare evidence across reviews is critical for navigating the evidence and compiling the data needed to inform decision- and policy-making. Thus, Cochrane's journal article format for its reviews is, in some ways, undermining its own mission.

The PICO Ontology: Making Cochrane's Data More Useful

Cochrane's Linked Data Project began as a way of leveraging the structured data and unique identifiers (IDs) in Cochrane data repositories. Cochrane CENTRAL is the largest repository of reports of randomized and quasi-randomized trials and currently contains 1.3 million records, many of which are “studified” or linked together under a common trial ID.3 Although studies and reviews are published in a document view in The Cochrane Library, these records are, in fact, underpinned by highly structured databases that, as Cochrane updates the reviews, represents a dynamic evidence base. But even though there is relevant data and metadata on the clinical questions addressed in these records, it is neither consistent nor semantically structured. The goal of the Linked Data Project is to create a model — using a PICO ontology — that could describe the clinical question in a structured, computable way. This rich PICO metadata links the reports of trials included in Cochrane Reviews with the review questions and their analyses.

Cochrane's PICO ontology was developed to describe the minimal set of characteristics needed to describe a record (either a review or a study) in a way that would allow researchers or decision-makers to quickly identify relevant resources in Cochrane's databases.4 The model captures population (P) characteristics such as sex, age, and condition; intervention (I) and comparator (C) characteristics such as classification, intervention, delivery method, setting, dose, duration, schedule and others; and outcome (O) characteristics such as classification, outcome measure, and domain, including endpoints. The PICO fields reference terms from controlled terminology sets including SNOMED CT (https://browser.ihtsdotools.org/?), RxNorm (https://www.nlm.nih.gov/research/umls/rxnorm/), MedDRA (https://www.meddra.org/), and WHO ATC (https://www.whocc.no/atc_ddd_index/) to populate the Cochrane ontology (or vocabulary). Unique, persistent identifiers (IDs) are assigned to all concepts in the model.

The PICO ontology is the model that now underpins Cochrane's new PICO annotation tool, which allows for the annotation of three separate sections of a Cochrane Review with the concepts contained with the model:

  1. 1. The methods section, which outlines the PICO question that was framed in the Review protocol

  2. 2. Each of the studies that have been found by the authors and included in the Review

  3. 3. Each of the meta-analyses conducted for the Review

The tool is easily modified to allow PICO annotation of reviews, studies or meta-analyses from other sources. These annotations are then put through a quality assurance process that is overseen by metadata specialists at Cochrane, as well as information specialists in Cochrane's review groups.

This rich, semantically structured metadata generated in the PICO annotation process is then used to fulfil a number of core use cases for dissemination of Cochrane evidence. For example, a “PICOfinder” user interface has been developed and undergone several iterations through user testing with various core personas such as researchers, health care professionals, and information specialists. There are also search and concept application programming interfaces (APIs) developed that allow this data to flow and support interoperability with other tools and systems, such as the MAGIC App (http://magicproject.org/), a guideline authoring tool which codes PICO against recommendations used in the development of health policy and guidelines. Later in 2019, Cochrane will roll out a PICO search and browse beta on The Cochrane Library platform that will address these core use cases of making Cochrane evidence more discoverable and usable with the aim of increasing impact on health care decision-making. The PICO ontology, annotation, and related dissemination work are also core pillars in Cochrane's efforts to make its data FAIR, according to the FAIR data principles for scientific data management and stewardship.5

Using Linked Data for Studying Biomarkers

Several Cochrane reviews and many study records in the repositories address the utility and validity of biomarkers uses. The PICO metadata facilitates biomarker validation research by making the relevant evidence in Cochrane's data repositories accessible; supporting the power of a specific biomarker to be validated or rejected for a disease state, process, or response to an intervention or exposure.

For example, in the majority of cases a biomarker has been used in the O (Outcome) portion of the study or review, either to indicate a pharmacological or biological response to an interventionReference Sampson, Singer and Walters6 or as a surrogate endpoint.Reference Graudal, Hubeck-Graudal and Jurgens7 A growing body of Cochrane evidence uses biomarkers in the I/C (Intervention or Comparison) role — usually by evaluating the effectiveness of an intervention strategy guided by presence or absence of a specific biomarker,Reference Schuetz, Wirz and Sager8 but sometimes by assessing the effects of making individuals aware of their biomarker status.Reference Marteau, French and Griffin9 A smaller number of reviews address a specific biomarker as part of the P (Population) by either restricting the population of interest to individuals who display the biomarkerReference Southern, Patel, Sinha and Nevitt10 or performing sub-analyses of the effectiveness of the intervention in individuals with or without the biomarker.Reference Schuit, Panagiotou and Munafò11

Cochrane reviews employ systematic and explicit methods to identify, select, and critically appraise relevant research, and to collect and analyze data from the research identified. Since reviews typically extract and synthesize data on multiple outcomes, the clinical utility of biomarker outcomes can be indirectly assessed by comparing effect sizes for the outcome assessed using a biomarker of interest with those assessed by non-biomarker outcomes (see SGLT-1 example below). This can be done on a study-by-study basis or across studies using the meta-analyses and data provided in the review. The Risk of Bias assessment provided in the review for each study points up any flaws in the design, conduct, analysis, and reporting of the study that could undermine its conclusions.

Reviews that include biomarkers as Intervention or Comparator provide a more direct measure of clinical utility by showing the evidenceReference Aabenhus, Jensen and Jørgensen12 (or lack thereofReference Petsky, Cates and Li13) for effects of a biomarker-guided intervention strategy on improving specific outcomes. Again, this can be assessed on a study-by-study or synthesized basis, with risks of bias provided. A special category of Cochrane reviews — diagnostic test accuracy reviews — directly address analytical validity by measuring sensitivity, specificity, ROC curves and other measures of the biomarker when assessed against a relevant reference standard.Reference Alldred, Takwoingi, Guo, Rompianesi, Hann, Komolafe, Gupta, Hull, Fraser, Shaikh, Borrell, Evron and Leeflang14 Cochrane does not currently provide PICO metadata to reviews of this sort, but we plan to do so in future.

To illustrate how Cochrane's newly linked data can facilitate the study of biomarkers, we can look at a few use cases. For example, Hemoglobin A1c (HbA1c) is a biomarker with multiple contexts of use. It is (1) a diagnostic biomarker used to identify patients with Type 2 diabetes mellitus; (2) a pharmacodynamic/response biomarker when evaluating patients with diabetes, to assess response to antihyperglycemic agents; and (3) a surrogate endpoint for reduction of microvascular complications associated with diabetes mellitus.

The HbA1c biomarker is included as an outcome in 52 Cochrane reviews and 8,814 trials in CENTRAL. It is worth noting that biomarkers in Cochrane Reviews will be primarily those that have found some use in clinical practice (such as HbA1C) whereas biomarkers that have only been used in research, are less likely to have been studied in Cochrane Reviews (e.g., MicroRNA — 1 Review and 704 trials in Cochrane datastores).

The HbA1c example translates into an Intervention Review with the following PICO:

  • P diabetes

  • I/C antihyperglycemic agents

  • O HbA1c

Thus, the generic format for other biomarkers will be:

  • P – Condition

  • I/C Intervention(s)

  • O – Biomarker

This format would apply to many Intervention Reviews in Cochrane's data repository and Cochrane Reviews will contain other outcomes which focus on clinical endpoints wherever available, therefore there is the potential to make indirect comparisons of biomarker results with clinical endpoints. For example, the Cochrane Review on Insulin and glucose-lowering agents for treating people with diabetes and chronic kidney disease 15 contains the PICO:

  • P – Diabetes and chronic kidney disease

  • I – Insulin or glucose-lowering agents

  • C – Placebo

  • O – HbA1c, other biomarkers, clinical endpoints

The results from this review show a Summary of Analyses for one of the glucose-lowering agents (SGLT2 inhibitors) vs. placebo (see Figure 1). The biomarker outcomes and clinical endpoints have not been compared to each but most of the work is done here to facilitate that question of interest by finding the studies that overlap and that address both.

Figure 1. Screenshot of Summary of Results Table from Cochrane Review CD011798: Insulin and glucose-lowering agents for treating people with diabetes and chronic kidney disease

We can also examine the forest plots from this review, which show that HbA1c is clearly lower with the SGLT2 inhibitors (Figure 2a). However, these agents do not appear to have effect on the clinical endpoints of all-cause mortality (Figure 2b) or incidents of myocardial infarction (Figure 2c). While this analysis is by no means definitive, it does show how a drug can have a clear effect on a (presumed) surrogate endpoint without affecting the clinical endpoint—and with Cochrane's PICO finder, it is far easier to find this information and use it as the starting point for a more rigorous evaluation of surrogacy.

Figure 2. Forest Plots for Effect of SGLT2 Inhibitors on HbA1c, All-cause Mortality, and Myocardial Infarction from Cochrane Review CD011798

Another interesting example is the Cochrane Review of Supplemental oxygen for caesarean section during regional anaesthesia, 16 which includes the following in the analyses: clinical endpoint of 1 and 5 minute Apgar scores and surrogate endpoints of various oxygenation levels. Again, we can see in the summary table that there was no effect of supplemental oxygen on Apgar scores, but there were effects on the oxygenation biomarkers (Figure 3).

Figure 3. Apgar scores as clinical endpoints with secondary outcomes of various oxygenation levels from Cochrane Review CD006161: Supplemental oxygen for caesarean section during regional anaesthesia

Semi-Automating Surrogate Endpoint Guidance

One powerful application for this approach — i.e., leveraging Cochrane's PICO ontology to identify evidence for/against surrogate endpoint biomarkers — would be to try and interface directly with the United States Food and Drug Administration's “Table of Surrogate Endpoints That Were the Basis of Drug Approval or Licensure.”17 First published in 2018, this table lists over 100 surrogate endpoint biomarkers that the Agency considers to be potentially useful for evaluating the effectiveness of new medicines, but it does not provide any direct links to the evidence base that would inform such use.Reference Hey, Shrager, Shapiro and Hoos18

A useful addition to this table would be a mechanism that allowed users to explore the body of research in which each of the biomarkers on the table had been used in relevant clinical trial reports and systematic reviews. This could include links to detailed analyses tables in Cochrane's data repositories that relate to the “Disease or use” and “Patient population” (columns 1 and 2 in the FDA's table, which correspond to P in Cochrane's PICO ontology), the “Surrogate endpoint” (column 3 in the table, corresponding to an O in PICO), and the “Drug mechanism of action” (column 5, corresponding to I/C in PICO). The concepts in the table mapped to the Cochrane vocab could generate an API call to Cochrane's PICOfinder which would return the relevant evidence and facilitate drilling down to the detailed results data in the analysis tables.

After subsequent analyses are completed, these links to the evidence could be maintained dynamically with the Cochrane evidence base so that as the reviews and their results are updated, based on new trial data that is published, users of the FDA table could be notified, and subsequent analyses revised, as appropriate. This example demonstrates the power of the PICO ontology and linked data approach to linking evidence via persistent identifiers and common terminology sets for improving the discovery of relationships between biomarker surrogate endpoints, clinical endpoints, and other relevant data from systematic reviews and meta-analyses.

Conclusion

The Cochrane Library contains a rich repository of studies and syntheses of studies relevant to biomarker research and validation. The linked data project with its PICO ontology and controlled hierarchical vocabulary will improve the discoverability and usefulness of this resource — including helping to advance the rapid, systematic study of biomarkers — and better fulfilling Cochrane's core mission to make evidence more available and accessible to the people who need it.

Footnotes

The authors are salaried employees of Cochrane.

References

See <https://linkeddata.cochrane.org/> (last visited July 9, 2019).+(last+visited+July+9,+2019).>Google Scholar
Effective Practice and Organization of Care Group and Public Health, see <https://www.cochrane.org/about-us/our-global-community/review-group-networks> (last visited July 9, 2019).+(last+visited+July+9,+2019).>Google Scholar
Sampson, A. L., Singer, R. F., and Walters, G. D., “Uric Acid Lowering Therapies for Preventing or Delaying the Progression of Chronic Kidney Disease,” Cochrane Database of Systematic Reviews 10 (2017): Art. No.: CD009460, doi: http://dx.doi.org/10.1002/14651858.CD009460.pub2.Google Scholar
Graudal, N. A., Hubeck-Graudal, T., and Jurgens, G., “Effects of Low Sodium Diet Versus High Sodium Diet on Blood Pressure, Renin, Aldosterone, Catecholamines, Cholesterol, and Triglyceride,” Cochrane Database of Systematic Reviews 4 (2017): Art. No.: CD004022, doi: http://dx.doi.org/10.1002/14651858.CD004022.pub4.Google Scholar
Schuetz, P., Wirz, Y., Sager, R., et al., “Procalcitonin to Initiate or Discontinue Antibiotics in Acute Respiratory Rract Infections,” Cochrane Database of Systematic Reviews 10 (2017): Art. No.: CD007498, doi: http://dx.doi.org/10.1002/14651858.CD007498.pub3.Google Scholar
Marteau, T. M., French, D. P., Griffin, S. J. et al., “Effects of Communicating DNA-Based Disease Risk Estimates on Risk-Reducing Hehaviours,” Cochrane Database of Systematic Reviews 10 (2010): Art. No.: CD007275, doi: http://dx.doi.org/10.1002/14651858.CD007275.pub2.Google Scholar
Southern, K. W., Patel, S., Sinha, I. P., and Nevitt, S. J., “Correctors (Specific Therapies for Class II CFTR Mutations) for Cystic Fibrosis,” Cochrane Database of Systematic Reviews 8 (2018): Art. No.: CD010966, doi: http://dx.doi.org/10.1002/14651858.CD010966.pub2.Google Scholar
Schuit, E., Panagiotou, O. A., Munafò, M. R., et al., “Pharmaco-therapy for Smoking Cessation: Effects by Subgroup Defined by Genetically Informed Biomarkers,” Cochrane Database of Systematic Reviews 9 (2017): Art. No.: CD011823, doi: http://dx.doi.org/10.1002/14651858.CD011823.pub2.Google Scholar
Aabenhus, R., Jensen, J. U. S., Jørgensen, K. J. et al., “Biomarkers as Point-of-Care Tests to Guide Prescription of Antibiotics in Patients with Acute Respiratory Infections in Primary Care,” Cochrane Database of Systematic Reviews 11 (2014): Art. No.: CD010130, doi: http://dx.doi.org/10.1002/14651858.CD010130.pub2.Google Scholar
Petsky, H. L., Cates, C. J., Li, A., et al., “Tailored Interventions Based on Exhaled Nitric Oxide Versus Clinical Symptoms for Asthma in Children and Adults,” Cochrane Database of Systematic Reviews 4 (2009): Art. No.: CD006340, doi: http://dx.doi.org/10.1002/14651858.CD006340.pub3.Google Scholar
Alldred, S. K., Takwoingi, Y., Guo, B., et al., “First Trimester Ultrasound Tests Alone or in Combination with First Trimester Serum Tests for Down's Syndrome Screening,” Cochrane Database of Systematic Reviews 3 (2017): Art. No.: CD012600, doi: http://dx.doi.org/10.1002/14651858.CD012600; Rompianesi, G., Hann, A., Komolafe, O., et al., “Serum Amylase and Lipase and Urinary Trypsinogen and Amylase for Diagnosis of Acute Pancreatitis,” Cochrane Database of Systematic Reviews 4 (2017): Art. No.: CD012010, doi: http://dx.doi.org/10.1002/14651858.CD012010.pub2; Gupta, D., Hull, M. L., Fraser, I., et al., “Endometrial Biomarkers for the Non-Invasive Diagnosis of Endometriosis,” Cochrane Database of Systematic Reviews 4 (2016); Art. No.: CD012165. doi: http://dx.doi.org/10.1002/14651858.CD012165; Shaikh, N., Borrell, J. L., Evron, J., and Leeflang, M. M. G., “Procalcitonin, C-Reactive Protein, and Erythrocyte Sedimentation Rate for the Diagnosis of Acute Pyelonephritis in Children,” Cochrane Database of Systematic Reviews 1 (2015): Art. No.: CD009185. doi: http://dx.doi.org/10.1002/14651858.CD009185.pub2.Google Scholar
Hey, S. P. et al., “Challenges and Opportunities for Biomarker Validation,” Journal of Law, Medicine & Ethics 47, no. 3 (2019): 357-361; Shrager, J., Shapiro, M., and Hoos, W., “Is Cancer Solvable? Towards Efficient and Ethical Biomedical Science,” Journal of Law, Medicine & Ethics 47, no. 3 (2019): 369-373; S.P. Hey et al., Surrogate Endpoints and Drug Regulation: What Is Needed to Clarify the Evidence, Journal of Law, Medicine & Ethics 47, no. 3 (2019): 381-387; L.M. McShane, Biomarker Validation: Context and Complexities, Journal of Law, Medicine & Ethics 47, no. 3 (2019): 388-392; A.D. Zhang and J.S. Ross, Biomarkers as Surrogate Endpoints: Ongoing Opportunities for Validation, Journal of Law, Medicine & Ethics 47, no. 3 (2019): 393-395; A.D. Stern, Managing the Use and Dissemination of Information about Biomarkers: The Importance of Incentive Structures, Journal of Law, Medicine & Ethics 47, no. 3 (2019): 396-397.CrossRefGoogle Scholar
Figure 0

Figure 1. Screenshot of Summary of Results Table from Cochrane Review CD011798: Insulin and glucose-lowering agents for treating people with diabetes and chronic kidney disease

Figure 1

Figure 2. Forest Plots for Effect of SGLT2 Inhibitors on HbA1c, All-cause Mortality, and Myocardial Infarction from Cochrane Review CD011798

Figure 2

Figure 3. Apgar scores as clinical endpoints with secondary outcomes of various oxygenation levels from Cochrane Review CD006161: Supplemental oxygen for caesarean section during regional anaesthesia