Introduction
Citizen science is expanding and increasing in visibility.Reference Guerrini, Pauwels, Denton and Kuiken1 Indeed, citizen science is so popular that the term is in danger of losing any clear meaning. Here we use the term “citizen science” to refer to “scientific endeavors in which individuals without specific scientific training participate as volunteers in one or more activities relevant to the research process other than (or in addition to) allowing personal data or specimens to be collected from them.”2 Citizen science is celebrated for its potential to democratize science, as well as to supplement conventional science carried out by professional scientists. The extent to which democratization involves a rejection of — or at least significant departures from — norms and policies that have evolved in the context of conventional science is a matter of ongoing debate.Reference Aungst, Fishman and McGowan3
The expansion of citizen science has been associated with advances in technology, including genomic sequencing and digital technologies such as mobile devices.Reference Kullenberg and Kasperowski4 The co-development of citizen science and digital technologies related to health is apparent in a number of areas:
Direct-to-consumer (DTC) genetic testing with return of raw data to consumers, enabling n-of-1 genomic research and consumer pooling of data to create genomic data commons,Reference Evans5 as well as research-directed “data donation” to DTC genetic testing companies (e.g., 23andMe).Reference Bietz, Patrick and Bloss6
Crowd-sourced and -funded research projects, such as the American and British Gut Projects, which enroll participants online and give them access to sequencing results for their own use or donation to other projects.7
DIYbio, including patient-led condition-focused research and technology development projects (e.g., Crohnology, NGLY1.org, the Do-It-Yourself Open Artificial Pancreas System (OpenAPS), Nightscout)Reference Lee, Hirschfield and Wedding8 and the Quantified Self movement, which focuses on self-tracking and knowledge generation via wearable technologies.9
Platforms for peer-to-peer connection and data aggregation and sharing such as Citsci.org and Open Humans.org, both nonprofit, and the platform created by the for-profit company Patient-sLikeMe, recently acquired by insurance giant UnitedHealth Group.Reference Wang and Lynn10
Environmental health studies enabled by sensors, including efforts focused on particular health conditions (e.g., asthma), the built environment (e.g., walkability), or inequities in toxic exposures (e.g., traffic exhaust, pollution from industrial farming).Reference Barrett, Chrisinger, King, English, Richardson and Garzón-Galvis11
The range of initiatives that satisfy even a well-delineated definition of citizen science, encompassing variation in goals, structure, scale, and technologies employed, is a challenge for ethical analysis and policy making.
Keeping that complexity in mind, this article focuses on data sharing in the context of health-related citizen science, including whether citizen scientists have an obligation to share data and should commit to open science. Like citizen science, “open science” is a term open to multiple interpretations. It has a family resemblance to citizen science, insofar as open science as a movement aspires to “make scientific research and data accessible to all.”12 Yet openness in science can vary along multiple dimensions. Open science has been linked to six distinct but related principles: open methodology, open source, open data, open access, open peer review, and open educational resources.Reference Phillips and Knoppers13 Data sharing and open science have been favored by recent policy, but even in conventional science these developments are not without their critics. Concerns about the interests of data generators and data quality and stewardship of resources come into play, as well as concerns about privacy and security and control in the case of health-related human data. With open data and open access in particular, privacy and security concerns are amplified. Further, even if certain norms and policies can be justified for professional scientists, justifications may be weaker for uncompensated volunteers. We also briefly consider whether citizen scientists have an obligation to make their findings accessible via publication in journals.
When citizen scientists address data sharing in normative terms, they support it and do so using language that suggests a strong affinity to open science.
Data Sharing in Conventional Science
The commitment to making knowledge more easily accessible to others in conventional science arguably dates back to the advent of scientific journals in the mid-17th century. However, the modern concept of open science, especially as it relates to the obligation to share data in the context of health-related research, can most clearly be traced to norms that were developed in the 1990s, as scientists embarked on one of the largest, publicly-funded, international collaborative projects in history, the Human Genome Project. At the time, the generation of human genome sequence data was incredibly expensive and time-consuming (it took 13 years and almost $3 billion to complete the project), and thus efforts to sequence the human genome were concentrated in a handful of large, well-funded sequencing centers across the globe.Reference Cook-Deegan and McGuire14 Each center was responsible for analyzing particular segments of the genome, with the goal that all of the generated data would eventually be pooled to create a reference human genome for the benefit of our global society. However, the work was slow-going, and there were legitimate concerns that the ability to create this important community resource would be thwarted by professional competition and self-interest. Of particular concern was the potential for scientists working on the Human Genome Project to make intellectual property claims on their discoveries, which would stymie data accessibility and make it excessively expensive and complicated to conduct clinical research that relied on these basic scientific discoveries. Thus, in 1996, at a summit in Bermuda, representatives from the major DNA sequencing centers in 5 nations agreed to a groundbreaking set of guiding principles for the Human Genome Project, which required that all DNA data be released into a publicly accessible database within 24 hours of generation. These “Bermuda Principles” created an ethos of broad data sharing that has persisted with respect to genomic data and eventually spread to other research outputs.15 In 2003, the same year that economist Paul David coined the term “open science,” the NIH released its statement on sharing research data, which sets an expectation for “the timely release and sharing of final research data from [all] NIH-supported studies for use by other researchers.”Reference David, Uhlir and Esanu16 Similar policies have also been adopted by sponsors internationally.17
Even researchers who do not receive funding from a sponsor that requires data sharing may be required to make the outputs of their research publicly accessible. Many journals now require that published articles be accompanied by the underlying research data, and there are an increasing number of journals that are open access.Reference Taichman18 Policies like these that support data sharing “aim to reduce transaction costs, promote data re-use, increase rigor and reproducibility, decrease redundant research, better involve patients, consumers, and others, facilitate researcher transparency in sharing processes and results, and improve connections with a larger variety of actors to produce more innovative approaches and solutions over the medium to long term.”Reference Gold19 In fact, even if a researcher is not compelled to share data in order to obtain research funding or as a condition of publication, some have argued that there is a moral obligation for individuals who participate in research to share data so that it is widely and responsibly used.Reference Drazen20
Although most can agree that making data widely available to advance research is a good thing, there are several areas of persistent concern. First, it is important to protect the professional interests of data generators through fair publication policies and clear procedures for ensuring that those who spend the time and intellectual capital on generating data get credit for their work. This has been addressed in several ways, including by granting data generators a certain grace period during which only they can publish on the data, through standards for collaboration and acknowledgement, and by shifting norms in academic medical centers so that team science is taken seriously and treated as a legitimate endeavor for the purposes of promotion and tenure. Second, additional effort is needed to promote data quality and steward-ship of resources. If data repositories become dump-sters for “big bad data” then they will be worse than useless.Reference Hoffman, Podgorski, Merson, Gaye and Geurin21 Sharing “junk data” increases transaction costs and makes it less likely that research findings will be reproduceable, which ultimately undermines the purpose of an open science model and weakens public trust. This can be addressed by ensuring that all research is conducted under a quality management system that requires minimum standards are met.22 Finally, when health-related human data are shared, there are added concerns about issues such as privacy and security, group harm, and ownership and control. These issues cannot easily be addressed, but creating governance structures that involve members of the community whose data are being shared can help safeguard against harm, ensure stakeholder perspectives are taken into account, and promote trust. Engaging community members in this way is also consistent with the principles of citizen science and is one way to integrate non-professionals into the scientific enterprise.
Data Sharing in Citizen Science
In this section, we review considerations that favor, qualify, or weigh against an obligation to share data in the context of health-related citizen science. The affirmative case for data sharing and open science begins with principles, values statements, and codes of ethics that have emerged from within the citizen science community itself. Implicit ethical commitments of practitioners, relationship to achievement of citizen science goals, and funder requirements are also relevant. We then look at caveats and concerns that add nuance and condition imperatives to share data or maximize access. In particular, we consider privacy and security, quality and cost, and safety, and look at how data sharing platforms are addressing concerns in practice. Finally, we examine arguments against a data sharing obligation, including lack of consensus regarding a general (versus professional) duty of beneficence and the disparate cost-benefit pictures of citizen scientists and professional scientists.
The Case for an Obligation to Share
Principles, Values, and Codes of Ethics. The European Citizen Science Association (ECSA) has identified 10 principles that underlie good practice in citizen science. Principle 7 is: “Citizen science project data and meta-data are made publicly available and where possible, results are published in an open access format.”23 The ECSA has also issued a policy brief highlighting synergies between the citizen science and open science movements. One of the areas of convergence mentioned is “contribution to common goods and shared resources” such as “a body of knowledge, methods and tools, or a pool of data that then serve as infrastructures for further research and civic action.”24 Another ECSA policy brief envisions citizen scientists leading academic scientists by example toward greater openness.25 The values statement of the Citizen Science Association (based in Maine) is in line, affirming a belief that “the practice of citizen science will grow stronger when more diverse actors…collect, interpret, analyze, curate, and share data.”26 The North American and European Congresses of DIYbio have disseminated draft codes of ethics endorsing “Open Access” and “Transparency.”27 There are minor differences, but both advocate sharing of ideas, knowledge, and data. In addition, researchers who interviewed project coordinators and conducted a focus group with volunteers attending a citizen science networking event heard that citizen science is “dominated by an ethic of openness” and that citizen scientists value and take pride in broad data sharing.Reference Bowser28 In sum, when citizen scientists address data sharing in normative terms, they support it and do so using language that suggests a strong affinity to open science.
Implicit Ethical Commitments. The word “citizen” derives from the Latin civitas (city), suggesting contribution to a larger enterprise.29 Even one of the more individualistic-seeming forms of citizen science, the quantified self, is associated with a more collectivist notion, the quantified community.30 As for “science,” in Robert Merton’s classic work, the foundational norms of modern science include “communism” (sometimes modified to “communalism” to avoid any confusion with Marxist ideology). According to Merton, “[s]ecrecy is the antithesis of this norm,” while “full and open communication is its enactment.”Reference Merton and Merton31
Connection to Achievement of Goals. Data sharing and openness have been cited as particularly salient to achievement of the goals of patient-led and -centric research. For example, Sharon Terry, the parent of two children affected by the genetic disorder pseudoxanthoma elasticum or PXE, has made the case that data sharing is a duty owed to participants, who want to see as much good as possible come from their contributions.Reference Terry32 Matthew Might and Matt Wilsey, also parents of children affected by rare genetic disorders, have advocated for a principle of share early and often. They elaborate: “Share data. Share negative results. For findings too small to be publishable, turn to the Web and publish them in short blog posts. Get the information out there.”Reference Might and Wilsey33
Funder or Sponsor Requirements. In the context of citizen science, external resources or support may be necessary or desirable for purposes such as hiring project coordinators (who may manage activities such as engagement, enrollment, and communication) and professional scientists (who may provide training and ongoing consultation or supervision) and developing or purchasing relevant technologies, especially for large-scale projects. As noted above, many funders of scientific research have adopted policies mandating or encouraging data sharing. In the case of federal agencies funding or sponsoring citizen science, data sharing is generally required by the Crowdsourcing and Citizen Science Act. In particular, the Act states that federal science agencies “shall, where appropriate and to the extent practicable, make data collected through a crowdsourcing or citizen science project…available to the public, in a machine readable format, unless prohibited by law.”34 As part of the consent process, agencies are supposed to notify participants whether they (participants) are authorized to publish data. The law also directs agencies to make technologies and applications developed through a covered crowdsourcing or citizen science project available to the public.
Caveats and Concerns
privacy and security
The challenge of promoting broad data sharing while protecting the privacy of participants and securing data against unauthorized access is prominent in commentary on sharing data generated through biomedical research. While many citizen scientists may be willing to trade some personal or group privacy for achievement of goals such as advancing research, helping people impacted by a health condition, and compiling evidence that an injustice exists and should be remedied, privacy and security clearly warrant attention in any discussion of data sharing in the context of citizen science.35 The explanation of ECSA Principle 7 includes a caveat recognizing that privacy and security concerns may prevent data sharing.36 In her work on citizen science, Sharona Hoffman underlines how public access may harm those whose information is included in datasets, especially given the weakness of U.S. laws restricting use by present and prospective employers, financial institutions, and marketers.Reference Hoffman37 Mobile applications may be particularly likely to facilitate generation or sharing of data outside user expectations. For example, participants in a citizen science study of stress may wear a sensor that generates electrocardiogram data. These data indeed have value relative to monitoring and management of stress but can also reveal cocaine use.Reference Kumar38 And evidence is accumulating that disclosures about when and how data from mobile applications will be shared — including in the context of app-based research studies — do not always accord with best practices.Reference Hickvale, Torous, Larsen, Moore and Rothstein39
The universal adoption of best practices for disclosures as part of the consent process for app-based research studies would be a step in the right direction. Anne Bowser and colleagues have put forward a helpful set of privacy-focused recommendations for citizen science projects, including data obfuscation and minimizing personal information collection, training volunteers via brief modules on safe privacy practices and privacy-setting options, reminders about parameters as projects unfold, and checks for unintentionally revealing patterns in data to be made public.40 Efforts to pass laws or develop policies that would simplify and standardize consents and terms of use for DTC genetic testing and consumer-facing health-related digital technologies, such as the recently-proposed Protecting Personal Health Data Act,41 would also help to ensure that when citizen scientists make choices to trade off privacy to advance other values and interests, those choices are informed.
The caveat to promotion of higher standards and greater uniformity is the importance of nuance. For example, it is unclear that small-scale community-led citizen science projects should be governed in the exact same manner and by the exact same standards as large-scale efforts, or efforts with significant academic or corporate involvement and so greater resources at their disposal.42 And, as noted in the introduction, some citizen scientists may be leery of any standards that are imposed without a convincing justification, or that seem overly paternalistic or tied to securing the hegemony of conventional science. Nuance is also needed where the data to be shared are collected by a proxy (e.g., parent or guardian) or have implications for third parties (e.g., data from a wearable device that includes continuous video monitoring of surroundings, data that could be used to stigmatize a group), as the case for additional safeguards and/or some form of external consultation or review is especially strong for citizen science projects that include one or more of these features.Reference Nebeker43
quality and cost
As noted, quality and cost are other concerns that frequently surface in discussions of data sharing. There is little benefit, indeed there is harm, in proliferating and wasting resources on junk data.44 Given the variety of initiatives that count as citizen science, it would be unsurprising if the data emerging from some initiatives is poor-quality. However, Bonney and colleagues contend that, with appropriate measures, data quality can be equivalent for citizen science projects and projects carried out entirely by professional scientists.Reference Bonney, Elliott and Rosenberg45 Further, new computing tools address many data-quality issues, and advances in sensing technologies have improved accuracy in studies that employ these technologies to collect data.Reference Carlson and Cohen46 To the extent digital health companies adopt proposed best practices such as using auditable and transparent validation methods,47 this technology-driven trend toward higher quality should extend to many kinds of technology-enabled health-related research. In addition, sharing data (including making data public) itself facilitates external review, which can improve quality.48 Nonetheless, two forms of bias are worthy of attention. The first is selection bias, if those who are attracted to citizen science and have the time and energy and technology to participate, and are undeterred by privacy and other concerns, are not representative of the population under study. For example, participants in a study using Apple ResearchKit are likely to be more affluent and educated than the general population.Reference Del Savio, Prainsack, Buyx, Maddox, Rumsfeld, Payne and Nebeker49 The second is the bias introduced if project leaders or participants shape data to serve personal agendas. Trials led by access-oriented patient advocacy organizations may be regarded as particularly prone to this kind of bias. These two problems have not been solved, but recommendations have been offered, such as early and meaningful engagement of members of groups likely to be underrepresented, and transparency and the creation of incentives to use bias-minimizing research methods.Reference Wenner, Kimmelman and London50 As for cost, it is important to make the case that investments in citizen science data sharing produce commensurate benefits. Perhaps methods developed to capture the value of open science could be adapted for this purpose.51
safety
If data sharing is understood broadly, it picks up activities such as posting information to support do-it-yourself device manufacturing. Considerable technical sophistication may be required both to create instructions that will lead to a working device and to follow those instructions. Technology hobbyists have long shared this kind of information, but the stakes are higher when, for example, the device is intended to inject a drug that will stop a life-threatening allergic reaction. In a report from the Citizen Health Innovators Project, general enthusiasm for citizen science is tempered in the case of the Four Thieves Vinegar Biohacking Collective’s posting of information on how to make a home version of the EpiPen. The report quotes Jose Gomez-Marquez, Co-director of the MIT Little Devices Lab, as stating that “‘[p]utting the idea out there… could be dangerous.’”52 Intellectual property law may limit copying of an existing device like the EpiPen, but that leaves open the question of how best to manage the dissemination of information about innovations that do more than mimic existing devices, and may cause harm if deficient or improperly executed but are unlikely to result in severe injury or death. One example would be code permitting a parent to monitor blood glucose readings from the continuous glucose monitoring system of a child with Type 1 diabetes remotely via a smartphone — a simplified description of the original basis for Nightscout.53 The parent went on to share his solution, and then the actual code, with other parents and patients with diabetes. These acts of sharing led to a global collaborative endeavor of further innovation and sharing, as well as a broader movement to accelerate progress in the treatment of diabetes linked to the motto “#WeAreNotWaiting” (including “#WeAreNotWaiting to bridge disconnected data islands”).54 Nightscout project leaders have responded to the safety challenge by, first, acknowledging that it exists, and second, reaching out to the U.S. Food and Drug Administration (FDA) to discuss how safety concerns can be addressed in an open source, citizen science context. For example, leaders have shared their processes for “ensuring that new features and code are well reviewed and tested before being made available to the general public, and for making sure that any reported issues that might represent a safety issue can be appropriately triaged, prioritized, fixed, and the fix distributed back to the people using the software.”Reference Leibrand55 The FDA has stressed the need for a single entity, even if nontraditional, as a point of accountability — a challenge given the nature of the group.56 The FDA has expressed particular concerns about DIY closed loop systems, like the OpenAPS, that involve automated insulin delivery. In May 2019, the FDA issued a strongly-worded warning after one such system resulted in an insulin overdose requiring medical intervention; DIY developers replied with a tweet encouraging open sharing of information about adverse events.Reference Snouffer, de Marco and Brown57
How Data-Sharing Platforms Are Addressing Concerns
While some citizen science-oriented web-based platforms simply catalog projects, CitSci.org and Open Humans are notable for doing much more. Both platforms support data uploading and sharing, and both have invested in supporting citizen scientists in conducting their work responsibly through embedded features and policies. The developers of CitSci.org have created an analytical framework that captures both volunteer personal privacy (e.g., what information about volunteers qua project members is visible and to whom) and project data openness, and a range of decision control options in each domain. Based on their experience, they believe that “decisions related to governance — that is, the balance of decision-making power regarding who can make an information sharing choice about what information to share and when to share it — are usually best left to each project.”58 At the same time, the platform sometimes nudges projects toward privacy protective practices via default settings. For example, in the domain of volunteer personal privacy, the default choice is anonymized user name for public view. The framework recognizes four options for project data openness: closed (open only to project managers), semi-open (viewable by project managers and volunteers, either all project volunteers or the project volunteer who collected the data), enhanced semi-open (also viewable by registered platform users), and open (fully public). The developers believe that a totally closed approach is contrary to the ethos of citizen science. However, they favor permitting different choices to be made for different kinds and/or levels of data. In particular, they state that “providing options to open or close data at the level of the individual data point brings an added benefit to a project, as data points that otherwise may have been left uncollected for fear of their being exposed, may instead be collected and protected, leading to a more complete and representative data set for analysis and even reuse.”59 They recommend that sharing decisions, especially for potentially sensitive data, be pushed down to individual volunteers.
CitSci.org also addresses quality concerns. For example, the developers have built in data management and metadata documentation features. Meta-data entries are optional, but completion indicators and the award of badges for metadata excellence encourage projects to take full advantage of features that enhance quality and increase transparency. Further, these and other platform features are continually refined based on user feedback to ensure that they are tailored to user needs.60
CitSci.org is intended to serve the full range of citizen science projects, whereas Open Humans, as the name suggests, facilitates sharing of human data. As of March 2020, 25 projects including the Nightscout Data Commons were listed on the Open Humans “Explore and Share Your Data” page.61 As with Citsci.org, many choices about data access and other aspects of governance are left to individual projects and participants. Guidance for projects is conveyed via a suite of policies. A Project Guidelines document lays out expectation for projects using the platform in areas such as data management and security. Regarding data management, the guidelines include: explain the data you’ll receive and your data security measures, be aware of existing de-identification standards and risks of re-identification (and don’t look to Open Humans to de-identify data for you), don’t ask for more data than you need, and share data with project members. The security section covers practices such as using HTTPS to encrypt interactions, monitoring and limiting administrative access, and constantly updating operating systems and software packages to ensure installation of the latest security features.62 The Data Use Policy has an extensive section on the “inherent identifiability” of data, including the potential for genomic and location data to lead to re-identification.63 The Public Data Guidelines document includes a warning related to data quality and safety concerns — “Use at your own risk.”64
The Open Humans website is itself an open source project, and as with CitSci.org the developers work with projects to troubleshoot challenges and to enhance its features. Dana Lewis, an administrator of the Nightscout Data Commons as well as an inventor of the OpenAPS, has written: “I’ve been able to work closely with Mad [Ball, Open Humans co-founder and Executive Director] and suggest the addition of a few features to make it easier to use for research and downloading large datasets from projects. I’ve also been documenting some tools I’ve created (like a complex json to csv converter…), also with the goal of facilitating more researchers to be able to dive in and do research without needing specific tool[s] or technical experience.”Reference Lewis65 The instructions for adding data to the Nightscout Data Commons on the Nightscout Foundation website are simple. The webpage includes a reminder that data deletion/removal is always an option (and a commitment to best efforts to remove data from active/ongoing research studies if that further step is desired), and it describes how to learn who has been approved to access data, a feature of the Open Humans platform.66
The Case against an Obligation to Share
The caveats and concerns reviewed above weigh against a maximalist approach to data sharing, and we have described how several of the platforms created for citizen science data sharing accommodate limits to sharing based on data sensitivity and other factors. But we have not yet considered direct challenges to the notion that citizen scientists have a prima facie duty to share data. One challenge might begin with an argument that the existence of a general duty to benefit others through enterprises like data sharing is debatable. For example, the philosopher Bernard Gert took the position that apart from duties to benefit others associated with professional and other specific roles, the only moral obligations that exist are captured by rules against causing harm to others or doing evil.Reference Gert67 Beneficence, then, is a moral ideal rather than a moral duty, and while data sharing might be praiseworthy it would not generally be morally required of uncompensated volunteers. Further, one might argue that codes of ethics and values statements trumpeting a commitment to data sharing and open science likely reflect the views of more privileged, more highly resourced, and less vulnerable participants in citizen science, rather than universal sentiments in the citizen science community. In addition, while many may claim the identity of “citizen scientist,” thereby suggesting an allegiance to the norms associated with the roles of “citizen” and “scientist,” some individuals or groups will simply be engaged in an activity that third parties label as citizen science.
The caveats and concerns reviewed above weigh against a maximalist approach to data sharing, and we have described how several of the platforms created for citizen science data sharing accommodate limits to sharing based on data sensitivity and other factors. But we have not yet considered direct challenges to the notion that citizen scientists have a prima facie duty to share data. One challenge might begin with an argument that the existence of a general duty to benefit others through enterprises like data sharing is debatable.
Finally, the context and consequences of a commitment to data sharing are very different for citizen scientists as compared to professional scientists. Data sharing is associated with financial costs as well as costs in the form of investment of time and effort and experiences of frustration, although platform developers may seek to minimize burdens. Professional scientists have or may more readily seek funding and hire support staff in order to minimize the extent to which they personally bear these costs. Professional scientists are also arguably more likely to benefit in reputational and financial terms from data sharing (e.g., building good will with professional colleagues and potential collaborators, gaining promotion on the basis of “team science” activities, qualifying for or retaining large research grants tied to data sharing).
If any of these arguments are even somewhat persuasive, the fallback position might be advocacy for a limited, qualified obligation to share data on the part of citizen scientists. The case for an obligation is strongest where (a) an individual or group has generated data or made discoveries related to a health condition associated with great suffering, such that failure to share that data or discovery would be perceived by many people as wrong or even evil, or (b) the data or discovery is unlikely to be generated or made by others, as in the case of a rare disease. Also, the financial and nonfinancial costs associated with data sharing should not be excessive. In many cases, the most that should be asked of volunteers would be allowing a project coordinator or other leader to share data (e.g., by not asserting copyright and other rights to impede sharing), with a privacy caveat for sensitive data. The considerations reviewed above also justify investment in efforts to minimize the time and effort required of citizen scientists in order to share data, and praise and support for sites like CitSci.org and Open Humans and individuals like Dana Lewis.
Citizen Scientists and Journal Publication
As noted above, publication plays a role in enhancing data quality. Publication in a journal may be especially valuable for quality enhancement if articles are vetted through a rigorous peer review process. Publication also helps ensure that results are findable by professional scientists and other citizen scientists. In the DIYbio Europe draft Code of Ethics “results” are included in the list of things to be shared.68
Publication of citizen science-related papers in journals is increasing, although in one analysis the number of publications in the “medical” category was small relative to publications in categories such as avian, marine, terrestrial invertebrate, plant, astronomy, and environmental.Reference Follett and Strezov69 Also, some of the activity captured by publication counts may be reflective of articles in “outreach” sections of journals. If that is the case, there may be barriers preventing a ramp-up in articles reporting research findings on par with the recent expansion of health-related citizen science. Bonney and colleagues have suggested that lack of acceptance by first-tier journals, or consignment to sections that would not be considered for “real” science, reflects overgeneralization of quality concerns about citizen science or simple prejudice, since as noted above citizen science can be equivalent to conventional science in quality with appropriate data quality assurance measures.70 Publication fees and requirements such as institutional review board (IRB) approval may also be impediments to publication for citizen science projects that lack institutional affiliations and the resources to pay fees and engage a commercial IRB. Even apart from the financial aspect, IRBs “may promote decisions specific to data ownership, data management, and informed consent that directly conflict with the aims of research that is explicitly participant-led.”Reference Grant, Wolf, Nebeker and Cooper71 The desirability of IRB involvement in citizen science is discussed elsewhere in this issue. There is room to debate whether citizen scientists should be subject to the same ethics review requirements as professional scientists, so long as they can describe a reasonable process for protecting the rights and interests of anyone who would be a human subject were the research regulated.
The challenges of achieving acceptance by scientific journals may be especially great for citizen scientists reporting the results of n-of-1 studies. While Sean Ahrens, the founder of Crohnology, succeeded in publishing the results of his self-experimentation with the ingestion of pig whipworm eggs in the American Journal of Gastroenterology, the article was accompanied by an extensive editor’s note justifying its inclusion in a medical journal. The editor concluded the note by denying any bearing of the article on judgments about treatment efficacy and suggested that the outcome of reading articles of this type would be a growth in clinician empathy rather than any change in clinical practice.Reference Ahrens72 Many citizen scientists might be grateful for the inclusion but infuriated by the hints of condescension.
Finally, it is worth noting that there are complex questions regarding authorship when citizen science findings are published in journals. The Vancouver Protocol requires involvement in drafting or revising a manuscript for authorship. Some journals do not specify a role in writing as a criterion, but Sarna-Wojcicki and colleagues note that the editors of these journals still envision authors as having email accounts and being literate in English.Reference Sarna-Wojcicki73 At present norms seem to be in flux. For example, the Our Voice family of citizen science projects encompasses a journal article in which the author line includes “On Behalf of Burnie Brae Citizen Scientists” and citizen scientists are listed by name in the Acknowledgements.Reference Tucket74 It also encompasses a journal article in which there is no mention of citizen scientists in the author line and the community partner is listed by name in the Acknowledgements but the citizen scientists are thanked collectively.75 Consistent with best practice for professional scientist-professional scientist collaborations, ideally discussions about authorship will be initiated early on in collaborations between professional scientists and citizen scientists and among citizen scientists, and continue as projects evolve. Especially in the context of research with indigenous communities, sharing authorship and other benefits of the research with citizen scientists can demonstrate respect, build trust, strengthen long-term collaborative relationships, and reduce the risk of misunderstanding and misappropriation of knowledge.Reference Castleden, Morgan, Neimanis, Smith, Bélisle-Pipon and Resnik76
Conclusion
There is considerable overlap between the citizen science and open science movements. Unsurprisingly, then, open sharing of health-related data is an aspiration that is strongly supported within the citizen science community. Publication of findings is also endorsed. At the same time, the diversity of health-related citizen science projects and the different kinds and degrees of participant vulnerability counsel against simple endorsement of data sharing and publication of findings as universal norms. Likewise, the sensitivity of much health-related data should spark skepticism about any insistence on unrestricted public access as the only approach to data sharing consistent with a commitment to open science. We have illustrated how champions of data sharing in the citizen science space are acting upon a nuanced understanding of the data sharing imperative by, for example, developing platforms for data sharing that involve members of the communities whose data are being shared in decision making about levels of access and other aspects of governance. We have also touched on some continuing challenges, including addressing quality and safety concerns. The creativity and commitment to continuous learning on display in the platform development efforts we describe fuel our hope that citizen scientists and their allies will use these qualities to make progress in resolving the remaining issues.