There is evidence that the bodies, activities, and experiences of women patients are not adequately reflected in the design and clinical use of medical implants. There is also evidence of worse outcomes for women from at least some types of implant failure, and it is likely that the two are connected. In this article I explore this issue, focusing mainly on artificial joints such as hips and knees. In the first part, I outline the problem of disproportionate and unacceptable levels of implant failure in women. The problem manifests mainly at the level of outcome statistics, and mechanisms of bias remain hidden. Given this, I explore opportunities for gender bias in the design, testing, regulation, and use of implants. This analysis reveals that specific instances of inattention and bias are likely to be small, can be difficult to identify, and the risks are difficult to predict. This means that if gender bias in implant design is an ethical issue, it may be one with no clearly blameworthy player. Furthermore, from a practical perspective there is no single obvious point at which to intervene. In part II I argue that philosophers working in other areas have explored structurally similar moral problems (sometimes referred to as “moral aggregation problems”), such as the type of environmental harm caused by cumulative small actions of many players, injustices arising from the global labor market, and implicit bias in the workplace. I describe some of this work and identify three features shared by different moral aggregation problems that should be addressed in any solution. These are: (1) structural factors: aggregative harms are usually more than the mere accumulation of enough smaller acts; structural factors are also important; (2) the expediency for those who contribute to the harm of the act or acts that comprise it; (3) visibility factors: the smaller acts that jointly bring about aggregative harms may be hard to see or identify, and the agents who act in these ways may be unconscious of doing so. In the final part of the article, I draw on my analysis of these shared features to identify strategies for addressing gender bias in medical implant design and use.
It is worth noting that throughout the article I refer primarily to “gender biases” rather than sexism or sex differences. This is despite much of the medical literature I draw on distinguishing male and female research participants based on a biological understanding of sex. My concern is to explore ways in which the exclusion and disregarding of women in design, testing, and use of devices can lead to harm, and propose solutions to this. Although clinical research focused on sex differences is an imperfect tool for identifying the nature and extent of differences in outcomes for women, it remains the best current source of this information. Some clinical research is beginning to explore how gender and sex come apart in impacts on patient outcomes, but such research remains rare and is fraught by methodological challenges (for example, Reference Pelletier, Khan, Cox, Daskalopoulou, Eisenberg, Bacon and LavoiePelletier et al. 2016; Reference Norris, Johnson, Hardwicke‐Brown, McEwan, Pelletier and PiloteNorris et al. 2017).
I. Gender and Medical Implant Failures
In 2010, medical device company DePuy Orthopaedics issued a worldwide recall of its ASR (articular surface replacement) hip system.Footnote 1 The recall came after more than 93,000 ASR hips had been sold, and the company had been aware for years of high failure rates and serious complications (Reference CohenCohen 2011; Therapeutic Goods Administration 2011). The ASR hip was one of a number of innovative hip replacements that used metal surfaces for both the ball and the socket. These “metal‐on‐metal” hips were supposed to be longer‐lasting and had been advertised as a better option for young patients (Reference Silverman, Ashley and ShethSilverman, Ashley, and Sheth 2016). In practice, however, this did not turn out to be the case. Rates of wear on the components were higher than bench‐testing predicted, and the harms associated with debris from wear were also more serious than anticipated. Patients experienced inflammatory responses, high metal ion concentrations in their blood, and painful growths at the site of the implant (pseudo tumors). Raised metal ion levels put patients at risk of other complications, including serious cardiovascular illness, changes to thyroid function, and negative impact on cognitive and nerve function (Reference Silverman, Ashley and ShethSilverman, Ashley, and Sheth 2016). The ASR hip had particularly bad outcomes, but other metal‐on‐metal hip implants also have high failure rates and are associated with similar harms (Reference CohenCohen 2012).
Not only did the hips perform much worse than predicted, they also failed at much higher rates in women than in men (Reference Smith, Dieppe, Vernon, Porter and BlomSmith et al. 2012; Reference Inacio, Ake, Paxton, Khatod, Wang, Gross, Kaczmarek, Marinac‐Dabic and SedrakyanInacio et al. 2013; Reference Haughom, Erickson, Hellman and JacobsHaughom et al. 2015). This finding was not expected at the time, but resonates with more general concerns about the safety and effectiveness of medical devices for women patients. A number of studies have found gender differences associated with orthopedic implants, including women having higher sensitivity to the metals found in joint replacements (Reference Caicedo, Solver, Coleman, Jacobs and HallabCaicedo et al. 2017), and female pediatric patients having worse outcomes after plate implant to repair displaced forearm fracture (Reference Vopat, Kane, Fitzgibbons, Got and KatarincicVopat et al. 2014). Cardiac devices have also been a focus of concern, with evidence that women are less likely than men to benefit from some implantable cardiac devices and more likely to have complications (Reference Dhruva and RedbergDhruva and Redberg 2012). Tissue repair meshes, too, have been associated with poor outcomes for women. A recent example is the unacceptably high complication and revision rates associated with synthetic mesh repair of vaginal prolapses (Reference Menchen, Wein and SmithMenchen, Wein, and Smith 2012).
Statistics showing that one group has worse outcomes than another from a device can indicate the existence of bias, but understanding why it comes about and how it can be addressed requires further investigation. In what follows I present an original analysis of the risk factors for gender bias in the design, testing, regulation, and use of implantable hips. The outcomes for patients implanted with medical devices such as metal‐on‐metal hips depend on many factors that can be grouped broadly into two categories: (1) device factors, such as the design of the implant and its suitability for treating the patient's condition; and (2) features of the clinical encounter, such as the timeliness of the diagnosis and referral, and quality of communication between treating clinician and patient. I focus first on device factors and then on factors within the clinical encounter.
Device Factors
The successful design of medical implants requires the incorporation of relevant information about prospective patients into the design of the device. This can include how heavy or light, tall or short, active or sedentary patients will be, and what sorts of activities they will need to be able to do. Women and men may tend to differ in their characteristics, clinical needs, or activities. Gender biases can be built into devices when the possible impact of such differences is ignored in the design, bench‐testing, clinical testing, and regulatory approval processes.
In the case of artificial joints, differences between men and women can be influenced by underlying anatomical and biological factors as well as social factors. For example, joint stability in women can be affected by hormonal cycles and pregnancy as well as fitness (Reference Marnach, Ramin, Ramsey, Song, Stensland and AnMarnach et al. 2003; Reference Shultz, Sander, Kirk and PerrinShultz et al. 2005). Gait is influenced by a combination of factors, including underlying biomechanics, footwear, and the types of physical activities the person routinely does. Studies indicate that men and women tend to have differences in gait, such that the sex of a walker can often be determined by gait alone (Reference Barclay, Cutting and KozlowskiBarclay, Cutting, and Kozlowski 1978; Reference Kozlowski, Brooks and van der ZwanKozlowski, Brooks, and van der Zwan 2016).Footnote 2 Ordinary activities that put significant strain on joints may also tend to be different for women and men.
One activity that is not taken into account by those who design and test artificial hips is sexual intercourse. This is an ordinary activity for many people who have hip replacements (Reference Lavernia and VillaLavernia and Villa 2016). A recent motion‐capture study demonstrated that there are significant differences in the range of motion and stress on the hip joint during sexual intercourse for women and men (Reference Charbonnier, Chagué, Ponzoni, Bernardoni, Hoffmeyer and ChristofilopoulosCharbonnier et al. 2014). These types of movements are not reflected in wear tests and simulations used in hip‐joint bench‐testing, so gender differences in performance or safety of the joint under these stresses is not tested.Footnote 3 The study tested twelve sexual positions, and found that eleven of these were safe for men, whereas only eight were safe for women.Footnote 4 Safe positions were those that did not require movement beyond the limits of the artificial joint. Movements beyond these limits can lead to dislocation, which is known to sometimes occur during sexual intercourse (Reference Dahm, Jacofsky and LewallenDahm, Jacofsky, and Lewallen 2004).
As well as indicating a potentially greater dislocation risk for women during sexual intercourse, the motion‐capture study found that all the sexual positions tested involved greater range of motion for the hip joint of the woman than for the man. Greater range of motion is associated with higher wear due to the impact of pressure on the edge of the artificial hip socket (Reference Mellon, Grammatopoulos, Andersen, Pegg, Pandit, Murray and GillMellon et al. 2013; Reference Matthies, Henckel, Cro, Suarez, Noble, Skinner and HartMatthies et al. 2014). This can result in fragments breaking from the edge of the socket, a problem that tends to compound because the debris gets into the joint and increases the rate of deterioration (Reference Matthies, Henckel, Cro, Suarez, Noble, Skinner and HartMatthies et al. 2014).
Women suffer from both dislocations and wear‐related complications of metal‐on‐metal hips at higher rates than men do (Reference Smith, Dieppe, Vernon, Porter and BlomSmith et al. 2012). The motion‐capture study provides evidence that at least one activity, sexual intercourse, is likely to produce higher risk of dislocation and wear for women's hips than men's. There may well be other activities that involve greater risk for women than men. For example, Stephen Mellon and his colleagues found evidence of greater edge‐loading (and thus risk of wear) on metal‐on‐metal hips for women than for men when moving from a seated to standing position (Reference Mellon, Grammatopoulos, Andersen, Pegg, Pandit, Murray and GillMellon et al. 2013).
The different impact of activities such as sexual intercourse can interact with biological factors to compound gender disparities. For example, the same amount of metal‐on‐metal hip joint wear may lead to worse health problems in women than in men because women's bodies tend to be more sensitive to wear particles (Reference Caicedo, Solver, Coleman, Jacobs and HallabCaicedo et al. 2017). Moreover, women on average have lower blood volume than men, so the same amount of wear can result in higher blood metal ion concentrations.
Existing design processes and bench‐testing regimes do not predict or investigate the impact of any of these gender differences. One way to make up for this inattention to gender would be to conduct high‐quality clinical testing on women and men. However, this does not always occur: only some devices are tested in clinical trials with human patients. Those that do not undergo clinical testing are approved based on their similarity to devices that have been found safe in historical clinical trials (Reference Ardaugh, Graves and RedbergArdaugh, Graves, and Redberg 2013). Unfortunately, between 1977 and 1993 the United States Food and Drug Administration (FDA) human research ethics guidelines for pharmaceutical trials recommended the exclusion of women of childbearing age from specific types of clinical research (FDA 1977; Reference Vidaver, Lafleur, Tong, Bradshaw and MartsVidaver et al. 2000). This measure was intended to prevent risks to pregnant women and their unborn children, such as had occurred from the use of thalidomide. However, it led to very low representation of women in all types of clinical research, and to a lack of knowledge about gender differences in diseases and responses to treatments (Reference Kirschstein and MerrittKirschstein and Merritt 1985).
The historical exclusion of women from clinical trials, combined with the reliance of device approvals on the similarity of new devices to existing devices approved in the past, means that many medical implants in current use have been approved based on clinical data from trials that excluded women. The ASR hip was approved on the basis of its similarity to more than sixty related devices, including at least twenty‐one approved before 1993 when the guidelines changed (Reference Ardaugh, Graves and RedbergArdaugh, Graves, and Redberg 2013).
Current clinical trials are not always more gender‐balanced than historical trials, a well‐documented problem in cardiovascular disease research (Reference Dhruva and RedbergDhruva and Redberg 2012; Reference Zusterzeel, Selzman, Sanders, Caños, O'Callaghan, Carpenter, Piña and StraussZusterzeel et al. 2014). Furthermore, when women are included in research, many studies still fail to record the sex or gender of participants, or fail to analyze outcomes by sex or gender (Reference Zucker and BeeryZucker and Beery 2010). A recent study of rates of sex‐specific analysis in orthopedic research found that less than a third of studies published in the top orthopedics journals—only 30% of those published in 2010—included sex‐specific analysis (Reference Hettrich, Hammoud, LaMont, Arendt and HannafinHettrich et al. 2015). Without sex‐ or gender‐specific analysis, the potential of a clinical trial to illuminate gender differences or similarities is lost, even if the trial included women participants.
One way of exerting pressure on manufacturers to take account of gender differences in the development of devices is to require sex‐ and gender‐specific data for the approval of devices. To date, regulators have taken few steps to ensure this. Recent guidance from the FDA on the inclusion of women in device clinical trials are nonbinding, and give leeway to developers in determining whether and to what extent women need to be represented in clinical trials and which gender differences need to be reported and included in labeling (FDA 2014).
Regulators have paid even less attention to the risk of gender bias within design processes and laboratory‐based testing. There is no prompt for FDA staff to ask whether gender has been considered in the development of testing equipment (such as hip simulators used to test wear) or whether laboratory test data (and reference standards or safety thresholds it is evaluated against) reflect both male and female patients.
Regulatory bodies are not the only bodies with the power to influence device design and testing. Standards organizations, such as the International Organization for Standardization (ISO), have significant influence over which aspects of device design and performance are subject to laboratory tests and how these tests are carried out. They are well positioned to influence the behavior of the device industry, and can also drive changes in expectations by regulators. However, at present, gender differences are largely ignored in standards and there is no evidence of any effort to address this.
This may be partly because standards bodies are dominated by industry. According to the ISO: “Industry experts drive all aspects of the standard development process, from deciding whether a new standard is needed to defining all the technical content” (ISO 2017). Industry players are unlikely to press for the collection of information about the performance of their devices for specific patient subgroups (such as male and female patients). Such information may limit the market for a device, impose arduous labeling requirements, or indicate the (clinical) desirability of different devices for men and women—outcomes that would not be financially attractive to manufacturers.
There is little transparency in ISO processes for developing new standards, so it is difficult to find out who sits on the committees that develop and approve standards, or to discover the criteria they use. However, there is no evidence that standards committees are expected to consider matters of equity and diversity when deciding which materials, products, or processes should be subject to standards, or when deciding what these standards should be. Thus it is unlikely that the ISO committees responsible for hip‐wear testing standards would have considered whether the simulator set‐up, loads, and movement cycles were equally representative of women and men patients.
The combination of inattention to gender in current design and bench‐testing processes; exclusion of women from past clinical trials; low rates of analysis and reporting sex and gender differences in current clinical trials; and failure of regulators and standards bodies to set strong expectations around the evaluation of sex and gender differences provides multiple opportunities for the design and approval of devices that do not reflect important sex and gender differences.
Factors in the Clinical Encounter
Even when careful research does reveal gender differences in outcomes with a particular type of device, or leads to the development of special devices for women or men, it falls upon clinicians to ensure that this informs treatment decisions, implant selection, and counseling for patients. If the factors that affect outcomes cannot be addressed by creating gender‐specific implants, the mitigation of risks may require advice to patients from their treating doctor. An example would be advice about which sexual positions are safe with an artificial hip.
There is evidence that the nature and quality of clinical consultations are influenced by clinician and patient gender in ways that disadvantage women, thus potentially compounding (rather than mitigating) the device factors described above. The combination of male clinician and female patient, for example, appears to be less patient‐centered than other gender combinations, and clinicians are more paternalistic in this combination (Reference Sandhu, Adams, Singleton, Clark‐Carter and KiddSandhu et al. 2009). These findings come from observational studies, and clinicians may be unaware of differences in their consultation style when treating women as compared to men. Orthopedic surgery has the lowest representation of women of all surgical specialties—3% in the UK, 5% in Australia, and 6% in the US (Australian Orthopaedic Association 2018). As a result, the combination of male clinician and female patient is common in hip replacement.
Advice about specific types of risk, such as those associated with sexual activity, may be particularly susceptible to gender bias. As a consequence of the design limitations of hip prostheses, women are more likely than men to have more limited sex lives following a hip replacement. This disparity is typical of the wider trend to downplay women's sexual problems, including the impact of disease and treatment on women's sex lives (for example, Reference TieferTiefer 2001; Reference Bancroft, Loftus and Scott LongBancroft, Loftus, and Long 2003). It is also exacerbated by poor communication from clinicians about sex after hip replacement. Surgeons have long been inattentive to the performance of hip implants during sexual activities. This is the case even though hip arthritis patients regard being able to have enjoyable sex as an important quality of life issue affected by their arthritis and hip replacement. Patients are keen for further information about the impact of treatments, including joint replacement, on their sex lives (Reference CurreyCurrey 1970; Reference Stern, Fuchs, Ganz, Classi, Sculco and SalvatiStern et al. 1991; Reference Laffosse, Tricoire, Chiron and PugetLaffosse et al. 2008; Reference Meiri, Rosenbaum and KalichmanMeiri, Rosenbaum, and Kalichman 2014; Reference Lavernia and VillaLavernia and Villa 2016). Surgeons, however, often don't discuss sex with hip replacement patients. For example, Diane Dahm and her colleagues found that 80% of surgeons do not initiate such conversations, speculating that reasons for this could include a lack of reliable information to inform the conversation, or surgeons feeling uncomfortable about raising the topic (Reference Dahm, Jacofsky and LewallenDahm, Jacofsky, and Lewallen 2004). Dahm and her colleagues also found that surgeons are more likely to initiate conversations about sex with younger patients and with married patients, suggesting that assumptions about patients’ likely sexual activities may influence whether or not the topic is discussed.
Surgeons’ reluctance to discuss sex affects women disproportionately for several reasons: women are more likely than men to experience sexual difficulties due to hip osteoarthritis (Reference CurreyCurrey 1970; Reference Laffosse, Tricoire, Chiron and PugetLaffosse et al. 2008; Reference Lavernia and VillaLavernia and Villa 2016); sexual difficulties are more likely to play a part in women's decisions to undergo hip replacement (Reference Laffosse, Tricoire, Chiron and PugetLaffosse et al. 2008); and women are less likely than men to initiate a conversation with their surgeon about the impact of hip replacement on sexual function (Reference Dahm, Jacofsky and LewallenDahm, Jacofsky, and Lewallen 2004).
This analysis of clinical factors suggests that clinicians may not discuss known gender differences in hip replacement, such as risks associated with sex. They may also act under more general forms of gender bias that influence their manner in clinical consultations. As a consequence, current clinical practice may tend to exacerbate the risks women patients already face due to device factors.
II. Gender Bias in Implant Design and Use as a Moral Aggregation Problem
The type of gender bias involved in poor outcomes for women from medical implants is far from simple. There is no clearly blameworthy individual or group. The relative contributions of the various factors to a given device failure may be difficult to determine. In the case of hips, whereas one woman might have a dislocation during sexual intercourse, another woman might suffer due to metal hypersensitivity, and another might have an underlying clinical problem that makes the implant less viable. In a particular case, negligence or bias is likely to be involved, but is not necessarily involved. The biases in question are often small and difficult to see when taken on their own, likely to interact in complicated ways, and are spread throughout a complex system. They are related to all aspects of design, testing, and implantation as well as clinical interactions between doctors and patients. Furthermore, many of these biases are likely to be implicit, meaning that players such as biomedical engineers or clinicians are not aware of their biases or are not able to control them consciously (Reference Brownstein and SaulBrownstein and Saul 2016).
The analysis I have offered suggests a real health disparity between genders in the development and use of implantable devices. Yet it is not a health disparity that can be straightforwardly identified or tracked through routine collection of health data, in the way that other health disparities, such as differences in mortality rates between populations, often are. The data required to do so is not routinely collected and would be difficult and expensive to collect. Nor is there a single body with clear responsibility for addressing the disparity. This creates challenges both for recognizing and addressing the problem.
Despite these challenges, the situation is not hopeless. This ethical problem is structurally similar to several problems in other areas of applied ethics. Specifically, it is similar to what have sometimes been called aggregation problems, in which small and often separately harmless actions by individuals and/or groups in different parts of a system interact to give rise to a new harm that is more than the sum of its parts. Just as gender bias in the development and use of implants is initially identified through population‐level data, which prompts investigation of underlying causes, other moral aggregation problems are often diagnosed by attending to the aggregate harms and working back from these to identify the small factors that caused them.
In the remainder of this article I will unpack several general features of moral aggregation problems. I will argue that these features are present in the problem of gender bias in medical device design and use. Understanding how they have been addressed in other contexts provides new insights into how this problem can be solved. My analysis draws on familiar instances that have been explored by philosophers: the tragedy of the commons; moral aggregation problems in global justice (such as the global trade conditions that result in exploitative labor practices); and aggregative workplace discrimination that involves the accumulation of micro‐inequities. The analysis is intended both to apply to gender bias in implant design and use, and to generalize to other moral aggregation problems.
One of the most familiar problems in applied ethics that involves aggregation is the tragedy of the commons, in which an ecosystem that is shared collapses due to overuse. Garrett Hardin argued that this occurs as an inevitable consequence of the rationality for individuals to increase their use of a shared resource (Reference HardinHardin 1968).Footnote 5 The problem is often described in terms of simple, idealized cases. Hardin, for example, begins with a case in which graziers run their herds on shared land. He argues that the benefit of adding another animal to a herd is wholly enjoyed by the individual grazier who owns that herd, whereas the cost (in terms of environmental degradation) is shared with the other users of the common. As such it is always rational for each grazier to add another animal (Reference HardinHardin 1968, 1244), but if they all do so the environmental impact will be devastating.
Urgent real‐world problems involving commons, such as pollution and climate change, do not involve such limited numbers of relatively equally positioned individuals as in Hardin's idealized case. Instead, they involve countless unequally positioned individuals, private institutions (both local and multinational), and public institutions at different levels, from local councils to global organizations such as the United Nations. Nevertheless, the logic is often the same: individuals, companies, or nations derive a direct benefit from their use of the resource, whereas costs—in terms of cumulative harm to the environment—are shared.
It is not only large‐scale environmental problems that have this sort of structure. Judith Lichtenberg has coined the term “new harms” to refer to a cluster of moral challenges that arise from globalization (Reference LichtenbergLichtenberg 2010). These moral challenges include exploitation resulting from global labor practices, and wars associated with mining conflict minerals for global markets, such as the ongoing conflict in the Democratic Republic of the Congo. These problems are characterized by geographical distance and complex relationships among unequally situated individual and group players.
The process of medical device design and use is also complex in this way. It involves individual players with varying degrees of power: vulnerable individual patients; individual biomedical engineers; individual employees of regulators and device companies; and individual surgeons who often wield significant power in hospitals. In addition, it involves local, national, and multinational group players including hospitals, national regulators, international standards bodies, and device companies that range in size from small businesses to powerful multinational corporations.
Often the individual acts that accumulate to cause harm in these cases are benign. There is nothing essentially harmful about animals grazing on the commons, buying a pair of shoes, or recruiting a male participant into a hip implant trial. The harm of environmental degradation emerges once the harmless acts aggregate beyond a tipping point. The harm of labor exploitation involves many people buying shoes in a competitive marketplace, alongside other features of the economic and political systems within which the production and sale of the shoes occurs. Underrepresentation of women in clinical trials occurs if individual recruiters systematically treat male and female participants differently, or if the trial protocols are designed and implemented in a way that excludes more (or all) women.
The harmlessness of many of the individual acts in such cases has prompted Elizabeth Kahn to describe the tragedy of the commons as an “essentially aggregative” harm: “more than the sum of many acts which are harmful; rather, it is a harm that only results when the actions in question occur together” (Reference KahnKahn 2014, 226). Samantha Brennan has referred to aggregates of this type as “moral lumps” due to the nonlinear way that the harms and goods accumulate, and the significance of particular quantities or structures in the emergence of the good or harm (Reference BrennanBrennan 2006). Although the term is rather imprecise, Brennan's idea of lumpiness is highly suggestive of the nature of the quality that is of interest. It is not mere aggregation, but the accumulation of instances into an aggregate of a particular kind.
Structural factors play a significant part in the emergence and nature of aggregative harms. Structural aspects include the pattern of the small acts that accumulate, such as their frequency and concentration. They include the social structures, such as political and corporate hierarchies, that mediate players’ interactions. They also include geographical factors, such as the physical distance between purchasers of fast fashion and those who manufacture it. The lumpiness involved in many moral aggregation problems is challenging because it means that otherwise harmless or insignificant acts become harmful. More controversially, it might mean that otherwise morally acceptable acts become wrong when they contribute to the formation of a “lump” with a particular size and structure.
In the case of medical device design and use, there are various structural factors mediating interactions among key players. Physical distance is one factor and is particularly relevant for devices that are developed and sold by multinational device companies. Other, nonphysical barriers prevent players who are geographically close from interacting efficiently. Biomedical engineers who work for device companies can have their opportunities to interact with other players and share ideas hampered by intellectual property concerns and other commercial considerations (Reference Heller and EisenbergHeller and Eisenberg 1998). This could mean lost opportunities to learn about and collaborate on addressing poor outcomes for women patients. Some biomedical engineers develop devices in universities and may tend to work in disciplinary silos (Reference Pober, Neuhauser and PoberPober, Neuhauser, and Pober 2001). Increasingly, universities too are concerned about secrecy and protection of knowledge when their researchers collaborate (Reference Cohen and WalshCohen and Walsh 2007). Patients—the most vulnerable parties in the medical implant case, and also those whose experiences must be factored into design and processes to prevent harm—are isolated from the other players partly by medical hierarchies and their lowly position within these. As such, patient representation and influence on device design and testing is limited, despite recognition of their importance for developing devices (Reference Shah and RobinsonShah and Robinson 2007; Reference Hutchison, Rogers and EntwistleHutchison, Rogers, and Entwistle 2017).
The way the harms accumulate in the medical device case is also complex. For example, when it comes to device design and testing, inattention to possible gender differences by individual biomedical engineers is not challenged by expectations of the regulators who will approve or reject their design. It is also not challenged by the existence of standards from the ISO that reflect likely gender differences.
Structure is not the only important feature of moral aggregation problems. One feature that Hardin's analysis of the tragedy of the commons emphasizes is the expediency for each individual grazier of adding another animal (Reference HardinHardin 1968). In fact, the expediency for individual players of the acts that contribute to aggregative harms is often a significant factor in their accumulation. This is certainly true in the medical device case. Biomedical engineers involved in designing implants (and those who design simulators and other testing equipment) are constrained by expectations of productivity. The market case for developing a new device is affected by the need for longer or more consultative design processes, such as involving patients in device design (Reference Shah and RobinsonShah and Robinson 2007). Cost considerations also affect decisions about the need for more tests or use of specialized testing equipment. Regulators, too, often work under tight funding constraints. They experience pressure to approve products within short timeframes to ensure that beneficial devices and medicines can help patients sooner rather than later (for example, Reference SchattnerSchattner 2016). Clinicians are often under pressure to limit consultation times, whether this be to ensure all patients receive timely care or to maximize their income.
These expediencies are not necessarily unacceptable—there are moral imperatives for efficiency in healthcare. Ethicists who write on the topic of healthcare justice recognize that solutions to health disparities need to accommodate resource constraints. This often involves a trade‐off between maximizing the total amount of healthcare that can be achieved with the resources available and ensuring fairness across different patients and health conditions (Reference Cookson and DolanCookson and Dolan 2000; Reference Persad, Wertheimer and EmanuelPersad, Wertheimer, and Emanuel 2009). What is challenging about aggregative harms is that the expedient acts may be benign when taken on their own, but harmful in aggregate within a system. In a resource‐poor setting there might in fact appear to be a moral imperative to act in these ways (which may be regarded as desirably efficient rather than expedient) to maximize value from the resources available. It is not only the apparently benign nature of the acts that contributes to this: these acts may also be unconscious, invisible, or their relationship to the aggregative harm may not be understood by the actors.
Structure and expediency are important features of aggregative harms, but they interact with another factor: visibility. The acts that aggregate to cause harm are often invisible, and the harm they give rise to unpredictable. There is more than one way in which visibility can be challenging. Structural factors, like geographical distance, limit the visibility of harms to some players. Shoe purchasers in the US or Australia are unlikely to interact with the factory workers who make their shoes, and are not necessarily familiar with work conditions in countries where they are made.Footnote 6 Likewise, biomedical engineers do not meet patients suffering from device complications. Expedience can also interact with visibility, especially in contexts such as the public health system, where an emphasis on efficient use of public funds might frame all decision‐making and tend to obscure other sorts of moral considerations.
In addition to these visibility issues, the medical implants case involves gender biases, some of which are likely to be implicit. Implicit biases result from attitudes that a person has but may not be aware of, or may not be consciously able to control (Reference Brownstein and SaulBrownstein and Saul 2016). Thus, players’ own motivations may be invisible to them. These could include the attitudes that underpin some of the phenomena described in part I, such as the different consultation styles by clinicians depending on patient gender, or employees of regulators deciding on nonbinding (rather than binding) guidelines for the inclusion of women in trials and reporting trial outcomes by sex and gender.
One context in which the visibility of small harms, including visibility issues related to implicit bias, has received considerable attention is workplace discrimination. It is widely accepted that the biases that harm women and members of other socially disadvantaged groups in their careers include very small differences in treatment that may be invisible to both the perpetrators and to those against whom the bias occurs. The notion of “micro‐inequities” was coined by Mary Rowe to explain the way that small, hard‐to‐see biases in treatment can have a cumulative effect, affecting women's lives in significant and harmful ways that cannot be understood by focusing on individual instances of harm (Reference RoweRowe 1974, Reference Rowe1977, Reference Rowe2008).
Rowe's process of identifying these small factors was organic; appointed at MIT as an ombudsperson in 1973, she heard innumerable stories of small slights from women and members of minority groups. She came to believe that these small factors were cumulatively significant in the underrepresentation and low status of members of these groups, and that addressing them was critical for bringing about change. Writing in 1977 on the types of micro‐inequity she discovered, she said:
These minutiae are usually not (practically speaking) actionable; most are such petty incidents that they may not even be identified, much less protested. They are however important, like the dust and ice in Saturn's rings, because, taken together, they constitute formidable barriers. (Reference RoweRowe 1977, 56)
The connection of workplace micro‐inequities to other moral aggregation problems has rarely been emphasized (a notable exception is Reference Brennan, Hutchison and JenkinsBrennan 2013, Reference Brennan, Brownstein and Mather Sau2016), partly because the emphasis of philosophical reflection on micro‐inequities has tended to be on the invisibility to their perpetrators of the biases in question, rather than the aggregative nature of the harms they cause. Discussion has focused largely on whether individuals can be morally blameworthy for behavior that results from a bias they are unaware of or cannot control (for example, Reference HolroydHolroyd 2012; Reference Saul, Hutchison and JenkinsSaul 2013; Reference LevyLevy 2017). Discussing them together highlights both the aggregative aspects of harm from micro‐inequities, and also the relevance of visibility to other moral aggregation problems.
To address gender bias in medical device design, testing, and use will require making the bias visible, addressing the structural factors that underpin it, and challenging expedient practices in designing, testing, and using implants.
III. Addressing Aggregative Harms: Lessons for the Medical Implant Case
I have argued that aggregative harms have three features that make them difficult to address. These are: (1) structural features, (2) the involvement of expediency considerations, and (3) their invisibility. I have also indicated how these features manifest in the case of medical implant design and use. In this final part of the article, I describe some strategies used for addressing moral aggregation problems in other contexts and explain how these might inform a response to bias in the design and use of medical implants.
A useful distinction has been made between top‐down responses to these sorts of problems and bottom‐up responses (for example, Reference Lawford‐SmithLawford‐Smith 2015, 316). Top‐down responses focus on the imposition of regulations by governments or by global institutions, whereas bottom‐up approaches focus on unilateral action by individuals. Hardin's preferred solution is a top‐down one. He argues that the tragedy of the commons should be averted by coercive regulation—“mutual coercion, mutually agreed upon by the majority of the people affected” (Reference HardinHardin 1968, 1247). Top‐down approaches have the advantage that they apply to those who would be motivated by the underlying reasons for the law (such as protection of the environment), as well as those who would not be thus motivated.
However, structural factors such as the complex relationships among different players, and the geographical distances involved, can mean that top‐down solutions are slow to emerge, subject to corruption, and vulnerable to changes in government or other restructurings within and among group players. This is exacerbated in the global context by the challenges of international law, which operates through treaties between sovereign states. Legislation on these matters often represents a compromise position that is too weak to properly address the problem. For example, the actual pledges and policies to cut greenhouse emissions made by governments in response to climate change may be inadequate to meet the agreed upon targets, and thus inadequate to avoid reaching a dangerous tipping point (Reference HarveyHarvey 2016).
There are similar challenges in the medical device context: the harms associated with gender bias in medical implant design and use arise within a complex, global, multilevel “system.” This system has come into existence and come to take its current form over time. It was not intentionally formed, nor is it controlled by a single governing body. As such, there is no locus of control over the system as a whole from which to intervene with a top‐down approach to address gender bias. National regulators such as the FDA and the Australian Therapeutic Goods Administration may be able to oversee top‐down action within their jurisdictions, but this would address only some of the challenges identified.
Recognition that top‐down solutions do not always work and that individuals have limited capacity to exercise their agency when it comes to such solutions has prompted many philosophers to focus on bottom‐up solutions to global moral aggregation problems such as climate change and labor exploitation. Specifically, they argue that irrespective of any top‐down measures, individuals have moral obligations to respond unilaterally to climate change and the new harms. Marion Hourdequin argues that individuals have a moral obligation to take unilateral measures to reduce their personal contributions to exploitation of a commons (Reference HourdequinHourdequin 2010). Elizabeth Kahn argues that commons users have a responsibility to act unilaterally and form collectives to govern the use of the common (Reference KahnKahn 2014). On this approach, individuals have obligations, but they are not obliged to take unilateral action to curtail their own contributions. In a similar vein, Holly Lawford‐Smith has argued that successful collectivization is required, and should be supported by public signaling (Reference Lawford‐SmithLawford‐Smith 2015). Where there are no existing groups capable of acting, the first step toward forming such groups is individuals signaling their willingness to cooperate with others to bring about change. In her work on the new harms, Judith Lichtenberg, too, emphasizes the importance of collectivization. Not only are collective responses likely to be more successful because we can do more together, she argues, but recent work in psychology indicates that it is easier for individuals to act when others around us are acting (Reference LichtenbergLichtenberg 2014, 237).
These bottom‐up approaches face various problems, including the risk that bottom‐up participation will be distributed unfairly across individuals (for example, Reference Few, Brown and TompkinsFew, Brown, and Tompkins 2007). In the context of medical device design and use, there is quite limited scope for collectivization and signaling. Patient activism is sometimes effective in addressing dangers associated with specific devices. In Australia, for example, bottom‐up collectivization led by patients was instrumental in a successful class action concerning harm from the ASR hip (Australian Broadcasting Commission 2011; Reference BrownBrown 2016). However, if the aim is the prevention of harm to patients through good device design, testing, and use, then relying on class actions by patients who have been harmed is a poor mechanism for achieving it. Bottom‐up action by concerned biomedical engineers, surgeons, and employees of regulators and device companies could make a powerful contribution to change, but it would require the biases and their cumulative impact to be visible to these players, a point I return to below.
The distinction between “top‐down” and “bottom‐up” responses does not foreground the roles of intermediate players such as corporations. It also oversimplifies in the face of the many different layers of regulation that can apply to contributors to moral aggregation problems. Labor practices, for example, are a product of local, state, national, and international laws as well as the practices and policies of private companies and international markets. All these organizational layers fall between the individual workers who produce goods in factories and the individual consumers who buy these goods. Efforts to address the exploitation of factory workers can occur at any level, and the best strategies will vary depending on the level at which they are targeted. Recognizing this, Elinor Ostrom, who was awarded the Nobel Prize in Economics in 2009 for her work on managing commons, recommends multilevel, multiscale approaches to complex global matters such as climate change. She argues that small‐to‐medium‐sized organizational units are well placed to undertake the sort of collective responses that might be effective (Reference OstromOstrom 2010). In the case of medical device design and use, an Ostrom‐style approach would recognize the significant role of local regulators, standards bodies, and device companies in any solution.
Unfortunately, the poor visibility of the small acts that accumulate to cause aggregative harms, and the unpredictability of these harms, form a significant barrier to Ostrom‐style approaches. In order for these approaches to get off the ground, the way in which the actions of various players accumulate to cause harm must be made visible.
Detecting the small factors and their cumulative impact and documenting them has been a critical precursor to developing strategies to address other moral aggregation problems. Recognition of this justifies the resources dedicated to measuring carbon emissions and understanding their involvement in global warming. It also underpins the related task of identifying the elements of Earth's climate system, how they interact, and individual tipping points for key elements such as polar ice sheets and major forests (for example, Reference Lenton, Held, Kriegler, Hall, Lucht, Rahmstorf and SchellnhuberLenton et al. 2008). In Rowe's work on micro‐inequities, too, the detection of the small acts or events that aggregated to cause harm to women and minorities was critical (Reference RoweRowe 1974, Reference Rowe1977, Reference Rowe2008).
Above, in part I, I outlined some of the sources of gender bias in hip‐implant design and use. In so doing, this article (and others like it) can potentially play a role in making bias visible. The findings can inform awareness and training programs for medical engineers, surgeons, and people in regulatory roles pertaining to implants, to help make the issue more widely understood.
Where the harms involve biases that might be implicit, such as micro‐inequities in the workplace, the process of making visible must also include making implicit biases visible to those who hold them. In the medical device context, many of the biases involved are likely to be implicit. For instance, clinicians and engineers may be unconscious of their stereotyped assumptions about the sex lives of elderly women with hip arthritis. For this reason, individuals might continue to assume that it is others (corrupt persons with explicitly biased beliefs) who are the main offenders. Ensuring that individual biomedical engineers, employees of regulators, members of standards committees, surgeons, and other clinicians learn about the problem—that it exists and how these biases accumulate to cause harm—is important. So too is encouraging individuals to take steps to identify any implicit biases they might have, for example by taking publicly available Implicit Association Tests (for example, Project Implicit 2018). The manner in which this is done, and the training or other support provided alongside it, matters to its effectiveness. As the use of implicit association tests and implicit bias training has increased, so too has backlash from members of some privileged groups (Reference KaplanKaplan 2006; Reference Pendry, Driscoll and FieldPendry, Driscoll, and Field 2007; and Reference Dean, Victor and Guidry‐GrimesDean, Victor, and Guidry‐Grimes 2016). There is some evidence that implicit bias training that encourages perspective‐taking is less likely to induce backlash (Reference KaplanKaplan 2006).
The combination of raising awareness of the multiple sources of bias, and encouraging individuals involved in the process to identify and confront their own implicit biases, is a necessary first step. Once the biases have been made visible, top‐down, bottom‐up, and multilevel measures to address them become possible.
It is also worth noting that the actions of some parties are more visible than others. For example, guidelines and policies by regulators are highly visible to device manufacturers, and have significant influence (especially if they are binding). Regulators can also influence clinical use of implants, especially by requirements for labeling and via the indications for use. The standards produced by organizations such as the ISO are also highly visible to designers and manufacturers of devices. Actions by these visible players have more potential to affect the system as a whole than actions by other parties. Nor is their impact limited to the force of binding regulations: more visible parties can more effectively signal their concern about gender bias and willingness to take action to tackle the problem.
Finally, the issue of expediency must be addressed. I identified this as a major exacerbating factor in the case of medical implants, due to the pressures that all players are under to ensure efficiency. For at least some of these players (clinicians and regulators), the need to be efficient with resources and to ensure that helpful implants are available for use in patients without undue delays are appropriately understood as moral imperatives. I think there are two necessary steps in successfully challenging expedient decision‐making in the context of moral aggregation problems. The first is making visible the aggregative harms and the way that the expedient decisions contribute to them. This is an essential step in illuminating the competing moral considerations that decision‐makers must weigh against the need for efficiency. On its own, however, this may not be enough. Even with a clear picture of the moral problem, and an understanding of how their acts contribute, many players are likely to weigh the efficiency considerations more heavily than their contribution to the aggregative harm. Even in view of the aggregative harm, there is a high cost to addressing the small and uncertain contributions of various factors. For example, it would be costly to develop hip simulators and hip wear testing standards that better reflect the patterns of wear on artificial hips implanted in women, and it may be difficult to provide hard evidence of the benefit of doing so.
In view of this, the second step in addressing expediency issues is identifying where in the system a first change can be made that most efficiently influences the rest of the system. Doing so could be justified on the basis that the harm as a whole outweighs the costs of addressing contributing factors, and that the contributing factors can be most efficiently addressed by acts by specific players. It is part of the logic of aggregation problems that they are nonlinear—an aspect of the “lumpiness” identified by Brennan (Reference BrennanBrennan 2006). For this reason, the cost of each harm should not be counted as its fraction of the cost of the aggregate, and responsibility for change should not be understood to fall equally and fractionally across all players. The analysis presented here offers a basis for questioning expedient decisions and challenging expectations of who should act to address these issues.
IV. The Need for Detection
In this article I have analyzed gender bias in the design, testing, and use of medical implants. I argued that this bias can be understood as a type of moral aggregation problem, and that possible solutions can be informed by philosophical work on aggregation problems in other areas, such as environmental ethics and workplace micro‐inequities. Drawing on work in these areas, I identified three features of moral aggregation problems: the importance of structural factors, the role of expediency, and the problem of visibility. I explained how these apply in the medical device case, and how understanding these features can help inform a feasible response.
One key activity supported by the analysis is “detective work”: collecting relevant data and undertaking analyses that can identify the existence and nature of aggregative harms. The first part of this article presents my efforts to detect the underlying factors associated with gender disparities in outcomes from hip implants. Philosophers frequently rely on empirical work undertaken by others to indicate the existence and nature of such empirical details. For example, those working in environmental ethics refer to the work of climate scientists. However, some of the aggregative harms that philosophers seek to describe and understand have not been explored by empirical researchers in other fields, or the interpretation of the empirical work has not focused on these issues. This is true of the gender biases associated with implantable devices that I focus on in this article. In such cases, philosophers must either do the necessary work themselves (as I did in Part I) or seek interdisciplinary collaborations with those who can. Either way, the importance of this detective work in understanding and addressing specific moral aggregation problems should be recognized.
Once moral aggregation problems were made visible, I argued for a multilevel approach, one in which individual and institutional players all act, notwithstanding the sometimes‐small biases occurring on their watch and the unpredictable relationship between these and specific harms. However, I noted that some players can make a bigger difference more efficiently. For example, action by regulators such as the FDA is more visible throughout the medical device system than actions by other players such as individual surgeons. For this reason, action by the FDA can more easily influence change. It can do so directly by obliging others to act, but also indirectly by signaling the importance of the problem and willingness to tackle it.
This article focused on the issue of gender bias in medical implant design and use, and my recommendations apply to this case. However, the analysis of key features involved in ethical issues that arise due to aggregation may have wider applicability.