The landscape of healthcare is being continuously altered by the development of new technology and devices aimed at benefiting healthcare providers, patients, and healthcare budgets. The basic premise of introducing a new medical device is that it performs better than available alternatives or it or fills a gap where there is none (Reference Edmondson, Bohmer and Pisano1). Objective measures form the mainstay of evaluation but can be difficult and take time to generate; therefore, subjective assessment is often the basis for the introduction of new medical technology (Reference Jennett2–Reference Antman, Lau, Kupelnick, Mosteller and Chalmers4). The assessment of medical devices is challenging (Reference Bernard, Vaneau and Fournel5;Reference Maisel6), objective measures are relatively straightforward indicators where one device is compared with another using measureable indicators of performance. Subjective measures of performance include human elements such as perceived usability and preference that can be harder to define. While the degree to which these subjective measures influence decisions regarding technology/device implementation is complex and certainly context specific, there is little doubt that they are important in determining the uptake and utility of medical devices in clinical practice (Reference de Veer, Fleuren, Bekkema and Francke7). Furthermore, promoting subjective benefits is of course not a new concept with it being a core principle in the field of marketing (Reference Manu and Sriram8). The ideal assessment of a medical device should combine both objective and subjective measures in comparing it with alternatives (Reference Ginsburg9;Reference Shah and Robinson10).
Even if both were performed, it is common to find discordance between objective and subjective measures (Reference Hróbjartsson and Gøtzsche11;Reference Hróbjartsson and Gøtzsche12). The interaction between objective device performance and subjective user perception in the evaluation of new medical devices has not been well studied. The aims of the present study were to test the accuracy of a new device for controlling gravity driven intravenous (IV) fluid infusion rate and compare with the current standard, the roller-clamp device. The second aim was to explore the relationship between objective performance measures and subjective user perceptions in the evaluation of the new device.
METHODS
This study comprised a laboratory study and experimental clinical setting study, comparing two different flow rate control devices.
Devices to Control IV Fluid Rate
Two devices controlling IV fluid rate were compared. The current standard, or control device is the roller-clamp (Supplementary Figure 1a). This device is a small plastic dial housed in a plastic frame through which the IV tubing passes. Rolling the dial up or down in the frame alters the amount of compression on the IV tubing and thus alters the flow rate. The compression of the tubing and, therefore, the point of flow rate regulation occurs exclusively at the tangential axis of the dial.
The test device (Supplementary Figure 1b) is a plastic unit through which the IV tubing is threaded. The section of the device that the tubing passes consists of a fixed concave outer wall and a convex inner wall that can be moved (by rotating the dial) to reduce the lumen of the tube as it passes through the device. In this device the tubing is compressed over the length that passes through moveable convex wall of the device (~3 cm). The mechanisms of each device is shown diagrammatically is Supplementary Figure 2.
Laboratory Study Design
The two devices were tested and compared in a bench top laboratory experiment, conducted by one author (M.D.H.). Both devices were set at a rate of 10, 40, 80, and 200 drops per minute with the aid of custom designed electronic motion detecting counting device. The drop counter was calibrated and was demonstrated to have an accuracy of 100 percent for counting of drops. The data were logged by means of software designed specifically for this study. There was no time limit to achieve the target rate. Once the target rate was set, the drop rate was recorded for the duration of a 500 ml volume infusion. This gave a total of eight test runs (two runs per drop rate) for each device. The accuracy of fluid infusion rate was compared over time for both devices.
Clinical Setting Study Design
Participants for the clinical setting evaluation study were registered nurses with experience administering IV fluids (n = 32) and first year student nurses (n = 34) who had no experience with IV fluid delivery. Registered nurses were recruited from the surgical wards of Auckland City Hospital during working shifts over three consecutive months. Student nurses were recruited during first year nursing lectures during 2 consecutive months. Both student and registered nurse were recruited for this study to account for any familiarity effect while also having a setting representative of clinical practice. The study participants were asked to set the IV fluid rate for each of the two devices in the way that they usually did for a gravity driven infusion (i.e. counting the drip rate). An IV infusion was set up using a 1 L or 500 ml of 0.9 percent normal saline bag (Baxter Healthcare) and was connected to a standard tube giving set (Baxter Healthcare). The fluid was infused into an empty container and was not administered to patients. Our custom made counting device (Supplementary Figure 3) measured the drop rate (averaged over 10 drops) and it was not visible to the participant.
Objective Measurement
During the participant phase, the devices were introduced to the participants only by a brief standardized verbal explanation of how to operate them and there was no reference to differences between the devices. There was no information provided to the participants on the exact purpose of the study, only that they were required to run through the following tasks once with each device. All participants completed two stations in a randomized order.
Station 1: Participants were given 30 seconds to achieve a target rate of 60 drops per minute with each of the devices
Station 2: Participants had no time limit to achieve a target rate of 60 drops per minute with each device and participants were told to stop the clock once they considered that had achieved the rate.
Subjective Measurement
Participants were asked to complete a survey adapted from the twenty-eight item Universal Design Performance Measure for Products (UDPMP; The Centre for Universal Design, 2003) (13). The survey has seven subscales that evaluate seven universal design principles (equitable use, flexibility in use, simple and intuitive use, perceptible information, tolerance for error, low physical effort, and size and space for approach and use) (13). The adaptions to the survey were to exclude four items that were not applicable to the two devices, and to include five new questions that were which device the participant preferred, which device was easier to use, which device was perceived as more accurate, and which device required less adjustment. An open question asked participants to comment on the strengths and weaknesses of the control and test device.
Statistical Analysis
All statistical analyses were conducted using SPSS statistical package version 19 (SPSS Inc, Birmingham, AL). Individual independent sample student t-tests were used for all bench top comparisons and objective measure comparisons between registered nurses and student nurses. Paired sample t-tests were used for objective measure comparisons between devices and single sample t-tests were used for assessing the accuracy of each device under each condition compared with the target rate. Chi-square tests for goodness of fit were used to evaluate user preference, ease of use, perceived accuracy and perceived requirement for adjustment. A null hypothesis was for equal numbers to choose the control and test device for each of the four characteristics. Assumptions for statistical tests were met and results are reported as significant if p < .05, unless otherwise stated.
RESULTS
Objective Measures
Experimental Evaluation
These tests revealed no significant difference between the two devices in terms of the error in the mean drop rate over a 500 ml volume infusion. The control device had a mean rate of 10.36 percent below the target rate and the test device having a mean rate of 8.72 percent below the target rate (p = .636). The change in rate over time due to reducing hydrostatic pressure was also not significantly different between the two devices with the control device having a mean reduction in rate of -2.49 percent per 1,000 drops (approximately 50 ml) and the test device having a mean reduction in rate of -1.67 percent per 1,000 drops (p = .303).
Clinical Evaluation
When given 30 seconds to set a target rate of 60 drops per minute both registered nurses, student nurses and pooled results were significantly off target with both devices (Table 1). Accuracy was much improved when no time limit was imposed to set the target rate (Table 1). There were no significant differences in accuracy between registered nurses and student nurses with or without time limit for each device (Table 2). Between the two devices, there was no significant difference in accuracy over 30 seconds or without a time limit and there was no difference in the time taken so set a desired rate without a time limit (Table 3)
Table 1. Ability of Registered and Student Nurses to Achieve the Target Rate (60 Drops per Minute (dpm)) with and without a Time Limit for Each Device
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190725120620798-0873:S0266462315000586:S0266462315000586_tab1.gif?pub-status=live)
Table 2. Differences in Error Rates (Drops per Minute Off Target Rate of 60 dpm) between Registered and Student Nurses with Each Device, and Time Taken to Set Each Device
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190725120620798-0873:S0266462315000586:S0266462315000586_tab2.gif?pub-status=live)
Table 3. Head to Head Comparison of Devices among All Participants (Drops per Minute Off Target Rate of 60 dpm)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190725120620798-0873:S0266462315000586:S0266462315000586_tab3.gif?pub-status=live)
Subjective Measures
The results were pooled for both registered nurses and student nurses. The results showed no significant difference in device preference (χ2 = 0.061 df = 1; p = .806), and although the control device was considered easier to use (χ2 = 3.879 df = 1; p = .049), the test device was considered more accurate (χ2 = 11.879 df = 1; p = 0.001) and considered to require less adjustment (χ2 = 10.242 df = 1;p = .001) (Figure 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190725120620798-0873:S0266462315000586:S0266462315000586_fig1g.gif?pub-status=live)
Figure 1. User opinion.
The results were also compared between registered nurses and student nurses. The results showed that registered nurses were more likely to prefer the test device compared with the student nurses (χ2 = 4.951 df = 1; p = .026). There was no significant difference in perceived ease of use, accuracy, or requirement for adjustment between the registered nurse and student nurse groups.
When participants were asked to comment on the strengths and weaknesses of each device (Supplementary Table 1), both registered and student nurses most commonly expressed the opinion that the test device was more accurate but that it was hard to turn the dial while the control device was less accurate but easier to use.
Detailed results of responses to the survey of Universal Design Performance Measure for Products are displayed in Supplementary Table 2. These results mirror the summary opinions that the control device is easier to use (items 5, 18, 19, 20, and 24) but the test device offers superior accuracy (item 6), is more intuitive (items 10, 12, and 22), and safer (items 16 and 17).
DISCUSSION
This study has tested the accuracy of a new device to control IV fluid rate and compared it with the standard roller-clamp device. The roller-clamp device is a ubiquitous technology to set and control IV flow rate, but there are well-described concerns about this approach (Reference Bissett, Brandt and Windsor14–Reference Rooker and Gorard17), which creates an opportunity for an alternative approach. Experimental testing found no difference in the accuracy of the two devices. Clinical testing of the two devices by experienced and student nurses revealed a discordance between the objective measures of accuracy and the subjective user perceptions of accuracy. This study also highlighted a range of subjective performance measures (such as ease of use and perceived safety) that have an influence on device preference. Given that the primary role of IV fluid rate control devices is to ensure an accurate rate of delivery, it is important to examine the reason behind the objective/subjective discordance in assessment of accuracy. It is also important to consider the role of other subjective performance measures that do not have a clear objective measure counterpart in user acceptance of medical devices.
The study is a clear demonstration that user opinion regarding a new medical device is influenced by more than just the objective primary performance measures. The following discussion explores the explanations (broadly categorized into design characteristics and demand characteristics) and implications of these results.
Design Characteristics
The separation of objective and subjective measures is by no means a new concept, but the interaction between these has not been discussed in the context of evaluating the performance of a medical device. This concept is, however, well covered in the placebo research literature (Reference Faasse, Cundy, Gamble and Petrie18–Reference Moerman20). Meta-analyses highlight that there is little evidence that the placebo response is based on objectively measured outcomes, rather it is subjective characteristics that are primarily responsible (Reference Hróbjartsson and Gøtzsche11;Reference Hróbjartsson and Gøtzsche12). And in the same sense in the study in relation to the accuracy of the two devices, the subjective opinion of the participants was that the test device was superior and yet objective measures showed they were equally accurate. There were other similarities with a placebo response, one review (Reference Moerman and Jonas21) points out that it is affected by size, color (Reference de Craen, Roos, de Vries and Kleijnen22), branding and labeling (Reference Branthwaite and Cooper23), and sophistication and expectation (Reference Desharnais, Jobin, Côté, Lévesque and Godin24). These factors describe key differences in design between the two devices used in this study. However, the concept of a placebo response (the effect in response to an inert substance or intervention) is not strictly applicable to the evaluation of medical devices and the meaning response may better explain our results. As an example of the difference between a meaning response and a placebo response, Moerman and Jonas (Reference Moerman and Jonas21) use the example of branded aspirin being more effective than nonbranded aspirin (Reference Branthwaite and Cooper23). They propose that the improved efficacy of the branded aspirin over nonbranded aspirin is a meaning response rather than a placebo response as it is not a difference in the product (active versus inert versus nothing) causing this difference, rather a difference of the meaning of the product to the participants. Similarly, in this study, differences in device design are likely contributors to a difference in the meaning response. The test device in this study is a much larger, colored, and refined device, with specific features that declare its purpose of controlling flow rate. Also in contrast to the control device, the test device has a product title branded on it, has numbers labeled to indicate increasing and decreasing flow rates and the dial mechanism of the test device is indexed providing the user with feedback by means of a clicking noise. The combination of these design characteristics provide a basis for the observed results in this study which we contend can be explained, at least in part, by a meaning response.
Demand Characteristics
A further explanation for the observed objective/subjective discordance may result from the concept of demand characteristics. Briefly, demand characteristics refers to an experimental artifact where participants form an interpretation of the experiment's purpose and unconsciously change their behavior to fit that interpretation (Reference Orne25). Inaccuracy in the delivery of gravity driven IV infusions are well documented both historically (Reference Bivins, Rapp, Powers, Butler and Haack26–Reference Rithalia and Rozkovec29) and more recently (Reference Bissett, Brandt and Windsor14–Reference Rooker and Gorard17). Research indicates that the setting up of IV fluid infusions is a stressful experience for nurses because of the risk of error and harm to patients (Reference Husch, Sullivan and Rooney30). This means that a device that offers greater accuracy has the potential to reduce stress and would be in demand. Also the nurses were aware that a new device, with added features, was being compared with the current device, and the participants could be led to believe that the new device was likely to be superior. This may contribute to participant's perception of improved accuracy and a desire to use the new device without it necessarily achieving a more accurate performance.
Furthermore, it is reasonable to assume that registered nurses responses may be more susceptible to the influence of demand characteristics than student nurses. This is due to their greater clinical experience and knowledge of the inaccuracies and potential hazards associated with IV fluid infusions. In fact our results do demonstrate that nurse are more likely to prefer the test device as compared to the student nurses (χ2 = 4.951 df = 1; p = .026). With these results we may hypothesize that clinical experience and published literature has taught the registered nurses to place more value on perceived improved accuracy over the control device. Another possible role of demand characteristics in the objective/subjective discordance regarding accuracy in this study may be hinted at by subjective measures of other performance characteristics. Characteristics such as the opinion that the test device was easier to figure out/explain how to use, was less prone to user error or was less likely to cause harm could conceivably have an impact on the perceived accuracy of the device. It was not possible in this study to include objective measures to correlate these subjective differences but these finding do support the likely role of demand characteristics both in this study but also in the evaluation of medical devices in general.
The implications of the discordance between objective and subjective assessment of medical devices is worth consideration. Human factor engineering seeks to optimize the interface between human and device (or system/technology) to enhance the benefits and minimize potential risks. Testing prototype devices in practical, real-world environments and making changes based on lessons learnt are a vital part of the process of device design and manufacture. The added dimension with medical devices is that in addition to the benefit to device users, there needs to be benefits, or at least no harm, to patients. The issue highlighted in this study is that it is possible to design devices in line with the principles of human factor engineering and to find that users believe there is a benefit in the absence of any.
The present case study illustrates how design features can produce a meaning response that leads to differences in perceived benefits and a desire for a new device. And more than that the testing of a new device on participants who are aware of limitations of current or existing devices and the presence of other subjective characteristics can allow demand characteristics leading to perceived benefits when there are none. The obvious question that arises is whether a new device should be adopted in the absence of objective evidence of superiority. There are numerous examples of user demand driven technology adoption before objective evidence of benefit and safety, including laparoscopic (Reference Himal31) and robotic (Reference Patel, Linares and Joseph32) surgery. The converse is also true, where despite objective evidence the lack of user acceptance presents a significant barrier to implementation (Reference de Veer, Fleuren, Bekkema and Francke7;Reference Yarbrough and Smith33). This study has clearly demonstrated that user perception does not reliably correlate with the objective performance in the case of flow rate accuracy. Despite general acceptance that objective evidence of benefit should be established a priori, there is a tendency for the adoption of new devices based on user demand and subjective perception, and this can lead to increased risk, adverse outcomes, and wasted resources (Reference Himal31).
There are some noteworthy limitations of this study. First, the authors acknowledge that more than one objective measure has to be considered when evaluating medical devices. Not all subjective measures in this study have (or could have) an objective measure corollary, in part due to the small scale of the study. Thus the measures that have been subjectively assessed (e.g., ease of use, reliability, and user preference) do not have an objective counterpart. This is due to the fact that to have objective data on these characteristics would require resources far greater than the capacity of this research. It is important to emphasize that the aim of this study was to evaluate the accuracy of the devices and investigate the relationship between objective and subjective assessment of a single characteristic (in this case accuracy). Of course this has implications on the generalizability of the findings. However, the authors’ intention was to provide an exemplar case study to discuss the broader issue of medical device evaluation.
This case study of a new and current device for setting and controlling the flow rate of IV fluids has provided an excellent opportunity to examine the interaction between objective performance measures and subjective user perceptions. In addition to providing evidence that the new device is not more accurate, we have shown that even with a relatively simple device and a simple task, device and demand characteristics come into play, and that a strong discordance can develop between the objective and subjective measures. This goes someway to explaining why devices are adopted in the absence of objective evidence of benefit, a phenomenon contributed to by what has been termed “persuasive design” (Reference Fogg34;Reference Redström, IJsselsteijn, de Kort, Midden, Eggen and van den Hoven35). This can undermine the primacy of objective evidence by convincing users of improved performance through accentuated features thought to enhance user perception, and users perception can have a large say in the implementation of new devices. This is not to say that subjective measures are not important in the evaluation process but they must be considered in the context of the device, the environment and how they relate to important objective measures. Greater awareness of the perils of persuasive design, and the deliberate assessment of it as part of the evaluation of new medical devices, through such tools as universal design performance measures, is encouraged. This should help to reduce the risk of premature medical device introduction, with the potential for risk reduction and cost savings.
SUPPLEMENTARY MATERIAL
Supplementary Figures 1–3 http://dx.doi.org/10.1017/S0266462315000586
Supplementary Tables 1–2 http://dx.doi.org/10.1017/S0266462315000586
CONFLICTS OF INTEREST
The authors report no potential or real conflict of interest.