Significant outcomes
-
We found digital DARS is a robust questionnaire, with comparable psychometric properties to previous versions
-
Different scores were found for ‘Sensory Experience’ subscale in digital and paper-and-pencil versions of DARS
-
Our results highlight the necessity of the digital validation of psychometric instruments and not directly assume equivalence
Limitations
-
Our sample size was relatively small (69 participants)
-
We have selected a heterogeneous distribution of diagnoses
-
We did not consider the time gap between filling both formats of DARS questionnaire
Introduction
In recent years, we are witnessing a digital revolution (Hodson, Reference Hodson2018) and evolving from an analogical world to a digital world, wherein digital medicine and digital psychiatry are emerging fields (Keesara et al., Reference Keesara, Jonas and Schulman2020). This revolution is resulting in the replacement of pencil and paper with computers, tablets, and mobile devices, which is highly pertinent to the field of psychometry, specifically for validity of self-report questionnaires (Alfonsson et al., Reference Alfonsson, Maathz and Hursti2014; van Ballegooijen et al., Reference van Ballegooijen, Riper, Cuijpers, van Oppen and Smit2016). Digital or online questionnaires have various advantages over traditional formats, including an increased comfort with digital instruments among younger generations (Prensky, Reference Prensky2001), greater compliance and fewer missing data (Gwaltney et al., Reference Gwaltney, Shields and Shiffman2008), as well as greater sincerity answering sensitive questions, for example, regarding substance use, traumatic events, or suicide (Barak, Reference Barak2007; Lin et al., Reference Lin, Bai, Liu, Hsiao, Chen, Tsai, Ouyang, Wu and Li2007; Christensen & Hickie, Reference Christensen and Hickie2010; Torous et al., Reference Torous, Staples, Shanahan, Lin, Peck, Keshavan and Onnela2015). Nevertheless, certain concerns regarding digital instruments must be acknowledged, and some people may feel uncomfortable with digital devices or might perceive a gap in security or privacy in online measure (Alfonsson et al., Reference Alfonsson, Maathz and Hursti2014); furthermore, different digital interfaces may affect scores or psychometric features of questionnaires (Tourangeau et al., Reference Tourangeau, Couper and Conrad2004; Thorndike et al., Reference Thorndike, Carlbring, Smyth, Magee, Gonder-Frederick, Ost and Ritterband2009).
When psychometric instruments are used in populations with different characteristics for which they were created, there has to be a process of validation and adaptation (Wild et al., Reference Wild, Eremenco, Mear, Martin, Houchin, Gawlicki, Hareendran, Wiklund, Chong, von Maltzahn, Cohen and Molsen-David2009). Similarly, with the migration from paper-and-pencil to digital formats, some authors recommend demonstrating the equivalence between the two formats rather than assuming a direct equivalence and evaluating the inter-format reliability (Coons et al., Reference Coons, Gwaltney, Hays, Lundy, Sloan, Revicki, Lenderking, Cella and Basch2009).
Regarding depression, many self-report screening and symptom severity questionnaires are available in online and digital formats. Therefore, different studies have demonstrated the digital reliability of Patient Health Questionnaire-9, General Health Questionnaire, Beck Depression Inventory, Center for Epidemiologic Studies Depression Scale, Clinically Useful Depression Outcome Scale or Montgomery−Asberg Depression Rating Scale-Self-report (Alfonsson et al., Reference Alfonsson, Maathz and Hursti2014; van Ballegooijen et al., Reference van Ballegooijen, Riper, Cuijpers, van Oppen and Smit2016) Anhedonia, a core symptom in depression, and present as well in other mental disorders, as personality disorders, eating disorders, anxiety disorders, or psychotic disorders (Ritsner, Reference Ritsner2014) are a complex phenomenon first described as a decreased ability to experience pleasure in a general sense (Ribot, Reference Ribot1897; Chapman et al., Reference Chapman, Chapman and Raulin1976). The perspective of anhedonia as a transdiagnostic phenomena has recently gained interest and might help in the development of psychiatric taxonomy (Trøstheim et al., Reference Trøstheim, Eikemo, Meir, Hansen, Paul, Kroll, Garland and Leknes2020).
The modern concept of anhedonia encompasses motivational anhedonia (the desire to be involved in a particular activity) and consummatory anhedonia (the actual ability to enjoy by doing this activity; Treadway & Zald, Reference Treadway and Zald2011), and even cognitive characteristics like the ability to anticipate and predict reward (Der-Avakian & Markou, Reference Der-Avakian and Markou2012). However, most questionnaires, including the frequently proposed gold standard, the Snaith–Hamilton Pleasure Scale (SHAPS; Snaith et al., Reference Snaith, Hamilton, Morley, Humayan, Hargreaves and Trigwell1995), only focus on specific aspects of anhedonia. To deal with this limitation, the Dimensional Anhedonia Rating Scale (DARS; Rizvi et al., Reference Rizvi, Quilty, Sproule, Cyriac, Bagby and Kennedy2015) was designed. The DARS evaluates different features of anhedonia (interest, motivation, effort, and consummatory pleasure) across different domains (hobbies, food and drink, social activity, and sensory experiences).
To date, the DARS is available in its original English version (Rizvi et al., Reference Rizvi, Quilty, Sproule, Cyriac, Bagby and Kennedy2015), as well as in a Spanish (Arrua-Duarte et al., Reference Arrua-Duarte, Migoya-Borja, Barrigón, Barahona, Delgado-Gomez, Courtet, Aroca, Rizvi, Kennedy, Quilty and Baca-García2019) and German (Wellan et al., Reference Wellan, Daniels and Walter2021) validation. The original DARS was developed in three studies; firstly, item selection was made in a community sample and after that validation was made in 150 community participants with an online questionnaire, and in 52 patients with depression and 50 controls with a paper-and-pencil questionnaire (Rizvi et al., Reference Rizvi, Quilty, Sproule, Cyriac, Bagby and Kennedy2015). With the community sample, authors used an online questionnaire assuming that scales fulfilled in web-based and in laboratory-based environments may be comparable (Risko et al., Reference Risko, Quilty and Oakman2006). This questionnaire is the only digital version of DARS. The studies showed comparable psychometric properties derived from the paper-and-pencil and digital scale.
In the original version, DARS demonstrated good reliability and validity against the SHAPS, the gold standard for measuring anhedonia (Cronbach’s alpha = 0.96 and 0.75–0.99 for subscales; Rizvi et al., Reference Rizvi, Quilty, Sproule, Cyriac, Bagby and Kennedy2015). The Spanish version has also demonstrated a high internal consistency, showed by its high Cronbach’s alpha (Overall 0.92 and 0.91–0.92 for subscales; Arrua-Duarte et al., Reference Arrua-Duarte, Migoya-Borja, Barrigón, Barahona, Delgado-Gomez, Courtet, Aroca, Rizvi, Kennedy, Quilty and Baca-García2019). No digital or online validations of Spanish DARS have been conducted.
The Spanish validation was performed with paper-and-pencil questionnaires, but a digital questionnaire was subsequently developed. Therefore, here we aim to analyse if the Spanish digital version of DARS has the same psychometric properties as the traditional paper-and-pencil one, hypothesising that no differences between both versions will be found.
Material and methods
Participants
This study is a secondary analysis of previous work designed to conduct the Spanish validation of the DARS (Arrua-Duarte et al., Reference Arrua-Duarte, Migoya-Borja, Barrigón, Barahona, Delgado-Gomez, Courtet, Aroca, Rizvi, Kennedy, Quilty and Baca-García2019). A total of 134 patients older than 18 years of age were recruited from July 2016 to February 2017 in the Psychiatry Department at Fundación Jiménez Díaz University Hospital, Madrid, Spain. Participants were recruited in three facilities: an outpatient mental health centre, a psychiatric hospitalisation unit, and a consultation-liaison psychiatry unit by two psychiatrists (EAD and MMB). In each site, psychiatrists selected at least one patient from those attending daily appointments, with any of these psychiatric disorders according to DSM-5 criteria: depressive disorders, psychotic disorders, adjustment disorders, anxiety disorders, personality disorders, bipolar disorders, and eating disorders (American Psychiatric Association, 2013). Patients from the inpatient setting were assessed one day before their respective discharge to avoid acute symptoms. They all participated without being compensated. Exclusion criteria were other diagnoses different than those mentioned above, acute substance use, decompensated medical or neurological conditions, and illiteracy or not being fluent in Spanish.
The study was carried out in accordance with the Declaration of Helsinki and approved by the Fundación Jiménez Diaz Hospital Ethics Committee. All participants provided written informed consent, after a complete description of the study.
Assessment
The attending psychiatrist recruited patients to participate in the study during their consultation at the facilities mentioned above. In the first step, patients filled the complete paper-and-pencil questionnaire for the original work (Arrua-Duarte et al., Reference Arrua-Duarte, Migoya-Borja, Barrigón, Barahona, Delgado-Gomez, Courtet, Aroca, Rizvi, Kennedy, Quilty and Baca-García2019), which included the DARS. In the second step of the study, participants were given an access code for the tool MEmind (available via Apple Store and Google Play; Barrigón et al., Reference Barrig, ón, Berrouiguet, Carballo, Bonal-Giménez, Fernández-Navarro, Pfang, Delgado-Gómez, Courtet, Aroca, Lopez-Castroman, Artés-Rodríguez and Baca-García2017) for which they were to access on their own to complete the digital version of the scale on any electronic device within the next week.
The MEmind Wellness Tracker tool is a web application, developed by the Psychiatry Department of the Fundación Jiménez Díaz University Hospital, with two interfaces, one for health care professionals resembling an Electronic Health Record and another for patients, accessible via web or an app on all types of electronic devices with any operating system. This research protocol was uploaded to the patient interface.
The DARS is a novel and dynamic self-administered instrument used to evaluate anhedonia. It consists of 17 items divided into four categories (Pastimes/Hobbies, Foods/Drinks, Social Activities, and Sensory Experiences; Rizvi et al., Reference Rizvi, Quilty, Sproule, Cyriac, Bagby and Kennedy2015). In each of the four categories, patients have to give two or three examples of experiences reporting pleasure, and after that, they have to assess their current desire, motivation, effort, and consummatory pleasure for their examples. Answers are given in on a 5-point Likert scale (Not at all = 0; Slightly = 1; Moderately = 2; Mostly = 3; Very Much = 4). The sum of all the items provides the total score, with higher scores reflecting greater motivation, effort and pleasure and, subsequently, less anhedonia (Fig. 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221026100108999-0228:S0924270821000454:S0924270821000454_fig1.png?pub-status=live)
Fig. 1. DARS as is shown in MEmind.
Statistical analysis
The statistical analysis was performed with R software version 3.4.1 (‘single candle’). Considering the sample size was small, and therefore, not normally distributed, we first contrasted the paper-and-pencil and digital formats by calculating a Wilcoxon test. To provide more robust results, we conducted tests for each item. In this way, we assessed a total of 17 hypotheses. Later, we assessed the validity of the scales by obtaining the consistency index (Kappa coefficient; Fleiss, Reference Fleiss1971), with values below 0.40 representing poor agreement, values between 0.40 and 0.75 fair to good agreement, and values greater than 0.75 excellent agreement (Fleiss et al., Reference Fleiss, Levin and Paik2003). Two measures of reliability were calculated for obtaining more consistent results: Cronbach’s alpha (Cronbach, Reference Cronbach1951), and Guttman’s coefficient (λ3; Guttman, Reference Guttman1945), with higher values representing better internal consistency for both coefficients. The intraclass correlation coefficient (ICC; Shrout & Fleiss, Reference Shrout and Fleiss1979) was obtained to decide whether or not both scales were stable and reliable; values less than 0.5 represent poor reliability, between 0.5 and 0.75 moderate, between 0.75 and 0.9 good, and greater than 0.90 excellent reliability (Koo & Li, Reference Koo and Li2016). Finally, to compare the paper-and-pencil and digital models, and therefore to compare their dimensional structure, we calculated the comparative fit index (CFI) for the scale (Bentler, Reference Bentler1990) and the root mean squared error (RMSE; Lehmann & Casella, Reference Lehmann and Casella1998).
Results
Sample description
Out of the 134 initial patients who filled the paper-and-pencil DARS version, 69 (51.5%) filled the digital format of DARS after filling the paper-and-pencil one. There were no differences regarding age or sex between participants completing both steps and those who just filled the paper-and-pencil format; however, there were differences observed regarding diagnoses (Table 1).
Table 1. Sample description
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221026100108999-0228:S0924270821000454:S0924270821000454_tab1.png?pub-status=live)
DARS, Dimensional Anhedonia Rating Scale
†Total sample; ‡Participants who only filled paper-and-pencil DARS; §Participants who filled paper-and-pencil and digital DARS.
* Withdrawing participants and second step participants are compared.
Migration to paper-and-pencil DARS to digital DARS
Total score of DARS in paper-and-pencil was higher than total score in digital DARS (49.35 ± 13.96 vs. 43.03 ± 15.16; Z = −3.536; p < 0.001). There were differences for items 13, 15, and 16 (all of them part of the ‘Sensory experiences’ subscale), whereas for the rest of item scores were similar (Table 2).
Table 2. Wilcoxon test for paper-and-pencil and digital formats of DARS
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221026100108999-0228:S0924270821000454:S0924270821000454_tab2.png?pub-status=live)
Internal consistency
Conbrach’s alpha for digital and paper-and-pencil DARS was the same in both formats (α = 0.94). For Guttman’s coefficient (λ3), the value was 0.97 for the digital version, and 0.96 for the paper-and-pencil format.
Test and re-test reliability
The ICC for digital version was 0.95 (F = 18.5, p < 0.000), and for paper-and-pencil version 0.94 (F = 16.7, p < 0.000).
Agreement between paper and digital versions items
As it is shown in Table 3, weighted Kappa between paper-and-pencil and digital DARS ranged from 0.30 to 0.52.
Table 3. Agreement in individual items for paper-and-pencil and digital DARS
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221026100108999-0228:S0924270821000454:S0924270821000454_tab3.png?pub-status=live)
p-value was under 0.001 for all items.
Equivalence among digital and paper version of DARS
The last part of our statistical analysis comprised the comparisons of both scales by calculating the CFI and the RMSE. In the first case, for the digital DARS, CFI value was 0.973 for the digital DARS and 0.974 for the paper-and-pencil DARS. Secondly, RMSE was 0.11 for the digital DARS and 0.10 for the paper-and-pencil DARS.
Discussion
Paper-and-pencil DARS total score was significantly higher than digital DARS total score. No significant differences between scores were found for subscales ‘Pastimes and hobbies’, ‘Food and drinks’, and ‘Social Activities’, whereas for ‘Sensory Experiences’ three out of five items had different scores in both formats. The level of agreement was fair to good for most items, with the exception of most items of ‘Sensory Experiences’ subscale, and one item in the subscale ‘Food and drinks’. Finally, internal consistency was excellent for both formats.
This excellent internal consistency suggests that our digital validation of DARS is a reliable questionnaire, with comparable psychometric properties to previous versions. Cronbach’s α (Cα) of our digital DARS (Cα 0.94) was similar to both original validations (Cα 0.92), and also similar to the Spanish paper-and-pencil format (Cα 0.92). Additionally, internal consistency was similar to other anhedonia questionnaires available in Spanish, such as the Revised Physical Anhedonia Scale (Cα 0.92), Revised Social Anhedonia Scale (Cα 0.95; Fonseca-Pedrero et al., Reference Fonseca-Pedrero, Paino, Lemos-Giráldez, García-Cueto, Villazón-García, Bobes and Muñiz2009), or Anticipatory and Consummatory Interpersonal Pleasure Scale (Cα 0.92; Gooding et al., Reference Gooding, Fonseca-Pedrero, Pérez de Albéniz, Ortuño-Sierra and Paino2016) and even higher than the Spanish SHAPS (Cα 0.77; Fresán & Berlanga, Reference Fresán and Berlanga2013). Nevertheless, we need to be cautious to assume that DARS digital migration is equivalent to paper-and-pencil one, at least in our sample. Total scores of DARS were higher in paper-and-pencil than in digital version and the level of agreement was around 0.40 for most items, that is just fair to good, and different scores were found for ‘Sensory Experience’ subscale in digital and paper-and-pencil DARS. This lower agreement in this specific subscale could have different explanations. Since these are the last questions, the lower level of agreement may represent participants' fatigue; moreover, sensory experiences are more abstract concepts than those evaluated in the rest of the subscales. While participants could ask to researcher if they needed when they were filling up the paper-and-pencil version, in the digital version, participants answered according to their own criteria. Also, we have to take into account that 26% of the patients who filled both versions were patients with psychosis, and there is a well-known difficulty in abstract thinking in psychosis (McCutcheon et al., Reference McCutcheon, Abi-Dargham and Howes2019). Furthermore, although non-significant, there were differences in basal level of anhedonia, and total scores of DARS were lower (representing higher anhedonia) in those participants that did not complete the digital version, so anhedonia itself could explain less motivation to complete the questionnaires in the second step.
Precisely, another particularity of our study was the two step design, as participants who were recruited to validate the Spanish DARS were invited to complete the same questionnaires in a digital format during the next week on their own without any reminders, control over the testing environment, or economic compensation. Indeed, almost half of the initially selected patients decided not to complete them. This possibly represents a lack of interest of participants in the study, or other clinical features, including anhedonia as mentioned above. It is noteworthy that a high percentage of patients with depression (26 out of 40) did not complete the online questionnaires, while most patients with psychosis did (18 out of 26). The overrepresentation of people with psychosis might reflect the therapeutic relationship with the recruiter clinician; greater severity of depression may also have contributed to format differences across these items. Furthermore, as the pencil-and-paper version of the DARS was always completed prior to the digital version of the DARS, order effects cannot be ruled out. Finally, due to our sample size, we did not formally evaluate the structural invariance of the paper-and-pencil and digital formats of the DARS; future research completing such analyses would provide valuable guidance in how to interpret any differences across these formats.
Our results highlight that full equivalence cannot be taken for granted in the digital migration of psychometric instruments, as have been previously pointed (Alfonsson et al., Reference Alfonsson, Maathz and Hursti2014; van Ballegooijen et al., Reference van Ballegooijen, Riper, Cuijpers, van Oppen and Smit2016). These results contrast with previous results of our group demonstrating the equivalence between paper-and-pencil and the electronic format of the SHAPS (Montoro et al., Reference Montoro, Arrua-Duarte, Peñalver-Argüeso, Migoya-Borja, Baca-Garcia and Barrigón2020), the gold standard questionnaire for assessing anhedonia. Although almost all psychiatric self-report questionnaires are equivalent in digital and classical formats, this is not universally valid. Notably, for more complex instruments such as Symptom Checklist-90-R or General Health Questionnaire 28 studies showed worse reliability between paper-and-pencil and digital versions (Alfonsson et al., Reference Alfonsson, Maathz and Hursti2014). These questionnaires are designed to capture many different domains of psychological constructs and have several subscales, similarly to DARS, which probably explained the partial equivalence for DARS while for a simple questionnaire such as SHAPS equivalence was clearer (Montoro et al., Reference Montoro, Arrua-Duarte, Peñalver-Argüeso, Migoya-Borja, Baca-Garcia and Barrigón2020). Therefore, when researchers design studies they must carefully consider which format is needed to use to engage participants and properly measure psychological phenomena.
This work is original in studying the digital migration of a novel anhedonia questionnaire and in its design that allows participants to decide if they want to participate in the two steps, a substantial sample lost to follow-up. This also represents the main limitation of the study and the relatively small sample size. Other limitations are that the time gap between filling both formats has not been considered, and neither the interface in which participants decided to fill the digital questionnaire (computer, smartphone, or any other device) did, that it is known may influence in results (Tourangeau et al., Reference Tourangeau, Couper and Conrad2004; Thorndike et al., Reference Thorndike, Carlbring, Smyth, Magee, Gonder-Frederick, Ost and Ritterband2009). Initial sample composition, with a variety of psychiatric diagnoses, maybe consider a strength, as real clinical population was represented; nevertheless, the final non-selected sample might have affected our results.
Our findings demonstrated that digital DARS is a robust questionnaire, but full equivalence with paper-and-pencil format cannot be assumed without caution, highlighting the necessity of the digital validation of psychometric instruments, especially in studies in which both formats are intended to be used.
Acknowledgements
This work was partly funded by Instituto de Salud Carlos III under Grant PI16/01852, the American Foundation for Suicide Prevention under Grant LSRG-1-005-16, the Ministerio de Ciencia, Innovación y Universidades under Grants RTI2018-099655-B-I00 and TEC2017-92552-EXP; and the Comunidad de Madrid under Grants Y2018/TCS-4705 and PRACTICO-CM.
Author contributions
EAD and MMB recruited and assessed the participants, and they made the literature search and co-write the first draft of manuscript. IB made the data analysis and draughted the results section. LCQ, SJR and SHK collaborated in designing the study. EBG designed the study, obtained the fund, and participated in the data analysis. MLB collaborates in draughting the manuscript and made the figures and tables. All authors critically reviewed the article and gave their final approval to the manuscript.
Conflict of interest
EBG designed MEmind Wellness Tracker. The other authors declare no conflict of interest.
Data availability statement
Data are available under request.