Significant outcomes
-
COVID-19 pandemic-related psychopathology covaries with the pandemic pressure over time.
-
There was a relative decrease in the level of COVID-19 pandemic-related psychopathology over the course of the second wave of the pandemic in Denmark, likely indicating habituation.
-
Following training based on manually labelled clinical notes from electronic health records, machine learning methods may be used to label large numbers of clinical notes that cannot be handled manually.
Limitations
-
This work is based on a subset of clinical notes containing COVID-19-related search queries and is thus likely to underestimate the extent of COVID-19-related psychopathology.
-
While the developed models perform relatively well, they do not include recent developments in natural language processing where pre-trained models are used to obtain a contextualised representation of natural text.
-
While the estimated feature importance scores allow for insights into the applied model, it should be noted that retraining could result in different importance scores given the non-deterministic elements of the training process (e.g. random splits).
Introduction
The COVID-19 pandemic is believed to have a major negative impact on global mental health due to the viral disease itself as well as the associated lockdowns, social distancing, isolation, fear, and increased uncertainty (Brooks et al., Reference Brooks, Webster, Smith, Woodland, Wessely, Greenberg and Rubin2020; Sønderskov et al., Reference Sønderskov, Dinesen, Vistisen and Østergaard2021; Szcześniak et al., Reference Szcześniak, Gładka, Misiak, Cyran and Rymaszewska2021; He et al., Reference He, Wei, Yang, Zhang, Cheng, Feng, Yang, Zhuang, Chen, Ren, Li, Wang, Mao, Chen, Liao, Cui, Li, He, Lei and Qiu2021). Individuals with preexisting mental illness are likely to be particularly vulnerable to the psychological stress associated with the COVID-19 pandemic (Jefsen et al., Reference Jefsen, Rohde, Nørremark and Østergaard2020; Rohde et al., Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020; Saunders et al., Reference Saunders, Buckman, Leibowitz, Cape and Pilling2021). Indeed, based on manual screening of clinical notes from electronic health records, Rohde et al. (Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020) recently showed that many patients with mental illness appear to have developed COVID-19 pandemic-related psychopathology – that is, symptoms of mental illness that seem to be either directly or indirectly caused by the pandemic – during the first wave of the pandemic in the spring of 2020. Here, we continue this effort in an attempt to continuously monitor the extent of COVID-19 pandemic-related psychopathology at the level of the entire patient population of the Psychiatric Services in the Central Denmark Region (CDR) by means of machine learning.
Methods
Setting and data
This project is based on data from clinical notes from the Psychiatric Services of the CDR, which provides inpatient, outpatient, and emergency treatment for all types of mental disorders for the approximately 1.3 million inhabitants of the CDR. For a further description of the patient population (e.g., diagnostic distribution), please see Hansen et al. (Reference Hansen, Enevoldsen, Bernstorff, Nielbo, Danielsen and Østergaard2021).
Manual labelling of COVID-19 pandemic-related psychopathology
The point of departure for the training of a machine learning model to monitor COVID-19 pandemic-related psychopathology is the manual labelling of such cases by Rohde et al. (Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020). In brief, in the work by Rohde et al., all 412,804 clinical notes on adult patients (aged 18 or above) from the Psychiatric Services of the CDR in the period from March 1st, 2020 to March 23rd, 2020 were initially filtered using the following COVID-19-related search queries (’Danish’ (English)): ‘corona’ (corona), ‘covid’ (covid), ‘virus’ (virus), ‘epidemi’ (epidemic), ‘pandemi’ (pandemic), and ’smitte’ (contaminate/contamination), including compounds. A total of 11,072 of the 412,804 clinical notes contained at least one of these search queries and were subsequently manually labelled by CR and OHJ as either ‘containing a description of COVID-19 pandemic-related psychopathology’ or ‘not containing a description pandemic-related psychopathology’. An attempt was made to distinguish between rational and pathological reactions to the pandemic. A total of 1357 of the 11,072 clinical notes were found to include descriptions of COVID-19 pandemic-related psychopathology. For more details, please see Rohde et al. (Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020).
Training and testing machine learning models for labelling of COVID-19 pandemic-related psychopathology
To train and test supervised machine learning models in labelling of COVID-19 pandemic-related psychopathology, the 11,072 manually labelled clinical notes (1357 describing COVID-19 pandemic-related psychopathology and 9715 without COVID-19 pandemic-related psychopathology) from Rohde et al., were split randomly into a training (70% of the notes) and test set (30% of the notes) while making sure that no patient appeared both in the train and test split. Subsequently, Naïve Bayes and XGBoost models were trained using a 5-fold cross-validated grid search on the training set. The Naïve Bayes classifier, which provides baseline performance, is a simple and efficient machine learning classifier based on Bayes theorem (Zhang, Reference Zhang2004). XGBoost is a state-of-the-art gradient boosting model that ensembles weak learners – typically decision trees – and has been shown to consistently obtain competitive results in classification tasks (Chen & Guestrin, Reference Chen and Guestrin2016). The models were trained on the natural language text in the clinical notes, the text field indicators (‘observations of the patient’, ‘current mental status’, ‘plan’, ‘conclusion’), the patient’s sex, diagnosis, and inpatient or outpatient status. The trained models were then validated on the test set consisting of the remaining 30% of the clinical notes. Furthermore, we examined the most important features, defined as the average gain of splits in decision trees using the given feature. The model training and selection was performed in Python (version 3.7.9) using scikit-learn version 0.23.2 (Buitinck et al., Reference Buitinck, Louppe, Blondel, Pedregosa, Mueller, Grisel, Niculae, Prettenhofer, Gramfort, Grobler, Layton, VanderPlas, Joly, Holt and Varoquaux2013) and imblearn version 0.7.0 (Lemaître et al., Reference Lemaître, Nogueira and Aridas2017). For XGBoost, the Python package xgboost version 1.0.2 was used (Chen & Guestrin, Reference Chen and Guestrin2016).
Application of the trained machine learning model
Following training and testing, the best performing of the two machine learning models (see the results section), based on the area under the receiver operating characteristic curve (AUC), was applied to the 78,540 clinical notes (out of a total of 33,363,187) from all adult patients (aged 18 or above) from the Psychiatric Services of the CDR from the period going from April 1st, 2020 to March 23rd, 2021, which contained at least one of the COVID-19-related search queries also used by Rohde et al. (Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020). These notes were labelled dichotomously (COVID-19 pandemic-related psychopathology: yes/no) based on a threshold calculated from the test set to result in 95% specificity.
Validation of the performance of the trained machine learning model
To validate the performance of the machine learning model in the labelling of COVID-19 pandemic-related psychopathology from April 1st, 2020 to March 23rd, 2021, 500 randomly drawn clinical notes from this period were labelled manually by CR and OHJ [similar approach as that used by Rohde et al. (Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020)] and compared to the labeling performed by the model. Lastly, the number of clinical notes containing COVID-19 pandemic-related psychopathology over the period from March 1st, 2020 to March 23rd, 2021 (manually labelled notes from March 1st, 2020 to March 23rd, 2020 and machine learning model labelled notes from April 1st, 2020 to March 23rd, 2021) was compared with the number of COVID-19-related deaths in Denmark (SSI, 2021) via Pearson correlation. For more information on model selection, preprocessing, and validation, see the Supplementary Material.
Approval
This project was approved by the Chief Medical Officer of Psychiatry of the Central Danish Region as part of quality development aimed at monitoring the level of pandemic-related psychopathology during the COVID-19 pandemic.
Results
Training and testing machine learning models in labelling of COVID-19 pandemic-related psychopathology
The best performing machine learning model was XGBoost, which achieved an AUC of 0.88 with 95% specificity and 36% sensitivity for the labelling of COVID-19 pandemic-related psychopathology in the test set. For comparison, the Naïve Bayes model achieved an AUC of 0.83. Figure 1 shows the most important features of the XGBoost model. The text field indicator ‘current mental state’ was found to be the most important feature but words related to psychopathology such as ‘angst’ (anxiety), and ‘presset’ (under pressure/stress) were also important. The same was the case for ‘missing’ diagnosis (diagnosis not assigned yet).
Application of the trained machine learning model
When applying the trained XGBoost model on the 78,540 clinical notes from the period from April 1st, 2020 to March 23rd, 2021 that contained a COVID-19-related search query, 7125 of the notes were labelled as containing descriptions of COVID-19 pandemic-related psychopathology (see Fig. 2 for an illustration).
Validation of the performance of the trained machine learning model
Validation of the XGBoost model for model labelling was compared to the manual labelling by CR and OHJ. It revealed a slight performance decrease to a specificity of 0.92 and a sensitivity of 33%, but no systematic decline in performance over time (see Figure S1 in Supplementary Material) as might otherwise have been expected if there was a distribution shift (i.e., a substantial change in the content of the clinical notes over time).
Figure 2 shows the development in COVID-19 pandemic-related psychopathology alongside the number of deaths due to COVID-19 in Denmark. There was a positive correlation between the level of pandemic-related psychopathology and the number of deaths due to COVID-19 in Denmark (R 2 = 0.13, 95% CI [0.06, 0.20], p < 0.0001, smoothed, R 2 = 0.20, 95% CI [0.12, 0.29], p < 0.0001).
Discussion
In this project, we successfully trained a XGBoost machine learning model to monitor the extent of COVID-19 pandemic-related psychopathology reported in the clinical notes of the electronic health records in a large psychiatric service setting. The training and test of the model were performed based on a dataset manually labelled for COVID-19 pandemic-related psychopathology. Subsequently, the trained XGBoost model labelled more recent clinical notes from the electronic health records for COVID-19 pandemic-related psychopathology. When comparing the number of COVID-19-related deaths and the extent of COVID-19 pandemic-related psychopathology, an expected positive association was observed.
The most important feature for the XGBoost model was the text field indicator ‘current mental state’. This is likely due to the fact that notes describing psychopathology (where current mental state is perhaps the most prominent example) are more likely to contain descriptions of COVID-19 pandemic-related psychopathology than notes not directly linked to descriptions of psychopathology. In accordance with this observation, words describing psychopathology, such as anxiety and under pressure/stress were also found to be important features. Furthermore, a missing diagnosis (diagnosis not assigned yet) was found to be of importance, suggesting that there were many newly referred patients among those experiencing pandemic-related psychopathology. Intensifiers like ‘meget’ (very) may be important as an interaction effect when appearing with other meaningful words, e.g. ‘meget bekymret for COVID-19’ (very worried about COVID-19). In fact, intensifiers were often used by OHJ and CR in their manual labelling to differentiate between rational and pathological reactions to the pandemic (Rohde et al., Reference Rohde, Jefsen, Nørremark, Danielsen and Østergaard2020).
Interpretation of feature importance requires a thorough understanding of the data used for model training. For instance, a ‘video consultation’ could be interpreted as the patient’s strong preference for staying at home due to perceived risk of contracting COVID-19. However, due to hospital policies during the pandemic lockdown, many consultations were held online and the term ‘video consultation’ was therefore often accompanied by a COVID-related search query, not related to the impact of the pandemic on the patient. Similarly, features such as ‘tider’ (appointments/times) can have multiple meanings depending on the context. Specifically, it will refer to hospital appointments in many cases, but the word may also appear in phrases such as ‘til tider’ (at times).
The observed positive association between the number of deaths due to COVID-19 and the number of clinical notes describing pandemic-related psychopathology indicates that the pandemic has had a negative psychological impact on patients with mental illness. This finding is clinically meaningful (and therefore serves as an indirect validation of the model) and compatible with the covariation between the pandemic pressure and the level of psychological well-being observed at the general population level in Denmark (Sønderskov et al., Reference Sønderskov, Dinesen, Vistisen and Østergaard2021). In our data, the association, however, seems less pronounced for the second wave compared to the first wave of the pandemic. As there was no systematic decline in model performance over the observation period, the weakened association is likely due to habituation – i.e. that patients are less sensitive to the development in the pandemic as time passes, possibly because the situation appears less insecure compared to the initial phase of the pandemic.
While our current approach shows promising results, it does have limitations. Perhaps most importantly, the developed model is likely to underestimate the number of cases of COVID-19 pandemic-related psychopathology. Hence, rather than providing estimates of absolute numbers, the model is better suited to monitor the relative development in COVID-19 pandemic-related psychopathology over time. This is an important metric that may aid psychiatric services in planning management of the psychological consequences of the COVID-19 pandemic. Furthermore, the results are a testament to the potential of applying machine learning to structured and natural language data from electronic health records in clinical psychiatry (Hansen et al., Reference Hansen, Enevoldsen, Bernstorff, Nielbo, Danielsen and Østergaard2021). Specifically, we show that, following training based on manually labelled clinical notes from electronic health records, machine learning methods can be used to label large numbers of clinical notes that cannot be handled manually.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/neu.2022.2
Acknowledgements
The authors thank Bettina Nørremark from Aarhus University Hospital – Psychiatry for her assistance with extraction of data.
Data availability
Due to the sensitive nature of the data, they are only available for quality development projects to employees in the Central Denmark Region upon application to, and approval by, the Chief Medical Officer of Psychiatry.
Financial support
The study is supported by an unconditional grant from the Novo Nordisk Foundation to Østergaard (Grant number: NNF20SA0062874). Rohde is supported by grants from the Danish Diabetes Academy, funded by the Novo Nordisk Foundation (grant number NNF17SA0031406), and from the Lundbeck Foundation (grantnumber R358-2020-2342). Jefsen is supported by a grant from the Health Research Foundation of Central Denmark Region (grant number: R64-A3090-B1898). Østergaard is further supported by grants from the Lundbeck Foundation (grant numbers: R358-2020-2341 and R344-2020-1073), the Danish Cancer Society (grant number: R283-A16461), the Central Denmark Region Fund for Strengthening of Health Science (grant number: 1-36-72-4-20), the Danish Agency for Digitisation Investment Fund for New Technologies (grant number 2020-6720), and Independent Research Fund Denmark (grant number: 7016-00048B).
Conflict of interest
Rohde received the 2020 Lundbeck Foundation Talent Prize. Østergaard received the 2020 Lundbeck Foundation Young Investigator Prize. Furthermore, Østergaard owns units of mutual funds with stock tickers DKIGI and WEKAFKI, as well as units of exchange traded funds with stock tickers TRET, 2B76, EXH2, QDVE, QDVH, USPY, SADM, and BATE. Danielsen has received speaker honorarium from Otsuka Pharmaceutical. The remaining authors report no conflicts of interest.