Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-11T09:53:28.687Z Has data issue: false hasContentIssue false

Characterizing DSM-5 and ICD-11 personality disorder features in psychiatric inpatients at scale using electronic health records

Published online by Cambridge University Press:  23 September 2019

Sergio A. Barroilhet
Affiliation:
Center for Quantitative Health, Division of Clinical Research and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA Department of Psychiatry, Tufts University School of Medicine, Boston, MA, USA University Psychiatric Clinic, University of Chile Clinical Hospital, Santiago, Chile
Amelia M. Pellegrini
Affiliation:
Center for Quantitative Health, Division of Clinical Research and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Thomas H. McCoy
Affiliation:
Center for Quantitative Health, Division of Clinical Research and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Roy H. Perlis*
Affiliation:
Center for Quantitative Health, Division of Clinical Research and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
*
Author for correspondence: Roy H. Perlis, E-mail: rperlis@partners.org
Rights & Permissions [Opens in a new window]

Abstract

Background

Investigation of personality traits and pathology in large, generalizable clinical cohorts has been hindered by inconsistent assessment and failure to consider a range of personality disorders (PDs) simultaneously.

Methods

We applied natural language processing (NLP) of electronic health record notes to characterize a psychiatric inpatient cohort. A set of terms reflecting personality trait domains were derived, expanded, and then refined based on expert consensus. Latent Dirichlet allocation was used to score notes to estimate the extent to which any given note reflected PD topics. Regression models were used to examine the relationship of these estimates with sociodemographic features and length of stay.

Results

Among 3623 patients with 4702 admissions, being male, non-white, having a low burden of medical comorbidity, being admitted through the emergency department, and having public insurance were independently associated with greater levels of disinhibition, detachment, and psychoticism. Being female, white, and having private insurance were independently associated with greater levels of negative affectivity. The presence of disinhibition, psychoticism, and negative affectivity were each significantly associated with a longer stay, while detachment was associated with a shorter stay.

Conclusions

Personality features can be systematically and scalably measured using NLP in the inpatient setting, and some of these features associate with length of stay. Developing treatment strategies for patients scoring high in certain personality dimensions may facilitate more efficient, targeted interventions, and may help reduce the impact of personality features on mental health service utilization.

Type
Original Articles
Copyright
Copyright © Cambridge University Press 2019

Introduction

Personality disorder (PD) diagnoses have an important public health impact as they predict increased utilization of medical and mental health care services (Twomey et al., Reference Twomey, Baldwin, Hopfe and Cieza2015; Tyrer et al., Reference Tyrer, Reed and Crawford2015; Huprich, Reference Huprich2018). Studies using structured diagnostic interviews have identified a PD diagnosis in 40–82% of psychiatric outpatient populations (Zimmerman et al., Reference Zimmerman, Rothschild and Chelminski2005; Newton-Howes et al., Reference Newton-Howes, Tyrer, Anagnostakis, Cooper, Bowden-Jones and Weaver2010; Beckwith et al., Reference Beckwith, Moran and Reilly2014) and in 64–74% of psychiatric inpatient populations (Grilo et al., Reference Grilo, McGlashan, Quinlan, Walker, Greenfeld and Edell1998; Keown et al., Reference Keown, Holloway and Kuipers2005; Stevenson et al., Reference Stevenson, Datyner, Boyce and Brodaty2011), further increasing utilization in these settings (Twomey et al., Reference Twomey, Baldwin, Hopfe and Cieza2015).

The variability in these prevalence estimates suggests the challenge of studying PDs in real-world settings. Despite high levels of usage of health care resources, and high rates of polypharmacy and hospital admissions (Quirk et al., Reference Quirk, Berk, Chanen, Koivumaa-Honkanen, Brennan-Olsen, Pasco and Williams2016) and the economic burden associated (Soeteman et al., Reference Soeteman, Hakkaart-Van Roijen, Verheul and Van Busschbach2008), evaluating personality dimensions is still not a part of routine assessment in psychiatric inpatient units (Fok et al., Reference Fok, Stewart, Hayes and Moran2014; Jacobs et al., Reference Jacobs, Gutacker, Mason, Goddard, Gravelle, Kendrick and Gilbody2015). Likewise, in administrative data sets, PDs may not be coded consistently, or may be treated as a single undifferentiated category (Jiménez et al., Reference Jiménez, Lam, Marot and Delgado2004; McLay et al., Reference McLay, Daylo and Hammer2005; Compton et al., Reference Compton, Craw and Rudisch2006; Jacobs et al., Reference Jacobs, Gutacker, Mason, Goddard, Gravelle, Kendrick and Gilbody2015; Newman et al., Reference Newman, Harris, Evans and Beck2018). On the other hand, the current categorical diagnosis for PDs has been questioned as not scientifically valid, while PD clinical features are being increasingly understood as dimensional phenotypes (Bjelland et al., Reference Bjelland, Lie, Dahl, Mykletun, Stordal and Kraemer2009; Haslam et al., Reference Haslam, Holland and Kuppens2012; Skodol, Reference Skodol2012; Tyrer et al., Reference Tyrer, Reed and Crawford2015). Accordingly, the DSM-5 and ICD-11 have both moved toward dimensional models of PD (Bach et al., Reference Bach, Sellbom and Simonsen2018a,Reference Bach, Sellbom, Skjernov and Simonsenb) and remain to be studied. Novel approaches to explore personality dimensions in psychiatric cohorts are needed (Quirk et al., Reference Quirk, Berk, Chanen, Koivumaa-Honkanen, Brennan-Olsen, Pasco and Williams2016).

To address this gap, we applied natural language processing (NLP) of electronic health records (EHRs) to characterize a large inpatient psychiatric cohort (Manning and Schiitze, Reference Manning and Schiitze1999). We hypothesized that EHR notes would capture relevant clinical descriptions as unstructured data, quantifiable by validated algorithmic tools that have been previously used for medical (Yu et al., Reference Yu, Kumamaru, George, Dunne, Bedayat, Neykov, Hunsaker, Dill, Cai and Rybicki2014; Yim et al., Reference Yim, Yetisgen, Harris and Kwan2016) and mental health research (Althoff et al., Reference Althoff, Clark and Leskovec2016; Can et al., Reference Can, Marín, Georgiou, Imel, Atkins and Narayanan2016; McCoy et al., Reference McCoy, Castro, Roberson, Snapper and Perlis2016; Birnie et al., Reference Birnie, Stewart and Kolliakou2018; McCoy et al., Reference McCoy, Yu, Hart, Castro, Brown, Rosenquist, Doyle, Vuijk, Cai and Perlis2018; Afshar et al., Reference Afshar, Phillips, Karnik, Mueller, To, Gonzalez, Price, Cooper, Joyce and Dligach2019). In particular, we examined the relationship between these dimensions and sociodemographic and clinical features, as a means of more comprehensively characterizing personality psychopathology in a real-world setting.

Methods

Subjects

Sociodemographic and clinical data were extracted from the health records of patients in the adult psychiatry inpatient unit at Massachusetts General Hospital between 2010 and 2016. Sociodemographic data included age, sex, race, and type of insurance, as well as relevant clinical factors such as admission route (i.e. either via the emergency room or not), length of stay, and Charlson Comorbidity Index. Admission and discharge documentation were extracted for estimation of personality trait domains by NLP. These EHR data were managed as an i2b2 datamart (Murphy et al., Reference Murphy, Weber, Mendis, Gainer, Chueh, Churchill and Kohane2010).

The Partners HealthCare Human Research Committee approved the study protocol, waiving the requirement for informed consent as detailed by 45 CFR 46.116 as no participant contact was required in this study based on secondary use of data arising from routine clinical care.

Generation of personality phenotypes

Building on our prior work in transdiagnostic psychiatric phenotypes, we developed personality-specific transdiagnostic phenotypes based on NLP (McCoy et al., Reference McCoy, Yu, Hart, Castro, Brown, Rosenquist, Doyle, Vuijk, Cai and Perlis2018). This process seeds an NLP model using expert-defined, or curated, terms. As with our prior work, we consulted relevant texts to guide phenotypic seed term generation; in this case, the DSM-5 and ICD-11. The DSM-5 (section III) (American Psychiatric Association, 2013) and ICD-11 (Tyrer et al., Reference Tyrer, Reed and Crawford2015; Bach and First, Reference Bach and First2018) assess PDs based on determining levels of functioning/impairment and stylistic traits organized in personality dimensions. In the DSM-5, these dimensions are Negative Affectivity, Detachment, Antagonism, Disinhibition, and Psychoticism. The ICD-11 includes the same dimensions, except Psychoticism, and adds Anankastia (or Compulsivity) as a new dimension. Definitions of overlapping dimensions are similar between the DSM-5 and ICD-11 (Bach et al., Reference Bach, Sellbom, Skjernov and Simonsen2018b). These extracted trait domain definitions, according to Skodol (Reference Skodol2018) and Tyrer et al. (Reference Tyrer, Reed and Crawford2015), are shown in Table 1 along with the examples of personality features that comprise these dimensions. These DSM-5 and ICD-11 derived terms were then expanded using the Personality Inventory for DSM-5 items (Krueger et al., Reference Krueger, Derringer, Markon, Watson and Skodol2012), other personality trait studies (Ashton et al., Reference Ashton, Lee, Perugini, Szarota, de Vries, Di Blas, Boies and De Raad2004, Reference Ashton, Lee, de Vries, Hendrickse and Born2012; Bach et al., Reference Bach, Sellbom and Simonsen2018a, Reference Bach, Sellbom, Skjernov and Simonsen2018b), and a thesaurus (Dictionary.com, LLC, 2019). From the generated synonym list, a clinically refined set of NLP seed terms was selected based on expert consensus (S.A.B., R.H.P.; Table 1).

Table 1. Personality trait domains in DSM-5 and ICD-11

As these pre-selected term lists are unlikely to capture the full diversity of clinical vocabulary, we applied a previously reported method for expanding clinical vocabularies (McCoy et al., Reference McCoy, Yu, Hart, Castro, Brown, Rosenquist, Doyle, Vuijk, Cai and Perlis2018). In this method, Latent Dirichlet allocation (LDA) is used to fit a probabilistic topic model to all documents. The use of topic loadings as LDA-determined phenotypes has been used for computational phenotyping and is discussed in our prior research (McCoy et al., Reference McCoy, Castro, Snapper, Hart, Januzzi, Huffman and Perlis2017, Reference McCoy, Yu, Hart, Castro, Brown, Rosenquist, Doyle, Vuijk, Cai and Perlis2018). Briefly, with an LDA-based topic model, documents are probability distributions over topics, and each topic is a probability distribution over the full vocabulary (Blei et al., Reference Blei, Ng and Jordan2003; Blei, Reference Blei2012). The posterior distributions of the term-topic distributions are inspected to identify the topic under which the cumulative probability of the expert-selected personality token within each list is greatest. This total cumulative probability of the seed word list is used to identify the relevant topic. Thereafter, that topic's topic-document weights are used as the phenotype for the relevant domain. In essence, this approach asks which LDA topics capture the greatest number of curated tokens for a given PD, and then uses the ‘best’ topic to represent that disorder. The tokens (terms) incorporated in topics corresponding to each concept are listed in Table 1, and the entire process is outlined in Fig. 1. For the topic modeling, we used the R interface to a Gibbs sampler implementation of LDA (topicmodels v0.2), one of many widely used open source implementations of LDA licensed under free software licenses (McCallum, Reference McCallum2002; Řehůřek and Sojka, Reference Řehůřek and Sojka2010; Grün and Hornik, Reference Grün and Hornik2018).

Fig. 1. Diagram of personality phenotype generation through the transfer of human expert language model into model learned through unsupervised machine learning. A probabilistic topic model is learned from patients’ clinical documentation through LDA. The learned topics are then matched to the personality symptom domains by linking the learned topic under which expert-identified tokens are most common. Thereafter, the linked topics are used as the phenotype for the linked personality domain.

Study design and analysis

We used robust clustering to account for individuals with multiple admissions. Linear regression modeling adjusting for sex, age, race, insurance type, Charlson Comorbidity Index, and route of admission was used to analyze personality domain loadings in different sociodemographic profiles. Linear regression adjusting for these sociodemographic variables, as well as for other personality trait domains, was used to explore the association between personality trait domains and hospital length of stay. Analyses utilized Stata/SE 13.1 (Statacorp, College Station, TX, USA).

Results

Characteristics of the full set of 4702 admissions for 3623 individuals are displayed in Table 2. Individual personality trait domains differed in their association with sociodemographic features (Table 3). Being male, non-white, having a low burden of medical comorbidity, being admitted through the emergency room, and having public insurance were independently associated with higher levels of disinhibition, detachment, and psychoticism. On the other hand, being female, white, and using private insurance were independently associated with increased levels of negative affectivity. Age was also associated with personality features: on average, patients with increased levels of disinhibition and psychoticism were younger, while patients with more negative affectivity were older.

Table 2. Cohort characteristics at admission

Table 3. Association between sociodemographic features and personality trait domains§

CI, confidence interval; ER, emergency room.

aβ (95% confidence interval) is equal to the variation (and its 95% CI) in days of length of stay, if the named personality domain score increased/decreased by 10%.

*p < 0.05; **p < 0.01; ***p < 0.001.

We next examined the association between personality trait domains extracted from clinical notes and length of inpatient stay. As shown in Table 4, the presence of disinhibition, psychoticism, and negative affectivity was significantly associated with a longer length of stay. In contrast, detachment was associated with a shorter length of stay. A 10% increase in the disinhibition domain score was associated with a ~2.7-day increase in length of stay. Similarly, a 10% increase in the psychoticism and negative affectivity domain scores was associated with an increase in length of stay of ~0.8 and ~0.7 days, respectively. On the other hand, having a 10% increase in detachment features was associated with a decreased length of stay by nearly 0.3 days.

Table 4. Regression model of personality trait domains and hospital length of stay (n = 4687 admissions)

ER, emergency room.

aβ (95% confidence interval) is equal to the variation (and its 95% CI) in days of length of stay, if the named personality trait domain score increased/decreased by 10%.

Discussion

As anticipated based on studies using traditional personality measures, we observed an association between sociodemographic features and individual personality trait domains (Lynn and Martin, Reference Lynn and Martin1997; Kjelsås and Augestad, Reference Kjelsås and Augestad2004). Demographic profiles are useful to predict certain behaviors (Krismayer et al., Reference Krismayer, Schedl, Knees and Rabiser2019), but their relationship with dimensional traits is less studied (Al-Halabí et al., Reference Al-Halabí, Herrero, Saiz, Garcia-Portilla, Corcoran, Teresa Bascaran, Errasti, Lemos and Bobes2010).

In particular, we found that greater scores in disinhibition, negative affectivity, and psychoticism were associated with a significantly longer length of stay, while a greater score in detachment was associated with a decreased length of stay. One way to interpret the effect sizes we observed is to compare our results to the US national average length of stay in inpatient psychiatric units, which is 6.6 days (Heslin et al., Reference Heslin, Elixhauser and Steiner2015). According to our results, an increase of 10% in the disinhibition dimension score may increase inpatient length of stay by 40% when compared to the national average. Likewise, patients scoring 10% higher in either psychoticism or negative affectivity may have an increased length of stay by an extra 12% when compared to the national average. Conversely, an increase of 10% in the detachment dimension score may decrease length of stay by 6% when compared to the national average. Given these results, personality may be a relevant factor to consider in terms of length of stay in the psychiatric inpatient setting.

While there is no doubt that PDs in general are associated with an increase in mental health services use in the outpatient setting (Twomey et al., Reference Twomey, Baldwin, Hopfe and Cieza2015; Tyrer et al., Reference Tyrer, Reed and Crawford2015), this relationship has been less clear in terms of psychiatric inpatient services use. In contrast to our results, several epidemiological studies (Jacobs et al., Reference Jacobs, Gutacker, Mason, Goddard, Gravelle, Kendrick and Gilbody2015; Piccinelli et al., Reference Piccinelli, Bortolaso, Bolla and Cioffi2016; Pauselli et al., Reference Pauselli, Verdolini, Bernardini, Compton and Quartesan2017; Newman et al., Reference Newman, Harris, Evans and Beck2018) and service use studies (Jiménez et al., Reference Jiménez, Lam, Marot and Delgado2004; McLay et al., Reference McLay, Daylo and Hammer2005; Compton et al., Reference Compton, Craw and Rudisch2006; Leontieva and Gregory, Reference Leontieva and Gregory2013; Habermeyer et al., Reference Habermeyer, De Gennaro, Frizi, Roser and Stulz2018) have shown that PDs do not necessarily increase, and may even shorten, length of stay. Consequently, personality may have been overlooked as an addressable factor in efforts to optimize services use. Only a few studies have found that personality was associated with an increased length of stay (Tyrer and Simmonds, Reference Tyrer and Simmonds2003; Fok et al., Reference Fok, Stewart, Hayes and Moran2014). However, neither of these studies explored which personality traits or diagnosis was associated with this outcome.

The only prior study we identified that similarly investigated the association between different personality types and use of services in the psychiatric inpatient setting is Keown et al. (Reference Keown, Holloway and Kuipers2005). This study considered a cohort of 193 patients from a community served by a mental health team in the UK, who were assessed using a structured interview, diagnosed according to the ICD-10, and followed over a 4-year period. Keown et al. found that among non-psychotic patients, having paranoid, dependent, and emotionally unstable PD was associated with an increased length of stay by 150 days in the 4-year period when a patient had one PD disorder, and up to 321 days for patients who had two or all of these PD disorders. Among psychotic patients, length of stay was associated with having more paranoid and anxious traits. Conversely, in the latter group of psychotic patients, the presence of anankastic traits was associated with a shorter length of stay.

The results of the Keown et al. study are in line with those from our study, since there is evidence of a correspondence between unstable personality and disinhibition, between paranoid personality and psychoticism, and between anxiety/dependence and negative affectivity (Skodol, Reference Skodol2018). On the other hand, unlike the Keown et al. study, we found that detachment – and not anankastic traits – was associated with a shorter length of stay. Interpersonal distance and restriction in the expression of affect may be associated with diminished expression of need for care, so when behavioral symptoms remit, these patients may be more likely to be discharged. However, the anankastia domain also shows a correlation with detachment (Skodol, Reference Skodol2018), ranging from 0.46 (Bach et al., Reference Bach, Sellbom, Skjernov and Simonsen2018b) to 0.79 (Lugo et al., Reference Lugo, de Oliveira, Hessel, Monteiro, Pasche, Pavan, Motta, Pacheco and Spanemberg2019).

Limitations

There are several limitations of our study to be considered. Extracting personality trait domains from EHR notes of psychiatry inpatients is limited by the fact that topics identified by the NLP process may account for state-related symptoms in the context of acute psychiatric syndromes like depression or psychosis, and not for stable personality traits. However, some studies show that trait assessments established during acute episodes (e.g. a major depressive episode) may be valid reflections of personality pathology rather than artifacts of symptomatic state (Morey et al., Reference Morey, Shea, Markowitz, Stout, Hopwood, Gunderson, Grilo, McGlashan, Yen, Sanislow and Skodol2010; Sevilla-Llewellyn-Jones et al., Reference Sevilla-Llewellyn-Jones, Cano-Domínguez, de-Luis-Matilla, Peñuelas-Calvo, Espina-Eizaguirre, Moreno-Kustner and Ochoa2017). Another alternative is that personality may itself influence symptom expression, and hence a clinical feature may be an expression of both symptoms and traits (von Gunten et al., Reference von Gunten, Pocnet and Rossier2009; Widiger, Reference Widiger2011). Personality dimensions and common psychiatric disorders also covary (Wright and Simms, Reference Wright and Simms2015) and may be part of spectra, that is, larger constellations of syndromes sharing some common features (Kotov et al., Reference Kotov, Krueger, Watson, Achenbach, Althoff, Bagby, Brown, Carpenter, Caspi, Clark, Eaton, Forbes, Forbush, Goldberg, Hasin, Hyman, Ivanova, Lynam, Markon, Miller, Moffitt, Morey, Mullins-Sweatt, Ormel, Patrick, Regier, Rescorla, Ruggero, Samuel, Sellbom, Simms, Skodol, Slade, South, Tackett, Waldman, Waszczuk, Widiger, Wright and Zimmerman2017). The approach taken here does not distinguish trait from state effects, but it may still capture relevant clinical features at a given point in time.

Conversely, this study does address several key limitations in the prior evidence base. First, personality diagnosis tends to be overlooked by clinicians; some studies indicate that PD prevalence may be underestimated in psychiatry inpatient settings (Fok et al., Reference Fok, Stewart, Hayes and Moran2014; Jacobs et al., Reference Jacobs, Gutacker, Mason, Goddard, Gravelle, Kendrick and Gilbody2015), especially in the absence of structured assessments (Zimmerman et al., Reference Zimmerman, Chelminski and Young2008; Leontieva and Gregory, Reference Leontieva and Gregory2013; Newman et al., Reference Newman, Harris, Evans and Beck2018). Second, when using only a clinical diagnostic approach, there may be a variation regarding which disorders are more likely to be diagnosed and which may be overlooked. This may be based on factors such as symptom severity, expectation of response to treatment, or familiarity with particular PD diagnoses (Zimmerman and Morgan, Reference Zimmerman and Morgan2013; Zimmerman, Reference Zimmerman2016). Finally, most prior personality studies have used a categorical diagnostic approach for PD diagnosis, which has been criticized for its questionable validity (Haslam et al., Reference Haslam, Holland and Kuppens2012; Skodol, Reference Skodol2012; Tyrer et al., Reference Tyrer, Reed and Crawford2015).

To address these limitations, we used NLP and machine learning as a novel method to overcome underdiagnosis, selective diagnosis, and lack of characterization of personality in the inpatient setting. In particular, our methodology allows access to extensive clinical information, deliberations, and clinicians' clinical judgment that may not be reflected in coded diagnoses. Likewise, we used a dimensional model to assess personality, in contrast to previous studies that used a categorical approach. This method may account more realistically for specific and clinically significant personality features in the inpatient setting.

Conclusion

In aggregate, our study suggests that personality features can be systematically and scalably measured using NLP in the inpatient setting, and that these features may relevantly contribute to service utilization. Developing treatment strategies for patients scoring high in PD features may facilitate more efficient, targeted interventions, and may help reduce the impact on mental health service utilization.

Author contributions

Drs. Barroilhet and Perlis designed the study, analyzed the results, and drafted the manuscript. Dr. McCoy developed the natural language processing methodology and revised the manuscript. Ms. Pellegrini revised the manuscript. All authors have given final approval of this submission.

Financial support

This work was supported by the National Institute of Mental Health (R.H.P., grant number 1R01MH106577). The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Conflict of interest

Dr. Barroilhet and Ms. Pellegrini report no conflicts of interest. Dr. McCoy receives research funding from the Brain and Behavior Research Foundation, National Institute of Aging, Telefonica Alfa, and The Stanley Center at the Broad Institute. Dr. Perlis holds equity in Psy Therapeutics and Outermost Therapeutics; serves on the scientific advisory boards of Genomind and Takeda; and consults to RID Ventures. Dr. Perlis receives research funding from NIMH, NHLBI, NHGRI, and Telefonica Alfa. Dr. Perlis is an associate editor for JAMA-Network Open.

Ethical standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. The study protocol has been approved by the Partners HealthCare Human Research Committee (protocol number 2016P002084). The requirement for informed consent was waived as detailed by 45 CFR 46.116 since no participant contact was required in this study based on secondary use of data arising from routine clinical care.

References

Afshar, M, Phillips, A, Karnik, N, Mueller, J, To, D, Gonzalez, R, Price, R, Cooper, R, Joyce, C and Dligach, D (2019) Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation. Journal of the American Medical Informatics Association: JAMIA 26, 254261.CrossRefGoogle Scholar
Al-Halabí, S, Herrero, R, Saiz, PA, Garcia-Portilla, MP, Corcoran, P, Teresa Bascaran, M, Errasti, JM, Lemos, S and Bobes, J (2010) Sociodemographic factors associated with personality traits assessed through the TCI. Personality and Individual Differences 48, 809814.CrossRefGoogle Scholar
Althoff, T, Clark, K and Leskovec, J (2016) Large-scale analysis of counseling conversations: an application of natural language processing to mental health. Transactions of the Association for Computational Linguistics 4, 463476.CrossRefGoogle Scholar
American Psychiatric Association (2013) Diagnostic and Statistical Manual of Mental Disorders (DSM-5®), 5th Edn. Washington, DC: American Psychiatric Association.Google Scholar
Ashton, MC, Lee, K, Perugini, M, Szarota, P, de Vries, RE, Di Blas, L, Boies, K and De Raad, B (2004) A six-factor structure of personality-descriptive adjectives: solutions from psycholexical studies in seven languages. Journal of Personality and Social Psychology 86, 356366.CrossRefGoogle Scholar
Ashton, MC, Lee, K, de Vries, RE, Hendrickse, J and Born, MPH (2012) The maladaptive personality traits of the personality inventory for DSM-5 (PID-5) in relation to the HEXACO personality factors and schizotypy/dissociation. Journal of Personality Disorders 26, 641659.CrossRefGoogle Scholar
Bach, B and First, MB (2018) Application of the ICD-11 classification of personality disorders. BMC Psychiatry 18, 351.CrossRefGoogle Scholar
Bach, B, Sellbom, M and Simonsen, E (2018a) Personality inventory for DSM-5 (PID-5) in clinical versus nonclinical individuals: generalizability of psychometric features. Assessment 25, 815825.CrossRefGoogle Scholar
Bach, B, Sellbom, M, Skjernov, M and Simonsen, E (2018b) ICD-11 and DSM-5 personality trait domains capture categorical personality disorders: finding a common ground. Australian & New Zealand Journal of Psychiatry 52, 425434.CrossRefGoogle Scholar
Beckwith, H, Moran, PF and Reilly, J (2014) Personality disorder prevalence in psychiatric outpatients: a systematic literature review. Personality and Mental Health 8, 91101.CrossRefGoogle Scholar
Birnie, KI, Stewart, R and Kolliakou, A (2018) Recorded atypical hallucinations in psychotic and affective disorders and associations with non-benzodiazepine hypnotic use: the South London and Maudsley Case Register. BMJ Open 8, e025216.CrossRefGoogle Scholar
Bjelland, I, Lie, SA, Dahl, AA, Mykletun, A, Stordal, E and Kraemer, HC (2009) A dimensional versus a categorical approach to diagnosis: anxiety and depression in the HUNT 2 study. International Journal of Methods in Psychiatric Research 18, 128137.CrossRefGoogle Scholar
Blei, DM (2012) Probabilistic topic models. Communications of the ACM 55, 7784.CrossRefGoogle Scholar
Blei, DM, Ng, AY and Jordan, MI (2003) Latent Dirichlet allocation. Journal of Machine Learning Research 3, 9931022.Google Scholar
Can, D, Marín, RA, Georgiou, PG, Imel, ZE, Atkins, DC and Narayanan, SS (2016) ‘It sounds like…’: a natural language processing approach to detecting counselor reflections in motivational interviewing. Journal of Counseling Psychology 63, 343350.CrossRefGoogle Scholar
Compton, MT, Craw, J and Rudisch, BE (2006) Determinants of inpatient psychiatric length of stay in an urban county hospital. The Psychiatric Quarterly 77, 173188.CrossRefGoogle Scholar
Dictionary.com, LLC (2019) Thesaurus.com. Long Beach, CA: Lexico Publishing Group.Google Scholar
Fok, ML, Stewart, R, Hayes, RD and Moran, P (2014) The impact of co-morbid personality disorder on use of psychiatric services and involuntary hospitalization in people with severe mental illness. Social Psychiatry and Psychiatric Epidemiology 49, 16311640.CrossRefGoogle Scholar
Grilo, CM, McGlashan, TH, Quinlan, DM, Walker, ML, Greenfeld, D and Edell, WS (1998) Frequency of personality disorders in Two Age cohorts of psychiatric inpatients. American Journal of Psychiatry 155, 140142.CrossRefGoogle Scholar
Grün, B and Hornik, K (2018). Topicmodels, v0.2-8. Available at https://cran.rproject.org/web/packages/topicmodels/index.html.Google Scholar
Habermeyer, B, De Gennaro, H, Frizi, RC, Roser, P and Stulz, N (2018) Factors associated with length of stay in a Swiss mental hospital. The Psychiatric Quarterly 89, 667674.CrossRefGoogle Scholar
Haslam, N, Holland, E and Kuppens, P (2012) Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research. Psychological Medicine 42, 903920.CrossRefGoogle Scholar
Heslin, KC, Elixhauser, A and Steiner, CA (2015). Hospitalizations Involving Mental and Substance Use Disorders Among Adults, 2012. HCUP Statistical Brief #191. Rockville, MD: Agency for Healthcare Research and Quality.Google Scholar
Huprich, SK (2018) Personality pathology in primary care: ongoing needs for detection and intervention. Journal of Clinical Psychology in Medical Settings 25, 4354.CrossRefGoogle Scholar
Jacobs, R, Gutacker, N, Mason, A, Goddard, M, Gravelle, H, Kendrick, T and Gilbody, S (2015) Determinants of hospital length of stay for people with serious mental illness in England and implications for payment systems: a regression analysis. BMC Health Services Research 15, 439.CrossRefGoogle Scholar
Jiménez, RE, Lam, RM, Marot, M and Delgado, A (2004) Observed-predicted length of stay for an acute psychiatric department, as an indicator of inpatient care inefficiencies. Retrospective case-series study. BMC Health Services Research 4, 4.CrossRefGoogle Scholar
Keown, P, Holloway, F and Kuipers, E (2005) The impact of severe mental illness, co-morbid personality disorders and demographic factors on psychiatric bed use. Social Psychiatry and Psychiatric Epidemiology 40, 4249.CrossRefGoogle Scholar
Kjelsås, E and Augestad, LB (2004) Gender, eating behavior, and personality characteristics in physically active students. Scandinavian Journal of Medicine & Science in Sports 14, 258268.CrossRefGoogle Scholar
Kotov, R, Krueger, RF, Watson, D, Achenbach, TM, Althoff, RR, Bagby, RM, Brown, TA, Carpenter, WT, Caspi, A, Clark, LA, Eaton, NR, Forbes, MK, Forbush, KT, Goldberg, D, Hasin, D, Hyman, SE, Ivanova, MY, Lynam, DR, Markon, K, Miller, JD, Moffitt, TE, Morey, LC, Mullins-Sweatt, SN, Ormel, J, Patrick, CJ, Regier, DA, Rescorla, L, Ruggero, CJ, Samuel, DB, Sellbom, M, Simms, LJ, Skodol, AE, Slade, T, South, SC, Tackett, JL, Waldman, ID, Waszczuk, MA, Widiger, TA, Wright, AGC and Zimmerman, M (2017) The Hierarchical Taxonomy of Psychopathology (HiTOP): a dimensional alternative to traditional nosologies. Journal of Abnormal Psychology 126, 454477.CrossRefGoogle Scholar
Krismayer, T, Schedl, M, Knees, P and Rabiser, R (2019) Predicting user demographics from music listening information. Multimedia Tools and Applications 78, 28972920.CrossRefGoogle Scholar
Krueger, RF, Derringer, J, Markon, KE, Watson, D and Skodol, AE (2012) Initial construction of a maladaptive personality trait model and inventory for DSM-5. Psychological Medicine 42, 18791890.CrossRefGoogle Scholar
Leontieva, L and Gregory, R (2013) Characteristics of patients with borderline personality disorder in a state psychiatric hospital. Journal of Personality Disorders 27, 222232.CrossRefGoogle Scholar
Lugo, V, de Oliveira, SES, Hessel, CR, Monteiro, RT, Pasche, NL, Pavan, G, Motta, LS, Pacheco, MA and Spanemberg, L (2019) Evaluation of DSM-5 and ICD-11 personality traits using the Personality Inventory for DSM-5 (PID-5) in a Brazilian sample of psychiatric inpatients. Personality and Mental Health 13, 2439.CrossRefGoogle Scholar
Lynn, R and Martin, T (1997) Gender differences in extraversion, neuroticism, and psychoticism in 37 nations. The Journal of Social Psychology 137, 369373.CrossRefGoogle Scholar
Manning, CD and Schiitze, H (1999) Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.Google Scholar
McCallum, AK (2002). ‘MALLET: A Machine Learning for Language Toolkit’. Available at http://mallet.cs.umass.edu.Google Scholar
McCoy, TH Jr, Castro, VM, Roberson, AM, Snapper, LA and Perlis, RH (2016) Improving prediction of suicide and accidental death after discharge from general hospitals with natural language processing. JAMA Psychiatry 73, 10641071.CrossRefGoogle Scholar
McCoy, TH, Castro, VM, Snapper, LA, Hart, KH, Januzzi, JL, Huffman, JC and Perlis, RH (2017) Polygenic loading for major depression is associated with specific medical comorbidity. Translational Psychiatry 7, e1238.CrossRefGoogle Scholar
McCoy, TH, Yu, S, Hart, KL, Castro, VM, Brown, HE, Rosenquist, JN, Doyle, AE, Vuijk, PJ, Cai, T and Perlis, RH (2018) High throughput phenotyping for dimensional psychopathology in electronic health records. Biological Psychiatry 83, 9971004.CrossRefGoogle Scholar
McLay, RN, Daylo, A and Hammer, PS (2005) Predictors of length of stay in a psychiatric ward serving active duty military and civilian patients. Military Medicine 170, 219222.CrossRefGoogle Scholar
Morey, LC, Shea, MT, Markowitz, JC, Stout, RL, Hopwood, CJ, Gunderson, JG, Grilo, CM, McGlashan, TH, Yen, S, Sanislow, CA and Skodol, AE (2010) State effects of Major depression on the assessment of personality and personality disorder. The American Journal of Psychiatry 167, 528535.CrossRefGoogle Scholar
Murphy, SN, Weber, G, Mendis, M, Gainer, V, Chueh, HC, Churchill, S and Kohane, I (2010) Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). Journal of the American Medical Informatics Association: JAMIA 17, 124130.CrossRefGoogle Scholar
Newman, L, Harris, V, Evans, LJ and Beck, A (2018) Factors associated with length of stay in psychiatric inpatient services in London, UK. The Psychiatric Quarterly 89, 3343.CrossRefGoogle Scholar
Newton-Howes, G, Tyrer, P, Anagnostakis, K, Cooper, S, Bowden-Jones, O and Weaver, T, COSMIC study team (2010) The prevalence of personality disorder, its comorbidity with mental state disorders, and its clinical significance in community mental health teams. Social Psychiatry and Psychiatric Epidemiology 45, 453460.CrossRefGoogle Scholar
Pauselli, L, Verdolini, N, Bernardini, F, Compton, MT and Quartesan, R (2017) Predictors of length of stay in an inpatient psychiatric unit of a general hospital in Perugia, Italy. The Psychiatric Quarterly 88, 129140.CrossRefGoogle Scholar
Piccinelli, M, Bortolaso, P, Bolla, E and Cioffi, I (2016) Typologies of psychiatric admissions and length of inpatient stay in Italy. International Journal of Psychiatry in Clinical Practice 20, 116120.CrossRefGoogle Scholar
Quirk, SE, Berk, M, Chanen, AM, Koivumaa-Honkanen, H, Brennan-Olsen, SL, Pasco, JA and Williams, LJ (2016) Population prevalence of personality disorder and associations with physical health comorbidities and health care service utilization: a review. Personality Disorders: Theory, Research, and Treatment 7, 136146.CrossRefGoogle Scholar
Řehůřek, R and Sojka, P (2010) Software framework for topic modelling with large corpora. In Proceedings of LREC 2010 Workshop New Challenges for NLP Frameworks. p. 46–50, 5 pp. ISBN 2-9517408-6-7.Google Scholar
Sevilla-Llewellyn-Jones, J, Cano-Domínguez, P, de-Luis-Matilla, A, Peñuelas-Calvo, I, Espina-Eizaguirre, A, Moreno-Kustner, B and Ochoa, S (2017) Personality traits and psychotic symptoms in recent onset of psychosis patients. Comprehensive Psychiatry 74, 109117.CrossRefGoogle Scholar
Skodol, AE (2012) Personality disorders in DSM-5. Annual Review of Clinical Psychology 8, 317344.CrossRefGoogle Scholar
Skodol, AE (2018) Can personality disorders be redefined in personality trait terms? American Journal of Psychiatry 175, 590592.CrossRefGoogle Scholar
Soeteman, DI, Hakkaart-Van Roijen, L, Verheul, R and Van Busschbach, J (2008) The economic burden of personality disorders in mental health care. Journal of Clinical Psychiatry 69, 259265.CrossRefGoogle Scholar
Stevenson, J, Datyner, A, Boyce, P and Brodaty, H (2011) The effect of age on prevalence, type and diagnosis of personality disorder in psychiatric inpatients. International Journal of Geriatric Psychiatry 26, 981987.CrossRefGoogle Scholar
Twomey, CD, Baldwin, DS, Hopfe, M and Cieza, A (2015) A systematic review of the predictors of health service utilisation by adults with mental disorders in the UK. BMJ Open 5, e007575.CrossRefGoogle Scholar
Tyrer, P and Simmonds, S (2003) Treatment models for those with severe mental illness and comorbid personality disorder. The British Journal of Psychiatry. Supplement 44, S15S18.CrossRefGoogle Scholar
Tyrer, P, Reed, GM and Crawford, MJ (2015) Classification, assessment, prevalence, and effect of personality disorder. The Lancet 385, 717726.CrossRefGoogle Scholar
von Gunten, A, Pocnet, C and Rossier, J (2009) The impact of personality characteristics on the clinical expression in neurodegenerative disorders – a review. Brain Research Bulletin 80, 179191.CrossRefGoogle Scholar
Widiger, TA (2011) Personality and psychopathology. World Psychiatry 10, 103106.CrossRefGoogle Scholar
Wright, AGC and Simms, LJ (2015) A metastructural model of mental disorders and pathological personality traits. Psychological Medicine 45, 23092319.CrossRefGoogle Scholar
Yim, W-W, Yetisgen, M, Harris, WP and Kwan, SW (2016) Natural language processing in oncology: a review. JAMA Oncology 2, 797804.CrossRefGoogle Scholar
Yu, S, Kumamaru, KK, George, E, Dunne, RM, Bedayat, A, Neykov, M, Hunsaker, AR, Dill, KE, Cai, T and Rybicki, FJ (2014) Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. Journal of Biomedical Informatics 52, 386393.CrossRefGoogle Scholar
Zimmerman, M (2016) Improving the recognition of borderline personality disorder in a bipolar world. Journal of Personality Disorders 30, 320335.CrossRefGoogle Scholar
Zimmerman, M and Morgan, TA (2013) The relationship between bordersline personality disorder and bipolar disorder. Dialogues in Clinical Neuroscience 15, 155169.Google Scholar
Zimmerman, M, Rothschild, L and Chelminski, I (2005) The prevalence of DSM-IV personality disorders in psychiatric outpatients. The American Journal of Psychiatry 162, 19111918.CrossRefGoogle Scholar
Zimmerman, M, Chelminski, I and Young, D (2008) The frequency of personality disorders in psychiatric patients. Psychiatric Clinics of North America 31, 405420.CrossRefGoogle Scholar
Figure 0

Table 1. Personality trait domains in DSM-5 and ICD-11

Figure 1

Fig. 1. Diagram of personality phenotype generation through the transfer of human expert language model into model learned through unsupervised machine learning. A probabilistic topic model is learned from patients’ clinical documentation through LDA. The learned topics are then matched to the personality symptom domains by linking the learned topic under which expert-identified tokens are most common. Thereafter, the linked topics are used as the phenotype for the linked personality domain.

Figure 2

Table 2. Cohort characteristics at admission

Figure 3

Table 3. Association between sociodemographic features and personality trait domains§

Figure 4

Table 4. Regression model of personality trait domains and hospital length of stay (n = 4687 admissions)