Disaster results in loss of life, destruction of property, and degradation of the environment. At times, the magnitude of damage is so severe that it is beyond the coping capacity of the affected community. Disasters place the affected population under tremendous pressure to cope with and adjust mentally to a devastating situation. The World Health Organization (WHO) states that people suffer from a host of mental health problems during disasters and mental health is very vital for the overall well-being and resilience of the affected communities after disasters.1 In the response and recovery phase that follows a disaster, mental health aspects are considered an integral component of the community’s resilience.Reference Satcher, Friel and Bell2, Reference Becker3 Understanding and researching the psychological consequences of the disaster involve logical and methodological challenges as disasters shatter the lives of people unexpectedly.Reference Goldmann and Galea4
In the past decade, the usage of social networking sites in general has increased manifold.Reference Kuss and Griffiths5 Social media also has played a crucial role during emergencies and natural disasters.Reference Gugan and Gnana6, Reference Imran, Castillo and Diaz7 Studies confirm that social media data have been used extensively in disaster management for preparedness,Reference Anson, Watson and Wadhwa8 situational awareness,Reference Verma, Vieweg and Corvey9–Reference Aupetit, Imran and Aupetit11 information dissemination,Reference Kaigo12–14 relief and response coordination,Reference Imran, Elbassuoni and Castillo15–Reference Alshareef and Grigoras18 fund-raising,Reference Okada, Ishida and Yamauchi19 damage assessment,Reference Ashktorab, Brown and Nandi20, Reference Cervone, Schnebele, Waters, Thakuriah, Tilahun and Zellner21 and relief and rescue activities.Reference Ashktorab, Brown and Nandi20, Reference O’Sullivan, Kuziemsky and Toal-Sullivan22 Apart from information about the disaster, individual updates and posts having emotional content are also available from social media platforms.Reference David, Ong and Legara23 It is also quite evident from the studies that the mental health of individuals and the affected population as a group can be predicted and analyzed from social media updates posted by the affected people.Reference Wongkoblap, Vadillo and Curcin24 Therefore, in emergency situations, the emotional content available in the social media platforms may aid in understating the psychological impact and overall well-being of the population as the data are real time and the updates are posted by the affected people. Our review aims to analyze the possibility, effectiveness, and procedures of using social media data to understand the emotional and psychological impact of an unforeseen disaster on the community. Specifically, this study tries to answer the following questions:
Is it feasible to use social media data to understand individual/population’s emotional characteristics during and after a natural disaster or a calamity?
What are all the ways in which social media data can be leveraged to understand and analyze the mental health consequences of a disaster?
METHODS
The systematic review was conducted to examine how social media can be used to understand the emotions, well-being, or mood of the affected community or individuals during and after a disaster. We followed the guidelines of the preferred reporting items for systematic reviews and meta-analyses (PRISMA) for searching and assessing the published articles.Reference Liberati, Altman and Tetzlaff25
Search Strategy
The electronic literature databases PubMed, PsycINFO, and PsycARTICLES were searched in November 2018 for collecting the relevant articles published between 2009 and November 2018. The databases were searched using the identified keywords as provided in Appendix 1. A manual search of the reference lists in the full-text articles considered for the study was also done.
Inclusion and Exclusion Criteria
Two independent reviewers screened the articles by reading the title and abstract and filtered the studies based on an inclusion and exclusion criteria. Primary research papers published in peer-reviewed journals or presented at international conferences in the English language were included. Studies that analyze or identify mental health, well-being, mood, emotions, or sentiments at the individual level or of the overall population in a disaster scenario using social media data were included for review. Studies with all types of study design were included in the review.
Studies were excluded if they used social media to choose the respondents for their survey or to fill up a questionnaire regarding disasters. Studies that analyze the social networks used by a population during disasters, which did not focus on the emotions of the individuals or a community, were excluded.
After the first phase of screening, the full text of the screened articles was examined by 2 reviewers independently to decide whether to include or exclude that particular study in the review. Discrepancies between the 2 were resolved by the third reviewer through a discussion. Methodological quality assessment of the included studies was done using the Mixed Method Assessment Tool (MMAT).Reference Pluye, Robert, Cargo and Bartlett26 MMAT is an efficient tool to evaluate the quality of studies with diverse designs.Reference Souto, Khanassov and Hong27 However, based on the results of MMAT assessment, studies were not excluded, as methodologically poor articles also provided a few insights for the review.
Data Extraction
A standard template for data extraction was designed using Microsoft Excel based on the review objectives and content of the included studies. Data were extracted on disaster setting and period, study objective, the platform used for data collection, data collection and pre-processing methods, analyzing techniques, tools used, and findings/outcome of the study. Apart from these details, bibliographic information (author and year) was also extracted.
RESULTS
The initial search fetched 3326 articles on PubMed, 1316 articles on PsycINFO, and 57 articles on PsycARTICLES. After screening the title and the abstract, 21 articles from PubMed, 18 articles from PsycINFO, and 3 articles from PsycARTICLES were identified. Thirteen articles were selected through a manual search of the reference lists in the identified articles. After removing the duplicates, 37 articles were selected for full-article review. Among these, based on our inclusion and exclusion criteria, 18 articles were included for the review. The results of searching and screening articles using the PRISMA flowchart are provided in Figure 1.

FIGURE 1 Flowchart of Review Selection process and Results.
We extracted the data from the 18 articles identified for the review. Table 1 shows the summary of the included articles,Reference Khalid, Helander and Hood28–Reference Gruebner, Lowe and Sykora45 and Appendix 2 gives the data extracted from the included studies in an Excel sheet. Appendix 3 shows the list of studies that were excluded after the full text of the articles was completely read.
TABLE 1 Summary of Included Studies

Due to the heterogeneous nature of the included studies and also as they have varied objectives and outcomes, we chose to give the results in a descriptive manner. Studies can be categorized based on various characteristics such as disaster location, language used in the data, social network platform, and emotions extracted. Appendix 4 summarizes the number of studies based on their characteristics. Among the studies, 9 studies are from USA,Reference Schulz, Thanh, Paulheim and Schweizer30, Reference Glasgow, Fink and Boyd-Graber31, Reference Lu, Hu and Wang33, Reference Doré, Ort and Braverman35, Reference Jones, Wojcik and Sweeting38–Reference Neppalli, Caragea and Squicciarini40, Reference Gruebner, Lowe and Sykora42, Reference Gruebner, Lowe and Sykora45 2 each from FranceReference Gruebner, Sykora and Lowe37, Reference Lin, Margolin and Wen41 and Japan,Reference Vo and Collier29, Reference Su, Cacciatore and Liang43 and 1 each from Germany,Reference Gaspar, Pedro and Panagiotopoulos36 Mexico,Reference De Choudhury, Monroy-Hernández and Gloria32 South Korea,Reference Woo, Cho and Shim34 and the Netherlands.Reference Van Lent, Sungur and Kunneman44 Except 1 study,Reference Khalid, Helander and Hood28 all the other reviewed studies used data from Twitter, a widely used social media platform, for analysis. De Choudhury et al.Reference De Choudhury, Monroy-Hernández and Gloria32, in their study, extracted posts from Blog del Narco (BDN) along with tweets. Khalid et al.Reference Khalid, Helander and Hood28 collected personal narratives and stories from blogs for emotional analysis. The maximum number of included studiesReference Vo and Collier29, Reference Schulz, Thanh, Paulheim and Schweizer30, Reference Doré, Ort and Braverman35, Reference Gruebner, Sykora and Lowe37, Reference Jones, Wojcik and Sweeting38, Reference Lin, Margolin and Wen41, Reference Gruebner, Lowe and Sykora42, Reference Van Lent, Sungur and Kunneman44, Reference Gruebner, Lowe and Sykora45 examined negative emotions (sad, anger, fear, confusion, disgust, shame) from the social media texts as an indicator of mental health problems. Three studiesReference Lu, Hu and Wang33, Reference Neppalli, Caragea and Squicciarini40, Reference Su, Cacciatore and Liang43 observed sentiments (positive, negative, and neutral) from the social media texts to analyze the psychological state of the population. Apart from extracting the emotions and sentiments from the texts, 3 articlesReference Khalid, Helander and Hood28, Reference Woo, Cho and Shim34, Reference Gaspar, Pedro and Panagiotopoulos36 analyzed the social media texts qualitatively based on their context.
A common framework of steps was formulated, based on the review in this study, to examine the psychological components from texts obtained from social media. Figure 2 shows the framework of the methodology that was applied in this review.

FIGURE 2 Framework to extract sentiments and emotions from social media texts.
Data Collection Techniques
Articles that use updates and posts in social networking sites in their study were considered in this review. All the included studies collected tweets from the micro-blogging site, Twitter. The reason for the widespread use of Twitter may be because Twitter has a developer streaming and a search application programming interface (API) using which its data can be extracted by a third-party user. This gave researchers easy access to public tweets for their research.Reference Zimmer and Proferes46 Also, only a very few percentages of Twitter users apply privacy settings to hide their posts.Reference Meeder, Tam and Kelly47 In the included studies, data were widely collected from the Twitter’s search APIReference Vo and Collier29–Reference Glasgow, Fink and Boyd-Graber31, Reference Lu, Hu and Wang33, Reference Lin, Margolin and Wen41 or streaming API.Reference Doré, Ort and Braverman35, Reference Neppalli, Caragea and Squicciarini40 Twitter’s fire hose was used in 2 researchesReference De Choudhury, Monroy-Hernández and Gloria32, Reference Su, Cacciatore and Liang43 for collecting the tweets. Tools like SOCIAL metrics™,Reference Woo, Cho and Shim34 Radian 6,Reference Gaspar, Pedro and Panagiotopoulos36 Twitris,Reference Hampton and Shalin39 Twitter package for R,Reference Jones, Wojcik and Sweeting38 and TwiNLReference Van Lent, Sungur and Kunneman44 was also used to harvest the tweets, but these tools also connect to Twitter API to extract data. In 3 studies, authors obtained tweets from the Harvard Centre for Geographical Analysis.Reference Gruebner, Sykora and Lowe37, Reference Gruebner, Lowe and Sykora42, Reference Gruebner, Lowe and Sykora45 Data were gathered based on keywords, hash tags, and location information. Instead of using keywords to gather the tweets, 2 studies identified users to procure their posts on Twitter.Reference Jones, Wojcik and Sweeting38, Reference Lin, Margolin and Wen41 The data collection method in each study was decided based on the purpose of its research. For instance, if the objective of the research is to investigate emotions toward a particular event, then data were collected using keywords and hash tags. In the case of spatial analysis of emotions or sentiments, location-based data collection techniques were used. If the purpose was to study the psychology of people in a specific population, then data were collected from the users’ updates.
Data Preprocessing
After collecting the tweets, some usual pre-processing steps need to be followed for extracting the needed information. Data preprocessing removes unwanted tweets, i.e., cleaning the data and preparing it for exploration. Preprocessing also involves removing the tweets that were not written in the language of interest. Eliminating retweets, duplicates, advertising tweets, automated tweets from the application, location check-in information tweets, and tweets that contain links to the external source were done before the analysis. This level of data filtering was done to eliminate tweets that do not have any emotional content. If the research was focusing on a particular location,Reference Vo and Collier29, Reference Glasgow, Fink and Boyd-Graber31, Reference De Choudhury, Monroy-Hernández and Gloria32, Reference Gaspar, Pedro and Panagiotopoulos36, Reference Lin, Margolin and Wen41 if the study’s aim was to examine mental health with respect to space,Reference Doré, Ort and Braverman35, Reference Van Lent, Sungur and Kunneman44, Reference Gruebner, Lowe and Sykora45 or if the study involved visualization of emotions or sentimentsReference Lu, Hu and Wang33, Reference Gruebner, Sykora and Lowe37, Reference Neppalli, Caragea and Squicciarini40, Reference Gruebner, Lowe and Sykora42 in a geographical area, then tweets without geographical location were excluded. Preprocessing also involves handling negations and word normalization.Reference Vo and Collier29 Negations were identified using the negation words and replacing it with NOT tagged with the word.Reference Schulz, Thanh, Paulheim and Schweizer30 Word normalization means converting or transforming the texts into a uniform sequence and makes the words consistent in some way. Stemming was used in 1 of the studiesReference Schulz, Thanh, Paulheim and Schweizer30 for word normalization. Translation of tweets to the required language or removing tweets that was not in the language of interest also is part of pre-processing. In general, as text analysis tools and lexicons were available in English, tweets were translated to English and non-English tweets were removed.Reference Doré, Ort and Braverman35, Reference Jones, Wojcik and Sweeting38, Reference Lin, Margolin and Wen41–Reference Su, Cacciatore and Liang43, Reference Gruebner, Lowe and Sykora45 In the included studies, GermanReference Gaspar, Pedro and Panagiotopoulos36 and FrenchReference Gruebner, Sykora and Lowe37 tweets were translated to English.
Feature Extraction and Data Analysis
After pre-processing the collected data, transferring the data into information and the requisite results were achieved by feature extraction and analysis of data. Features include emoticons, word unigram, parts of speech, sentiments, character trigram, and 4-gram.Reference Vo and Collier29, Reference Schulz, Thanh, Paulheim and Schweizer30 These features were extracted before applying machine learning techniques to the data for their classification. Some studies used established systems and tools for analysis. In that case, feature extraction was done by an inbuilt module in the system itself. Natural language processing (NLP) methods were used for feature extraction in SOCIALmetrics™Reference Woo, Cho and Shim34 and EMOTIVE systems.Reference Gruebner, Sykora and Lowe37, Reference Gruebner, Lowe and Sykora42, Reference Gruebner, Lowe and Sykora45
Extraction of emotions from the tweets was carried out using classification algorithms (machine learning algorithms, lexicon-based methods, or with tools that work based on the above-mentioned methods). Except 2 studies, other researchers used developed and evaluated systems or tools for extraction of context or theme and classification of emotions. In 2 studies,Reference Vo and Collier29, Reference Schulz, Thanh, Paulheim and Schweizer30 dataset annotated with emotions by annotators was created for training the classifier. For classification, machine learning algorithm was used. The performance of the models was evaluated using human-annotated dataset. Both these articles concluded that multinomial naive Bayes (MNB) model performs well for classification of emotions from texts. Many studies used linguistic inquiry and word count (LIWC) for the detection of emotions in the Twitter posts.Reference Glasgow, Fink and Boyd-Graber31, Reference De Choudhury, Monroy-Hernández and Gloria32, Reference Doré, Ort and Braverman35, Reference Jones, Wojcik and Sweeting38, Reference Lin, Margolin and Wen41 LIWC is a validated computer program, developed to categorize the text into psychologically meaningful sections.Reference Tausczik and Pennebaker48 The LIWC tool extracts the emotional features from tweets and also is very user-friendly because it does not require any programming skills for its use.
The EMOTIVE system was used in studies to identify emotions in the tweets.Reference Gruebner, Sykora and Lowe37, Reference Gruebner, Lowe and Sykora42, Reference Gruebner, Lowe and Sykora45 EMOTIVE was developed for the purpose of emotional analysis of informal text messages and it uses NLP pipelines for processing data and adopts an ontology approach to extract emotions.Reference Sykora, Jackson and O’Brien49 Anger, unpleasantness, anxiety, sadness, fear, happy, inhibition, and calm are the basic emotions examined in the studies. There was little difference in the list of emotions studied based on the language of the tweets considered. Negative emotions such as anger, anxiety, sadness, and fear were correlated with psychological problems. To quantify the negative affect of the community exposed to long-term violence, negative emotional words (sad, anger, anxiety, and inhibition) identified by LIWC were considered.Reference De Choudhury, Monroy-Hernández and Gloria32 The emotional categories of anxiety, sadness, and anger were considered as a distress response of the community to a disaster.Reference Lin, Margolin and Wen41 SOCIALmetrics™ was applied in one of the studies for processing and text analysis of Korean tweets.Reference Woo, Cho and Shim34 SOCIALmetrics™ is a social media analysis tool that uses NLP and text mining techniques for processing and extracting information from social media texts. In this article, the related keywords to suicide and depression were identified by extracting the negative emotional words such as anger, anxiety, sad, hurt, suffering, and shock in the tweets, which were then analyzed by using SOCIALmetrics™.
Apart from examining emotions, 2 studies focused on sentiments (positive, negative, and neutral) in the social media data. In 1 study, the sentiments of the posts were studied using tools such as CoreNLP, SentiStrength, and SentiWordNet, and their uncertainty was measured by entropy.Reference Lu, Hu and Wang33 In another study, initially SentiStrength was used, and then machine learning algorithms were used for classifying the sentiments. The models were validated using human-annotated dataset. Finally, support vector machine (SVM) was reported as best performing classifier of sentiments.Reference Neppalli, Caragea and Squicciarini40 In both studies, sentiments were mapped in the geographical area and their variations with respect to space and time were studied.
Death-related talk tweets were automatically classified from the tweets by improving the DUALIST framework. DUALIST uses anMNB model. In 1 of the articles.Reference Glasgow, Fink and Boyd-Graber31 SVM was used to correct the false confidence of a naive Bayes model, and this model performs better than LIWC and DUALIST MNB model in classifying the death-related talk tweets.Reference Glasgow, Fink and Boyd-Graber31
Other than extracting and classifying the emotions, the context in which the emotions and sentiments were expressed was also examined qualitatively in 3 articles.Reference Khalid, Helander and Hood28, Reference Gaspar, Pedro and Panagiotopoulos36, Reference Su, Cacciatore and Liang43 In 1 study,Reference Khalid, Helander and Hood28 the authors used Leximancer, a text mining tool to extract themes and concepts from the narratives in the blogs after disasters. The concepts and their emotional relationship were analyzed with the help of semantic maps and ontologies. In the other 2 studies,Reference Gaspar, Pedro and Panagiotopoulos36, Reference Su, Cacciatore and Liang43 a small number of tweets were coded manually by the coders into several categories and themes, and then these coded tweets were used to extract concepts from a huge volume of posts. Finally, the results were presented qualitatively. In this way, the context in which the emotions are stated can be identified.
Hampton and Shalin,Reference Hampton and Shalin39 in their study, examined the use of adjectives and their antonyms by the population during disasters. An antonym pair corpora was created based on tweets posted during disasters. By understanding the lexical choice in disaster situations with reference to a normative use, they proved that it was possible to recognize patterns of disruption among the population, thereby identifying the needy in disaster scenarios.
After the extraction of emotions, statistical analysis like clustering of emotionsReference Gruebner, Sykora and Lowe37, Reference Gruebner, Lowe and Sykora42, Reference Gruebner, Lowe and Sykora45 and regressionReference Jones, Wojcik and Sweeting38 to ascertain the change in emotions with time and multivariate regression to examine proximity, gender, interpersonal communication, media exposure, and their association with distress intensity changesReference Lin, Margolin and Wen41 were done. Association between pre- and peri-disaster with postdisaster discomfort rates were examined by spatial regression techniques.Reference Gruebner, Lowe and Sykora45 The mapping of sentiments and emotions was done to examine the spatial distribution of mental health of the population and to understand how emotions vary based on the distance from the disaster.
DISCUSSION
The language use and emotions expressed in the social media posts were predictors of various mental health problems.Reference D’Andrea, Chiu and Casas52–Reference Mowery, Smith and Cheney55 There are several existing methodologies that can be used for ascertaining mental health from social media data.Reference Wongkoblap, Vadillo and Curcin24 But these methods or techniques are minimally used in a disaster context. Application of these techniques and broadening its use in a disaster context will help the respondents (e.g., the government, social service organizations, disaster management bodies) to the disaster understand the population’s mental health.
Facebook is one of the widely used social networking sites globally and in a recent study, it was rated as the appropriate public social networking site for negative emotional expression.Reference Waterloo, Baumgartner and Peter56 But Facebook was scarcely used in disaster mental health studies, to the best of our knowledge. The reason may be that updates on Facebook are restricted by the users for public access and the data can be collected only after consent by the users. All the included studies used Twitter posts as the source of information for research, as Twitter itself provides streaming and rest APIs for developers to collect tweets for research.
The classification of emotions and sentiments from the texts is done either by machine learning models or by using lexicons. From the included studies, it is not possible to conclude that the best performing technique or lexicon depends on the situation and also factors such as size and volume of the social media texts, lexicon size, and the Web environment.
The traditional method of studying mental health during a disaster was by carrying out surveys and interviews of the affected population.Reference Walker-Springett, Butler and Adger57–Reference Cerdá, Paczkowski and Galea59 Surveys were always carried out after the disasters, and it was not possible to ascertain the mood of the population before and during the emergency situation. This limitation can be overcome by using social media data for mental health research as posts or update on social media networking sites are done in real time by the affected people at the time of the disaster. Furthermore, social media updates from the population before the disaster are also accessible.Reference Gruebner, Lowe and Sykora42 Temporal analysis of the emotional state of the affected population before, during, and after the disasters can be studied from the social media data. Due to the financial and time restrictions, traditional surveys are limited in scope in a disaster situation. In such cases, social media data will help the researchers, decision makers, and the disaster response teams to get insights into the psychological characteristics of the affected population.
All the included studies analyzed only the updates posted by the public. Researches confirm that images or photos posted by the people can be used to identify and predict their emotional state and mood.Reference Reece and Danforth60–Reference Lin, Jia and Guo62 In the future, text and image analysis can be done together to study and understand mental health of a population during a disaster in an efficient way. There is evidence to show that online social network analysis plays a significant role in predicting and identifying the mental state of people in social networks.Reference Michael, Silenzio and States63–Reference Wang, Zhang and Sun65 Hence, incorporating online social network analysis with posts from social media would aid in understanding the emotional impact the disaster on the population during and after disasters.
CONCLUSION
As disaster strikes the population unexpectedly, information from traditional sources may not be available and it is advisable to get information from multiple sources. Social media is one of the sources for data gathering during a disaster. From the studies, it is very clear that information extracted from social media data provides valuable information about the emotions of the population during and after disasters and definitely augments the traditional methods of information gathering at the time of a disaster. The collected data also aid the public health professionals in the response team in decision making. Further research that incorporates image and social network analysis along with social media texts would provide more reliable results. These research improvements should be incorporated into practice during disasters to provide psychological support to people as a part of disaster response for building resilience of the affected population.
Limitations
Based on the included studies, we summarized some of the limitations of using social media data for psychological analysis in a disaster scenario. The first limitation is that social media data do not represent the entire population, because social network users mostly constitute younger adults within the age range of 18 to 29 years and the socioeconomically privileged.Reference Duggan50 Therefore, the results obtained will not be a representation of the community. Second, the Twitter API service allows users to collect only the sample of the tweetsReference Morstatter, Pfeffer and Liu51 and so the data are not the representative of the entire Twitter activity. Third, there is a possibility that there is no connection between the disaster location and the person who tweets. There is no confirmation that all the tweets collected are obtained from only the concerned location of the study. Fourth, generally while pre-processing, only particular language tweets were considered and tweets with links were eliminated. This may lead to loss of some important information from the filtered tweets.
Conflict of Interest
The authors declare that there is no conflict of interest.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/dmp.2019.40