Introduction
Social anxiety disorder (SAD) is a mental disorder characterized by anxiety regarding social situations, avoidance of social stimuli, and negative self-beliefs (Clark & Wells, Reference Clark, Wells, Hemberg, Liebowitz, Hope and Schneier1995; Rapee & Heimberg, Reference Rapee and Heimberg1997). The neural basis of these characteristic manifestations has been explored using neuroimaging studies, which have suggested hyperactivation of the fear circuit (Bruhl, Delsignore, Komossa, & Weidt, Reference Bruhl, Delsignore, Komossa and Weidt2014; Freitas-Ferrari et al., Reference Freitas-Ferrari, Hallak, Trzesniak, Filho, Machado-de-Sousa, Chagas and Crippa2010). For example, typical social cues such as other people's faces or speech anticipation induce excessive activation of limbic emotion-related regions, including the amygdala (Davies et al., Reference Davies, Young, Torre, Burklund, Goldin, Brown and Craske2017; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008; Kim et al., Reference Kim, Shin, Lee, Kim, Jo and Choi2018; Kraus et al., Reference Kraus, Frick, Fischer, Howner, Fredrikson and Furmark2018; Phan, Fitzgerald, Nathan, & Tancer, Reference Phan, Fitzgerald, Nathan and Tancer2006; Stein, Goldin, Sareen, Zorrilla, & Brown, Reference Stein, Goldin, Sareen, Zorrilla and Brown2002) and insula (Boehme et al., Reference Boehme, Ritter, Tefikow, Stangier, Strauss, Miltner and Straube2014; Choi, Shin, Ku, & Kim, Reference Choi, Shin, Ku and Kim2016; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008; Kim et al., Reference Kim, Shin, Lee, Kim, Jo and Choi2018; Straube, Kolassa, Glauer, Mentzel, & Miltner, Reference Straube, Kolassa, Glauer, Mentzel and Miltner2004), and decreased activity of visual cortices related to processing of facial stimuli, including the fusiform gyrus and intraparietal sulcus (Binelli et al., Reference Binelli, Muñiz, Subira, Navines, Blanco-Hinojo, Perez-Garcia and Martin-Santos2016; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008). Because of the consciousness of other people's eyes, activation of regions related to theory of mind (ToM), including the superior temporal sulcus, middle temporal gyrus, and temporoparietal junction, is another characteristic (Brühl et al., Reference Brühl, Rufer, Delsignore, Kaffenberger, Jäncke and Herwig2011; Choi et al., Reference Choi, Shin, Ku and Kim2016; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008; Kim et al., Reference Kim, Shin, Lee, Kim, Jo and Choi2018). Negative self-beliefs in patients with SAD have been linked to abnormal engagement of regions associated with self-related processing such as the medial prefrontal cortex (Blair et al., Reference Blair, Geraci, Devido, McCaffrey, Chen, Vythilingam and Pine2008, Reference Blair, Geraci, Otero, Majestic, Odenheimer, Jacobs and Pine2011; Goldin & Gross, Reference Goldin and Gross2010; Yoon et al., Reference Yoon, Kim, Shin, Choi, Lee and Kim2016) and frontoparietal attentional network, including the anterior cingulate cortex (ACC) and inferior parietal cortex (Becker, Simon, Miltner, & Straube, Reference Becker, Simon, Miltner and Straube2017; Goldin & Gross, Reference Goldin and Gross2010; Kim, Yoon, Shin, Lee, & Kim, Reference Kim, Yoon, Shin, Lee and Kim2016), as well as areas of emotional processing including the amygdala (Blair et al., Reference Blair, Geraci, Otero, Majestic, Odenheimer, Jacobs and Pine2011; Brühl et al., Reference Brühl, Rufer, Delsignore, Kaffenberger, Jäncke and Herwig2011). Resting-state functional connectivity has also been demonstrated to exhibit impairments of the networks associated with self-related processing and ToM (Choi et al., Reference Choi, Shin, Ku and Kim2016; Cui et al., Reference Cui, Vanman, Long, Pang, Chen, Wang and Chen2017; Yun et al., Reference Yun, Kim, Ku, Shin, Kim and Choi2017).
Symptoms of SAD can be improved through appropriate pharmacological treatment or psychological interventions such as cognitive-behavioral therapy (CBT) (Mayo-Wilson et al., Reference Mayo-Wilson, Dias, Mavranezouli, Kew, Clark, Ades and Pilling2014). Common CBT techniques for SAD are exposure and cognitive restructuring (Heimberg, Reference Heimberg2002). Repetitive exposure to feared conditions may induce habituation, extinction, or new learning (Heimberg, Reference Heimberg2002; Tryon, Reference Tryon2005). The mechanism of this cognitive restructuring has been explored using neuroimaging studies. The most consistent findings include decreased activation (Burklund, Torre, Lieberman, Taylor, & Craske, Reference Burklund, Torre, Lieberman, Taylor and Craske2017; Goldin & Gross, Reference Goldin and Gross2010; Klumpp et al., Reference Klumpp, Fitzgerald, Kinney, Kennedy, Shankman, Langenecker and Phan2017; Månsson et al., Reference Månsson, Carlbring, Frick, Engman, Olsson, Bodlund and Andersson2013) and changed functional connectivity (Whitfield-Gabrieli et al., Reference Whitfield-Gabrieli, Ghosh, Nieto-Castanon, Saygin, Doehrmann, Chai and Gabrieli2016; Young et al., Reference Young, Burklund, Torre, Saxbe, Lieberman and Craske2017; Yuan et al., Reference Yuan, Zhu, Qiu, Meng, Zhang, Shang and Lui2016) of the amygdala. Other psychological intervention-related findings include modulated insula activation (Duval, Joshi, Russman Block, Abelson, & Liberzon, Reference Duval, Joshi, Russman Block, Abelson and Liberzon2018), changed ACC activity (Burklund et al., Reference Burklund, Torre, Lieberman, Taylor and Craske2017; Klumpp et al., Reference Klumpp, Fitzgerald, Piejko, Roberts, Kennedy and Phan2016, Reference Klumpp, Fitzgerald, Kinney, Kennedy, Shankman, Langenecker and Phan2017), and altered connectivity of the default-mode or cerebellum-prefrontal network (Yuan et al., Reference Yuan, Meng, Zhang, Nie, Ren, Zhu and Zhang2017, Reference Yuan, Zhu, Qiu, Meng, Zhang, Ren and Zhang2018).
Despite proven effects, deciding to participate in social situations and start long-term therapy in a clinical setting is still challenging for patients with SAD (Olfson et al., Reference Olfson, Guardino, Struening, Schneier, Hellman and Klein2000). Training alone at home can provide a chance for therapeutic gains without visits to a formal setting, and mobile-based virtual reality (VR) makes this self-training technically possible (Kim et al., Reference Kim, Hong, Kim, Jung, Kyeong and Kim2017). The VR technique is to expose individuals repeatedly and stepwise to virtual social situations and reduce the fear response and avoidance reaction. Additionally, this technique can give individuals immediate objective feedback on their presentations. Although VR exposure therapy in a clinical setting is effective for treating social fears (Anderson et al., Reference Anderson, Price, Edwards, Obasaju, Schmertz, Zimand and Calamaras2013; Bouchard et al., Reference Bouchard, Dumoulin, Robillard, Guitard, Klinger, Forget and Roucaut2017), virtual reality self-training (VRS) at home may be a good interim modality to reduce social anxiety, increase hope for treatment, and seek further formal treatment. A reduction in subjective anxiety by a 2-week application of VRS has already been verified (Kim et al., Reference Kim, Hong, Kim, Jung, Kyeong and Kim2017). Despite its effectiveness, the brain mechanism has not yet been identified.
The current study addresses this mechanism using functional magnetic resonance imaging (fMRI) with two different tasks, the distress task and speech evaluation task. These were designed to reflect characteristics of SAD, such as anxiety regarding external social stimuli and negative self-beliefs. The distress task measured attitudes toward social cues by asking whether participants felt distress while viewing a set of facial images of people as a simulated audience. Previous studies have frequently focused on negative stimuli, such as angry faces, to evoke social anxiety (Phan et al., Reference Phan, Fitzgerald, Nathan and Tancer2006; Stein et al., Reference Stein, Goldin, Sareen, Zorrilla and Brown2002). In ordinary social situations, however, the audience is not expressing anger because the speaker is clumsy, and can maintain a neutral expression or rather smile. To simulate this ecological environment, the facial expressions in our task consisted of neutrality and happiness. Nevertheless, we expected patients with SAD to feel distress with these stimuli because they tend to have an incorrect perception that neutral others have angry attitudes toward them (Roth & Heimberg, Reference Roth and Heimberg2001). The fact that patients with SAD have a hyperactivity in the threat-detection system, just like amygdala activation to neutral faces (Cooney, Atlas, Joormann, Eugène, & Gotlib, Reference Cooney, Atlas, Joormann, Eugène and Gotlib2006) also supports the justification of our task.
The speech evaluation task had participants evaluate their own speech in conjunction with negative adjectives. Previous fMRI studies of negative beliefs in patients with SAD have mainly used tasks related to autobiographical memory to report abnormal responses of reappraisal-related brain regions including the amygdala (Goldin, Manber-Ball, Werner, Heimberg, & Gross, Reference Goldin, Manber-Ball, Werner, Heimberg and Gross2009) or increases in attention-related parietal cortex responses by mindfulness-based stress reduction (Goldin, Ziv, Jazaieri, Hahn, & Gross, Reference Goldin, Ziv, Jazaieri, Hahn and Gross2013). In contrast, we chose to evaluate participants' own speech in order to address negative beliefs. The adoption of speech evaluation was based on previous studies showing that self-beliefs in self-performance ratings of a speech were negatively biased in SAD (Cody & Teachman, Reference Cody and Teachman2010; Koban et al., Reference Koban, Schneider, Ashar, Andrews-Hanna, Landy, Moscovitch and Arch2017). We expected that VRS does not involve intervention in cognitive reappraisal, but could improve patients' negative self-beliefs, which have been considered to be one of the main treatment targets (Hofmann, Moscovitch, Kim, & Taylor, Reference Hofmann, Moscovitch, Kim and Taylor2004).
This study aimed to neurobiologically verify the possibility of VRS as a tool to improve symptoms using fMRI with the distress and speech evaluation tasks in patients with SAD with and without VRS. In terms of distress, we hypothesized that abnormal limbic and ToM-related activity while processing emotional information of faces would be restored with decreased social anxiety after VRS, whereas activity of the visual cortices would increase with improving attention to the faces after VRS. In terms of speech evaluation, we expected that a decrease in negative evaluation after VRS would change activity in the regions related to negative self-beliefs, such as the medial prefrontal cortex, ACC, inferior parietal lobule, and amygdala.
Method
Participants
Among 115 volunteers who emailed the application form to participate in this study after watching an internet advertisement, a total of 61 participants (19–30 years old) who were evaluated as having high social anxiety through their responses to the screening questionnaires were interviewed by a psychiatrist (K.M.K.). The inclusion criteria were a DSM-5 diagnosis of SAD (American Psychiatric Association, 2013) and more than 30 points on the total score of the Liebowitz social anxiety scale-self report (LSAS) (Fresco et al., Reference Fresco, Coles, Heimberg, Liebowitz, Hami, Stein and Goetz2001). The exclusion criteria were (1) lifetime diagnosis of major psychiatric disorder including psychotic disorder, bipolar disorder, substance use disorder, or organic mental disorder, (2) current use or history of any psychiatric treatment including psychotropic medication or CBT, (3) lifetime diagnosis of a neurological disorder or having medical conditions preventing MRI, (4) pregnancy, and (5) left-handedness (Annett, Reference Annett1970). Nine volunteers were excluded because they did not meet the criteria, and remaining 52 volunteers finally participated in the study.
By stratified randomization of sex and severity of social anxiety, 24 participants were assigned to the VRS group and 28 to the waiting list (WL) group. The VRS group received eight sessions of VRS, whereas the WL group received no training. The levels of anxiety and depression in these participants were evaluated using the Hospital Anxiety and Depression Scale (HADS) (Zigmond & Snaith, Reference Zigmond and Snaith1983). In the VRS group, three participants dropped out during self-training, and only 21 underwent MRI scans. In the WL group, eight participants dropped out during waiting time, and only 20 were scanned. There were no statistical differences in age, sex, intelligence quotient, LSAS scores, or anxiety and depression scores of the HADS between the two groups (online Supplementary Table 1). Neither group received psychopharmacological medication. The Institutional Review Board at Gangnam Severance Hospital, Yonsei University, approved the study procedure. Written informed consent was obtained from all participants.
Interventions
The VRS content included three environments: school life, business life, and daily life. Each environment consisted of four situations, which were set up to have four different levels of difficulty in a way that the number of virtual persons appearing increased (see the sample videos: https://youtu.be/LxfSPaSJSTE). Since each situation included three topics, there were three environments, 12 situations, and 36 topics. These environments were displayed via the head-mounted display (HMD), which consisted of a Samsung Galaxy S6 latched onto Samsung Gear VR powered by Oculus, and participants operated the environments themselves by clicking the built-in buttons next to the HMD. They trained alone by repeatedly performing speeches following the narration provided in the content. The participants' eye movement, speaking time, and heart rate were automatically measured as program-embedded variables for immediate feedback and they self-evaluated their speech. Using this information, recommendations on whether to repeat a speech or move on to the next speech were made with the scores of each variable. Further description of VRS is detailed in a previous paper (Kim et al., Reference Kim, Hong, Kim, Jung, Kyeong and Kim2017).
The training was developed for individuals to carry out by themselves, but participants made visits to the VR clinic to ensure it was performed correctly. They visited the clinic eight times over 2 weeks. Participants practiced one of the situations regardless of the type of environment in the first session, and did two situations in the remaining seven sessions, a new situation and a situation practiced at the previous session. Accordingly, participants had to complete a total of eight situations and 24 topics until the end of training. After each session, participants completed the simulator sickness questionnaire (SSQ) to assess the degree of cybersickness (Kennedy, Lane, Berbaum, & Lilienthal, Reference Kennedy, Lane, Berbaum and Lilienthal1993).
Procedures
Both groups completed the initial assessment, including the LSAS and HADS, the speech test, and fMRI scanning, and the follow-up assessment consisting of the same tests after VRS or waiting time (about 3 weeks after the initial assessment). Before fMRI scanning, participants performed the speech test, in which they made a presentation on a specific topic (future plans in the initial assessment and a vacation experience in the follow-up assessment) in front of two audiences, and watched a recorded video of another person's presentation on the same topic.
Experimental tasks
Participants performed two different behavioral tasks (Fig. 1) for two fMRI scanning sessions. The first was the distress task of block design, in which participants were asked to imagine making a social speech. The task included two experimental blocks and a rest block. During the 22-s experimental blocks, the word ‘self’ or ‘other’ was given for 2 s, then five image sets of eight different faces were displayed on the screen for 4 s each. The position of the face was different for each set of images, in which the sex ratio was 1:1 and the expression ratio of happiness v. neutrality was 1:3. The facial images were selected from Korean Facial Expressions of Emotion (Park et al., Reference Park, Oh, Kim, Lee, Lee, Kim and An2011). In the block that started with ‘self’, participants were requested to concentrate on their internal state and respond whether they had distress due to internal physical or psychological reactions. In the block that began with ‘other’, participants were instructed to concentrate on their external conditions and respond whether they felt uncomfortable from external threats such as facial expressions in the pictures. These two conditions were named ‘internal block’ and ‘external block’, respectively. Participants responded with ‘yes’ or ‘no’ buttons whenever the set of faces changed, and the responses were automatically saved. Each experimental block was repeated 10 times, and the order was pseudo-randomized. The 18-s rest block was always located between the experimental blocks. In this block, the word ‘rest’ was displayed for 2 s, then mosaic images of eight faces lasting for 4 s were displayed four times.
The second task was the speech evaluation task of event-related design, in which participants evaluated their presentation and another's in the speech test before fMRI scanning. The task included three conditions: self, other, and mosaic. In the self and other conditions, an adjective associated with poor speech skills was presented under their own face or the other speaker's face. Participants were instructed to answer ‘yes’ or ‘no’ if the person's presentation fit well with the adjective. In the mosaic condition, an animal's name was presented under a mosaic screen. Participants were asked to respond if the animal name was three letters or not. Each condition included 32 trials, and different adjectives or animal names were used in each trial. A total of 96 trials were presented pseudo-randomly. Each trial lasted for three seconds with an average 5-s jittered inter-stimulus interval.
Behavioral outcome analysis for the distress task was done through the ‘distress index’, which was defined as the number of ‘yes’ responses divided by the total response number in the experimental blocks. A higher index indicated greater distress. Behavioral outcome analysis for the speech evaluation task was performed through the ‘negative evaluation index’, which was defined as the number of ‘yes’ responses divided by the total response number in each of the self and other conditions. A high index meant that the participant gave a negative assessment of the speech.
Image acquisition and preprocessing
MRI scanning was performed using a Siemens Magnetom Verio 3T scanner (Siemens Medical Solutions, Erlangen, Germany). Functional images were collected using a gradient echo planar imaging sequence (repetition time, 2000 ms; echo time, 30 ms; flip angle, 90°; number of slices, 30; slice thickness, 3 mm; and matrix size, 64 × 64). Three scans were discarded before image acquisition for signal equilibrium. T1-weighted structural images were acquired with a 3D spoiled-gradient-recall sequence (repetition time, 1900 ms; echo time, 2.46 ms; flip angle, 9°; number of slices, 176; slice thickness, 1 mm; and matrix size, 256 × 256).
Preprocessing and analysis of fMRI data were performed with Statistical Parametric Mapping, version 12 (http://www.fil.ion.ucl.ac.uk/spm). Functional images were corrected for differences in slice acquisition time and were realigned to correct individual head motions. After co-registration of the corrected functional images on the structural images, transformation matrices obtained by spatial normalization of the structural images were applied to the co-registered images. These normalized functional images were smoothed with a Gaussian kernel of 6 mm full-width at half-maximum.
Statistical analysis
For the individual analysis, the internal, external, and rest conditions in the distress task and the self, other, and mosaic conditions in the speech evaluation task were used as regressors of interest in the general linear model (GLM). Six head motion parameters and two sessions were included as regressors of non-interest. In the distress task, contrast images were created by subtracting the rest condition from the average of the internal and external conditions to observe the neurobiological response to other people's facial expressions. In addition, contrast images that subtracted the internal condition from the external condition were used to identify neural responses caused by different cognitive processes regardless of others' facial images. In the speech evaluation task, contrast images were created by subtracting the other condition from the self-condition, the mosaic condition from the self-condition, and the mosaic condition from the other condition. To find these task-related neural substrates in patients with SAD, one sample t test was performed using the contrast images of the initial scan for all participants, regardless of group. Given that activated regions are observed more extensively in the block design than in the event-related design (Chee, Venkatraman, Westphal, & Siong, Reference Chee, Venkatraman, Westphal and Siong2003), significant results were defined as the areas that survived beyond the threshold at a family-wise error (FWE)-corrected p < 0.01 in the distress task and FWE-corrected p < 0.05 in the speech evaluation task with a cluster size k > 50.
Among these task-related areas, those found in the contrast subtracting the rest condition from the average of the internal and external conditions in distress task and the contrast subtracting the other condition from the self-condition in the speech evaluation task were considered to be regions of interest (ROIs) for further analysis and their regional activity was extracted with MarsBaR version 0.44 in each of the initial and follow-up scans. Pearson correlation analysis was used to assess the relationship between regional neural activity at baseline and severity of social anxiety (LSAS total, anxiety, and avoidance scores). Then, we used the GLM to find whether changes in regional activity between the initial and follow-up scans were different between the groups. In the regions showing an interaction effect between time and group, a post-hoc paired t test was conducted to determine whether changes in regional activity over time occurred only in the VRS group.
To identify the factors associated with improvement of social anxiety in the VRS group, we conducted a linear regression with a change in the LSAS total score as a dependent variable. First, to find the candidate variables for the final regression model, univariable regression analyses were performed using changes in behavioral indices of fMRI tasks and regional activities as explanatory variables. Meaningful variables, whose p-value was less than 0.2 in the univariable regression, were used in the stepwise multivariable regression model. The power of the model was explained by the adjusted R 2.
Continuous variables and a categorical variable of baseline demographic characteristics were compared using t test and χ2-test, respectively. The time and group effects of the LSAS scores and behavioral measures during the fMRI tasks were analyzed using the GLM. Statistical analyses for demographic variables, behavioral measures, and extracted regional neural activities were conducted by using SPSS software (ver. 23; SPSS Inc., Chicago, IL, USA).
Result
Changes in behavioral assessments
Figure 2 displays the behavioral measure scores during the initial and follow-up assessments in each group. The meaningful interaction effect between time and group was observed in the LSAS total, anxiety, and avoidance scores (F 1,39 = 5.8, p = 0.02; F 1,39 = 6.9, p = 0.01; and F 1,39 = 4.7, p = 0.04, respectively), the distress index (F 1,36 = 6.7, p = 0.01), and the negative evaluation index for the self (F 1,39 = 3.6, p = 0.06), but not in the HADS anxiety and depression scores and the negative evaluation index for the other. The post-hoc test confirmed that compared with the initial assessment, the follow-up assessment showed significantly decreased LSAS anxiety and avoidance scores (t 20 = −3.8, p < 0.01; and t 20 = −3.2, p < 0.01, respectively), distress index (t 17 = −2.4, p = 0.03), and negative evaluation index for the self (t 20 = −4.1, p < 0.01) in the VRS group. In the WL group, no significant changes in these scores were found. There was no discontinuation of VRS due to simulator sickness, and the mean total score of the SSQ in the VRS group was 27.7 (standard error: 7.37).
Initial responses to distress and negative self-evaluation
Table 1 shows the results from the one sample t test in the initial scan. Significant responses to distress (subtracting the rest condition from the average of the internal and external conditions) were identified in 14 areas, including the bilateral dorsomedial prefrontal cortex, right premotor cortex, right supramarginal gyrus, bilateral lateral occipital and lingual gyri, bilateral anterior insula, and bilateral thalamus. Among these areas, right lingual gyrus (coordinates: 14/−72/14) activity was negatively correlated with initial LSAS total scores (r = −0.32, p < 0.05) and avoidance subscale scores (r = −0.33, p < 0.05), but not with anxiety subscale scores (Fig. 3a). No region showed a significant difference between the internal and external conditions.
MNI, Montreal Neurological Institute; Vox, number of voxels; R., right; L., left; PMC, premotor cortex; DMPFC, dorsomedial prefrontal cortex; SMG, supramarginal gyrus; LOG, lateral occipital gyrus; ACC, anterior cingulate cortex; VLPFC, ventrolateral prefrontal cortex; AG, angular gyrus; MTG, middle temporal gyrus; PCC, posterior cingulate cortex.
Significant regional responses to negative self-evaluation were analyzed by the self > other contrast and were identified only in the bilateral ACC. These ACC activities were positively correlated with initial LSAS total scores (right: r = 0.53, p < 0.01; left: r = 0.40, p = 0.01), anxiety scores (right: r = 0.53, p < 0.01; left: r = 0.41, p < 0.01), and avoidance scores (right: r = 0.49, p < 0.01; left: r = 0.36, p < 0.02) (Fig. 3b, c). In addition, various activations in the bilateral dorsomedial prefrontal cortex, left ventrolateral prefrontal cortex, left ACC, left angular gyrus, left middle temporal gyrus, right lateral occipital gyrus, bilateral posterior cingulate cortex, right insula, and bilateral thalamus were found in the self > mosaic contrast, and those in the bilateral dorsomedial prefrontal cortex, left ventrolateral prefrontal cortex, left middle temporal gyrus, and bilateral posterior cingulate cortex were activated in the other > mosaic contrast.
Changes in responses to distress and negative self-evaluation
The interaction effect of regional neural activities between time and group in regional neural responses to distress was found in the right lingual gyrus (coordinates: 20/−88/0; F 1,38 = 3.7, p = 0.06) and left thalamus (coordinates: −10/−14/10; F 1,38 = 3.4, p = 0.07) at a marginally significant level. The post-hoc paired t test showed that in the VRS group, neural activity in both regions significantly increased in the follow-up scan compared with the initial scan (right lingual gyrus: t 19 = 4.3, p < 0.01; left thalamus: t 19 = 3.3, p < 0.01), but did not change in the WL group (Fig. 4). However, no interaction effect of regional neural activities between time and group in regional neural responses to negative self-evaluation was observed in either side of the ACC.
In univariable linear regression, changes in the distress index and changes in right lateral occipital gyrus (coordinates: 46/−72/−2), left lateral occipital gyrus (coordinates: −26/−94/2), and right thalamus (coordinates: 20/−26/−2) activity were found as meaningful explanatory variables for changes in the LSAS total scores (online Supplementary Table 2). Among these variables, changes in the distress index and right lateral occipital gyrus activity were selected in the final multivariate model, and the adjustment R-squared statistics was 0.48.
Discussion
This study aimed to find neurobiological evidence for the therapeutic effect of VRS. When we evaluated patients after approximately 3 weeks, both the anxiety and avoidance scores significantly decreased in the VRS group, but not in the WL group. As VRS induced significant improvement despite the short training period, examination of brain changes following VRS was confirmed as a legitimate analysis of the treatment mechanism. The distress index decreased in the VRS group, but not in the WL group, also suggesting that the mechanism of VRS can be properly investigated using neuroimaging analysis.
In terms of distress, our hypotheses included limbic activation, and the main ROIs were the amygdala and insula. However, the amygdala was not activated in response to our stimuli. This may be due to the nature of our stimuli, which were neutral or happy, and contrasted with the harsh and angry expressions used in previous studies that reported amygdala hyperactivation (Davies et al., Reference Davies, Young, Torre, Burklund, Goldin, Brown and Craske2017; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008; Phan et al., Reference Phan, Fitzgerald, Nathan and Tancer2006; Stein et al., Reference Stein, Goldin, Sareen, Zorrilla and Brown2002). Insula activation was observed in our study, consistent with the finding of other studies (Boehme et al., Reference Boehme, Ritter, Tefikow, Stangier, Strauss, Miltner and Straube2014; Choi et al., Reference Choi, Shin, Ku and Kim2016; Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008; Kim et al., Reference Kim, Shin, Lee, Kim, Jo and Choi2018; Straube et al., Reference Straube, Kolassa, Glauer, Mentzel and Miltner2004). Since the insula is a center of the disgust emotion (Wicker et al., Reference Wicker, Keysers, Plailly, Royet, Gallese and Rizzolatti2003), this finding may result from patients' perception of neutral or happy expressions as feelings of disgust, though there was no behavioral evidence for such perception other than the distress index in our experiment. This aspect may be supported by patients' tendencies to rate happy faces as less approachable or untrustworthy (Campbell et al., Reference Campbell, Sareen, Stein, Kravetsky, Paulus, Hassard and Reiss2009; Gutiérrez-García & Calvo, Reference Gutiérrez-García and Calvo2016). Alternatively, insula activation during the distress task may be derived from salience processing. The insula is a key node of the salience network and plays a central role in the detection of behaviorally relevant stimuli (Uddin, Reference Uddin2015). Even neutral and happy expressions can be perceived as salient stimuli for patients who are overly concerned about the audience's reaction. In fact, excessive salience processing and insula overactivity in patients with SAD have been consistently reported in previous studies (Duval et al., Reference Duval, Joshi, Russman Block, Abelson and Liberzon2018; Klumpp, Post, Angstadt, Fitzgerald, & Phan, Reference Klumpp, Post, Angstadt, Fitzgerald and Phan2013). However, because functional changes in the insula following VRS were not found in our study, there is no evidence that this biased perception would be improved by this short training.
Multiple involvements of the thalamus during the distress task can also be considered to be limbic activation in response to emotional stimuli. The thalamus functions as a sophisticated sensory relay and is involved in emotional operations through connections to the limbic cortex like the amygdala or insula (Ward, Reference Ward2013). Our findings in the thalamus can be linked to a previous report showing hyper-reactivity of the thalamus when performing emotional and visual tasks (Brühl et al., Reference Brühl, Rufer, Delsignore, Kaffenberger, Jäncke and Herwig2011). Even structural abnormalities of the thalamus in patients with SAD have been reported in other studies (Meng et al., Reference Meng, Lui, Qiu, Qiu, Lama, Huang and Zhang2013; Tadayonnejad, Klumpp, Ajilore, Leow, & Phan, Reference Tadayonnejad, Klumpp, Ajilore, Leow and Phan2016). Furthermore, our study demonstrated that the thalamic area showing additionally increased activity after VRS was the mediodorsal region (coordinates: −10/−14/10). Given that the mediodorsal thalamic nucleus plays a role in the modulation of fear extinction (Lee et al., Reference Lee, Ahmed, Lee, Kim, Choi, Kim and Shin2012) and reward devaluation (Mitchell, Browning, & Baxter, Reference Mitchell, Browning and Baxter2007), enhanced function of this region by VRS may reflect an improvement in the ability to positively modulate emotional signals.
Another hypothesis was that abnormal ToM-related activity while processing emotional information of faces would be restored after VRS. During the distress task, ToM-related activation was not observed in the super temporal sulcus and temporoparietal junction, but was seen in the dorsomedial prefrontal cortex, which is involved in mentalization (Amodio & Frith, Reference Amodio and Frith2006). This cortex is also a critical region for the self-referential process (Ochsner et al., Reference Ochsner, Beer, Robertson, Cooper, Gabrieli, Kihsltrom and D'Esposito2005). It has been reported that abnormal hyperactivation of this region in patients with SAD reflects the self-focused pathophysiology of SAD (Yoon et al., Reference Yoon, Kim, Shin, Choi, Lee and Kim2016). Taken together, the dorsomedial prefrontal activation observed in our study may be because self-focused patients over-recognized others' evaluation of themselves. However, similar to the biased perception through the insula, there is no evidence that neural processing for this over-recognition would be improved by VRS.
The visual cortices, such as the lingual gyrus and lateral occipital cortex, are of interest in SAD in that structural changes in those regions have been associated with symptom severity and self-focused attention (Frick et al., Reference Frick, Engman, Alaie, Björkstrand, Faria, Gingnell and Furmark2014). In our study, various visual cortices were activated when participants imagined making a speech in front of audiences and watched images of other people's faces. Among these regions, right lingual gyrus activity was inversely correlated with the initial LSAS avoidance score, suggesting that the more prone a person is to avoid social situations, the lower their visual activity. This may be due to severe patients' avoidance of gaze on facial stimuli. It was already reported that patients with SAD show weaker activity in visual cortices such as the fusiform gyrus and intraparietal sulcus when processing facial images than healthy controls (Gentili et al., Reference Gentili, Gobbini, Ricciardi, Vanello, Pietrini, Haxby and Guazzelli2008). In addition, among the activated regions only the right lateral occipital gyrus showed close association between changes in regional activity and social anxiety changes after VRS in regression analysis. This finding supports our hypothesis that activity of the visual cortices would increase with improved attention to faces after VRS. Therefore, the greatest effect of short-term training in terms of brain changes is reduction in the tendency to avoid social stimuli.
The VRS group showed a significantly decreased negative evaluation index for the self in the follow-up assessment compared with the initial assessment, but the WL group did not, suggesting that VRS may be effective at decreasing negative self-beliefs. In terms of speech evaluation, we hypothesized that this decrease in negative evaluation after VRS would be associated with changes in activity in various regions related to negative self-beliefs. In the imaging results, however, patients with SAD showed increased activity in only the bilateral ACC in response to negative self-evaluation for their own speech in the initial assessment, and this activity was significantly correlated with the LSAS anxiety and avoidance scores. These findings may reflect the conflicting emotions of negative evaluation, since ACC hyperactivity during negative evaluation of the self has also been observed in adults without SAD (Longe et al., Reference Longe, Maratos, Gilbert, Evans, Volker, Rockliff and Rippon2010; Miedl et al., Reference Miedl, Blechert, Klackl, Wiggert, Reichenberger, Derntl and Wilhelm2016). The ACC plays a crucial role in affective evaluation of performance monitoring and control-demanding processes for aversive signals (Braem et al., Reference Braem, King, Korb, Krebs, Notebaert and Egner2017). Because of the characteristic aspects of the ACC in SAD, activity in this region has been suggested as a candidate biomarker of treatment selection (Frick et al., Reference Frick, Engman, Wahlstedt, Gingnell, Fredrikson and Furmark2018). In our study, however, normalization of increased ACC activity following VRS was not observed despite decreased negative evaluation index for the self after VRS. In terms of negative self-beliefs, a 2-week self-training may have been too short for a change in subjective assessment to lead to a change in ACC function. In that sense, the correction of negative self-beliefs seems to be less useful than a decrease in social anxiety as an indicator of a short-term treatment mechanism in SAD.
Limitations
First, the initial number of study applicants was large, but the number of final samples was small due to strict selection criteria and high dropout rates. Second, we did not include a healthy control group, and thus it was not possible to know whether the behavioral or neuronal features seen at baseline were unique to patients with SAD and to confirm whether the neuronal changes after VRS reached a normal state. Third, although the participants' initial distress index in the distress task was above 0.5 despite the use of only expressions of happiness and neutrality, taking into account the general expressions of the audience when listening to a speech, there was insufficient evidence to support the interpretation of negative bias due to the lack of valence ratings on the faces used in the task. Forth, since the order of the two experimental tasks was not counterbalanced, order effects may have influenced the results. Finally, although VRS was developed for patients to carry out by themselves at home, they visited the VR clinic eight times to ensure correct performance during the study. Although there was no intervention by a therapist, it is not possible to rule out that the regular visits had an effect on the results.
Conclusion
In the aspect of behavioral measurements, VRS decreased the levels of social anxiety and avoidance behavior and weakened negative self-beliefs in patients despite a short training period. In terms of the neural basis of the training effect, however, the reductions in social anxiety and negative self-beliefs were not supported by the fMRI results because VRS induced no changes in the limbic- and ToM-related regions, such as the insula and dorsomedial prefrontal cortex, or in regions related to negative self-beliefs like the ACC. In contrast, VRS-induced improvements in the ability to pay attention to social stimuli without avoidance and even positively modulated emotional cues were based on functional changes in the visual cortices and thalamus. These short-term neuronal changes provide justification for VRS as a first interim intervention option for patients who are reluctant to receive formal treatment despite severe social anxiety.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720003098.
Acknowledgements
The authors would like to thank Dr Kang Joon Yoon and radiologic technologist Sang Il Kim from St. Peter's Hospital for their valuable technical support. The authors would also like to thank So Hyun Kyeong, Sa Rang Min, and Young Hoon Jung for help with data collection.
Financial support
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF-2016R1A2A2A10921744).
Conflict of interest
None.