Introduction
Major challenges in current psychiatry include poor understanding of the neuronal basis of psychiatric disorders and the lack of reliable biomarkers for diagnostic classification (Howes & Kapur, Reference Howes and Kapur2014). To be clinically useful, such biomarkers should work at the level of an individual patient. Functional brain alterations have been identified across the brain at the group level in psychotic disorders, but distributions of these functional findings in patients and control subjects overlap too much to be used in personalized medicine. Multivariate methods, supported by machine-learning algorithms, identify not only robust differences but more fine-grained patterns present in a group or condition and therefore provide more sensitive information. This approach holds potential for identifying an individual patient with a disorder or with a certain risk. Even more importantly, these methods may inform about the neuronal basis of such differences (Howes & Kapur, Reference Howes and Kapur2014).
Machine-learning-based classification has been used with brain imaging data to differentiate patients with psychotic disorders from healthy control subjects with considerable success, reaching accuracies of 81–90% (Orrù et al. Reference Orrù, Pettersson-Yeo, Marquand, Sartori and Mechelli2012). Earlier psychosis-related classification studies, both structural and functional, have mainly focused on patients with chronic schizophrenia (Davatzikos et al. Reference Davatzikos, Ruparel, Fan, Shen, Acharyya, Loughead, Gur and Langleben2005; Calhoun et al. Reference Calhoun, Maciejewski, Pearlson and Kiehl2008; Sun et al. Reference Sun, van Erp, Thompson, Bearden, Daley, Kushan, Hardt, Nuechterlein, Toga and Cannon2009; Shen et al. Reference Shen, Wang, Liu and Hu2010; Costafreda et al. Reference Costafreda, Fu, Picchioni, Toulopoulou, McDonald, Kravariti, Walshe, Prata, Murray and McGuire2011; Karageorgiou et al. Reference Karageorgiou, Schulz, Gollub, Andreasen, Ho, Lauriello and Georgopoulos2011; Koutsouleris et al. Reference Koutsouleris, Borgwardt, Meisenzahl, Bottlender, Möller and Riecher-Rössler2012; Arbabshirani et al. Reference Arbabshirani, Castro and Calhoun2014; Bleich-Cohen et al. Reference Bleich-Cohen, Jamshy, Sharon, Weizman, Intrator and Poyurovsky2014) whereas functional studies have typically used resting-state data (Shen et al. Reference Shen, Wang, Liu and Hu2010; Arbabshirani et al. Reference Arbabshirani, Castro and Calhoun2014) or working-memory and verbal-learning tasks (Calhoun et al. Reference Calhoun, Maciejewski, Pearlson and Kiehl2008; Costafreda et al. Reference Costafreda, Fu, Picchioni, Toulopoulou, McDonald, Kravariti, Walshe, Prata, Murray and McGuire2011; Bleich-Cohen et al. Reference Bleich-Cohen, Jamshy, Sharon, Weizman, Intrator and Poyurovsky2014). During resting state, brain activation patterns reflect, at least in part, ongoing thoughts and experiences (Andrews-Hanna et al. Reference Andrews-Hanna, Reidler, Huang and Buckner2010), which are likely to differ between patients and healthy control subjects in many uncontrolled ways. During tasks, the thoughts and experiences of patients and control subjects may be more similar than during resting-state, but a more narrow part of brain function is captured.
Until now, only a few classification studies have concentrated on first-episode -psychosis (FEP) patients, in whom the effects of long-term illness are not yet present, or on the acute psychotic state, which is the most characteristic feature of psychotic disorders. In FEP patients, the discriminative structural brain patterns have been noticeably scattered, without implicating any specific brain regions (Borgwardt et al. Reference Borgwardt, Koutsouleris, Aston, Studerus, Smieskova and Riecher-Rössler2013; Pettersson-Yeo et al. Reference Pettersson-Yeo, Benetti, Marquand, Dell'Acqua, Williams, Allen, Prata, McGuire and Mechelli2013) or, by combining T1 and diffusion tensor-imaging data, involved numerous regions, such as the amygdala, middle and superior frontal gyrus, parahippocampal gyrus, bilateral globus pallidum, uncinate fascicles, and cingulum (Peruzzo et al. Reference Peruzzo, Castellani, Perlini, Bellani, Marinelli, Rambaldelli, Lasalvia, Tosato, De Santi, Murino, Ruggeri and Brambilla2014). The findings have been more focused in functional magnetic resonance imaging (fMRI) studies during verbal-learning tasks, involving, among others, altered activation of regions overlapping with the default-mode network (Pettersson-Yeo et al. Reference Pettersson-Yeo, Benetti, Marquand, Dell'Acqua, Williams, Allen, Prata, McGuire and Mechelli2013). The default-mode network is a highly interconnected set of brain regions that is active during wakeful rest and includes the posterior cingulate cortex/precuneus, medial prefrontal cortex and bilateral inferior parietal regions (Raichle & Snyder, Reference Raichle and Snyder2007).
Psychotic disorders have been suggested to derive from dysfunctional integration of different brain regions (Bleuler, Reference Bleuler1911; Friston & Frith, Reference Friston and Frith1995; Friston, Reference Friston1998), and recent studies have also indicated that central hub nodes (i.e. regions with high connectivity to other brain regions) play a crucial role in the patients’ altered brain-network organization (Collin et al. Reference Collin, Kahn, de Reus, Cahn and van den Heuvel2014; Crossley et al. Reference Crossley, Mechelli, Scott, Carletti, Fox, McGuire and Bullmore2014; van den Heuvel & Fornito, Reference Van den Heuvel and Fornito2014). Therefore, when attempting to understand the complex dysfunction of information processing (Savla et al. Reference Savla, Vella, Armstrong, Penn and Twamley2013; Fatouros-Bergman et al. Reference Fatouros-Bergman, Cervenka, Flyckt, Edman and Farde2014), more complex stimuli may be useful.
As it is not possible to bring realistic everyday situations into a brain-imaging laboratory, we mimicked them by letting the participants view a movie. Hypothetically, during a movie stimulus, the experiences and thoughts between patients and control subjects are more similar than during resting-state and still simulate the richness of everyday information processing. Consistent with this view, movie stimuli have been shown to evoke high inter-subject synchronization of sensory and attention-related networks and to be associated with more sensitive and specific functional connections than resting state without any stimuli (Hasson et al. Reference Hasson, Nir, Levy, Fuhrmann and Malach2004; Bartels & Zeki, Reference Bartels and Zeki2005; Malinen et al. Reference Malinen, Hlushchuk and Hari2007; Nummenmaa et al. Reference Nummenmaa, Glerean, Viinikainen, Jääskeläinen, Hari and Sams2012; Lahnakoski et al. Reference Lahnakoski, Glerean, Jääskeläinen, Hyönä, Hari, Sams and Nummenmaa2014).
We hypothesized that compared to healthy control subjects, integrative processing of a multimodal movie stimulus would be altered in FEP patients. Assuming that such a difference is related to the neuronal basis of reality distortion that defines psychosis (i.e. the diminished ability to understand what is real and what is not), we also expected severity of positive psychotic symptoms and fantasy content of the movie to be associated with these alterations.
Method and materials
Subjects
We included 78 participants (46 patients, 32 controls) from the Helsinki Early Psychosis Study. The patient group consisted of patients treated for FEP in hospitals and outpatient clinics of the Helsinki University Hospital. Control subjects were recruited using the Finnish Population Information System. Psychosis was defined as psychotic symptoms lasting over 24 h, with scores of ⩾4 points on the Unusual thought content or Hallucinations in the Brief Psychiatric Rating Scale Extended (BPRS-E; Ventura et al. Reference Ventura, Green, Shaner and Liberman1993). BPRS was conducted as a semi-structured interview including the first occurrence of the symptoms. This information was also checked from the medical records and in the Structured Clinical Interview for DSM-IV Disorders – Axis I (SCID-I) interview. We excluded patients with previous psychotic episodes or neurological disorders, subjects with substance-induced psychotic disorders, as well as subjects not eligible for magnetic resonance imaging. Diagnoses were based on the Research Version of SCID-I (First et al. Reference First, Spitzer, Gibbon and Williams2007). Diagnostic assessments were done at baseline for control subjects and at 2 months for patients. For patients, medical records were also used in the diagnostic assessment.
Functional magnetic resonance images were collected at baseline for all subjects. The BPRS-E interview was done at baseline for all patients and at the 2-month follow-up for those patients who remained in the study (n = 36). At baseline, symptom severity was assessed for the worst lifetime period and for the past week. At 2 months, symptom severity was assessed for the past week and for the worst period after baseline measurement (i.e. worst period during the previous 2 months). The study protocol was approved by the Ethics Committee of the Hospital District of Helsinki and Uusimaa, and written informed consent was obtained from each subject before participation.
Image acquisition
We conducted the fMRI recordings at the AMI Centre of Aalto NeuroImaging, Aalto University. Due to the prescheduled change of the AMI Centre scanner in the middle of the study, 26 subjects (14 patients, 12 controls) were scanned using a GE Signa VH/i 3-T scanner (scanner 1) with a 16-channel head coil and 54 subjects (33 patients, 21 controls) using a Siemens Skyra 3-T scanner (scanner 2) with a 32-channel head coil. For both scanners, the parameters for functional blood-oxygenation-level-dependent (BOLD) imaging were 36 slices, slice thickness 4 mm, matrix size 64 × 64, echo time 30 ms, repetition time 1.8 s, flip angle 75°, and field of view 24 cm. Slices were aligned according to the line connecting the anterior and posterior commissures. Also T1- and T2-weighted structural images were collected (see Supplementary material for imaging parameters).
Stimuli
During the fMRI recording, the participants watched five episodes with both realistic and fantasy content from the fantasy film Alice in Wonderland (Tim Burton, Walt Disney Pictures, 2010; Finnish soundtrack) for 7 min 20 s (245 volumes). This film used a combination of live action and modern computer animation. Alice, played by the actress Mia Wasikowska, was present in each of the selected scenes. While there were animated characters in each scene as well, the movie clip started from a scene that was filmed in real surroundings but continued in Wonderland where the background was animated (see Supplementary Table S1 for detailed descriptions and durations of the scenes). Scenes were projected without breaks and using Presentation software (Neurobehavioral Systems Inc., USA) to a semi-transparent screen, visible to the subject via a mirror placed on the head coil. The soundtrack was delivered to the subject via plastic tubes connected to earplugs. Foam pads were added in and around the head coil for noise cancellation, and the sound level was adjusted to be comfortable for each subject.
Data preprocessing
An experienced radiologist examined all structural MRIs. One control subject was excluded due to neurological findings. The fMRI data were preprocessed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/). To correct for movement, each subject's images were realigned to the first image in the series using rigid-body transformations. Realigned images were normalized using the Montreal Neurological Institute template to allow between-subject comparison. To reduce the effect of individual functional differences and to better fit the assumption of normal distribution, images were smoothed with an 8-mm full-width-at-half-maximum Gaussian kernel.
Multiple linear regression was used to remove the constant offset, linear drift, and components correlating to the subject's motion parameters. After the regression, we employed a novel maxCorr method (Pamilo et al. Reference Pamilo, Malinen, Hotta and Seppä2015) to further remove artifacts from the data. From group-fMRI data maxCorr finds subject-specific components that are typically artifacts related to subject movement and physiological noise, such as cardiac cycle and respiration. The subject-specific components are found in the maxCorr analysis by using the data of the other subjects (patients and control subjects) as a reference. We removed a constant number of 10 maxCorr components per subject to ensure equal processing across subjects and to allow thorough removal of artifacts and other subject-specific (i.e. not shared) components. The preprocessed and cleaned data were then amplitude-normalized (i.e. L2-norm was set to 1) and convolved with a filter W = [–1 –1 –1 –1 –1 –1 6]/6 in the time domain. The values in the convolved time-courses can be interpreted as measuring the change from a baseline defined by the six preceding time-points. It should be noted that maxCorr was run separately for each cross-validation fold, as explained later in more detail, to guarantee that it would not bring information from the test set into the learning phase.
Machine-learning overview
In machine-learning analysis, a classifier learns a set of best-classifying parameters from a training data set. After learning these parameters, the classifier can predict the class in a novel sample. In other words, the classifier is a model of the relationship between multiple data features and the class labels in the training set. In the case of fMRI, the input data, called a feature vector, are typically voxel values and the output classes are different groups of subjects such as patients and control subjects. After the learning phase, the performance of the classifier is evaluated using a test set that is independent of the training set.
To provide for independent test and training samples, we divided the subjects into 32 groups (2 or 3 subjects per group, each containing at least one patient and one control subject), and repeated the classification process 32 times. Each group served once as a test set and other times it was included in the training set. This is a widely used procedure, called stratified k-fold cross-validation (Kotsiantis, Reference Kotsiantis, Maglogiannis, Karpouzis, Wallace and Soldatos2007) with k = 32 in our case. An alternative to cross-validation would be to split the sample, use the first part in training and the second in testing. However, in our case this might not bring any advantages but only decrease the training-set size.
In each repetition, the feature selection and the best-classifying parameters were based on the training-set data only, and then used to predict the classes of the subjects in the test set (Fig. 1). The training and test data were independent.
Formation of feature vectors
From the convolved data, a two-sample t value (unequal sample sizes, unequal variances) was calculated across subjects in the training set, to compare patients with control subjects for each time-point and voxel. These t values produced a n conv × n vox matrix, where n conv is the number of time-points after the time-domain convolution and n vox is the number of voxels. Choosing the number of the most significant time points and voxels was done by the commonly used method of taking the square root of the total amount of data. For each time-point, the maximum absolute t value was found across all voxels and then top N = round(√n conv) = 15 time-points were picked for further processing. Next, a similar selection was performed for the voxels by finding the maximum absolute t value within the selected N time-points for each voxel and then picking the top M = round(√n vox) = 463 voxels. This selection of N time-points and M voxels was used for reading the values from the convolved time series to create a single N × M = 6945-samples-long feature vector for each subject. These feature vectors were then given as an input to the classifiers in the machine-learning stage. Please note that the selected time-points and voxels could vary from one cross-validation fold to another although their amount remained the same.
Classification
For classification, we employed scikit-learn package (Pedregosa et al. Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhoffer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot and Duchesnay2011) (written in Python), consisting of feature compression and the actual machine-learning classification. We employed 5-component PCA for feature compression and compared several classifiers to be sure that the results generalized and that our approach did not just by chance work well for one type of a classifier. We evaluated the performances of logistic regression, support-vector classifier (linear kernel, parameter C = 0.1, 1.0, or 10.0), decision-tree classifier (max. depth 3), linear discriminant analysis classifier, and quadratic discriminant analysis classifier. p Values for accuracy were calculated as max (P1, P2) ⩾ 0.5 where P1 is the proportion of patients and P2 = (1 – P1) is the proportion of control subjects.
Cross-validation
For each cross-validation fold, the training set was separately processed without the knowledge of the test-set data. The processing started with maxCorr, followed by selection of the time-point and voxel indices, as described above, to form the feature vectors. These feature vectors were then used for training the classifiers, i.e. to obtain the classifier-specific parameters to replicate the known subject classes of the training set with the best possible accuracy.
After the training was finished, the feature selection and the classifier parameters were fixed and then employed to predict the test-set subjects’ classes, unknown to the algorithm. The test-set subjects were first individually processed while the training-set data served as a reference for the maxCorr computation to remove the subject-specific components. One should note, that this preprocessing influences only a single test-set subject data at a time, and hence the test-set subjects cannot affect each other. Furthermore, this processing cannot affect the training stage as the training had already been performed and the classifiers were fixed. Then, the indices selected from the training set were employed to form the feature vector and, finally, the class was predicted by the classifiers trained and fixed in the earlier training stage.
For simple illustration, one could consider all of the above processing as a single ‘black box’. First, the training data enters with the known class labels and the black box learns what it can learn. Next, the data of a single test-set subject is fed into the black box, without the correct class, and the box outputs its prediction. Finally, this prediction is compared with the known class outside the box.
Analysis of association between classification sensitivity and severity of positive symptoms
To test the hypothesis that classification sensitivity was related to the severity of positive symptoms, we calculated for each patient sum scores of BPRS-E items: 10 (hallucinations), 11 (unusual thought content, i.e. delusions), 12 (bizarre behaviour), and 15 (conceptual disorganization, i.e. positive formal thought disorder). We then calculated a sum score of correct classifications for each subject in the training set (see Supplementary material for a detailed description of the formation of symptom and classification scores) and used Spearman rank correlation to test for the association between classification sensitivity and positive symptoms. Analysis was done using SPSS version 22.0 (IBM Corp., USA). Because statistically significant differences were found between the groups, age was controlled for by calculating partial correlation in SPSS according to Conover (Reference Conover1999).
Acquisition of fantasy ratings
To acquire an online evaluation of the fantasy content of the movie, we asked an independent sample of 17 healthy control subjects (seven female, mean age 26.5 years) to rate the events in the movie. Subjects were sent a link to a web-based movie rating software (https://git.becs.aalto.fi/eglerean/dynamicannotations/tree/master) (Nummenmaa et al. Reference Nummenmaa, Glerean, Viinikainen, Jääskeläinen, Hari and Sams2012) and asked to continuously rate how likely it is, that the events in the movie would happen in real life. The rating was done by sliding a bar on a scale visible next to the movie. Ratings were recorded on a scale of 0 to 1 every 200 ms and converted to match the imaging repetition time (TR). Mean values over all subjects were then calculated, resulting in a mean fantasy content score for each volume.
Post-hoc analysis of association between functioning of discriminative voxels and fantasy ratings
We extracted subject-level parameter estimates from SPM's general linear model for fantasy-related activation from the voxels identified as discriminative features in the classification process. To control for some visual changes in the movie, mean optical flux, computed on the basis of movement of the objects in the movie (see Viinikainen et al. Reference Viinikainen, Glerean, Jääskeläinen, Kettunen, Sams and Nummenmaa2012), was entered into the analysis as a nuisance covariate. The extracted estimates were entered into one-sample t tests for patient and control subject groups, respectively, to test for the association of fantasy ratings with precuneus functioning. Between-group differences were tested with a two-sample t test. As the machine learning features were selected from data combined from both scanners, we used combined data also in this post-hoc analysis.
Post-hoc analysis of association between fantasy ratings and the functioning of discriminative voxels
To better understand how the different time points and voxels contribute to the classification, we computed time-point-wise spatial maps of the most discriminative voxels. In each fold, the feature vector was reshaped into an N × M matrix. The minimum absolute t value by which a voxel was included in the feature vector, was used as a threshold to form the spatial maps for the discriminating time points. These time-point-wise spatial maps were then visually compared to the map of all discriminative voxels.
Control analysis for movement
We used framewise displacement (Jenkinson et al. Reference Jenkinson, Bannister, Brady and Smith2002) to calculate a score for movement for each time point (1 time point = 1 repetition time) and a mean score across all time points. We used Mann–Whitney U test to compare movement scores between patient and control subject groups for each time point and the mean movement across all time points.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Results
Descriptives
Table 1 presents detailed descriptions of all participants and the mean positive symptom scores for the patients. The control subjects were on average 2.7 years older than the patients (p = 0.046). Supplementary Table S2 gives the patients’ antipsychotic medication. Four (8.7%) patients were not using antipsychotics at the time of baseline assessment.
s.d., Standard deviation; NOS, not otherwise specified.
Classification
Table 2 shows classification accuracies of all classifiers. To demonstrate the effects of the maxCorr artifact-removal method, we present the classification results with and without maxCorr cleaning. With maxCorr, we reached a mean accuracy of 77.5% (p = 1.23 × 10−6), and the highest accuracy was obtained with quadratic discriminant analysis (QDA) (79.5%, p = 5.69 × 10−8). maxCorr improved the classification accuracy in all classifiers (on average by 31.7%) from non-significant to significant.
SVM, Support vector machine; DT, decision tree; LR, logistic regression; QDA, quadratic discriminant analysis; LDA, linear discriminant analysis.
Discriminative voxels
Fig. 2a shows a map of all voxels selected into the feature vectors across all folds. For the voxels best discriminating patients from control subjects, 194 were selected into the feature vectors in all 32 folds. Out of these 194 voxels, 136 formed consistent bilateral clusters in the ventral precuneus (Fig. 2b ). The remaining voxels comprised of single voxels near the precuneus clusters, a two-voxel cluster in the cingulate gyrus and a four-voxel cluster in the paracentral lobule. Time-point-wise analysis showed that the precuneus voxels that best discriminated the groups were most clearly involved in discrimination during the three consecutive time points of 211–213. During these time points Alice is talking to the floating, detached head of the Cheshire Cat, which then starts to spin and grow a body (Supplementary Table S1, scene 5).
Clinical data
Successfully fitted classification frequencies were positively correlated with positive symptom scores of current symptoms at baseline (ρ = 0.30, p = 0.024), current symptoms at 2 months (ρ = 0.32, p = 0.032), and worst period after baseline symptoms at 2 months (ρ = 0.32, p = 0.031); i.e. the more severe a patient's symptoms were, the more correct classifications over all 32 sets they had. We found no correlation between classification success and lifetime worst-period symptom scores.
Fantasy content
The fantasy content of the movie was related to the functioning of the precuneus both in patients (t = 5.183, p < 0.001) and in control subjects (t = 7.331, p < 0.001). Between-group comparison showed the relationship to be stronger in control subjects than in patients (t = 2.130, p = 0.036). Adding optic flux into the model as a nuisance covariate had no effect on the findings. The mean movement between patients and control subjects did not differ systematically (U = 742, p = 0.951). However, the movement was statistically significantly different between the groups at six time points out of 245 and we did not include any of these time points into the feature vectors.
Discussion
To our knowledge, this is the first study to use machine-learning classification with fMRI data collected during naturalistic stimulation in an attempt to distinguish FEP patients, or any psychiatric patient group, from healthy control subjects. Although the classification accuracy reached here did not outperform earlier machine-learning findings, the present results on the role of the precuneus in the natural-stimulus-related discrimination are valuable for understanding functional brain alterations in psychotic disorders. By using a movie stimulus with high validity for natural information processing (Hasson & Honey, Reference Hasson and Honey2012) and adding a novel data-cleaning method (Pamilo et al. Reference Pamilo, Malinen, Hotta and Seppä2015), we found this brain region to have major contribution to classification and to be associated with both fantasy content of the movie and the severity of positive symptoms. These findings add to earlier resting-state (Shen et al. Reference Shen, Wang, Liu and Hu2010; Arbabshirani et al. Reference Arbabshirani, Castro and Calhoun2014) and task-related (Pettersson-Yeo et al. Reference Pettersson-Yeo, Benetti, Marquand, Dell'Acqua, Williams, Allen, Prata, McGuire and Mechelli2013) machine-learning studies on FEP patients. While resting-state studies reveal activation of the idle brain, and tasks successfully capture differences in the task-specific brain function (Goghari et al. Reference Goghari, Sponheim and MacDonald2010), studies using movie stimuli may help to unravel the brain systems that are most involved in everyday integrative information processing. Selection of movie contents relevant for the study question, such as fantasy to discriminate psychotic patients, may further advance this understanding.
The precuneus is considered to be a rich club hub node (van den Heuvel & Sporns, Reference van den Heuvel and Sporns2011) and therefore play a crucial role in integrative information processing. It is located in the medial surface of the parietal lobule and can be divided into dorsal-anterior, dorsal-posterior, and ventral parts (Margulies et al. Reference Margulies, Vincent, Kelly, Lohmann, Uddin, Biswal, Villringer, Castellanos, Milham and Petrides2009; Zhang & Li, Reference Zhang and Li2012), the ventral part overlapping with the discriminative region identified in our results. Together with the posterior cingulate cortex, the precuneus is a part of the posterior area of the default-mode network (Raichle & Snyder, Reference Raichle and Snyder2007), considered to have complex interactions with several other functional networks (Leech et al. Reference Leech, Braga and Sharp2012; Utevsky et al. Reference Utevsky, Smith and Huettel2014), and to act as a central hub in the brain's large-scale information processing (Leech & Sharp, Reference Leech and Sharp2014).
Functionally, the precuneus is involved in high-level cognitive processes, such as episodic memory retrieval (Shallice et al. Reference Shallice, Fletcher, Frith, Grasby, Frackowiak and Dolan1994; Tulving et al. Reference Tulving, Kapur, Markowitsch, Craik, Habib and Houle1994), self-related processing (Kircher et al. Reference Kircher, Senior, Phillips, Benson, Bullmore, Brammer, Simmons, Williams, Bartels and David2000), and the processing of meaning and knowledge acquired through experience (Mestres-Missé et al. Reference Mestres-Missé, Càmara, Rodriguez-Fornells, Rotte and Münte2008; Binder et al. Reference Binder, Desai, Graves and Conant2009). In a review summarizing both the anatomy and the functionality of the precuneus, Cavanna & Trimble (Reference Cavanna and Trimble2006) proposed the anterior part of the region to be involved in mental imagery, a view later supported by a meta-analysis of brain-imaging studies of false beliefs and visual perspective taking (Schurz et al. Reference Schurz, Aichhorn, Martin and Perner2013). The precuneus on its own and as a part of the default-mode network has shown altered metabolism, structure, functioning, and connectivity in patients with psychotic disorders (Lynall et al. Reference Lynall, Bassett, Kerwin, McKenna, Kitzbichler, Muller and Bullmore2010; Bora et al. Reference Bora, Fornito, Radua, Walterfang, Seal and Wood2011; González-Hernández et al. Reference González-Hernández, Pita-Alcorta, Padrón, Finalé, Galán, Martínez, Diáz-Comas, Samper-Gonzáles, LEncer and Marot2014; Mashal et al. Reference Mashal, Vishne and Laor2014; van den Heuvel & Fornito, Reference Van den Heuvel and Fornito2014; Zhang et al. Reference Zhang, Qiu, Yuan, Ma, Ye, Yu, Hu, Dong and Wang2014) and during an induced psychedelic state in healthy subjects (Carhart-Harris et al. Reference Carhart-Harris, Leech, Erritzoe, Williams, Stone, Evans, Sharp, Feilding, Wise and Nutt2013, Reference Carhart-Harris, Erritzoe, Williams, Stone, Reed, Colasanti, Tyacke, Leech, Malizia, Murphy, Hobden, Evans, Feilding, Wise and Nutt2012; Palhano-Fontes et al. Reference Palhano-Fontes, Andrade, Tofoli, Santos, Crippa, Hallak, Ribeiro and Araujo2015). A role of the precuneus in psychosis-relevant integrative processing is consistent with long-held views of psychoses as disintegrative disorders (Friston, Reference Friston1998; Friston & Frith, Reference Friston and Frith1995). Earlier research has shown that during movie viewing the brain activity across subjects is highly synchronized, especially in early visual and auditory regions (Hasson & Honey, Reference Hasson and Honey2012). If the discrimination would have resulted from the physical features of the visual and auditory stimuli, these regions would have been expected to be involved in the classification.
The correlation between positive symptom severity and classification success suggests that naturalistic information processing in the precuneus is related to the most typical symptom class of early psychosis. The relationship between fantasy ratings and the functioning of the discriminative voxels further suggests that our results are associated with (reality-distortion-related) stimulus content. Considering the known functional roles of the precuneus related to episodic memory discussed above, the association to fantasy might be a reflection of the evaluation of ‘could this happen in real life’ in the context of personal experience and meaning. Moreover, patients might relate to the events and characters in the movie differently, for example in relation to their own hallucinations and delusions.
By using maxCorr to remove further artifacts from the fMRI data after standard preprocessing steps, we were able to increase classification accuracy from statistically non-significant to statistically highly significant. The classification accuracy was consistent across all tested classifiers. We take the result to imply that maxCorr provides a useful preprocessing step to remove artifacts and thereby to diminish across-subjects variance. Such preprocessing seems to be especially beneficial before machine-learning classification and is well suited for fMRI data group analysis in general. Similar accuracy across classifiers suggests that our results are independent of the applied classification method.
Some limitations regarding the study should be discussed. The control subjects were on average 2.7 years older than the patients. Despite this statistically significant difference, the age difference was relatively small, and it is unlikely that the difference had a marked impact on the observed classification accuracy of 74.4–79.5%. Age was controlled for in our correlation analysis, and it did not correlate with symptom scores or classification success. Medication was not controlled for in our analysis, but according to current literature (Abbott et al. Reference Abbott, Jaramillo, Wilcox and Hamilton2013; Röder et al. Reference Röder, Dieleman, van der Veen and Linden2013), medication effects on the BOLD signal are largely restricted to normalizing the signal in patients with respect to the control subjects. Thus, we could expect the classification accuracy to be better in an unmedicated patient group. Although the fantasy content of the movie was associated with the activation of the discriminative voxels, other characteristics might vary together with fantasy and therefore affect the results. For example, animated characters were introduced at the same time as the events in the movie become increasingly unrealistic and, in principle, patients might attend to these characters differently. The present data do not allow to exclude such a contribution with certainty although the precuneus cluster did not coincide with typical activation nodes of the fronto-parietal attention networks (Vossel et al. Reference Vossel, Geng and Fink2014), nor with the visual cortices that typically show attentional modulation during visual stimuli (Wojciulik & Kanwisher, Reference Wojciulik and Kanwisher2000). Basic sensory processing of animated characters could differ also independently of attention, although then one would expect discriminative voxels in the visual cortex. Even if such voxels were not found in the present study, better controlled studies are needed to disentangle contribution of different factors that might explain why the precuneus differentiates FEP patients and healthy control subjects during naturalistic stimulation. Although we did identify a scene during which the differentiation appeared most prominent, exact contribution of this part of the movie remains to be studied.
Our findings are the first to show abnormalities in precuneus functioning during naturalistic information processing in FEP patients. In comparison with structural data or simplistic tasks, naturalistic stimuli elicit integrative and ecologically valid processing that is of high interest in the research of psychotic disorders. Furthermore, the findings suggest an association of the precuneus alteration to evaluation of reality and to reality distortion symptoms. Precuneus is known to be a central hub for the integration of self- and episodic-memory-related information, and thus, its dysfunction might give insights into the neuronal basis of the psychotic symptoms. Our findings indicate the usefulness of naturalistic stimuli, such as movies, in machine-learning classification analysis based on brain-imaging data after sufficient preprocessing, and they call for future research on the role of the precuneus in psychosis.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291716002609.
Acknowledgements
We thank all participants. We thank Tuula Mononen and Sanna Järvinen for coordinating the data collection and interviews, Marjut Grainger for data management and Marita Kattelus for technical assistance. We also thank Enrico Glerean and Lauri Nummenmaa for assistance with acquiring movie ratings and Juho Kettunen for calculating the optic flow.
This work was supported by the Sigrid Jusélius Foundation (J.S.), the Finnish Cultural Foundation (J.S. and T.M.), European Union Seventh Framework Programme (FP7/2007-2013), grant agreement no. 602478 (J.S.), the European Research Council (Advanced Grant no. 232946 to R.H.), the Jalmari and Rauha Ahokas Foundation (T.M.), the Doctoral Program Brain and Mind of the University of Helsinki (T.M. and S.P.), the Academy of Finland (no. 278171 to J.S. and no. 251155 to T.T.R.), the Finnish Medical Foundation (T.T.R.), and the Louis-Jeantét Foundation (R.H.).
Declaration of Interest
None.