Introduction
People born in winter and early spring are at an elevated risk for schizophrenia (Bradbury & Miller, Reference Bradbury and Miller1985; Boyd et al. Reference Boyd, Pulver and Stewart1986; Baron & Gruen, Reference Baron and Gruen1988; Mortensen et al. Reference Mortensen, Pedersen, Westergaard, Wohlfahrt, Ewald and Mors1999). Although the effect is small, with an increase in the risk of about 10%, this ‘season of birth’ effect is one of the most robust findings in schizophrenia epidemiology (Davies et al. Reference Davies, Welham, Chant, Torrey and McGrath2003). It has also been influential in developing hypotheses of schizophrenia pathogenesis, forming one of the central tenets of the viral infection hypothesis of the disorder (Torrey et al. Reference Torrey, Torrey and Peterson1977), as well as other less intensively investigated putative mechanisms such as vitamin D deficiency during foetal development (McGrath, Reference McGrath1999). However, at present, the mechanisms underpinning the season of birth effect are not known.
Schizophrenia is also known to be highly heritable (Cardno & Gottesman, Reference Cardno and Gottesman2000; Sullivan et al. Reference Sullivan, Kendler and Neale2003) and polygenic (International Schizophrenia et al. Reference Purcell, Wray, Stone, Visscher, O'Donovan and Sullivan2009; Purcell et al. Reference Purcell, Moran, Fromer, Ruderfer, Solovieff and Roussos2014). Genomic studies have identified large numbers of risk alleles that contribute to the risk of the disorder, with risk being conferred by large numbers of variants spanning the full spectrum of population frequencies (Rees et al. Reference Rees, Walters, Georgieva, Isles, Chambert and Richards2014; Schizophrenia Working Group of the Psychiatric Genomics, 2014; Singh et al. Reference Singh, Kurki, Curtis, Purcell, Crooks and McRae2016). The relative contributions of alleles of various frequencies are not fully resolved, but recent studies estimate that common alleles, captured by genome-wide association study (GWAS) arrays, capture between a third and one half of the genetic variance in liability (Schizophrenia Working Group of the Psychiatric Genomics, 2014).
A key assumption underlying research into the causes of the season of birth effect is that season acts as a proxy for one or more environmental exposures (e.g. virus infection). While an environmental origin for the season of birth effect seems the most plausible explanation, it is also possible that the effect is the result of gene–environment correlation. There are already examples where correlation of genetic and apparent environmental risks have been observed in psychiatric research; one such example, the link between maternal smoking and attention-deficit hyperactivity disorder (ADHD ), is at least partly driven by shared genetic liability to both ADHD and smoking rather than by exposure to smoking per se (Thapar et al. Reference Thapar, Rice, Hay, Boivin, Langley and van den Bree2009). The link between obstetric complications and schizophrenia may be confounded by the correlation between genetic liability to both (Ellman et al. Reference Ellman, Huttunen, Lonnqvist and Cannon2007). This is also likely to apply to cannabis and psychosis (Power et al. Reference Power, Verweij, Zuhair, Montgomery, Henders and Heath2014; Vaucher et al. Reference Vaucher, Keating, Lasserr, Gan, Lyall and Ward2017) and indeed to the link between all substance use disorders and psychosis (Adan et al. Reference Adan, Arredondo, Capella, Prat, Forero and Navarro2017).
Gene–environment correlations are sometimes classified as passive, active and evocative; regardless of the nature of the correlation, each predicts people born in the winter should have elevated genetic liability to the disorder, even those who do not manifest the disorder. In contrast, if the association between winter birth and schizophrenia is not the result of gene-environment correlation, there should be no link between winter birth and genetic liability to the disorder. Evocative correlation is said to occur when an individual's behaviour evokes an environmental response and is therefore not clearly applicable to the phenomenon under investigation. The passive gene-environment correlation could, in principle, operate if parents with elevated trait liability have seasonal patterns of sexual activity that favour winter birth. If this is the case, then their offspring are expected to have elevated liability to the disorder (on average the mean liability of the parents which will be higher than the population as a whole) and liability to winter birth. While this seems a priori unlikely, previous studies have shown that high schizophrenia trait liability is associated with a wide range of behaviours, amongst which of potential relevance are seasonal fluctuations in mood and activity levels (Byrne et al. Reference Byrne, Raheja, Stephens, Heath, Madden and Vaswani2015). Confounding through active gene–environment correlation is also possible if enhanced genetic risk for the disorder impacts upon seasonal patterns of foetal loss. Given the potential impacts on prevention and treatment of detecting modifiable environmental exposures, it is important to rigorously exclude alternative explanations for the season of birth effect on schizophrenia risk.
Population studies using family history have generally suggested that the season of birth effect is not a manifestation of genetic liability (Hettema et al. Reference Hettema, Walsh and Kendler1996; Suvisaari et al. Reference Suvisaari, Haukka and Lonnqvist2004). However, the majority of people who develop schizophrenia do not have a history of the disorder in a close relative (Svensson et al. Reference Svensson, Lichtenstein, Sandin, Oberg, Sullivan and Hultman2012). Advances in molecular genetics now allow genetic liability to be directly estimated by individuals regardless of their affected status or family history. Liability conferred by common risk alleles can be estimated through a process known as polygenic risk scoring; in a given individual, their polygenic risk score (PRS) represents the burden of common risk alleles carried by that individual. PRS studies have repeatedly been demonstrated to provide a useful index of genetic liability to the disorder (International Schizophrenia et al. Reference Purcell, Wray, Stone, Visscher, O'Donovan and Sullivan2009; Ripke et al. Reference Ripke, O'Dushlaine, Chambert, Moran, Kahler and Akterin2013; Schizophrenia Working Group of the Psychiatric Genomics, 2014), and are increasingly being applied to investigate whether environmental risk factors operate independently of genetic liability (Power et al. Reference Power, Verweij, Zuhair, Montgomery, Henders and Heath2014).
Here, we implement PRS analysis of the UK Biobank (UKBB) (Sudlow et al. Reference Sudlow, Gallacher, Allen, Beral, Burton and Danesh2015) to determine whether season and month of birth are associated with genetic risk for schizophrenia. The UKBB is a large prospective cohort of more than half a million residents of the UK for which genetic data and seasonality of birth data are available. We generated schizophrenia PRS for every participant in the UKBB cohort (June 2015 release, N = 136 538) and tested whether this score was associated with month or season of birth. As a secondary test of the plausibility of the season of birth being a genetically correlated confound, we also conducted a GWAS of the season of birth, aiming to estimate whether this phenotype is heritable.
Rare alleles in the form of pathogenic copy number variants (CNVs) are also known make a contribution to schizophrenia. Although the contribution to liability from CNVs at a population level (Purcell et al. Reference Purcell, Moran, Fromer, Ruderfer, Solovieff and Roussos2014) is much smaller than that of common alleles, for completeness we also tested whether the frequencies of 93 pathogenic CNVs that have been linked with neurodevelopmental disorders (Dittwald et al. Reference Dittwald, Gambin, Szafranski, Li, Amato and Divon2013; Coe et al. Reference Coe, Witherspoon, Rosenfeld, van Bon, Vulto-van Silfhout and Bosco2014; Kendall et al. Reference Kendall, Rees, Escott-Price, Einon, Thomas and Hewitt2017) were associated with season of birth.
UKBB obtained informed consent from all participants and this study was conducted under generic approval from the NHS National Research Ethics Service (approval letter dated 13 May 2016, Ref 16/NW/0274) and under UKBB approvals for applications #6553 (Smith) and # 14421 (Kirov).
Material and methods
To define risk alleles, we used the largest available schizophrenia GWAS comprising a meta-analysis of two large studies (Pardinas 2017, Schizophrenia Working Group of the Psychiatric Genomics, 2014) which included 40 675 schizophrenia cases and 64 643 controls (Pardiñas et al. Reference Pardiñas, Holmans, Pocklington, Escott-Price, Ripke and Carrera2018). As PRS requires genome-wide data, we did not include the extension dataset provided to the PGC for selected SNPs by deCODE genetics. Both schizophrenia datasets were imputed using the SHAPEIT/IMPUTE2 software (Howie et al. Reference Howie, Fuchsberger, Stephens, Marchini and Abecasis2012; Delaneau et al. Reference Delaneau, Zagury and Marchini2013) with a combination of the 1000 Genomes phase 3 (1KGPp3) and UK10 K datasets as a reference panel.
For the UKBB study, given that the schizophrenia GWAS we used to define risk alleles was of primarily European Ancestry, we restricted the sample to those who self-reported as being of white UK ancestry (n = 136 538). For constructing PRS, we downloaded data that are publically available (https://www.med.unc.edu/pgc/results-and-downloads). We included autosomal SNPs that passed stringent quality control criteria [minor allele frequencies (MAF) ⩾0.01] and imputation quality score greater than or equal to 0.9. This resulted in 5 471 613 SNPs. Using the UKBB genotypes, we pruned the SNPs keeping those which are the most significantly associated with schizophrenia in the region while excluding SNPs at which the genotypes are correlated with the selected SNPs with r 2 ⩾ 0.2. A physical distance threshold for pruning SNPs was set to 1 Mb and p value threshold was 0.5. After pruning, 118 302 independent SNPs remained. We selected markers, based upon significance thresholds, to construct a polygenic score in the UKBB data. The PRS was calculated from the effect size-weighted sum of associated alleles within each subject. PRS was standardised by subtracting the population mean for PRS and dividing by the standard deviation.
As a test of robustness, we constructed PRS based on risk alleles passing a range of schizophrenia association thresholds in the PGC2 + CLOZUK data (e.g. significant at p ⩽ 0.01, 0.05, 0.1, …, 0.5). The primary analysis was based on p < 0.05 as this is the threshold that currently maximally captures polygenic risk (Schizophrenia Working Group of the Psychiatric Genomics, 2014). The PRSs of people born in each month/season were compared to those born in January/winter (baseline) using linear regression analysis with season/month coded as a factor (glm() function in R). All analyses in the UKBB were adjusted for the array (the UKBB used two different arrays) and the first 8 principal components (PCs), reflecting underlying stratification in the sample due to population and/or genotyping differences. The first eight PCs, out of 15 available in the Biobank, were selected after visual inspection of each pair of PCs, taking forward only those that resulted in multiple clusters of individuals [see (Smith et al. Reference Smith, Escott-Price, Davies, Bailey, Colodro-Conde and Ward2016) for detail].
AVENGEME provides a set of R functions that allow power for PRS analyses to be calculated (Dudbridge, Reference Dudbridge2013). For this to be applicable, a priori, this method assumes that (1) there is a non-zero SNP-heritability for the season of birth and (2) there is a genetic correlation between SZ and the season of birth. As we show in the present study that both assumptions are violated, this widely used approach is not applicable. However, to illustrate that we had high power to detect the sought after effects, we calculated the power of our PRS analysis for the reported effect size OR 1.07 (Davies et al. Reference Davies, Welham, Chant, Torrey and McGrath2003) comparing winter/spring v. summer/autumn births with pwr.norm.test() function in R.
Genome-wide association analysis (GWA) of the season of birth was performed using a binary phenotype defined as ‘0’ for individuals born in winter and spring (December–May) and ‘1’ for those born in summer and autumn (June–November). Association analysis was conducted using logistic regression with an array, and the first eight principal components as covariates (as described above). Genotype dosage data were initially converted to the most probable genotype format (with the probability >0.9), then filtered by removal of SNPs with Hardy–Weinberg equilibrium p < 10−6, MAF <0.01, info <0.4, data on <95% of the sample after excluding genotype calls made with less than 90% posterior probability after which 8 989 945 variants were retained. LD-Score regression analysis (Bulik-Sullivan et al. Reference Bulik-Sullivan, Loh, Finucane, Ripke, Yang and Patterson2015) was employed to estimate SNP-based heritability in this dataset.
The numbers of subjects with and without pathogenic CNVs born in winter/spring v. summer/autumn were compared with a χ2 test. The list of 93 pathogenic CNVs was compiled from two widely accepted sources (Dittwald et al. Reference Dittwald, Gambin, Szafranski, Li, Amato and Divon2013; Coe et al. Reference Coe, Witherspoon, Rosenfeld, van Bon, Vulto-van Silfhout and Bosco2014) as we have previously reported (Kendall et al. Reference Kendall, Rees, Escott-Price, Einon, Thomas and Hewitt2017). The CNV calls for UKBB participants were made in-house as we have previously reported in a UKBB CNV study (Kendall et al. Reference Kendall, Rees, Escott-Price, Einon, Thomas and Hewitt2017).
Results
The frequencies of birth by month are presented in Fig. 1. The distribution of frequencies in our study is similar to the distribution in the whole UKBB population (N = 502 632; biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=52).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200219143539548-0433:S0033291718000454:S0033291718000454_fig1g.gif?pub-status=live)
Fig. 1. Frequency of birth by month in the UK Biobank population.
Correlations between the timing of birth and polygenic risk
We found no association between schizophrenia PRS and season of birth in the UKBB sample, with winter as the baseline. Figure 2 shows the distribution of the standardised PRS for each season. The figure clearly demonstrates no difference in mean or variance of PRS. Detailed results are presented in Table 1, with baseline winter as the season of birth. Similarly, we found no differences in the PRS by month of birth with January as the baseline (Fig. 3; online Supplementary Table S1). Note that the specification of the baseline season is arbitrary; the results remain the same regardless which season is used.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200219143539548-0433:S0033291718000454:S0033291718000454_fig2g.gif?pub-status=live)
Fig. 2. Distribution of schizophrenia polygenic risk scores (PRS) of 136 538 UK Biobank participants with respect to their season of birth. Schizophrenia polygenic risk scores were constructed using schizophrenia risk SNPs with association p value ⩽0.05.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200219143539548-0433:S0033291718000454:S0033291718000454_fig3g.gif?pub-status=live)
Fig. 3. Distribution of schizophrenia polygenic risk scores of 136 538 UK Biobank participants with respect to their month of birth. Schizophrenia polygenic risk scores were constructed using schizophrenia risk SNPs with association p value ⩽0.05.
Table 1. Comparison of schizophrenia PRS of individuals in the UK Biobank sample split by season of births
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200219143539548-0433:S0033291718000454:S0033291718000454_tab1.gif?pub-status=live)
The schizophrenia PRS are generated using schizophrenia risk SNPs at different thresholds for the association. The baseline category of the analysis is winter birth and therefore a negative B indicates a decrease in risk of schizophrenia in those born in the season, which is shown in the header row, compared to winter.
The first column shows the p value threshold for SNP selection from the GWAS discovery. The numbers of SNPs, which passed the selection criterion, are shown in the second column. The following three sections of the table present effect sizes (B) and p values comparing each season with winter, estimated simultaneously in a single nominal regression model for each SNP selection threshold. The row in bold represents the primary analysis (P threshold for risk SNPs p⩽0.05); other rows are exploratory
Given that previous studies have frequently identified both winter and spring as the season of elevated risk, we also collapsed those born in winter and spring into a single group and compared their PRS with those born in summer or autumn. Our findings were not consistent with a season of birth–PRS correlation; indeed those born in winter and spring actually had a slightly lower PRS than those born in summer or autumn, and this reached nominal significance for some of the secondary tests (Table 2).
Table 2. Comparison of schizophrenia PRS of individuals in the UK Biobank sample, born in winter–spring v. summer–autumn (‘Whole sample’ section) and bottom v. top deciles of schizophrenia PRS of individuals in the UK Biobank sample (‘Bottom v. Top deciles of schizophrenia PRS’ section)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200219143539548-0433:S0033291718000454:S0033291718000454_tab2.gif?pub-status=live)
The baseline of the analysis is winter–spring birth combined. The schizophrenia PRS are generated using schizophrenia risk SNPs at different thresholds for association (column 1). OR (column 2) is the exponentiation of the B-coefficient provided by logistic regression
To check whether this sample maybe underpowered to detect low-effect sizes, we calculated the power of the PRS analysis for the reported effect size OR 1.07 (Davies et al. Reference Davies, Welham, Chant, Torrey and McGrath2003). The effect size for power calculation is usually estimated as (B 0−B)/σ, where B 0 is the effect size under the null hypothesis [in our case B 0 = log(1) = 0], B is the effect size under the alternative hypothesis [in our case B = log(1.07) = 0.068] and σ is the standard deviation. Since the PRS were standardised in our analyses, σ = 1 under both null and alternative hypotheses. Thus, the effect size for the power calculation was d = 0.068/sqrt(2) = 0.048 and the significance level was set to α = 10−4, to account for multiple testing. The power to detect an association between PRS and season of birth was >99.9%. We also estimated d = 0.0128 as the smallest effect size which our sample is capable to detect with 80% power, which corresponds to OR 1.018 (B = 0.018) excess for winter/spring births compared to summer/autumn births. As the power to detect an association between PRS and season of birth is 80%, assuming association effect size as small as B = 0.018 at 10−4 significance level (accounting for all PRS tests), we exclude seasonal variation in PRS as a being correlated with a season of birth effect.
To evaluate effects that might only be observed at the extremes of liability, we also compared ‘winter and spring’ v. ‘summer and autumn’ births for the top and bottom deciles of the PRS distribution; again we found no evidence for association at the primary testing threshold for risk alleles from the PGC study. We did observe a nominally significant association (p = 0.02) when risk alleles were selected from the PGC GWAS at p < 0.1, but again, the effect size was again inconsistent with that expected for a genotype–phenotype correlation, people in the top 10% for genetic loading to schizophrenia having a slightly higher chance of being born in summer or autumn compared with the bottom 10% of people (Table 2).
Genome-wide association study of the timing of birth
A Manhattan plot of the GWAS of the season of birth is given as online Supplementary Fig. S1. No genome-wide significant associations were identified, there was no evidence for inflation in the test statistics indicative of polygenic inheritance [genomic control (Devlin & Roeder, Reference Devlin and Roeder1999) λ = 0.992] (see QQ-plot in Supplementary Figure 2), and no evidence that SNP heritability contributed to this phenotype (total liability scale h 2 = −0.002 s.e. = 0.0052) as estimated by LD-score regression.
CNVs and timing of birth
There was no difference in the frequencies of pathogenic CNVs between groups of UKBB participants born in winter and spring compared with those born in summer or autumn [frequencies 0.017 (95% CI 0.016–0.018) and 0.016 (95%CI 0.015–0.017), respectively; χ2 test p = 0.137].
Conclusion
An excess of winter and spring births in people with schizophrenia is one of the most robustly supported and influential epidemiological findings in psychiatry (1–5). Here, we test and fail to support the hypothesis that the excess of winter and spring births in people with schizophrenia is an effect of gene–environment correlation. Not only do we fail to find evidence that schizophrenia liability in the form of common alleles or rare CNVs is associated with season of birth, our GWAS suggests that season of birth is not even a heritable trait, which alone makes such a correlation an untenable explanation for the season of birth effect (with respect to common variation). Rejecting a genetic–environmental correlation, we conclude that our study strongly supports the widely held view that the excess of winter births in schizophrenia is the result of an as yet unknown environmental risk exposure with a seasonal gradient.
Although not a direct aim of our study, we also note that the absence of heritability effects on season of birth implies that other traits that exhibit relative age effects, for example, personality traits, sporting ability and general academic performance (Jeronimus et al. Reference Jeronimus, Stavrakakis, Veenstra and Oldehinkel2015), are similarly unlikely to represent gene–environment correlation; rather, as widely interpreted, they most likely result from differential levels of maturity in school and other cohort intakes. However, some cognitive phenotypes exhibit season of birth fluctuations beyond effects attributable simply to the timing of school intake (Grootendorst-van Mil et al. Reference Grootendorst-van Mil, Steegers-Theunissen, Hofman, Jaddoe, Verhulst and Tiemeier2017). It is therefore not possible to exclude the possibility that an as yet to be discovered seasonally fluctuating risk factor that increases the risk of schizophrenia also contributes to the season of birth effects which have been observed for these other cognitive phenotypes.
Our study has a number of major strengths: the use of the largest schizophrenia dataset to date to identify genetic risk loci; the largest available genotyped population cohorts to generate schizophrenia PRS; and the ability to measure liability to schizophrenia directly at a molecular level. Another strength conferred by the large population sample is the power to test the sample month by month. This is important given that many studies vary in their definition of winter–spring and in the month for which risk is maximal. Together, these strengths allow us to test the genetic confounding hypothesis with extremely high power; as a result, our failure to find evidence in support of that hypothesis allows us to refute it as an explanation. In doing so, our results are consistent with and complementary to, studies that have indirectly measured genetic liability based on family history (Hettema et al. Reference Hettema, Walsh and Kendler1996; Suvisaari et al. Reference Suvisaari, Haukka and Lonnqvist2004; Svensson et al. Reference Svensson, Lichtenstein, Sandin, Oberg, Sullivan and Hultman2012).
Our study has a number of limitations. One potential limitation of the study is that, like all environmental exposures, possible variance in the exposure rate to the pathogenic agent might mean our conclusion could be country or birth cohort specific. However, the season of birth effects (and therefore exposure to the putative pathogenic environmental exposures) have been widely documented in Northern European samples (5). They have also been shown to operate in the UK from at least 1921 up till the modern era, with the most recent study in the UK suggesting January births are associated with an OR for schizophrenia of 1.17 (Disanto et al. Reference Disanto, Morahan, Lacey, DeLuca, Giovannoni and Ebers2012). Another potential limitation is that our analyses did not account for the possibility that an individual's circadian biology or chronotype might interact with a season of birth effects (Natale & Adan, Reference Natale and Adan1999; Natale et al. Reference Natale, Adan and Fabbri2009).
Finally, our PRS analysis, and heritability estimates were based upon common SNPs, and do not include a possible contribution from rare SNPs. However, the frequencies of rare CNVs, linked to neurodevelopmental disorders, also did not differ by season of birth, and it seems unlikely that rare mutations that increase liability to schizophrenia would have different effects on mating behaviours than the burden of common alleles. Nevertheless, when sufficient data become available, it may be useful to test for seasonal burdens of rare and de novo mutation events.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291718000454
Acknowledgements
This research was conducted using the UK Biobank resource. UK Biobank was established by the Wellcome Trust, Medical Research Council, Department of Health, Scottish Government and Northwest Regional Development Agency. UK Biobank has also funding from the Welsh Assembly Government and the British Heart Foundation. Data collection was funded by UK Biobank. The work at Cardiff University was supported by Medical Research Council (MRC) Centre (MR/L010305/1) and Program Grants (G0800509). The CLOZUK sample was genotyped with funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement n° 279227 (CRESTAR Consortium; http://www.crestar-project.eu/). JW is supported by a JMAS Sim Fellowship from the Royal College of Physicians of Edinburgh and DJS is supported by a Lister Institute Prize Fellowship. KK is supported by a Wellcome Trust Clinical Research Fellowship. The work at Cardiff University was supported by Medical Research Council (MRC) Centre (MR/L010305/1) and Program Grants (G0800509). The CLOZUK sample was genotyped with funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement n° 279227 (CRESTAR Consortium; http://www.crestar-project.eu/). JW is supported by a JMAS Sim Fellowship from the Royal College of Physicians of Edinburgh and DJS is supported by a Lister Institute Prize Fellowship. KK is supported by a Wellcome Trust Clinical Research Fellowship.