Recent advances in molecular data acquisition from archaeological remains have revolutionized the genetic study of human population history. Evidence suggests that polygenic selection, particularly during the last 10,000–15,000 years encompassing the Holocene, has influenced genetic markers linked to phenotypic traits such as height, skin pigmentation, and psycho-social characteristics, including educational attainment (EA), intelligence (IQ), and autism. These findings support a gene-culture co-evolutionary model, where sociocultural transformations following the agricultural revolution reshaped selective pressures on human populations (Kuijpers et al., Reference Kuijpers, Domínguez-Andrés, Bakker, Gupta, Grasshoff, Xu, Joosten, Bertranpetit, Netea and Li2022; Piffer & Kirkegaard, Reference Piffer and Kirkegaard2024a; Woodley et al., Reference Woodley, Younuskunju, Balan and Piffer2017).
A recent analysis of 2500 ancient genomes identified significant positive temporal trends for traits such as height, EA, IQ, and, to a lesser extent, autism and socioeconomic status (SES), along with a negative trend for schizophrenia and depression (Piffer & Kirkegaard, Reference Piffer and Kirkegaard2024a). However, this research was limited to samples from Europe and parts of the Middle East.
This article aims to replicate these findings using samples from populations of Eastern Eurasian origins. The dataset covers a diverse range of geographical regions, including the equatorial zone (e.g., Malaysia and Indonesia), tropical regions (e.g., Vietnam, Laos, Thailand), temperate areas (e.g., central and northern China), and extends into Siberia and the Arctic. It also includes Central Asia (e.g., Mongolia, Xinjiang), Tibet, and regions as far east as Japan.
A limitation of this approach is that the predictive validity of polygenic scores (PGSs) declines with increasing genetic distance from the genomewide association study (GWAS) reference population. Consequently, PGSs derived from European GWASs may exhibit lower predictive validity for African populations compared to Europeans (Martin et al., Reference Martin, Gignoux, Walters, Wojcik, Neale, Gravel, Daly, Bustamante and Kenny2017). This issue also affects East Asian populations, albeit to a lesser degree. For instance, a recent GWAS on EA conducted in East Asian cohorts reported significant genetic correlations and transferability of findings between East Asian and European populations (genetic correlation between GWASs = .87), alongside similar patterns of gene expression and cross-trait genetic correlations (Chen et al., Reference Chen, Kim, Lam, Chuang, Chiu, Lin, Jung, Kim, Kim, Cho, Shim, Park, Ahn, Okbay, Jang, Kim, Seo, Park, Ge, Huang and Won2024). Furthermore, Piffer and Kirkegaard (Reference Piffer and Kirkegaard2024b) demonstrated that PGSs for IQ, EA, and height derived from European-ancestry GWASs predict differences in mean phenotypic traits across contemporary Chinese provinces, achieving a correlation of approximately .7 between height PGS and average height. These findings support the utility of European-derived PGSs for predicting variation within East Asian populations.
Our model predicts that PGSs associated with cognitive abilities relevant to academic and intellectual achievement, including IQ and EA, will show a positive trend over time (indicating a negative correlation with years before present [BP]) due to the adaptive advantages conferred by such traits in complex societies. This hypothesis is supported by findings that East Asians, whose populations have historically developed highly organized, large-scale societies, tend to score higher in average IQ and indices of EA compared to other global populations (Lynn & Vanhanen, Reference Lynn and Vanhanen2012) and to have higher EA PGSs than other ancestral groups (Piffer, Reference Piffer2015, Reference Piffer2019). Such patterns suggest that prolonged exposure to the selective pressures of complex hierarchical societies may have amplified traits associated with cognitive performance and academic success. Conversely, given the generally lower average height observed in East Asian populations (NCD Risk Factor Collaboration [NCD-RisC]), 2020) and recent findings from a global study on the genetic determinants of height (Piffer & Kirkegaard, Reference Piffer and Kirkegaard2024c), we do not expect to observe a positive temporal trend in height PGSs.
We predict a positive correlation between latitude and height PGS, supported by findings from a recent study reporting a correlation of .72 between height PGS and the average height of Chinese provinces (Piffer & Kirkegaard, Reference Piffer and Kirkegaard2024b), as well as the positive correlation observed at the phenotypic level (Lu et al., Reference Lu, Hu, Yang, Zhang, Lu, Gong, Li, Shen, Zhang and Zhuang2022).
In contrast, anxiety and depression are predicted to exhibit negative selection due to their negative correlation with intelligence—as reported by a recent large meta-analysis (Anglim et al., Reference Anglim, Dunlop, Wee, Horwood, Wood and Marty2022). In fact, the Smoke Detector principle posits that natural selection favors systems that are hypersensitive to potential threats, even at the cost of frequent false alarms, because the cost of missing a real danger (a false negative) is much higher than the cost of responding to a nonthreat (a false positive). In ancestral settings, heightened sensitivity to threats would have increased survival chances by prompting individuals to avoid dangers. However, in contemporary urban environments, where immediate physical threats are less prevalent, this heightened sensitivity can lead to maladaptive responses, manifesting as anxiety and depression (Nesse, Reference Nesse2001). We predict negative selection on schizophrenia, supported by evidence of negative correlations between schizophrenia and intelligence at both genetic and phenotypic levels (Comes et al., Reference Comes, Senner, Budde, Adorjan, Anderson-Schmidt, Andlauer, Gade, Hake, Heilbronner, Kalman, Reich-Erkelenz, Klöhn-Saghatolislam, Schaupp, Schulte, Juckel, Dannlowski, Schmauß, Zimmermann, Reimer and Papiol2019; Hill et al., Reference Hill, Davies, Liewald, McIntosh and Deary2016; Lam et al., Reference Lam, Trampush, Yu, Knowles, Davies, Liewald, Starr, Djurovic, Melle, Sundet, Christoforou, Reinvang, DeRosse, Lundervold, Steen, Espeseth, Räikkönen, Widen, Palotie and Lencz2017; Lencz et al., Reference Lencz, Knowles, Davies, Guha, Liewald, Starr, Djurovic, Melle, Sundet, Christoforou, Reinvang, Mukherjee, DeRosse, Lundervold, Steen, John, Espeseth, Räikkönen, Widen and Malhotra2014; Smeland et al., Reference Smeland, Frei, Kauppi, Hill, Li, Wang, Krull, Bettella, Eriksen, Witoelar, Davies, Fan, Thompson, Lam, Lencz, Chen, Ueland, Jönsson, Djurovic and Andreassen2017). Conversely, we anticipate positive selection on autism spectrum disorder (ASD). This contrast may reflect differing adaptive contexts: psychotic tendencies might have conferred advantages in traditional societies where mystical beliefs and hallucinatory experiences were integrated into cultural norms. In contrast, modern societies, which emphasize sustained focus, specialization and structured cognition, may favor traits associated with autism.
Methods
Polygenic Scores
PGSs were calculated using the most recent and extensive GWAS for each trait. To filter variants, we applied clumping and thresholding (C + P) using PLINK 1.9 (Chang et al., Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee2015), setting a standard GWAS p-value threshold of p < 5 × 10^-8 with a linkage disequilibrium (LD) of r^2 < 0.1. Allele frequencies were computed using PLINK 2.0 (Chang et al., Reference Chang, Chow, Tellier, Purcell and Lee2020). The frequency files were loaded into R (version 4.4.1; R Core Team, 2023) and merged with the GWAS summary files to compute the PGSs.
Educational attainment (EA). we used PGS derived from two major European ancestry GWAS datasets: (1) The multitrait analysis of European genomewide association summary statistics, which includes years of education, cognitive performance, self-reported math ability, and highest math class taken, encompassing approximately 1.1 million individuals (Lee et al., Reference Lee, Wedow, Okbay, Kong, Maghzian, Zacher, Nguyen-Viet, Bowers, Sidorenko, Karlsson Linnér, Fontana, Kundu, Lee, Li, Li, Royer, Timshel, Walters, Willoughby, Yengo and Cesarini2018), referred to here as ‘EA3’; (2) The largest European-based GWAS to date, involving about 3 million individuals (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati, Sidorenko, Kweon, Goldman, Gjorgjieva, Jiang, Hicks, Tian, Hinds, Ahlskog, Magnusson, Oskarsson, Hayward, Campbell, Porteous and Young2022), referred to as ‘EA4’. To enhance robustness and minimize error, we averaged EA3 and EA4, yielding a more stable indicator.
EUR-IQ. The most recent and largest GWAS of general cognitive function (Davies et al., Reference Davies, Lam, Harris, Trampush, Luciano, Hill, Hagenaars, Ritchie, Marioni, Fawns-Ritchie, Liewald, Okely, Ahola-Olli, Barnes, Bertram, Bis, Burdick, Christoforou, DeRosse and Deary2018) identified 434 independent SNPs at p < 5 × 10-8.
EAS-EA1. The GWAS of EA trained on East Asians (T. T. Chen et al., Reference Chen, Kim, Lam, Chuang, Chiu, Lin, Jung, Kim, Kim, Cho, Shim, Park, Ahn, Okbay, Jang, Kim, Seo, Park, Ge, Huang and Won2024) relied on 180k samples from the Taiwan Biobank (TWB; Feng et al., Reference Feng, Chen, Chen, Kuo, Hsu, Yang, Chen, Su, Chu, Shen, Ge, Huang and Lin2022) and Korean Genome and Epidemiology Study (KoGES; Y. Kim et al., Reference Kim and Han2017).
MIX-Height. For height, we used the significant SNPs from the largest GWAS to date (Yengo et al., Reference Yengo, Vedantam, Marouli, Sidorenko, Bartell, Sakaue, Graff, Eliasen, Jiang, Raghavan, Miao, Arias, Graham, Mukamel, Spracklen, Yin, Chen, Ferreira, Highland and Hirschhorn2022), which comprised a multi-ancestry sample (after LD pruning with a threshold of r 2 < .1).
EAS-Height. Akiyama et al. (Reference Akiyama, Ishigaki, Sakaue, Momozawa, Horikoshi, Hirata, Matsuda, Ikegawa, Takahashi, Kanai, Suzuki, Matsui, Naito, Yamaji, Iwasaki, Sawada, Tanno, Sasaki, Hozawa, Minegishi and Kamatani2019) identified 573 independent variants with p < 5 × 10-8 using a relatively large sample (>190K) of Japanese individuals.
Height-direct. Tan et al. (Reference Tan, Jayashankar, Guan, Nehzati, Mir, Bennett, Agerbo, Ahlskog, Pinto de Andrade Anapaz, Åsvold, Benonisdottir, Bhatta, Boomsma, Brumpton, Campbell, Chabris, Cheesman, Chen and Young2024) carried a family-based GWAS (FGWAS) on 34 phenotypes, including body height. We used the within-family (‘direct’) beta of 565 independent SNPs significant at p < 5 × 10^-8.
Schizophrenia (SCZ). A recent schizophrenia GWAS by Trubetskoy et al. (Reference Trubetskoy, Pardiñas, Qi, Panagiotaropoulou, Awasthi, Bigdeli, Bryois, Chen, Dennison, Hall, Lam, Watanabe, Frei, Ge, Harwood, Koopmans, Magnusson, Richards and Sidorenko2022) identified 342 independent SNPs in a combined, multi-ancestry GWAS that were significant at a genomewide level (p < 5 × 10-8) with a LD of r 2 < .1.
Autism spectrum disorder (ASD). The largest GWAS of ASD was employed (Grove et al., Reference Grove, Ripke, Als, Mattheisen, Walters, Won, Pallesen, Agerbo, Andreassen, Anney, Awashti, Belliveau, Bettella, Buxbaum, Bybjerg-Grauholm, Bækvad-Hansen, Cerrato, Chambert, Christensen, Churchhouse and Børglum2019), which identified 88 top loci, 69 of which had p < 5 × 10^−8.
Depression. The most recent and extensive meta-analysis was used (Als et al., Reference Als, Kurki, Grove, Voloudakis, Therrien, Tasanko, Nielsen, Naamanka, Veerapen, Levey, Bendl, Bybjerg-Grauholm, Zeng, Demontis, Rosengren, Athanasiadis, Bækved-Hansen, Qvist, Bragi Walters, Thorgeirsson and Børglum2023). After executing C + T, 305 SNPs remained.
Anxiety. A recent GWAS of anxiety disorders based on a large (∼1.2 million), multi-ancestry sample identified 55 significant independent loci (Friligkou et al., Reference Friligkou, Løkhammer, Cabrera-Mendoza, Shen, He, Deiana, Zanoaga, Asgel, Pilcher, Di Lascio, Makharashvili, Koller, Tylee, Pathak and Polimanti2024).
Skin Color-UKBB. We utilized the GWAS on skin color from the pan-UKBB dataset (https://pan.ukbb.broadinstitute.org/GWAS; phenocode: 1717). After applying clumping and thresholding (C + T) for SNP selection, 1323 SNPs remained.
Light Skin-EAS. This analysis employed findings from a recent GWAS on skin color conducted by B. Kim et al. (Reference Kim, Kim, Shin, Leem, Cho, Kim, Gu, Seo, You, Martin, Park, Kim, Jeong, Kang and Won2024) in an East Asian sample. The study identified 26 independent SNPs associated with skin luminance, as well as red/green and yellow/blue components of skin color. For this study, we selected the 15 SNPs significantly associated with skin luminance.
Datasets
Ancient DNA data was obtained from the Genome Sequence Archive for Human (GSA-Human; National Genomics Data Center, & Partners, 2017) at the National Genomics Data Center (NGDC), Beijing Institute of Genomics, Chinese Academy of Sciences (https://ngdc.cncb.ac.cn/gsa-human) and from the European Nucleotide Archive (ENA) (2022) at the European Bioinformatics Institute (EBI) (https://www.ebi.ac.uk/ena)
Accession numbers were the following: HRA000123 (M. A. Yang et al., Reference Yang, Fan, Sun, Chen, Lang, Ko, Tsang, Chiu, Wang, Bao, Wu, Hajdinjak, Ko, Ding, Cao, Yang, Liu, Nickel, Dai, Feng and Fu2020); HRA000411 (Mao et al., Reference Mao, Zhang, Qiao, Liu, Chang, Xie, Zhang, Wang, Li, Cao, Yang, Liu, Dai, Feng, Ping, Lei, Olsen, Bennett and Fu2021); HRA000451 (C. C. Wang et al., Reference Wang, Yeh, Popov, Zhang, Matsumura, Sirak, Cheronet, Kovalev, Rohland, Kim, Mallick, Bernardos, Tumen, Zhao, Liu, Liu, Mah, Wang, Zhang, Adamski and Reich2021); HRA001777 (Kumar et al., Reference Kumar, Wang, Zhang, Wang, Ruan, Yu, Wu, Hu, Wu, Guo, Wang, Niyazi, Lv, Tang, Cao, Liu, Dai, Yang, Feng, Ping and Fu2022); HRA002378 (H. Wang et al., Reference Wang, Yang, Wangdue, Lu, Chen, Li, Dong, Tsring, Yuan, He, Ding, Wu, Li, Tashi, Yang, Yang, Tong, Chen, He and Fu2023); HRA004375 (Xiong et al., Reference Xiong, Wang, Chen, Yang, Du, Meng, Ma, Allen, Tao, Wang, Jin, Wang and Wen2024); HRA005606 (Bai et al., Reference Bai, Liu, Wangdue, Wang, He, Xi, Tsho, Tsering, Cao, Dai, Liu, Feng, Zhang, Ran, Ping, Payon, Mao, Tong, Tsring, Chen and Fu2024); HRA005681 (Guo et al., Reference Guo, He, Xie, Tao, Mai, Zhu, Qin, Yang, Xie, Wang, Ma, Zhao, Li, Gong and Wang2024); HRA005990 (Ma et al., Reference Ma, Zhou, Wang, Yan, Chen, Qiu, Zhao, Jin and Wang2024); HRA006574 (Shen et al., Reference Shen, Wu, Zan, Yang, Guo, Ji, Wang, Liu, Mao, Wang, Zou, Zhou, Peng, Ma, He, Bai, Xu, Wen, Jin, Zhang and Wang2024); HRA007236 (Zhang et al., Reference Zhang, Zhang, Bai, Hu, Duan, Yuan, Zhang, Ma, Zhou and Ning2024); HRA007527 (Li et al., Reference Li, Wang, Ma, Tu, Qiu, Chen, Jiang, Geng, Liu, Wang, Shen, Jin, Li, Wang and Wei2024); PRJEB2671 (McColl et al., Reference McColl, Racimo, Vinner, Demeter, Gakuhari, Moreno-Mayar, van Driem, Gram Wilken, Seguin-Orlando, de la Fuente Castro, Wasef, Shoocongdej, Souksavatdy, Sayavongkhamdy, Saidin, Allentoft, Sato, Malaspinas, Aghakhanian, Korneliussen and Willerslev2018); PRJEB29700 (Sikora et al., Reference Sikora, Pitulko, Sousa, Allentoft, Vinner, Rasmussen, Margaryan, de Barros Damgaard, de la Fuente, Renaud, Yang, Fu, Dupanloup, Giampoudakis, Nogués-Bravo, Rahbek, Kroonen, Peyrot, McColl, Vasilyev and Willerslev2019); PRJEB30575 (Flegontov et al., Reference Flegontov, Altınışık, Changmai, Rohland, Mallick, Adamski, Bolnick, Broomandkhoshbacht, Candilio, Culleton, Flegontova, Friesen, Jeong, Harper, Keating, Kennett, Kim, Lamnidis, Lawson and Schiffels2019); PRJEB35748 (Jeong et al., Reference Jeong, Wang, Wilkin, Taylor, Miller, Bemmann, Stahl, Chiovelli, Knolle, Ulziibayar, Khatanbaatar, Erdenebaatar, Erdenebat, Ochir, Ankhsanaa, Vanchigdash, Ochir, Munkhbayar, Tumen, Kovalev and Warinner2020); PRJEB36297 (Ning et al., Reference Ning, Li, Wang, Zhang, Li, Wu, Gao, Zhang, Zhang, Hudson, Dong, Wu, Fang, Liu, Feng, Li, Han, Li, Wei, Zhu and Cui2020); PRJEB41752 (Liu et al., Reference Liu, Witonsky, Gosling, Lee, Ringbauer, Hagan, Patel, Stahl, Novembre, Aldenderfer, Warinner, Di Rienzo and Jeong2022); PRJEB42781 (Wang et al., Reference Wang, Wang, Xie, Li, Fan, Yang, Wu, Cao, Liu, Yang, Liu, Dai, Feng, Wu, Qin, Li, Ping, Zhang, Zhang, Liu and Fu2021); PRJEB43762 (Cooke et al., Reference Cooke, Mattiangeli, Cassidy, Okazaki, Stokes, Onbe, Hatakeyama, Machida, Kasai, Tomioka, Matsumoto, Ito, Kojima, Bradley, Gakuhari and Nakagome2021); PRJEB45573 (Gelabert et al., Reference Gelabert, Blazyte, Chang, Fernandes, Jeon, Hong, Yoon, Ko, Oberreiter, Cheronet, Özdoğan, Sawyer, Yang, Greytak, Choi, Kim, Kim, Jeong, Bae, Bhak and Pinhasi2022); PRJEB55185 (Lee et al., Reference Lee, Miller, Bayarsaikhan, Johannesson, Ventresca Miller, Warinner and Jeong2023); PRJEB72297 (Lee et al., Reference Lee, Sato, Tajima, Amgalantugs, Tsogtbaatar, Nakagome, Miyake, Shiraishi, Jeong and Gakuhari2024); PRJEB20217 (M. A. Yang et al., Reference Yang, Gao, Theunert, Tong, Aximu-Petri, Nickel, Slatkin, Meyer, Pääbo, Kelso and Fu2017).
For contemporary genomes, we used a sample of 383 individuals from Han Chinese, Tibeto-Burman, Tai-Kadai, Austroasiatic, Mongolic, Turkic, Tungusic, and Indo-European speaking groups from China and Nepal, published by T. Wang et al. (Reference Wang, Wang, Xie, Li, Fan, Yang, Wu, Cao, Liu, Yang, Liu, Dai, Feng, Wu, Qin, Li, Ping, Zhang, Zhang, Liu and Fu2021); samples from 4 native North American and 12 north Asian populations published by Rasmussen et al. (Reference Rasmussen, Li, Lindgreen, Pedersen, Albrechtsen, Moltke, Metspalu, Metspalu, Kivisild, Gupta, Bertalan, Nielsen, Gilbert, Wang, Raghavan, Campos, Kamp, Wilson, Gledhill, Tridico and Willerslev2010); 42 Mongols from Inner Mongolia (C. C. Yang et al., Reference Yang, Sarengaowa, He, Guo, Zhu, Ma, Zhao, Yang, Chen, Zhang, Tao, Liu, Zhang and Wang2021); 41 Tai–Kadai-speaking Maonan people (J. Chen et al., Reference Chen, He, Ren, Wang, Liu, Zhang, Yang, Zhang, Ji, Zhao, Guo, Chen, Zhu, Yang, Wang, Ma, Tao, Liu, Shen and Huang2022); and 157 individuals from four Tibeto-Burman-speaking groups from the Guizhou province in Southwest China (J. Chen et al., Reference Chen, Zhang, Yang, Wang, Zhang, Ren, Wang, Liu, Chen, Ji, Zhao, He, Guo, Zhu, Yang, Ma, Wang and Huang2023). Our sample also comprised 1104 individuals indigenous to the high-altitude regions in the Himalayas in Nepal, including 344 ethnic Tibetans and 103 Sherpa (Jeong et al., Reference Jeong, Witonsky, Basnyat, Neupane, Beall, Childs, Craig, Novembre and Di Rienzo2018).
We used the TOPMed Imputation Server to perform genotype imputation, leveraging the TOPMed reference panel (Taliun et al., Reference Taliun, Harris, Kessler, Carlson, Szpiech, Torres, Taliun, Corvelo, Gogarten, Kang, Pitsillides, LeFaive, Lee, Tian, Browning, Das, Emde, Clarke, Loesch and Abecasis2021).
Merging
For each sample, metadata were collected from the supplementary material of the relative publication, and the following variables were extracted (if present) from the metadata: Population/Culture (‘pop’); Years Before Present (Years BP); Autosomal Coverage; Latitude and Longitude.
Genome Imputation
The ancient genomes were imputed using GLIMPSE2 (Rubinacci et al., Reference Rubinacci, Hofmeister, Sousa da Mota and Delaneau2023), using 1000 Genomes Project phase3 as reference panel (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/). For this, raw sequence data in.bam format were processed to generate imputed genotypes. The imputation process aimed to infer missing genotypic information and provide a more comprehensive set of genetic variants for each individual. The output BCF files were then merged with BCF tools (Danecek et al., Reference Danecek, Bonfield, Liddle, Marshall, Ohan, Pollard, Whitwham, Keane, McCarthy, Davies and Li2021).
Data Transformation
In our analysis, we combined GWAS summary statistics with allele frequency information based on SNP identifiers (chr:pos). SNPs that did not overlap between datasets were excluded. We then compared the GWAS effect allele (A1) with the allele data to ascertain whether it corresponded with the reference (REF) or alternative (ALT) allele in the 1000 Genomes Project (1KG; 1000 Genomes Project Consortium; 2015). We computed a weighted PGS, which we refer to as the Genetic Value Score (GVS), for each SNP, taking into account its allele frequency and effect size (β). Following this, a merged dataset was constructed containing one PGS value for each individual by taking the average GVS across all SNPs. The individual’s data were then augmented with additional attributes, including their dataset source and other relevant metadata, extracted using a custom string parsing function.
Admixture Analysis
Admixture components for the samples were computed using ADMIXTURE software (Alexander et al., Reference Alexander, Novembre and Lange2009), a powerful tool for estimating individual ancestries from multilocus SNP genotype datasets. The ADMIXTURE analysis works by decomposing genotype data of each individual into fractions representing potential ancestral populations. Admixture data was merged with the PGS dataset in R by organizing individuals based on their FAM identifiers.
Statistical Analysis
A correlation analysis was conducted to examine the relationship between the PGS values and their corresponding dates, expressed as Years BP.
Regression analyses were carried out to investigate the effects of various predictors on PGS values. The first regression model included Coverage as a covariate, the second introduced geographic variables (Latitude and Longitude) to the predictors, and the fourth introduced ancestry components estimated with ADMIXTURE.
In our regression analysis, we aimed to include the eight ancestry components estimated by ADMIXTURE as predictors. However, since the ancestry proportions output by ADMIXTURE sum to 1 for each individual, including all eight components introduces a situation of perfect multicollinearity. This means that the value of one component can be perfectly predicted from the values of the others, leading to computational difficulties and unstable coefficient estimates in the regression model. To circumvent this issue, and based on standard analytical practices in such situations, we opted to include only seven of the eight components in the model. By doing this, the effect of the dropped component is effectively absorbed into the intercept of the regression, and the coefficients of the included components are interpreted relative to it. This adjustment ensures a more stable and interpretable model without compromising the integrity of our analysis.
Results
In total, there were 1245 individuals in the merged dataset. After excluding individuals with no metadata and duplicate samples, there were 1133 individuals.
Dates (in Years BP) for these samples spanned a wide historical range. The minimum date was 141.5, with a mean date of 3101.8. The oldest sample was 40,000 years old and belonged to the Tianyuan individual. Finally, 181 samples lacked date information.
The PGSs were computed for each individual for different traits. The percentage of SNPs matching between the GWAS and the samples ranged from 77.2% for Skin Color-UKBB to 100% for IQ, EA3, MIX-Height, Anxiety and Depression, Light Skin-EAS (Table 1).
Table 1. Characteristics of the polygenic scores
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_tab1.png?pub-status=live)
Note: GWAS, genomewide association study; SNPs, single nucleotide polymorphisms; ASD, autism spectrum disorder.
Sample locations were globally dispersed. Latitude ranged from 3.533 to 70.700, with a mean of 40.4, centering around mid-latitude regions, and the median latitude was slightly higher at 43.08. Longitude spanned from -170.10 to 159.10, with a median at 100.46. The lowest longitude belonged to the Old Bering Sea samples found at Ekven. Because a negative value would not make sense in this context, we compute longitude east on a 0−360 scale. The lowest latitude samples (3.5−5.47) came from Indonesian and Malaysian genomes.
Depth of coverage, which indicates the number of times a specific base (nucleotide) in the DNA is read during the sequencing process, ranged from 0.001x to 32.29x, with a mean of 1.75x.
Admixture
The analysis explored various models, and the one with the lowest cross-validation (CV) error incorporated eight components. The selection of eight ancestral populations (K) was based on rigorous cross-validation. The eight-component model exhibited the lowest CV error, suggesting that it most accurately captures the genetic structure of our dataset without overfitting. Consequently, we employed this eight-component model to interpret the genetic ancestry of our population in the subsequent analysis (see Figure 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig1.png?pub-status=live)
Figure 1. ADMIXTURE plot showing ancestry components (K = 8) by group.
Based on the distribution of these components among different cultural groups, we identified their most probable ancestral origins. For instance, the second component (V2) stood out distinctly, comprising over 90% of the ancestry in the Afanasievo and Finnish individuals, while representing only a small proportion of the East Asian samples. It accounted for more than 50% of the ancestry in the Xinjiang samples and exceeded 25% in the Turkic and Upper Paleolithic Siberian samples. Given these patterns, we labeled this component as ‘Afanasievo-Yamnaya’. Similarly, the fifth component comprised about 90% of the ancestry of the Jomon samples, but was present at around 10% among Koreans, and was much smaller among the other mainland East Eurasian populations. Hence, we labeled this component as ‘Jomon’.
Temporal Trends
We computed the correlation between Years BP and the PGS to identify any potential temporal trends, such as an increase or decrease in the PGS over time. This method allows for a straightforward and intuitive visualization of the temporal trend by plotting the PGS values against the corresponding time points, making it easy to observe any general upward or downward movement in the data.
However, this approach has notable limitations due to the variability in the ethnic composition of our sample over time. Specifically, different geographical regions and ethnic groups were not uniformly represented across all periods, which introduces biases into our analysis. For example, certain timeframes may have predominantly sampled individuals from one region or ethnic group, while other periods may have samples with different regional or ethnic backgrounds. This inconsistency means that any detected trend might be influenced by underlying shifts in the population structure rather than a genuine temporal change in the PGS. Thus, interpreting the correlation without considering these compositional differences could lead to misleading conclusions about the temporal dynamics of polygenic traits.
To circumvent these limitations, we employed regression models that included Admixture Components, Longitude, Latitude, and Coverage as co-predictors. By incorporating these additional variables, we aimed to account for the effects of population structure, spatial distribution, and sampling coverage, thereby providing a more accurate and nuanced analysis of the temporal trends in PGSs. This approach helps to mitigate the confounding effects of uneven sampling across time and ensures that the observed trends are not merely artifacts of changing ethnic compositions or sampling biases.
We performed linear regression using the full dataset, but for the spline regression model incorporating Date, we restricted the analysis to samples younger than 12,000 years. This adjustment was necessary due to the sparse data available for older samples (N = 8), which could have obscured more nuanced temporal trends. To fully utilize the dataset, we applied spline effects for Latitude, including the oldest samples.
The correlations and the p-values are reported in Table 2 and the temporal trend is visualized in Figures S1 to S11.
Table 2. Pearson’s correlation between Years BP and PGS *
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_tab2.png?pub-status=live)
Note: BP, before present; PGS, polygenic score; EA, educational attainmennt; ASD, autism spectrum disorder.
* Significant correlations are shown in bold type.
Regression Models
EA Analysis. The effects of Years BP and Arctic admixture were significantly negative (β = -0.274 and -0.239 respectively). The other significant effects were positive: Northeast Chinese admixture (.348); Afanasievo (.153); Siberian (.215); Southeast Asian (.199) and coverage (.079). The results are visualized in Figure 2.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig2.png?pub-status=live)
Figure 2. Educational attainment (EA): standardized beta coefficients.
Nonlinear trends with Latitude and Years BP. The scatterplot inspection suggested that the temporal trend deviated from linearity, prompting the use of a spline function to capture potential nonlinear patterns between PGSs and Years BP. This was implemented using the splines package in R, utilizing natural splines (ns) with three degrees of freedom for flexibility. By incorporating a natural spline with three degrees of freedom for Date, we aimed to model complex, nonlinear associations that could elucidate spatial relationships more effectively. Regression results showed significant nonlinear effects for the first, second and third spline components of Date (p < .001, p < .01, and p < .001 respectively). For latitude, a significant nonlinear effect was found for the first spline component (p < .001). To facilitate interpretation, we generated partial effect plots for Latitude and Date from the fitted model, displaying the predicted effects on PGSs with confidence intervals (Figure 3). These plots provide a clearer view of the nonlinear influence of both latitude and temporal trends on PGSs.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig3.png?pub-status=live)
Figure 3. Partial effect of latitude on educational attainment (EA) and partial effect of Years Before Present (BP) on EA.
Comparison with modern genomes from Eastern Eurasia. To confirm the positive trend in recent times, we computed PGSs from several samples of contemporary Eastern Eurasians. We then carried a regression model with EA as the outcome and with period (modern vs. ancient) and geographical region as predictors.
The main effect of the modern period (relative to ancient) was positive and significant (β = 0.398, p = .032), indicating an increase in EA scores in modern times. Central Asia, Mongolia, NE China, North Siberia, South China, SE Asia, and Tibetan regions all showed positive and significant effects on EA scores compared to the baseline region, suggesting higher EA PGSs in these regions.
The interaction between the modern period and Mongolia (β = 0.874; p < .001), South China (β = 1.120, p < .001) and SE Asia (β = 0.590, p < .001) is positive and significant. This suggests that, in the modern period, EA scores have increased notably in these regions, especially in South China. The presence of an interaction between region and period can be seen in Figure 4.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig4.png?pub-status=live)
Figure 4. Interaction effect: Educational attainment (EA) by period for each region.
Since the average age of the samples from each period and geographical region differed, we added Date (Years BP) as a covariate, and assigned a value of zero to the modern samples. The interaction results were very similar, but Date replaced Period as a significant predictor (β = -0.157, p < .001).
IQ Analysis
Regression model results. The regression results indicated several significant effects (Figure 5). Date had a significant negative effect (β = -0.096, p = .012), indicating a trend of lower PGSs further back in time. Northeast Chinese ancestry showed a significant positive effect (β = 0.146, p = .001), suggesting a positive contribution to the outcome variable. Afanasievo-Yamnaya and Siberian ancestry components did not show significant effects (p = .180 and p = .280 respectively). The Arctic component also showed no significant influence (p = .920). Longitude and Latitude were not significant predictors (p = .613 and p = .077 respectively), although latitude approached significance. Coverage did not have a significant effect (p = .636), and SE Asian, Tibetan, and Jomon components were also nonsignificant, with p-values ranging from .127 to .850.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig5.png?pub-status=live)
Figure 5. IQ: Standardized beta coefficients.
Nonlinear trends with Years BP. The regression model results indicated a significant nonlinear effect for the first spline component of Date (p < .001; Figure 6), while the spline components for Latitude did not show significant effects.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig6.png?pub-status=live)
Figure 6. Partial effect of Years Before Present (BP) on IQ.
Height Analysis
MIX-Height. Years BP did not have a significant effect on MIX-Height (p = .485). Afanasievo-Yamnaya admixture had a strong positive effect on it (β = 0.556), whereas Jomon and Southeast Asian had negative coefficients (−.20 and −.144, respectively)
Nonlinear trends with latitude and Years BP. The results from the regression model indicated significant nonlinear effects for the first spline component of latitude (p < .01) and the second spline of Date (p < .01) (Figure 7). The effect of the second spline of Date was negative, implying negative selection on height.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig7.png?pub-status=live)
Figure 7. Partial effect of latitude on MIX-Height and partial effect of Years Before Present (BP) on MX-Height.
EAS-Height. The regression results indicated several significant effects. The influence of Date on the outcome was not significant (p = .077). Northeast Chinese admixture had a negative effect on the outcome (β = −0.083, p = .048), suggesting a significant but relatively small influence. The Afanasievo-Yamnaya component did not significantly affect the outcome (p = .188). In contrast, Jomon admixture showed a strong negative effect (β = −0.283, p < .001), as did Siberian (β = −0.152, p = .002), Arctic (β = -0.255, p = .001), and Southeast Asian (β = −0.135, p = .009) components. Longitude had a significant positive effect (β = 0.185, p = .018). Other predictors, such as Latitude and Coverage, did not show significant effects (p = .511 and p = .526 respectively). The effect of latitude on EAS-Height had a similar curvilinear trend to MIX-Height, and the first spline component of latitude was barely significant (p = .042).
DIR-Height. The regression results indicated several significant effects. Date had a significant negative effect (β = -0.080, p = .027), suggesting a higher PGS in more recent times. Northeast Chinese ancestry also showed a significant negative effect on the outcome (β = −0.125, p = .002). Afanasievo-Yamnaya admixture had a strong positive effect (β = 0.354, p < .001), indicating a substantial contribution to the outcome variable. Jomon ancestry showed a negative effect (β = −0.109, p = .003), while Arctic ancestry also had a significant negative impact (β = −0.192, p = .008). Longitude exhibited a positive effect (β = 0.217, p = 0.004), indicating a spatial trend across longitudinal gradients. Other predictors, such as Tibetan, Siberian, SE Asian, Latitude, and Coverage, did not show significant effects (p-values ranging from .088 to .850).
The spline regression revealed no statistically significant effects for either Date or Latitude.
The standardized coefficients of the regression models with MIX-Height, EAS-Height and DIR-Height are visualized in Figure 8.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig8.png?pub-status=live)
Figure 8. MIX-Height, EAS-Height and DIR-Height: Standardized beta coefficients.
Psychiatric Phenotypes Analysis
Schizophrenia. The regression results for schizophrenia revealed several significant effects (Figure 9). Afanasievo-Yamnaya admixture had a substantial positive impact (β = 0.288, p < .001), indicating a strong contribution to the outcome. Jomon and Siberian admixtures also had significant positive effects (β = 0.130, p = .001, and β = 0.151, p = .002 respectively). Southeast Asian admixture showed a significant positive effect (β = 0.108, p = .041). The effect of Tibetan admixture was close to significance (β = 0.115, p = .054), suggesting a potential influence that could warrant further exploration. Other variables, such as Date, Northeast Chinese, Arctic, Latitude, Longitude, and Coverage, did not show significant effects, with p-values ranging from .133 to .744. The spline regression revealed no statistically significant effects for either Date or Latitude.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig9.png?pub-status=live)
Figure 9. Partial effect of Years Before Present (BP) on autism spectrum disorder (ASD).
Autism spectrum disorder (ASD). The regression results for ASD indicated multiple significant findings (Figure 9). Date had a significant negative effect (β = −0.126, p = .001), pointing to lower PGSs further back in time. Afanasievo-Yamnaya admixture had a significant positive impact (β = 0.192, p = .003). Jomon admixture exhibited a significant negative effect (β = −0.088, p = .021), while Arctic admixture also demonstrated a significant negative influence (β = −0.251, p = .001). Latitude had a significant positive effect (β = 0.170, p = .017), indicating the impact of geographic variation.
The other predictors, including Northeast Chinese, Tibetan, Siberian, SE Asian, Longitude and Coverage, did not show significant effects, with p values between .107 and .488.
Nonlinear trends with Years BP. The regression model results indicated significant nonlinear effects for the first and second spline components of Date (p < .001 and p < .041, respectively; Figure 9). In contrast, the spline components for Latitude were not significant.
Depression. The regression analysis for Depression showed several significant outcomes (Figure 9). Date had a significant positive effect (β = 0.080, p = .033), indicating higher PGSs in ancient times. Tibetan admixture had a significant negative impact (β = −0.158, p = .007), while Jomon admixture showed a significant positive effect (β = 0.080, p = .038). Arctic admixture also had a significant positive effect (β = 0.171, p = .023).
Other predictors, such as Northeast Chinese, Afanasievo-Yamnaya, Siberian, SE Asian, Latitude, Longitude, and Coverage, did not demonstrate significant effects, with p-values ranging from .068 to .922.
Nonlinear trends with Years BP. The regression model results showed a significant nonlinear effect for the first spline component of Date (p < .001; Figure 10), while the spline components for Latitude were not significant.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig10.png?pub-status=live)
Figure 10. Partial effect of Years Before Present (BP) on depression.
Anxiety. The regression results for Anxiety highlighted several significant effects (Figure 11). Date had a significant positive effect (β = 0.080, p = .032), which indicates a reduction in PGSs over time. Northeast Chinese admixture showed a significant negative impact (β = −0.099, p = .019). Arctic admixture was also positively significant (β = 0.167, p < .024), while Coverage had a significant negative effect (β = −0.098, p < .014).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig11.png?pub-status=live)
Figure 11. Schizophrenia, depression, autism spectrum disorder (ASD) and anxiety: Standardized beta coefficients.
The remaining predictors, including Afanasievo-Yamnaya, Tibetan, Jomon, Siberian, SE Asian, Latitude and Longitude, did not show significant effects, with p-values ranging from 0.055 to 0.474.
The spline regression revealed no statistically significant effects for either Date or Latitude.
Skin Color
Skin Colour-UKBB. The regression results for Skin Colour-UKBB indicated several significant effects (Figure 12). Years BP (Date) had a significant positive effect (β = 0.102, p < .000), suggesting higher PGSs for darker skin color in more ancient populations. Northeast Chinese ancestry also had a significant positive effect (β = 0.392, p < .001), indicating a substantial contribution to darker skin pigmentation. Tibetan ancestry showed a significant positive effect (β = 0.378, p < .001), as did the Jomon (β = 0.178, p < .001), Siberian (β = 0.343, p < .001), Arctic (β = 0.332, p < .001), and Southeast Asian (β = 0.223, p < .001) components, all indicating their contributions to darker skin pigmentation.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig12.png?pub-status=live)
Figure 12. Dark skin color and light skin color: Standardized beta coefficients.
Latitude showed a significant negative effect (β = −0.111, p < .041), suggesting that populations further from the equator tend to have lower PGSs for darker skin. In contrast, Afanasievo-Yamnaya ancestry, Longitude, and Coverage did not show significant effects, with p-values of .148, .818, and .704 respectively.
Nonlinear trends with Latitude. The regression model results indicated significant nonlinear effects for the first and second spline components of Latitude (p < .001 and p < .05, respectively), suggesting a complex, nonlinear influence of Latitude on the PGS that a linear model could not capture (Figure 13). In contrast, the spline components for Date were not statistically significant.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig13.png?pub-status=live)
Figure 13. Partial effect of latitude on Skin Color-UKBB.
Light Skin-EAS. The regression results for Light Skin-WAS indicated several significant effects. Years BP (Date) had a significant negative effect β = −0.175, p < .001), suggesting lighter skin pigmentation in more recent populations. Northeast Chinese ancestry had a significant positive effect (β = 0.173, p < .001), indicating a contribution to lighter skin pigmentation. Similarly, Tibetan ancestry showed a significant positive effect (β = 0.102, p < .046), as did the Jomon (β = 0.284, p < .001), Siberian (β = 0.437, p < .001), Arctic (β = 0.173, p < .008), and Southeast Asian (β = 0.253, p < .001) components, all pointing to their roles in lighter skin pigmentation.
Longitude did not demonstrate a significant effect (p < .176). While the effect of Latitude did not reach statistical significance (p < .084), it trended in the expected positive direction. Similarly, Coverage showed no significant effect (β = 0.038, p = .279).
Nonlinear trends with Latitude and Years BP. The regression model results indicated significant nonlinear effects for the second and third spline components of Latitude (p < .001 and p < .01 respectively), suggesting a complex, nonlinear influence of Latitude on the PGS that a linear model could not capture (Figure 14). The three spline components of Date were significant (p < .01, .01 and .001 respectively).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig14.png?pub-status=live)
Figure 14. Partial effect of latitude on Light Skin-EAS and partial effect of Years Before Present (BP) on Light Skin-EAS.
Correlation Between PGS at the Group Level
The mean PGS for the traits were computed for each historical culture group and the group-level correlations were computed (Figure 15). Cognitive-related PGS (EA and IQ) had significant negative correlations with Depression (r = −.66 and −.62 respectively). EA also had a strong negative correlation with Anxiety (r = −.83). In turn, Depression and Anxiety were positively correlated (r = 0.64). ASD had a negative correlation with Depression (r = −085) and a positive one with Schizophrenia (r = .61).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250128182440718-0141:S1832427424000495:S1832427424000495_fig15.png?pub-status=live)
Figure 15. Group-level polygenic score correlations.
Skin Colour-UKBB PGS had negative correlations with the Height PGS, and the correlation reached significance with DIR-Height (r = −.73; p < .01). IQ PGS had significant positive correlations with EAS-Height and MIX-Height.
Discussion
The temporal analysis of PGSs revealed significant patterns linking sample age (Years BP) to evolutionary pressures on various traits. EA, IQ, and ASD displayed negative correlations with sample age, indicating increases over time likely driven by positive directional selection. In contrast, PGSs of anxiety, depression, and (dark) skin color PGSs showed positive correlations with Years BP, consistent with decreases over time through negative directional selection. These results replicate findings from prior research on ancient West Eurasian genomes (Piffer & Kirkegaard, Reference Piffer and Kirkegaard2024a). The temporal trend of the schizophrenia PGS was in the predicted direction (negative) but it did not reach significance. Similarly, the strong positive directional selection for Height previously reported in West Eurasian samples was not confirmed. Positive selection was detected in only one of the three PGSs for Height (DIR-Height), while the effect of the second spline of Date on MIX-Height was negative, indicating negative selection on height. This suggests that the relatively shorter average stature of East Asians may be attributable to negative selection or the absence of positive selection on height-increasing alleles. A comparison between the ancient genomes and modern samples revealed that ancient samples had higher PGS for MIX-Height (Figure S12).
Importantly, these trends persisted after controlling for potential confounding variables such as ancestry, latitude, and longitude. In some cases, controlling for these variables strengthened the observed associations; for example, the effect size for EA increased significantly from .195 to .274 (Figure 2). This enhanced association suggests that the temporal trends for EA are particularly robust, implying positive direction selection for cognitive and noncognitive abilities related to education. Other correlations remained consistent in magnitude, underscoring the stability of these findings.
Anxiety and depression showed negative temporal trends, aligning with the Smoke Detector principle, which posits that natural selection favors hypersensitivity to potential threats as a survival mechanism. This hypersensitivity reduces the risk of missing real dangers (false negatives) by tolerating frequent false alarms (false positives), an adaptive strategy in ancestral environments where immediate physical threats were prevalent. However, in contemporary urban settings with fewer immediate dangers, this heightened sensitivity becomes maladaptive, manifesting as anxiety and depression (Nesse, Reference Nesse2001). These findings suggest that while these traits may have been beneficial in the past, they are subject to negative selection in modern contexts.
Latitude emerged as an important factor influencing PGS. Traits such as EA and ASD were positively associated with latitude (Figures 2 and 11), while (darker) skin color showed a negative correlation. These patterns align with broader evolutionary and environmental considerations. For instance, the relationship between latitude and skin color reflects the well-documented adaptation to UV exposure and the dietary shift associated with agriculture, which reduced vitamin D intake and reinforced the selection for lighter skin in higher latitudes (Deng & Xu, Reference Deng and Xu2017; Lucock, Reference Lucock2023). Interestingly, the partial effect of latitude on Skin Colour-UKBB revealed a nonlinear trend for the UKBB-derived PGS (indicating darker pigmentation), with scores decreasing until approximately 48° latitude before increasing in northern regions. This pattern corresponds to the darker pigmentation observed among Arctic populations, such as the Inuit, and likely reflects adaptations to the unique UV exposure and dietary conditions in those regions. The extended daylight and snow-reflected UV in the Arctic make darker skin advantageous for protection, while traditional diets rich in vitamin D reduce the evolutionary pressure for lighter skin. Moreover, although this PGS was based on a sample composed mostly of European ancestry individuals with a much smaller share of other ancestries, we were able to detect significant temporal and spatial (i.e., with latitude) associations. This is remarkable because unlike other polygenic traits, the genetic loci associated with skin pigmentation are known to differ among populations and to show substantial divergence between East Asians and Europeans (B. Kim et al., Reference Kim, Kim, Shin, Leem, Cho, Kim, Gu, Seo, You, Martin, Park, Kim, Jeong, Kang and Won2024).
The PGS for light skin pigmentation (Light Skin-EAS), derived from a GWAS conducted on individuals of East Asian ancestry (B. Kim et al., Reference Kim, Kim, Shin, Leem, Cho, Kim, Gu, Seo, You, Martin, Park, Kim, Jeong, Kang and Won2024), confirmed evidence of negative selection for darker skin (or positive selection for lighter skin) pigmentation. This conclusion is supported by the positive effect of Years BP on Light Skin-EAS (Figure 14).
B. Kim et al. (Reference Kim, Kim, Shin, Leem, Cho, Kim, Gu, Seo, You, Martin, Park, Kim, Jeong, Kang and Won2024) reported correlations between the PGS for skin luminance and geographic factors such as absolute latitude and mean annual solar radiation, with correlation coefficients around r = .5 for individual populations from the 1000 Genomes Project phase 3. We successfully replicated this correlation in our ancient sample. However, the positive effect of latitude was significant only in the spline regression, suggesting nonlinear relationships (Figure 14).
While the effect of latitude on MIX-height was in the predicted direction, it did not reach statistical significance in the linear regression analysis. However, using spline regression, the effect became significant for both MIX-Height (p < .01) and EAS-Height (p < .05). This nonlinear relationship, which peaks at around 50° latitude before decreasing, suggests a complex interplay between environmental factors and human physiology.
One potential explanation for this curvilinear pattern involves the trade-offs described by Bergmann’s and Allen’s rules (Allen, Reference Allen1877; Bergmann, Reference Bergmann1847). Bergmann’s rule states that individuals in colder climates tend to have larger body masses to conserve heat, while Allen’s rule suggests that shorter limbs are advantageous in extreme cold to minimize heat loss. Therefore, populations living near 50° latitude—where the climate is temperate—may have evolved to be taller with larger body masses, as larger size is beneficial for thermoregulation without the extreme pressures of Arctic conditions.
In contrast, populations in Arctic regions beyond 50° latitude might have developed a stockier build with shorter limbs to cope with severe cold, resulting in a decrease in overall height. This adaptation helps reduce surface area and conserve body heat, aligning with Allen’s rule. Thus, the observed peak in height at mid latitudes followed by a decrease in higher latitudes could reflect evolutionary adaptations to varying climatic pressures, balancing the need for heat conservation with other physiological demands.
Ancestral components further highlighted key associations between genetic ancestry and phenotypic traits. Afanasievo-Yamnaya ancestry showed a strong positive association with height, aligning with previous findings of selection for tall stature among Steppe populations (Mathieson et al., Reference Mathieson, Lazaridis, Rohland, Mallick, Patterson, Roodenberg, Harney, Stewardson, Fernandes, Novak, Sirak, Gamba, Jones, Llamas, Dryomov, Pickrell, Arsuaga, de Castro, Carbonell, Gerritsen and Reich2015). Conversely, Jomon ancestry was negatively associated with height, consistent with the shorter stature of modern Japanese populations compared to other Northeast Asians, likely reflecting admixture with the aboriginal Jomon (Figure 8). For cognitive traits, Northeast Chinese ancestry showed the strongest effects on EA and IQ, reinforcing the regional importance of this ancestry in shaping these phenotypes (Figures 2 and 5). Psychiatric traits also exhibited ancestry-specific effects; for instance, Afanasievo-Yamnaya ancestry increased susceptibility to schizophrenia and ASD, while Tibetan ancestry had a protective effect against depression. Arctic ancestry reduced susceptibility to ASD, illustrating the diversity of ancestry effects across phenotypes (Figure 11).
Regional and temporal interactions further nuanced the observed trends. Modern genomes exhibited significantly higher PGS for EA compared to ancient samples, although the magnitude of these increases varied by region (Figure 4). Notably, interaction effects suggested that positive directional selection on EA was particularly intense in Mongolia, Southern China, and Southeast Asia. These findings underscore the dynamic interplay between environmental, cultural, and genetic factors across different regions and time periods.
Correlations among PGS provided additional insights into potential shared selection pressures (Figure 15). Cognitive traits, such as IQ and EA, were negatively associated with psychiatric traits like depression and anxiety, while IQ showed positive correlations with height PGS. Anxiety and depression were positively correlated, but ASD exhibited a strong negative correlation with depression and a positive correlation with schizophrenia. These relationships suggest that selection pressures on cognitive and psychiatric traits may have operated in complex, sometimes opposing directions.
Temporal effects on PGS, modeled with splines, revealed nonlinear patterns that further contextualize these evolutionary trends. For example, EA and IQ exhibited modest growth between 12,000 and 6000 BP, followed by an accelerated increase peaking around 1500 BP. IQ, however, showed a distinct decline between 8000 and 6000 BP, a finding that may indicate a meaningful historical trend requiring further investigation. While ASD PGS demonstrated general increases across time periods, the Depression PGS exhibited a marked decline between 6000 and 2000 BP before sharply increasing. These temporal dynamics highlight the complex interplay of environmental, cultural, and genetic factors influencing the evolution of these traits.
Overall, the findings provide robust evidence for evolutionary changes in PGSs over time, shaped by both directional selection and regional interactions. The intricate relationships between ancestry, latitude, and time underscore the importance of considering nonlinear and context-dependent factors in understanding the genetic architecture of human traits. Future research should build on these results, exploring additional datasets and refining models to further elucidate the evolutionary forces shaping polygenic traits.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/thg.2024.49.