Integrative analysis of plastome, single-copy nuclear gene Pgk1 and SLAF-seq data uncovers multiple-origin and introgression history in polyploid Agropyron cristatum
Hao Yan (颜浩)a, Yihao Zhang (张一昊)a, Hailun Shi (石海伦)a, Xuande Xu (许宣德)a, Shuangbing Yu (余双炳)a, Lijun Yan (闫利军)b, Yan Zhao (赵彦)c, Dandan Wu (吴丹丹)a, Yue Zhang (张越)a, Yiran Cheng (程怡然)a, Yi Wang (王益)a, Houyang Kang (康厚扬)a, Xiao Ma (马啸)d, Haiqin Zhang (张海琴)d, Yonghong Zhou (周永红)a, Wenjie Chen (陈文杰)e,*, Lina Sha (沙莉娜)d,**, Xing Fan (凡星)a,***     
a. Triticeae Research Institute, Sichuan Agricultural University, Chengdu, Sichuan, China;
b. Sichuan Academy of Grassland Science, Chengdu, Sichuan, China;
c. Key Laboratory of Grassland Resources, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China;
d. College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China;
e. Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, Qinghai, China
Abstract: Elucidating the origins and mechanisms of polyploidization requires tracing the evolutionary history of polyploid species, particularly those with complex origins. Agropyron cristatum, traditionally regarded as an autopolyploid, exhibits characteristics indicative of a segmental allopolyploid. Here, we used phylogenetic analysis based on a low-copy nuclear gene (i.e., Pgk1), SLAF-seq, and plastome data from 20 diploid and 120 tetraploid Agropyron individuals to determine whether tetraploid A. cristatum arose from an allopolyploid or autopolyploid event. Phylogenetic analyses based on Pgk1 and SLAF-seq data identified two distinct A. cristatum lineages that corresponded to the two main Agropyron habitats in Central Asia–Europe and East Asia–Qinghai-Tibet Plateau. These findings, taken together with molecular dating and gene flow analyses, suggest that the East Asian tetraploid A. cristatum originated via both autopolyploidy from A. cristatum and hybridization between diploid A. cristatum and A. mongolicum, with each diploid cytotype acting as a maternal donor. Furthermore, the Central Asia–Europe tetraploid A. cristatum originated solely via autopolyploidy of diploid A. cristatum. Our findings also indicate that rapid diversification of Agropyron was likely driven by climate oscillations, geographic isolation, introgressive hybridization, and chloroplast capture. These findings challenge simplistic views of autopolyploids and underscore substantial potential for achieving high levels of genetic and adaptive diversity through recurrent hybridization and reticulate evolution.
Keywords: Polyploidy    Agropyron    SLAF-seq    Plastome    Diversification    Pgk1    
1. Introduction

Polyploidy is generally viewed as an important driver of plant evolution and diversification. Between 30% and 70% of angiosperms are polyploids or have undergone polyploid events (Stebbins, 1950; Grant, 1963; Masterson, 1994). Furthermore, recent studies have proposed that polyploidization has been a major driver of speciation in all flowering plants (Jiao et al., 2011; Soltis et al., 2018). Two types of polyploidization have been identified (i.e., autopolyploidy and allopolyploidy). Autopolyploidy results from whole genome duplication within a single species, whereas allopolyploidy results from hybridization between distinct species. The role of autopolyploids in the evolution of natural plant populations has traditionally been overlooked, as their morphology is often similar to their diploid progenitors, their genomes have, until recently, been too large to sample, and multivalent pairing can lead to deleterious effects and polysomic inheritance (Soltis et al., 2007; Spoelhof et al., 2017). However, understanding how autopolyploidy affects plant evolution may help disentangle the effects of whole genome duplication from hybridization.

One approach to clarifying patterns of evolution following autopolyploidy is to compare an autopolyploid lineage with its diploid progenitors (Kreiner et al., 2017; Spoelhof et al., 2017). This would require examining an autopolyploid lineage with multiple extant cytotypes. A good model system that meets these requirements is the crested wheat grass genus Agropyron. Agropyron consists of 3–10 species (Tzvelev, 1976; Dewey, 1986; Yen et al., 2013) and includes diploid (2n = 2x = 14), tetraploid (2n = 4x = 28), and hexaploid (2n = 6x = 42) species. In addition, all the species within Agropyron, regardless of ploidy level, contain the same basic genome (Dewey, 1984; Jensen et al., 2005). Cytogenetic studies have shown that Agropyron species form an autopolyploid series based on the only two diploid cytotypes in the genus, Agropyron cristatum and A. mongolicum (Dewey 1984; Löve 1982; Jensen et al., 2006; Yen et al., 2013).

Agropyron cristatum comprises a polyploid complex of diploid, tetraploid and hexaploid cytotypes that shows geographical and morphological variation. Specifically, A. cristatum tetraploids occur commonly in Europe, the Middle East, Central Asia, and East Asia, A. cristatum diploids are distributed sporadically from Europe to Mongolia, and A. cristatum hexaploids are rare but have been reported in Turkey and Iran (Dewey and Asay, 1982). A. cristatum spikes vary from broad, pubescent, pectinate spikes to narrow, linear, glabrous spikes (Dewey and Asay, 1982; Yen et al., 2013). Such extensive variation in spike morphology has been attributed to their putative progenitors, including the narrow-spiked (A. mongolicum) and the broad-spiked (diploid cytotype of A. cristatum) (Dewey, 1986; Hsiao et al., 1989). However, the ancestry of A. cristatum remains controversial.

Cytogenetical analysis of Agropyron cristatum has yielded conflicting interpretations of the origin of the tetraploid A. cristatum. Some evidence has indicated that the two genomes in A. cristatum exhibit signs of segmental alloploidy (Stebbins, 1947; Schulz-Schaeffer et al., 1963) and are distinguished from each other by structural rearrangements (Hsiao et al., 1989). Other studies have suggested that A. cristatum is an autopolyploid formed by genome duplication from its diploid cytotype (Dewey, 1961; Taylor and McCoy, 1973). Studies of genetic diversity show that significant genetic variation occurs among different individuals within A. cristatum populations (Mellish et al., 2002; Chen et al., 2013; Che et al., 2015), indicating large genetic differences between A. cristatum and their different putative progenitors (Said et al., 2018). Given that females play an important role of in establishment of polyploid richness (Kinoshita et al., 2008; Chen et al., 2020), determining the origin of the maternal donors should help identify the mode of polyploid events in particular groups (Sha et al., 2010; Chen et al., 2020).

Phylogenetic analysis is the ideal tool for identifying the mode of polyploid events and the maternal parents of polyploids (e.g., Sang, 2002; Gornicki et al., 2014; Ma et al., 2014; Brassac and Blattner, 2015; Sha et al., 2022). Low-copy genes, e.g., Pgk1, have been used to estimate phylogenies in grass species (Huang et al., 2002; Fan et al., 2013; Hu et al., 2013; Sha et al., 2017), specific-locus amplified fragment sequencing (SLAF-seq; Sun et al., 2013) has been used to generate reduced representation genomic data in the absence of a reference genome (Yang et al., 2018; Liang et al., 2022; Jia et al., 2023), and plastome data have been used to identify sources of maternal inheritance (Bock, 2007). Here, we aim to reach a greater understanding of the role of autopolyploidy and/or allopolyploidy in Agropyron cristatum evolution. For this purpose, we used phylogenetic analysis based on Pgk1, SLAF-seq, and plastome data for 120 tetraploids and 20 diploids from Agropyron to determine whether the tetraploid A. cristatum originated from autopolyploid or allopolyploid events. Specifically, we reconstructed a phylogenetic framework that identified diploid–tetraploid relationships in A. cristatum and then documented diversification of A. cristatum populations across their evolutionary history.

2. Materials and methods 2.1. Plant sampling, DNA extraction, Pgk1 cloning and plastome sequencing

In this study, we collected 140 Agropyron samples, encompassing nearly all natural distribution areas (Table S1). Seed materials with PI numbers were generously provided by the American National Plant Germplasm System (Pullman, Washington, USA), while other seed materials were collected by our team. Plants used for sequencing and voucher specimens have been deposited at the Herbarium of Triticeae Research Institute, Sichuan Agricultural University, China (SAUTI).

Total genomic DNA was extracted from fresh plant leaves using the CTAB (cetyl trimethyl ammonium bromide) method (Doyle, 1990). Genomic data from 67 Agropyron cristatum samples was used to newly sequence 122 Pgk1 sequences. Primers for Pgk1 and PCR conditions were adopted from Fan et al. (2013). The Pgk1 sequence for A. mongolicum was obtained from published data (Fan et al., 2013). Pgk1 genes were manually annotated using the A. mongolicum sequence as a reference (GenBank No. FJ711024).

We newly sequenced plastomes from five Agropyron mongolicum samples and sixty-eight A. cristatum samples (including seven diploid cytotypes). Plastomes were amplified in overlapping fragments via the long-range PCR method (Yang et al., 2014). PCR products were fragmented to construct short insert (approximately 500 bp) libraries following the NEBNext® protocol. Individual DNA samples were indexed with tags and pooled for sequencing on a single lane of an Illumina Hiseq 4000 PE150 platform. All plastome sequences were assembled de novo according to the procedure outlined by Yang et al. (2014). The plastome of A. cristatum (GenBank No. KY126307) served as the reference genome for annotating all newly sequenced accessions by using the CPGAVAS2 (Shi et al., 2019). All plastomes were subsequently manually proofread. The circular annotation map of the Agropyron plastome was generated using OGDRAW (Greiner et al., 2019). All sampling maps were created using QGIS (https://qgis.org/). Nine additional Agropyron plastomes were sourced from the published data of Chen et al. (2018, 2021).

2.2. Sequence alignment and analysis

Whole plastome and Pgk1 sequences were aligned using MAFFT v.7.505 (Katoh and Standley, 2013) with default settings. MEGA-X (Kumar et al., 2018) was employed to calculate sequence characteristics, including parsimony-informative sites, variable sites, length of sequences, and nucleotide content. To avoid accidental sequencing errors, parsimony-informative sites were treated as SNPs (single nucleotide polymorphisms) for subsequent analysis, and the VCF file was converted using the software SNP-sites (Page et al., 2016). The original Pgk1 sequences were processed using the filtering steps outlined by Fan et al. (2013), with additional manual checks to exclude singletons. To obtain Pgk1 allele genotype data, the Pgk1 SNP dataset underwent a more complex analysis process. Specifically, we first converted all sequences retained in the previous filtering program into SNP data, then manually merged different allele SNP sites to acquire genotype information for each sample. Plastome and Pgk1 Fst analyses were performed using vcfpop (Huang et al., 2022), and the Fst matrix was illustrated using the R package ‘ggplot2’. The R package ‘ggsignif’ was used to calculate significant differences in the length of sequences.

2.3. SLAF-seq sequencing, assembly and analysis

Genomic materials from 63 Agropyron cristatum and three A. mongolicum samples were used for SLAF-seq sequencing. SLAF-seq library was constructed by BioMarker (Beijing) following the protocol of Sun et al. (2013). Paired-end sequencing data with a read length of 150 base pairs were generated using the Illumina HiSeq 2500 platform. The raw dataset was filtered through fastp (Chen, 2023) to remove adapters and low-quality reads. Subsequently, the clean data underwent de novo assembly into loci, followed by SNP identification using ipyrad (Eaton and Overcast, 2020). A clustering threshold of 0.85 was applied, along with a minimum sample requirement of 40 per locus, ensuring that the output included 60% of the analyzed samples. To accommodate tetraploid datasets, the maximum number of alleles was set to 4, while all other parameters remained at their default settings. After ipyrad SNP calling, VCFtools (Danecek et al., 2011) was used to exclude SNPs with minor allele counts (MAC) below 3 and missing data in over 40% of samples. Further refinement was achieved using PLINK 1.9 (Purcell et al., 2007), which eliminated linkage disequilibrium (LD) sites. PCA analysis was also implemented using PLINK. Fst calculations and UPGMA cluster analysis were conducted using vcfpop. For comparative purposes with diploid species within the Triticeae, loci were split into individual sequences, and those connected by “NNNN” were separated into distinct loci. Thereafter, minimap2 (Li, 2021) was employed to perform alignment between the consensus sequences of each split locus and the Triticeae diploid genome (see Table S1 for details). Successfully mapped loci were filtered by removing those with MAPping Quality (MAPQ) < 20. For duplicated mappings, only one target with the highest MAPQ value was retained. Subsequently, Triticeae sequences were extracted using bedtools (Quinlan and Hall, 2010) and incorporated into the original Agropyron loci files, prompting a renewed multiple sequence alignment facilitated by MAFFT. Based on the genome of Pseudoroegneria libanotica (which acquired most successful mapped loci counts), loci that successfully mapped were extracted, and the positional information of SNPs on these loci was converted to the P. libanotica genome coordinates. Then, R package ‘Cmplot’ (Yin et al., 2021) was used to generate the SNP density plot.

2.4. Phylogenetic analysis 2.4.1. Plastome and Pgk1 phylogenetic analysis

For plastome and Pgk1 data, phylogenetic analyses were carried out using maximum likelihood (ML) and Bayesian inference (BI). The DNA evolutionary model was determined by ModelFinder (Kalyaanamoorthy et al., 2017) with the Bayesian Information Criterion (BIC) implemented in IQ-TREE (Minh et al., 2020). The optimal model for plastome was GTR + I + G for both ML and BI. For Pgk1 sequences, the optimal model was GTR + G for ML and HKY + G for BI. Prior to the phylogenetic analysis of plastomes, we compared the topology and support values for complete plastome, coding sequences (CDS) alone, and noncoding regions. However, the CDS and noncoding regions were excluded due to their low statistical support values. Ultrafast bootstrap (Hoang et al., 2018) was employed to identify the best-scoring ML tree by running 10,000 replicates. The BI analysis was conducted using MrBayes v.3.2.7 (Ronquist et al., 2012). Four Markov chain Monte Carlo (MCMC) chains (one cold and three heated), applying the default heating values (t = 0.2) in MrBayes, were run for 1,000,000 generations, with sampling every 1000 generations. Convergence was assessed by ensuring the average standard deviation of split frequencies (ASDSF) was less than 0.01, and additional generations were appended if necessary. The first 25% of the total trees were discarded as burn-in. Node support was estimated by posterior probabilities. For the Pgk1 dataset, a Svdquartets analysis was also performed in PAUP 4 (Swofford, 2003), allowing us to specify the origin of the sequences and conduct quartet coalescence analysis. This approach enabled the obtention of a genotype-based phylogenetic topology rather than a sequence-based phylogenetic tree.

We used the R package ‘phytools’ (Revell, 2024) to compare the phylogenetic trees of Pgk1 and plastome. To ensure sample consistency, we utilized R package ‘ape’ (Paradis and Schliep, 2019) to extract samples common to both datasets for analysis. Additionally, to enhance visual appeal, the tree topology was automatically rotated.

2.4.2. SLAF-seq phylogenetic analysis

The SLAF-seq dataset was partitioned into two distinct subsets: the first comprised the original ipyrad-generated loci exclusive to Agropyron sequences, while the second constituted a mapping dataset aligned with diploid Triticeae and outgroup taxa genomes. PhyloSuite (Zhang et al., 2020, Zhang et al., 2020) facilitated the separate concatenation of loci from each subset. Given the prohibitive escalation in computational time with data volume expansion, our phylogenetic inference strategy relied solely on IQ-TREE for ML tree construction, employing ultrafast bootstrapping across 1000 iterations to gauge statistical confidence.

IQ-TREE further served to generate individual gene trees, which were integral to the coalescent analysis. The resultant coalescent tree was then inferred using weighted ASTRAL (wASTRAL) (Zhang and Mirarab, 2022) under default settings. Building upon the outcomes of wASTRAL and fineRADstructure (refer to sections 3.4 Network and phylogenetic analyses of SLAF-seq data, 3.6 Population structure), Agropyron was stratified into eight distinct groups. Leveraging this classification, we proceeded to reconstruct the species tree in BPP (Flouri et al., 2018), adopting the A01 algorithm. Subsequently, species delimitation within Agropyron was executed via the A10 algorithm, guided by the user specific tree inferred through the A01 framework. Both A01 and A10 algorithms implement the default JC69 mutation model for 100,000 generations, sampling every 2 generations, and specified a burn-in of first 8000 generations. The priors for theta and tau were set to inverse-gamma (3, 0.02) and inverse-gamma (3, 0.03) to fit the estimate mutation rate of 2.3e-9, effective population size of (Ne) of 1e6 and root age of approximately 3.3 Million years ago (Mya) (see MCMCTree and Stairway Plot result in section 3.10). Each BPP analysis was run independently in two replicates to compare results for consistency and assess convergence.

2.5. Network analysis

To analyze genetic diversity of Agropyron populations, haplotype networks based on plastome and Pgk1 sequence data were constructed using PopART v.1.7 (Leigh and Bryant, 2015) with the Median-Joining (MJ) method (Bandelt et al., 1999). SNP data served as input for network analysis. However, given the substantial number of SNP sites, the PopART proved unsuitable for network analysis of SLAF-seq data. Consequently, we adopted SplitsTree APP (Huson and Bryant, 2024) to infer network structures utilizing the Neighbor-Net algorithm.

2.6. Population genetic structure analysis

We analyzed plastome, Pgk1, and SLAF-seq data in ADMIXTURE to characterize the structure of Agropyron populations (Alexander et al., 2009). The optimal number of genetic clusters (K) was determined based on the cross-validation error (CV error) obtained with the “--cv” flag in ADMIXTURE, varying K from 1 to 10. Population structure visualization was achieved using the R package ‘ggplot2’. In the case of SLAF-seq data, we additionally employed fineRADstructure (Malinsky et al., 2018a) for population structure inference. The fineRADstructure is specialized software for analyzing population genetic structures in RAD (Restriction-site Associated DNA) datasets. Beyond generating heatmaps of coancestry, it also reconstructs topological relationships among individuals based on shared coancestry site information.

2.7. Migration history estimates

The potential migration history of Agropyron was inferred using TreeMix v.1.13 (Pickrell and Pritchard, 2012). The program PLINK was utilised to convert the VCF file into a binary SNP file for TreeMix analysis. TreeMix was run 50 times for migration edges from 1 to 6 using the flags ‘-noss -global -tf -bootstrap’. The Evanno method implemented in the R package ‘OptM’ (Fitak, 2021) was utilized to determine the optimal number of migration events. Finally, based on the best-identified migration edges, TreeMix was independently executed 100 times with the flags ‘-noss -global -tf -se’, and the most consistent result was adopted as the final outcome.

For genomic SLAF-seq data, an additional flag ‘-k 1000’ was incorporated into the analysis to mitigate the potential influence of LD. Dsuite (Malinsky et al., 2021) was also used to infer potential events of gene flow. The statistical analysis of Patterson’s D and f4-ratio was conducted utilising the subprogram Dtrios in Dsuite. The Fbranch program in Dsuite has the capacity to reflect excess allele sharing with each valid population by outputting a matrix with fb statistic value. This capability enables the disentangling of correlated f4-ratio results and the assignment of gene flow evidence to specific, possibly internal, branches on a phylogeny (Malinsky et al., 2018b). The fb result was plotted by means of the “dtools.py” script, which was contributed by Malinsky et al. (2021). The tree file set to TreeMix and Dsuite was inferred by SVDquartets (Pgk1) and BPP (SLAF-seq). The grouping of samples followed the structures of population genetics and phylogenetic analysis, and was further subdivided based on ploidy levels to explore potential introgression events associated with different ploidy levels.

Genetic admixture is typically ascribed to two primary factors: gene flow and/or incomplete lineage sorting (ILS). The QuIBL (Edelman et al., 2019) program was used to determine whether genetic admixture in Agropyron was best explained by the ILS model or the ILS + introgression model. QuIBL uses evolutionary branch length information to assess the discordance between gene trees and the species tree in triplets, estimating the introgression proportion under the introgression model based on the ratio of conflicting loci. It is important to note that this analysis requires multiple gene trees for comparison with a hypothetical species tree. Hence, plastome and Pgk1 data were excluded from consideration.

We applied QuIBL to test whether previously inferred admixture signals (from TreeMix, ADMIXTURE, and Dsuite) involving the pairs MIX-EA, G1-G2, and G4–G6 were caused primarily by introgression or ILS. Three phylogenetic datasets with the topologies (((CA, MIX), EA), O), (((G1, AM), G2), O) and (((G4, G5), G6), O) were constructed for this purpose. For each dataset, individuals were randomly sampled within each predefined group. Any individual with missing data was replaced by another from the same group to maximize the number of loci. Ten independent analyses per dataset were performed to mitigate sampling instability. The optimal model for gene tree conflicts was selected using the delta BIC value (ΔBIC), with a threshold of ΔBIC < −10. The introgression proportion within the introgression model was inferred by calculating the ratio of introgression loci to the total number of triplet loci.

2.8. Divergence time, demographic history and ancestral area estimates

Divergence time estimation was performed using SLAF-seq and plastome data in MCMCTree (Yang, 2007). To enhance computational efficiency during the Markov chain Monte Carlo (MCMC) analysis, an approximate likelihood approach was implemented. For the Agropyron population, we employed a two-step divergence time inference approach, as commonly applied in plant phylogenetics (e.g., Marcussen et al., 2014; Zhang et al., 2020, Zhang et al., 2020). First, the crown age of the Agropyron genus was estimated using conserved, non-recombining, and monophyletic plastome data. This crown age was then used to calibrate the inference of intra-population divergence times based on nuclear data.

For plastome data analysis, secondary calibration points from Zhang et al. (2022) were applied: specifically, 39.9–40.4 Mya for the most recent common ancestor (MRCA) of Brachypodium and Triticeae, and 15.9–16.3 Mya for the MRCA of Triticeae. The HKY85 substitution model (model = 4) and independent-rates log-normal relaxed-clock (ILN) model were employed. Due to the lack of prior information on substitution rates for plastomes in our study taxa, an approximate mean rate was estimated using a strict molecular clock in BASEML. Based on this preliminary result (base substitution rate = 5.7e-10), the prior rgene_gamma was set to (10, 87.71). For shallow divergence nodes, sigma2_gamma was set to (10, 200), and the birth-death sampling process was defined as Bdparas = (1, 1, 0.1, m). MCMCTree analysis of SLAF-seq data also utilized the HKY85 substitution model. However, for closely related taxa (for only one genus), a strict molecular clock was deemed more appropriate than a relaxed clock (dos Reis et al., 2016). Consequently, the analysis was calibrated using the inferred Agropyron crown age, with rgene_gamma set to (10, 43.48), thereby rendering sigma2_gamma specification unnecessary. All MCMCTree analyses comprised two independent MCMC chains per run. Each chain ran for 2,000,000 generations, with the initial 200,000 generations discarded as burn-in. Convergence was evaluated using Tracer v.1.7.1 (Rambaut et al., 2018), confirming effective sample sizes (ESS) > 200.

To investigate the potential influence of climatic fluctuations on Agropyron’s divergence, we depicted relative temperature variations over the past 5 million years, drawing upon paleoclimate reconstruction data reported by Rohling et al. (2021).

Effective population size (Ne) trajectories over time were estimated utilizing Stairway Plot 2 (Liu and Fu, 2020), with the mutation rate per year set at 2.3e-9, as inferred from prior MCMCTree analyses. The site frequency spectrum (SFS) file required for this analysis was generated from SLAF-seq SNP data through the implementation of easySFS (https://github.com/isaacovercast/easySFS). Median Ne values across distinct lineages were aggregated and visually represented employing R package ‘ggplot2’.

To reconstruct the biogeographical history of Agropyron, we first estimated the ancestral ranges of Agropyron populations in RASP 4 (Yu et al., 2020). The AICc_wt was utilized to identify the most suitable model in BioGeoBEARS (Matzke, 2014) and infer the ancestral ranges throughout the phylogeny. Based on Liu’s floristic map of the world (Liu et al., 2023), the habitat of A. cristatum can be divided into three main regions: Europe (Europe–Turkey), Asia (Central Asia-East Asia), and Iran–Pakistan. Given that A. cristatum is differentiated into Central Asia and East Asia lineages (see sections 3.4–3.6) and the unique plateau climate of the Qinghai-Tibet Plateau, five geographical groupings were included in the RASP analysis: A:, Central Asia [excluding Xinjiang]; B, Iran–Pakistan; C, Europe–Turkey; D, East Asia [including Xinjiang]; E, Qinghai-Tibet Plateau. The classification of Xinjiang within East Asia merits clarification. This designation was established based on both the phylogenetic clustering of Xinjiang A. cristatum samples with East Asian specimens and the geographical isolation imposed by mountain barriers (e.g., Tianshan, Altai, and Karakoram Mountains) along Xinjiang’s western border (Fig. S1).

2.9. Morphological variations analysis

To elucidate morphological variation across Agropyron lineages, we analysed 16 spike traits from 35 Agropyron samples. Spike traits are stable during flowering, easy to measure in the field, and have been reliably used in Triticeae taxonomy, particularly for Agropyron delineation (Yen et al., 2013). For each sample, five healthy spikes were measured during the flowering stage. PCA was performed using the R package ‘FactoMineR’. The top six quantitative traits with the highest contribution values in the PCA underwent Kruskal–Wallis analysis of variance (ANOVA) and Dunn post-hoc pairwise tests using the R package ‘ggstatplot’ (Patil, 2021). Lineage grouping was performed based on the results of the previous SLAF-seq analysis.

3. Results 3.1. Characteristics of plastomes, Pgk1 gene and SLAF-seq assembly

A total of 84 plastomes and 123 Pgk1 sequences were analyzed (Table S1). The lengths of Pgk1 sequences range from 1341 (C19) to 1493 (C26) base pairs (bp), with an average GC content of 42.2%. Pgk1 genes comprise 5 exons and 4 introns. The genome structure and gene content of all sequenced plastomes of Agropyron are similar to published plastomes of Triticeae (Bernhardt et al., 2017; Chen et al., 2020; Gornicki et al., 2014). The plastome sizes range from 135,167 (C45) to 135,554 (C51) bp. The GC content of all Agropyron sequences is approximately 38.3%. All plastomes are divided into a large single-copy (LSC) region and a small single-copy (SSC) region by two inverted repeat (IR) regions, comprising a total of 109 genes (including 76 protein-coding genes, 29 tRNA genes, and 4 rRNA genes) (Fig. S2). The lengths of CDS genes range from 87 (petN) to 4409 (rpoC2) bp. The proportion of variable sites (variable sites/total sites, V/T) varies from 0 to 1.10% (rps16), and the parsimony-informative sites (parsimony-informative sites/total sites, Pi/T) range from 0 to 1.08% (petL) (Table S2).

In the ipyrad pipeline, we identified 3120 loci across 66 individuals. After applying missing data filters and LD pruning, 16,779 SNPs were retained from a total of 81,536 sites. To mitigate assembly gaps (“NNNN”), the initial set of 3120 loci was split into 6232 loci. Subsequently, alignments were performed using minimap2 with diploid genomes in Triticeae and outgroups, resulting in the successful mapping of 3563 target loci. The number of successfully mapped loci varied from 155 (in the outgroup Brachypodium distachyon) to 1577 (in the Triticeae species Pseudoroegneria libanotica). The SNP density map, generated using the reference genome of P. libanotica, showed that SNPs identified via SLAF-seq were uniformly distributed across all seven chromosomes, with no significant bias (Fig. S3).

3.2. Network and phylogenetic analyses of plastomes

The MJ network of haplotypes derived from the plastome SNP dataset identified eleven shared haplotypes (Fig. 1A). Four distinct clusters of haplotypes were recognized (plastome-cluster 1–4). Plastome-cluster 1 encompasses all East Asian samples, four accessions from the Qinghai-Tibet Plateau, and three from Central Asia. Plastome-cluster 2 includes nearly all Central Asian and Middle Eastern samples, one Mediterranean accession, and three European accessions. Plastome-cluster 3 primarily comprises Mediterranean samples, with minor contributions from Central Asia, the Qinghai-Tibet Plateau, and Europe. Plastome-cluster 4 consists solely of samples from the Qinghai-Tibet Plateau, forming a distinct cluster separated, most likely, by numerous mutational steps.

Fig. 1 Phylogenetic and network analyses of plastome relationships in Agropyron cristatum. (A) Plastome network analysis. Black lines represent mutational steps between haplotypes; black dots indicate inferred ancestral nodes; circle size is proportional to haplotype frequency; colors represent sample origins. Distinct clusters are delineated with black boxes. Shared haplotypes are labeled H1 to H11. Dashed-line boxes indicate subclusters within cluster 1. (B) Phylogenetic tree inferred using maximum likelihood and Bayesian inference. Branch support values (posterior probability/ultrafast bootstrap; PP/UB) are shown above nodes. The four major clades are color-coded. Diploid accessions are marked with an asterisk (★). Colored dots preceding sample names denote geographic origins. (C) Geographic distribution of samples. Colors and groups correspond to the clade color scheme in panel (B).

Both ML and BI generated phylogenetic trees with nearly identical topologies (Fig. 1B). Here, we limit our discussion to the ML tree. In line with the network analysis results, the phylogenetic analysis grouped all samples into four clades with consistent statistical support (UB > 90%; PP > 0.90), which were recognized as plastome-clade 1 to 4. Plastome-clade 1 contains two subclades (with UB = 75% and PP = 1.00). One subclade (plastome-subclade 1) contains all sampled Agropyron mongolicum and five tetraploid A. cristatum from East Asia, however, this subclade also includes one diploid A. cristatum from Kazakhstan, three tetraploid A. cristatum from the Qinghai-Tibet Plateau, and five tetraploid A. cristatum from Xinjiang. The other subclade (plastome-subclade 2) contains one diploid and five tetraploid A. cristatum from East Asia, six tetraploid A. cristatum from the Qinghai-Tibet Plateau, and three tetraploid A. cristatum from Xinjiang (China). Plastome-clade 2 is composed of five diploid A. cristatum, which are joined by the majority of tetraploids from Central Asia and the Middle East with high statistical support (UB = 99% and PP = 1.00). Specifically, one diploid A. cristatum from the Middle East was clustered with four sympatric tetraploid A. cristatum with consistent statistical support; two diploid A. cristatum from Central Asia were scattered with their sympatric tetraploids; and two diploid A. cristatum and one tetraploid A. cristatum from Europe were also scattered together. Plastome-clade 3 comprises a mixture of samples from Central Asia, the Qinghai-Tibet Plateau, Europe, and the Mediterranean region. In this clade, three diploid A. cristatum from different regions were clustered together with a tetraploid A. cristatum with high statistical support (UB = 100% and PP = 1.00). Plastome-clade 4 consists of one diploid and four tetraploid samples from the Qinghai-Tibet Plateau (UB = 100% and PP = 1.00).

3.3. Network and phylogenetic analyses of Pgk1 gene sequences

In contrast to the plastome network results, the Pgk1 network exhibits only three clusters (Fig. 2A). Pgk1-cluster 1 showed similarities to plastome-cluster 1, encompassing material from East Asia, the Qinghai-Tibet Plateau, and Central Asia. Notably, all Central Asian samples in this cluster originate from Xinjiang (China), with the exception of a single tetraploid from Novosibirsk (Russia). All Pgk1 sequences of A. mongolicum were clustered in the upper half of Pgk1-cluster 1 (Pgk1-subcluster 1), which includes three shared haplotypes. The remaining Pgk1-cluster 1 samples included two tetraploids from Xinjiang, one diploid, and two tetraploids from Qinghai. As with plastome-cluster 2, the majority of materials in Pgk1-cluster 2 were from Central Asia, with the remaining four samples consisting of two Turkish tetraploids, one East Asian tetraploid, and one Middle Eastern sample. The geographic origin of materials in Pgk1-cluster 3 was diverse, spanning the Middle East, Central Asia, East Asia, the Mediterranean region, and the Qinghai-Tibet Plateau.

Fig. 2 Phylogenetic and network analyses of Pgk1 relationships in Agropyron cristatum. (A) Haplotype network of Pgk1 sequences. Black lines represent mutational steps between haplotypes; black dots indicate inferred ancestral nodes; circle size is proportional to haplotype frequency; colors represent sample origins. Distinct clusters are delineated with black boxes. Shared haplotypes are labeled H1 to H6. Dashed-line boxes indicate subclusters within cluster 1. (B) Phylogenetic tree inferred using maximum likelihood and Bayesian inference. Branch support values (posterior probability/ultrafast bootstrap; PP/UB) are shown above nodes. The four major clades are color-coded. Diploid accessions are marked with an asterisk (★). Colored dots preceding sample names denote geographic origins. Numbers in brackets indicate distinct Pgk1 alleles within individual samples. (C) Geographic distribution of samples. Colors and groups correspond to the clade color scheme in panel (B).

Due to the high conservation of single-copy genes, we relaxed the statistical support criteria for phylogenetic analyses to UB > 80% and PP > 0.85. The phylogenetic analysis based on Pgk1 were closely aligned with the network analysis (Fig. 2B). Pgk1-clades 1–2 correspond to network Pgk1-clusters 1–2. Pgk1-clade 1 (UB = 99% and PP = 1.00) encompasses most East Asian, Qinghai-Tibet Plateau, and all Xinjiang samples. Diploid allele 2 (C58) clustered with tetraploid allele 1 (C78) with strong support (BS = 84% and PP = 0.88), whereas diploid allele 1 (C54) grouped with allele 2 (C75) and allele 2 (C33) (UB = 99% and PP = 1.00). Similar to network Pgk1-cluster 2, Pgk1-clade 2 (UB = 89% and PP = 1.00) predominantly includes Central Asian materials. Diploid C51 formed a tight cluster with C38, C15, C12, C11, and C23 (UB = 89% and PP = 0.93). Diploid allele 1 (C60) was robustly aggregated with C11 and C19 into a single branch (UB = 99% and PP = 1.00). Pgk1-clades 3–6 occupy basal positions in the phylogenetic tree, matching Pgk1-cluster 3 in the network analysis. Pgk1-clade 3 (UB = 96% and PP = 0.99) consists entirely of tetraploid accessions, except for C43 from the Qinghai-Tibet Plateau, which originates from Central Asia. Pgk1-clade 4 includes only one Central Asian tetraploid (C27). Pgk1-clade 5 (UB = 93% and PP = 0.98) represents a mixed origin, comprising samples from the Qinghai-Tibet Plateau, Middle East, Mediterranean, and Central Asia, along with three diploids from the Mediterranean and Central Asia. Pgk1-clade 6 (UB = 82% and PP = 0.95) is composed of two Central Asian tetraploids.

The genotype-based (GT) phylogenetic tree, constructed using Pgk1 allele genotypes (Svdquatets tree), categorized all samples into two distinct clades. Nonetheless, the integration of degenerate bases and the consolidation of alleles, which diminished the number of variable sites, led to a decrease in the statistical support for the GT phylogenetic tree. Our main objective was to elucidate the phylogenetic relationship between samples, rather than differentiating among various Pgk1 allele types. Despite the absence of robust statistical backing, the tree’s topology offers significant clues regarding the genealogical ties among the samples. GT-clade 1 was subdivided into two subclades. GT-subclade 1 is uniquely occupied by A. mongolicum, with no other diploid specimen present. This subclade also encompasses tetraploid materials originating from East Asia and the Qinghai-Tibet Plateau. Another subgroup (GT-subclade 2) emerged from the clustering of two diploid A. cristatum individuals (one each from Mongolia and Qinghai-Tibet Plateau) with tetraploid samples sourced from the Qinghai-Tibet Plateau, East Asia, and Xinjiang. GT-clade 2, also, split into two subclades—GT-subclade 3 and GT-subclade 4. GT-subclade 3 predominantly consists of Central Asian materials, featuring five diploid accessions from this region. Conversely, GT-subclade 4 is a compact cluster containing merely four accessions, three of which are diploid, along with the solitary Central Asian tetraploid sample, C41, which grouped together with them (Fig. S4).

3.4. Network and phylogenetic analyses of SLAF-seq data

SplitsTree analysis partitioned most samples into two distinct groups: one cluster predominantly consisted of samples from East Asia, the Qinghai-Tibet Plateau, and Xinjiang (left panel, Fig. S5), and another cluster mainly included samples from Central Asia and the Middle East (right panel, Fig. S5). Five samples occupied intermediate positions between the two clusters (central panel, Fig. S5). Similarly, the supermatrix ML tree (Fig. S6) grouped the samples into three clusters: East Asia lineage (EA lineage), Central Asia lineage (CA lineage), and four samples with unresolved phylogenetic relationships (designated as MIX population). These results were consistent with the SplitsTree analysis. The UPGMA clustering based on genetic distance also divided all samples into two groups, thereby supporting the classification of the two lineages (Fig. S7).

The phylogenetic topology of Agropyron was more effectively resolved using wASTRAL (Fig. 3B). Samples were again primarily divided into two lineages: the EA lineage and the CA lineage. Six intermediate samples remained poorly resolved phylogenetically. Additionally, wASTRAL recovered population subdivisions within these lineages, a finding strongly supported by subsequent fineRADstructure analysis (see section 3.6). All samples except C124 were classified into eight populations. SNP-based PCA analysis corroborated this classification: PC1 and PC2 separated the two lineages, with further subdivisions observed in the EA lineage, while PC3 resolved additional structure within the CA lineage (Fig. 3A).

Fig. 3 Genealogical relationships revealed by SLAF-seq analysis through PCA and phylogenetic tree reconstruction. (A) Principal component analysis (PCA) of SLAF-seq SNP data. Materials from divergent lineages and admixed populations (MIX) are delineated with dashed boundaries, with distinct color codes assigned to each population. The upper panel (PC1 vs. PC2) demonstrates inter-lineage divergence and intra-group differentiation within the EA lineage, while the lower panel (PC1 vs. PC3) elucidates sub-lineage variation in the CA lineage. (B) Unrooted phylogenetic tree constructed using wASTRAL. Branches are color-coded according to the eight defined populations. Asterisks (★) preceding sample names denote diploid accessions. (C) Geographic distribution of SLAF-seq sampling localities color-coded by lineage affiliation.

The phylogenetic analysis based on 3563 mapped loci supported the division of Agropyron into two major lineages. However, samples from the MIX population were classified into both lineages (Fig. S8). Despite this, all MIX materials clustered at the basal positions of the clades, indicating poor resolution of its phylogenetic relationships within the MIX population.

3.5. Phylogenetic cytonuclear discordance based on Pgk1 genotype and plastomes

Extensive phylogenetic tree conflicts among clades were observed between the Pgk1 genotype dataset and the plastome dataset (Fig. 4). Five Qinghai-Tibet Plateau samples were situated in the EA lineage according to the Pgk1 genotype dataset. However, they clustered into a specific clade (plastome-clade 4) in the plastome phylogenetic tree. There appears to be no evidence indicating the existence of this specific clade in the Pgk1 genotype phylogenetic results. This particular clade in the plastome tree may represent an unknown plastome lineage (UN lineage). The CA lineage and the EA lineage were named based on the primary origin of the materials within them, and it was also observed in SLAF-seq phylogenetic tree. These two lineages were geographically separated by the Altai Mountains, Karakoram Mountains, Tianshan Mountains, Himalaya Mountains, and Pamir Plateau (Fig. S1). Four samples (C18, C37, C43 and C53) exhibited phylogenetic conflicts associated with the two lineages, possibly suggesting limited geographic isolation. Although plastome-clade 3 did not cluster with plastome-clade 2, the Pgk1 and SLAF-seq data strongly supported the grouping of the two lineages, corresponding to GT-clade 2 in the Pgk1 genotype dataset. Phylogenetic conflicts among C81, C50, C38, and C40 revealed historical gene flow within the CA lineage. Additionally, conflicts between two subclades in the EA lineage further highlighted introgression events associated with two distinct parental origins.

Fig. 4 Cytonuclear phylogenetic discordance and population structure analysis. Left: plastome phylogeny versus right: nuclear Pgk1 genotype phylogeny, with corresponding samples connected by color lines. Inter-lineage phylogenetic conflicts are denoted by solid purple lines, whereas intra-lineage topological discordance within subclades is indicated by dashed purple lines. Other samples were linked by green lines. Vertical bar plots adjacent to each phylogeny display population structure inferred through ADMIXTURE analysis, with optimal cluster numbers determined by cross-validation (K = 4 for plastome data, K = 3 for nuclear Pgk1 data).
3.6. Population structure

For plastomes, within the range of K values from 4 to 6, CV error decreases. As the K value increases, there is no longer a significant improvement in the CV error (Fig. S9A). Genetic structure analysis with ADMIXTURE partitioned plastomes into four distinct populations when K = 4, closely mirroring the primary branches of the plastome phylogenetic tree (Fig. 4 and Fig. S10). Notably, nine samples exhibited genetic admixture, with six instances in plastome-clade 1 and three in plastome-clade 3. When K was increased to 5, plastome-clade 1 was further subdivided into two genetic groups, each aligning with plastome-subclade 1 and plastome-subclade 2 in the phylogeny.

In the Pgk1 genotype dataset, the structure analysis revealed three groups that aligned with the phylogenetic structure at K = 3 (Fig. 4 and Fig. S4). Sixteen samples displayed genetic admixture between the CA lineage and EA lineage. When K was increased to 5, the analysis additionally uncovered a genetic component corresponding to the GT-subclade 1. Furthermore, among these samples, six individuals were identified as having an admixture between A. mongolicum and A. cristatum within the EA lineage, suggesting a significant history of hybridization events involving both lineages. Although the CV error was lowest at K = 7 (Fig. S9B), the results were inconsistent with the phylogenetic analysis, so it was excluded.

In the SLAF-seq dataset, ADMIXTURE analysis at K = 2 showed the lowest CV error (Fig. S9C), and divided all samples into two distinct lineages. Moreover, the MIX population with mixed components from both sides was observed. When K increased to 3, Agropyron mongolicum was distinguished, and a population (G1) with mixed components of A. mongolicum and East Asian A. cristatum was identified (Fig. 5A). The fineRADstructure results revealed similar lineage differentiation, however, they indicated more population subdivisions within the lineages. The EA lineage was divided into AM, G1, and G2, while the CA lineage was split into G3–G6. Topological clustering based on shared sites showed that the MIX population belonged to the CA lineage, and its comparable number of shared sites in both lineages suggested historical admixture between the two lineages (Fig. 5B). The inference of the BPP species tree from these eight populations corroborates the classification of the MIX group within the CA lineage, while concurrently supporting the delineation of these eight groups into discrete categories (PP = 1.00) (Fig. S11).

Fig. 5 Genomic population stratification revealed by SLAF-seq analysis. (A) Population structure inferred by ADMIXTURE, with K = 2 at the top and K = 3 at the bottom. (B) Clustered fineRADstructure coancestry matrix. Gradient colors quantify pairwise coalescent unit distances, while the dendrogram topology (upper margin) reflects shared ancestral sites among populations.
3.7. Genetic divergence analysis

To investigate the genetic diversity among lineages and populations, we conducted Fst analysis. The groupings for the Fst analyses were based on cytological ploidy levels and the results of the subdivisions deduced from population structure and phylogenetic analysis.

Fst analysis of plastome data revealed the main level of genetic divergence among different lineages, with the highest genetic divergence values (0.52–0.81) observed between the UN lineage and the other two lineages. For both diploid (Fst = 0.18) and tetraploid (Fst = 0.33) Agropyron cristatum, comparisons between the EA lineage and the CA lineage showed moderate genetic differentiation. Smaller genetic divergence (Fst = 0.03–0.16) between different ploidy groups within the same lineage indicated a close relationship between diploid and tetraploid individuals. The minimal divergence (Fst = 0.03) between A. mongolicum and tetraploid A. cristatum in the EA lineage further supported their close relationship (Fig. S12A).

In the Pgk1 dataset, diploid and tetraploid Agropyron cristatum within the same lineage exhibited negligible differentiation (Fst = 0.01–0.04). Both diploid (Fst = 0.38) and tetraploid (Fst = 0.42) individuals from different lineages showed moderate genetic differentiation. Consistent with the plastome Fst results, minimal genetic divergence was observed between A. mongolicum and the EA lineage, highlighting their close relationship. However, moderate genetic divergence (Fst = 0.18) between diploid A. cristatum in the EA lineage and A. mongolicum suggested that A. mongolicum is more closely related to tetraploid A. cristatum in the EA lineage (Fst = 0.03) (Fig. S12B).

In the SLAF-seq data, a pattern analogous to that observed in the plastome and Pgk1 datasets was uncovered. Within each lineage, genetic differentiation between diploid and tetraploid individuals was relatively low (Fst = 0.03–0.08), whereas differentiation across lineages was markedly higher (Fst = 0.29 for diploids; Fst = 0.36 for tetraploids). Remarkably, Agropyron mongolicum displayed the lowest genetic divergence from the tetraploid EA lineage but the greatest divergence from the diploid EA lineage. Furthermore, the MIX population was incorporated into the SLAF-seq analysis to examine genetic differentiation between admixed groups and these two lineages. The findings revealed reduced genetic differentiation between the MIX group and the CA lineage (Fst = 0.07–0.12) compared to the pronounced differentiation with the EA lineage (Fst = 0.23–0.33) (Fig. S12C).

3.8. Significance analysis of differences in plastome size and Pgk1 gene length

We also examined the variations in plastome size across four groups: Agropyron mongolicum, the EA lineage, CA lineage and UN lineage. Our findings suggested no substantial differences in plastome length among the EA lineage, CA lineage, or A. mongolicum. Notably, the UN lineage displayed a highly significant deviation in plastome size compared to the other three groups (Fig. S13A). Further investigation attributed this discrepancy to a 346 bp deletion within the intergenic spacer (IGS) region situated between atpI and atpH (spanning positions 31,768bp to 32,114bp, referencing the KY126307 plastome). A BLAST search against the Triticum aestivum reference genome (IWGSC CS RefSeq v.2.1) revealed striking similarity (Percent identity 99.14%, Query cover 100%) between the deleted sequence and the atpA (encoding ATP synthase subunit alpha, chloroplastic-like) gene located on chromosomes 2D, 3A, 5A, and 5B.

The amplified Pgk1 sequences varied in length from 1341 to 1491 bp. T-test results revealed a significant difference between the EA lineage (encompassing both Pgk1-subclade 1 and Pgk1-subclade 2) and the CA lineage (Fig. S13B). An insertion of approximately 80 bp was consistently observed in all Agropyron mongolicum specimens and a subset of EA lineage tetraploid A. cristatum individuals. This finding aligns with prior research by Fan et al. (2012), which documented the presence of this insertion in A. mongolicum, Qinghai-Tibet Plateau A. cristatum, and some Kengyilia species. Moreover, analysis using the Transposable Elements Platform (TREP) database classified this segment as a transposon sequence.

3.9. Migration history

A total of 126 Pgk1 SNPs and 16,779 SLAF-seq SNPs were utilized for TreeMix and Dsuite analyses. TreeMix analysis revealed two migration events in the Pgk1 dataset (Fig. 6A). Both gene flow events were associated with diploid individuals from the GT-subclade 4 of the CA lineage. The first event exhibited a migration weight of approximately 0.26 within the CA lineage, specifically from the GT-subclade 4 diploids to the GT-subclade 3 tetraploids. The second event showed a migration weight of approximately 0.11 between the CA and EA lineages, from the GT-subclade 4 diploids to the GT-subclade 2 tetraploids. D statistics from Dsuite analyses supported evidence of gene flow, identifying additional potential gene flows involving tetraploid A. cristatum in the CA lineage, beyond the TreeMix findings (Fig. 7B). Furthermore, more substantial gene flows were detected in the EA lineage, particularly involving diploids from GT-subclade 2. The highest fb value was observed within the EA lineage, between diploids of GT-subclade 2 and tetraploids of GT-subclade 1 (fb = 0.80). Another relatively large fb value of 0.51 was detected between diploids of GT-subclade 2 and A. mongolicum.

Fig. 6 Introgression analysis across populations. (A) TreeMix-derived phylogenomic networks with putative introgression events based on Pgk1 gene. (B) Dsuite Fbranch statistics (fb) quantifying introgression signals from nuclear Pgk1 data. (C) Whole–genome scale introgression topology reconstructed from SLAF-seq SNPs using TreeMix. (D) Genome-wide introgression heatmap generated by Dsuite Fbranch analysis. Migration edges in TreeMix plots (A and C) are annotated with arrow directionality, color-coded migration weights, and numerical parameters. The values in the parentheses represent jackknife estimate of the standard error, and the p-values respectively. Dsuite results (B and D) display Fbranch statistics through chromatic gradients, with exact fb values superimposed on internal box.

Fig. 7 Integrated reconstruction of demographic history and paleoclimatic drivers. (A) Time-aware demographic trajectories inferred through Stairway Plot analysis. The global effective population size (Ne) trajectory (black) is contrasted with lineage-specific dynamics: CA lineage (blue) and EA lineage (red). Vertical dashed lines mark critical Ne inflection points dated. (B) Time-calibrated phylogeny with ancestral range reconstruction by MCMCtree and RASP. Node annotations combine: probabilistic ancestral distributions (pie charts), mean divergence times (Mya, 95% HPD blue bars). Pie chart abbreviations (A, Central Asia [excluding Xinjiang]; B, Iran–Pakistan; C, Europe–Turkey; D, East Asia [including Xinjiang]; E, Qinghai-Tibet Plateau) annotate the inferred most probable ancestral distribution regions. Paleotemperature every 1000 years (lightgrey) and Gaussia-fitted smooth curve (grey) modified from Rohling et al. (2021) spanning 5 Mya climate cycles.

TreeMix analysis of SLAF-seq data identified three gene flow events (Fig. 7C). The first occurred between the EA lineage and MIX population (from the ancestral node of G2 to MIX) with a migration weight of approximately 0.46. The second occurred within the EA lineage, with gene flow from G2 tetraploids to G1 tetraploids (migration weight: 0.47). The final introgression event occurred in the CA lineage, from G4 diploids to the ancestral node of G6, with a migration weight of 0.34. Dsuite analysis detected strong introgression between G1 and G2 tetraploids in the EA lineage (fb = 0.45), and introgression events between the MIX population and all EA lineage populations (fb: 0.16–0.28) (Fig. 7D). The highest fb value (0.88) was observed in the CA lineage, between G4 and G6 tetraploids. To assess how much of the phylogenetic conflict could be explained by introgression, we conducted a QuIBL test. The results of the QuIBL test showed that both genetic component mixing events revealed by the SLAF-seq data TreeMix analysis could be explained by the ILS + introgression model (ΔBIC < −10). Analysis of ten independent replicates showed averaged 3.5% introgression between the EA lineage and MIX population, averaged 3.6% between G1 and G2 populations, and averaged 1.8% between G4 and G6 (Table S4).

3.10. Divergence time estimation and ancestral area reconstruction

The optimal DIVALIKE + J model (Table S5), implemented in RASP was integrated with divergence time estimates derived from SLAF-seq and plastome datasets to reconstruct the ancestral areas of Agropyron cristatum and its primary lineages. This approach generated a time-calibrated phylogenetic tree with ancestral node area reconstructions (Fig. 7B and Fig. S14).

Based on the plastome dataset analysis, the crown age of Agropyron was estimated at 3.32 Mya (95% highest posterior density [HPD]: 2.1–4.7 Mya), and the most likely ancestral area was reconstructed as the combined Central Asia, Europe and Qinghai-Tibet Plateau. Among the major lineages, the MRCA of the EA lineage diverged at 1.22 Mya (95% HPD: 0.80–1.57 Mya), with East Asia identified as its probable ancestral region. This EA lineage subsequently diversified around 1.05 Mya (95% HPD: 0.69–1.44 Mya) and 1.09 Mya (95% HPD: 0.70–1.53 Mya), though the ancestral regions for both subgroups could not be reliably determined in RASP. Divergence within the CA lineage occurred earlier at 1.52 Mya (95% HPD: 1.00–2.14 Mya; ancestral region: Europe), followed by a later event at 0.98 Mya (95% HPD: 0.60–1.42 Mya; ancestral region: Central Asia). Notably, the UN lineage exhibited the most recent MRCA divergence within Agropyron, estimated at 0.79 Mya (95% HPD: 0.33–1.38 Mya), for which the Qinghai-Tibet Plateau was inferred as the ancestral area (Fig. S14).

Analysis based on the SLAF-seq dataset indicated that the MRCA of Agropyron diverged around 3.56 Mya (95% HPD: 2.35–4.75 Mya). These data also identified the ancestral area as Asia, encompassing Central Asia, East Asia, and Qinghai-Tibet Plateau. For the CA lineage, the MRCA was dated to approximately 1.09 Mya (95% HPD: 0.91–1.87 Mya), suggesting a likely origin in Central Asia–Europe. Within this lineage, all populations originated between 0.84 and 0.86 Mya, except for G5, which emerged later at about 0.48 Mya. Specifically, the most likely ancestral region of G3 was inferred as Central Asia, whereas that of G5 was Europe. The remaining populations exhibited mixed Central Asia and Europe affinities. The MRCA of the EA lineage was estimated at 1.66 Mya, predating that of the CA lineage, with East Asia and Qinghai-Tibet Plateau identified as its probable source. Notably, the origin of A. mongolicum closely aligned with that of G5, dating to 0.57 Mya (95% HPD: 0.37–0.77 Mya). Population G2 emerged at 1.01 Mya (95% HPD: 0.66–1.35 Mya), and G1, the earliest among Agropyron populations, had an MRCA of 1.42 Mya (95% HPD: 0.92–1.89 Mya). Both G1 and G2 displayed distributions spanning East Asia and Qinghai-Tibet Plateau, whereas the most likely ancestral region of A. mongolicum was confined solely to East Asia (Fig. 7B).

The Stairway Plot (Fig. 7A and Fig. S15) of Agropyron cristatum indicated a decline in Ne from 1.5 to 0.87 Mya, culminating in a bottleneck event at approximately 0.87 Mya. A brief phase of rapid Ne decline occurred, and then the species underwent a substantial population expansion. Lineage-specific dynamics indicate that the CA lineage displayed continuous population growth until approximately 0.3 Mya. Conversely, the EA lineage remained relatively stable until 0.7 Mya, after which it experienced a short phase of rapid expansion before stabilizing again.

3.11. Morphological variation among Agropyron lineages

Morphological variations in spikes effectively distinguished Agropyron cristatum from A. mongolicum. PCA of 16 spike morphological characteristics demonstrated clear separation between these two species, thereby validating the morphological basis for their species delineation (Fig. S16A). Furthermore, PCA uncovered notable phylogenetic distinctions within A. cristatum, aligning with its morphological classification into two distinct lineages. ANOVA revealed significant differences among groups for six traits (p < 0.05). However, post-hoc tests detected significant differences only between EA and CA lineages in three traits: spike length, spikelet length, and floret number (Fig. S16C–H).

4. Discussion 4.1. Speciation pattern of tetraploid Agropyron cristatum

Agropyron cristatum comprises a polyploid complex of three cytotypes, with main focus on the formation of tetraploid cytotype being still in dispute. Early studies, based on phenolic profiles and the assumption of a single basic genome, suggested an autopolyploid origin (Dewey, 1969; Taylor and McCoy, 1973). In contrast, other researchers, citing high morphological variation and karyotype differences, proposed a segmental allopolyploid origin (Sarkar, 1956; Schulz-Schaeffer et al., 1963). A recent molecular karyotyping study rejected the autopolyploid nature of tetraploid A. cristatum (Zhao et al., 2017). Since A. cristatum and A. mongolicum are the only diploid taxa in Agropyron, phylogenetic comparison across cytotypes can be used to test if these two diploid cytotypes, both as parents, contributed genome to polyploid A. cristatum, which provides clues for identifying polyploidy pattern in this species.

Our analysis of Pgk1 sequences and SLAF-seq data indicate that tetraploid A. cristatum from East Asia and the Qinghai-Tibet Plateau are closely related to A. mongolicum and diploid A. cristatum from East Asia. In contrast, tetraploid A. cristatum from other regions are distinct from A. mongolicum but closely related to diploid A. cristatum. Our finding that the ancestral genetic structure of Pgk1 sequences in the tetraploid A. cristatum contains both heterozygous and homozygous elements suggest that the tetraploid A. cristatum is derived from two distinct diploid cytotypes from East Asia and Qinghai-Tibet Plateau. The transposon insertion in the Pgk1 gene also distinguishes the two diploid species; furthermore, sequences with and without this insertion were observed in the tetraploids of the EA lineage. Our plastome data further elucidate the maternal origin of the tetraploid A. cristatum in EA lineage. Phylogenetic analysis based on plastome data grouped both the diploid A. cristatum and diploid A. mongoliucm into the same clade. This suggests that both diploid cytotypes served as maternal donors in the polyploidization of tetraploid A. cristatum.

One problem with this interpretation is that phylogenetic topologies inferred from the Pgk1, SLAF-seq, and plastome data suggested that the EA lineage did not have distinct diploid parental cytotypes. An additional problem is that the population structure inferred from SLAF-seq data indicates that the CA lineage comprises only a single genetic component. A possible explanation of these findings is that the autopolyploidization of the diploid A. cristatum included regional genetic exchange. Specifically, we would expect tetraploid variations to include different genome combinations, e.g., two genomes of the diploid A. cristatum (PcPc), a genome A. mongolicum and a genome of the diploid A. cristatum (PcPm), and two genomes of A. mongolicum (PmPm). Morphological and cytological evidence suggests PmPm genomic composition of A. cristatum is unlikely. Furthermore, SLAF-seq data indicate that there are no pure PmPm tetraploids in the EA lineage, and only one hybrid population was found (Fig. 5).

Polyploids often have multiple origins, arising from independent populations of their progenitors (Soltis and Soltis, 1999). Our genetic analyses consistently indicate that the EA and CA lineages of the tetraploid A. cristatum have distinct evolutionary origins (Fig. 5A and Figs. S2-S5). Furthermore, our morphological analysis of A. cristatum lineages showed that three quantitative morphological traits (i.e., spike length, spikelet length and floret number) differed significantly between the EA and CA lineages of A. cristatum. Taken together, these findings lead us to propose that two evolutionarily significant units (ESUs) within A. cristatum be recognized: ESU-EA (EA lineage) and ESU-CA (CA lineage). Critically, the magnitude of these differences (e.g., spike length: 24.43–34.04 mm in ESU-EA vs. 26.56–64.35 mm in ESU-CA) is substantially smaller than the differences that distinguish A. cristatum from its close relative A. mongolicum (spike length: 87.92–104.51 mm). The overlapping ranges of these traits preclude identification based on a single character, necessitating a comprehensive assessment for diagnosis.

Coalescence analysis and fineRADstructure identified eight distinct genetic clusters within the EA and CA lineages, revealing greater complexity in their population history (Figs. 3B and 5B). BPP analysis also distinguished these eight populations as distinct groups nested within the broader EA and CA lineages. The distinct genetic clusters identified by BPP may constitute important management units (MUs) or incipient lineages within these ESUs. Given the relatively recent estimated crown age of the genus Agropyron (~3 Mya), these deeply divergent lineages and their internal substructure likely represent early stages of speciation.

In this study, we apply the terms ESUs and MUs to categorize lineages based on hierarchical genetic divergence and evolutionary independence, emphasizing their role in describing incipient speciation rather than conservation management. If future research provides conclusive evidence for reproductive isolation or identifies stable diagnostic characteristics at any hierarchical level (e.g., between the two ESUs or among the distinct genetic clusters within them), the taxonomic classification of A. cristatum should be revised.

These results, in combination with the groupings of Agropyron mongolicum and tetraploid A. cristatum (in EA lineage), indicate that different tetraploid A. cristatum with the same genotypes could be derived from different parental donors via independent hybridization events. This would promote rapid adaptation of this species to different ecological habitats, as partly reflected by high phenotypic diversity.

4.2. Diversification of Agropyron

Ancestral area reconstruction indicated that the two proposed lineages of tetraploid Agropyron cristatum had distinct ancestral regions: Central Asia–Europe (Europe, Turkey, Central Asia) and East Asia (East Asia, Qinghai-Tibet Plateau, Xinjiang). Xinjiang (China), which is geographically and culturally part of Central Asia, is proposed as a convergence zone for the two lineages. However, this hypothesis seems unlikely in our reconstruction. Although Xinjiang is often regarded as a geographical link between Central Asia and East Asia, western Xinjiang is geographically separated from Central Asia by a continuous mountain barrier comprising the Tianshan Mountains, Altai Mountains, Karakoram Mountains, and Pamir Plateau. Consequently, the CA and EA lineages were likely geographically isolated early in their evolutionary history. Despite this geographical isolation, our analysis of gene flow between A. cristatum lineages indicates that introgression has occurred between the CA and EA lineages. One potential explanation for the weak geographical isolation erected by the mountain barrier is that polyploidy facilitates plant adaptation to high-altitudes and cold climates (Brochmann et al., 2004; Folk et al., 2020). If so, we propose A. cristatum overcame cold climatic barriers in Xianjiang through polyploidization, enabling genetic exchange between the two lineages during secondary contact.

Phylogenetic analysis based on nuclear data identified the CA lineage of Agropyron cristatum; plastome data indicated that the CA lineage diverged into two distinct clades. Ancestral area reconstruction demonstrated that these two clades occupy adjacent geographic regions. Combined with evidence of gene flow within this lineage, these findings suggest that sympatric lineages can fuse through introgression, resulting in a single nuclear lineage while retaining divergent haplotypes in plastomes. A second, phylogenetically independent conflict suggests the potential presence of an unknown maternal donor (denoted as the UN lineage) in the history of A. cristatum. Analysis of Fst and deletion events suggests a distinct evolutionary history for the proposed UN lineage. Fst levels are higher in the UN lineage compared to other clades. Notably, a unique deletion within the IGS region of the UN lineage exhibits a high degree of similarity to the nuclear atpA gene, which encodes the chloroplast ATP synthase CF1 subunit alpha in wheat (McCarty et al., 2000). We thus speculate that the nuclear–cytoplasmic interactions brought about by intracellular gene transfer may have led to plastome gene pseudogenization and random loss in an unknown (UN) lineage, as has been documented in other cases (Sloan et al., 2018). Our analyses of effective population size indicate that the UN lineage itself may have undergone a bottleneck event. We propose that plastomes from the small remnant population of the UN lineage introgressed into the EA lineage via chloroplast capture. Subsequent to introgression, genetic drift resulted in the coalescent time of the captured plastome haplotype aligning approximately with this bottleneck period.

Demographic structure and molecular dating illustrate the relationship between climatic oscillations and population divergence. Changes in effective population size indicate that Agropyron populations encountered environmental pressure from 1.5 Mya to 0.87 Mya, which corresponds to the Middle Pleistocene Transition (about 0.7–1.2 Mya). This period is characterized by intensification of cold climates and increased global ice sheet coverage. Pollen studies have confirmed that during this period, there was a rapid decline of forest and a significant increase of grassland areas in temperate regions (Zhou et al., 2018). The climatic shift likely contributed to the rapid expansion of the CA lineage during this period and dramatically impacted the modern distribution of A. cristatum. During this period, A. cristatum populations, including three autopolyploid groups, underwent significant differentiation (Fig. 7B). The period during which the population of the EA lineage expanded (about 0.4–0.7 Mya) is close to the estimated time of origin for A. mongolicum.

The evolutionary mechanisms that enabled Agropyron populations to successfully recover after bottleneck events were likely hybridization and secondary contact. Post-polyploidization gene flow analyses revealed secondary contact events between Agropyron populations (Fig. 6). Our estimation of the most recent common ancestor of the Agropyron hybrid group (G1) predates the rapid expansion of Agropyron population sizes. Conventional divergence time estimation assumes an absence of large-scale introgression. This assumption may have biased time estimates of the origin of the Agropyron hybrid group (G1), as a hybrid species cannot chronologically precede its parental species in origin. Finally, elevated Fst values between polyploid lineages compared to diploid counterparts indicate that hybridization and secondary contacts likely substantially augmented genetic diversity of Agropyron populations.

Multiple origins of tetraploid Agropyron cristatum via independent hybridization events in CA and EA lineages would generate a diverse array of A. cristatum genotypes. Considering the gene flow that has happened within and between lineages, rapid diversification could also be enhanced by ancestral hybridization events. Potential cytonuclear integration and co-evolution, along with chloroplast capture, may preserve greater diversity in plastomes. Taken together, it is possible that climatic variation, geographic isolation, hybridization and chloroplast capture act to drive rapid diversification during the evolution of A. cristatum, promoting the occurrence of highly diverse phenotypes and genetic diversity.

5. Conclusions

Our study reveals that the evolutionary history of Agropyron cristatum, traditionally classified as an autopolyploid, is considerably more complex. We provide evidence indicating contributions from divergent progenitors, multiple independent origins, and extensive introgression among populations. This intricate reticulate evolutionary pattern, combined with processes of geographical isolation and historical climatic oscillations inferred from our analyses, has significantly contributed to the remarkable speciation, diversification, and broad Eurasian distribution observed in A. cristatum. Such complexity challenges simplistic views of autopolyploids and underscores their substantial potential for achieving high levels of genetic and adaptive diversity through recurrent hybridization and reticulate evolution in Agropyron. Critically, our findings demonstrate that evolutionary trajectories in tetraploid A. cristatum are often obscured by morphology-based studies. Even populations exhibiting minimal morphological divergence may represent distinct lineages shaped by divergent processes. Therefore, future research aimed at unraveling the mechanisms governing the evolution and diversification of A. cristatum, and potentially other autopolyploid complexes, should focus on high-resolution genomic analyses at the population level. Integrating cytogenetic, ecological, and advanced genomic data is essential to fully delineate population structures, identify cryptic diversity, and reconstruct the detailed evolutionary history.

Acknowledgements

This work was funded by the following projects: the National Natural Science Foundation of China (Nos. 31870360 and 32171603); Xining Science and Technology Major Project (2023-Z-13); Qinghai Provincial Science and Technology Major Project (2023-SF-A5); Youth Innovation Promotion Association (Chinese Academy of Sciences, Grant/Award Number: Y2023116) and the Qinghai Provincial central guide local science and technology development funds project (2025ZY002). We are grateful to American National Plant Germplasm System (Pullman, Washington, USA) providing the part seed materials for this study.

CRediT authorship contribution statement

Xing Fan: Conceptualization, Funding acquisition, Writing - Review & Editing. Lina Sha: Conceptualization, Writing - Review & Editing. Wenjie Chen: Funding acquisition, Writing - Review & Editing. Hao Yan: Writing - Original Draft, Investigation, Visualization. Yihao Zhang: Investigation. Hailun Shi: Investigation. Xuande Xu: Investigation. Shuangbing Yu: Investigation. Yue Zhang: Investigation. Lijun Yan: Resources. Yan Zhao: Resources. Dandan Wu: Resources. Yiran Cheng: Resources. Yi Wang: Resources. Xiao Ma: Resources. Haiqin Zhang: Resources. Houyang Kang: Resources. Yonghong Zhou: Resources.

Data accessibility statement

All the new Pgk1 and plastome data were deposited to GenBase database in NGDC (National Genomics Data Center, China) and Nucleotide database in NCBI (National Center for Biotechnology Information, America). SLAF-seq raw data were deposited to GSA database (CRA025368) in NGDC. Specimen ID, Database Accession ID, ploidy, sampling country information can be found in Table S1.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.pld.2025.10.002.

References
Alexander, D.H., Novembre, J., Lange, K., 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res., 19: 1655-1664. DOI:10.1101/gr.094052.109
Bandelt, H.J., Forster, P., Rohl, A., 1999. Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol., 16: 37-48. DOI:10.1093/oxfordjournals.molbev.a026036
Bernhardt, N., Brassac, J., Kilian, B., et al., 2017. Dated tribe-wide whole chloroplast genome phylogeny indicates recurrent hybridizations within Triticeae. BMC Evol. Biol., 17: 141. DOI:10.1186/s12862-017-0989-9
Bock, R., 2007. Structure, function, and inheritance of plastid genomes. In: Bock, R. (Ed.), Cell and Molecular Biology of Plastids, Topics in Current Genetics. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 29–63.
Brassac, J., Blattner, F.R., 2015. Species-level phylogeny and polyploid relationships in Hordeum (Poaceae) inferred by next-generation sequencing and in silico cloning of multiple nuclear loci. Syst. Biol., 64: 792-808. DOI:10.1093/sysbio/syv035
Brochmann, C., Brysting, A.K., Alsos, I.G., et al., 2004. Polyploidy in arctic plants: polyploidy in arctic plants. Biol. J. Linn. Soc., 82: 521-536. DOI:10.1111/j.1095-8312.2004.00337.x
Che, Y., Yang, Y., Yang, X., et al., 2015. Phylogenetic relationship and diversity among Agropyron Gaertn. germplasm using SSRs markers. Plant Syst. Evol., 301: 163-170. DOI:10.1007/s00606-014-1062-4
Chen, S., 2023. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta, 2: e107. DOI:10.1002/imt2.107
Chen, S.Y., Ma, X., Zhang, X.Q., et al., 2013. Genetic diversity and relationships among accessions of five crested wheatgrass species (Poaceae: Agropyron) based on gliadin analysis. Genet. Mol. Res., 12: 5704-5713. DOI:10.4238/2013.November.18.19
Chen, N., Sha, L.N., Dong, Z.Z., et al., 2018. Complete structure and variation of the chloroplast genome of Agropyron cristatum (L.) Gaertn. Gene, 640: 86-96. DOI:10.1016/j.gene.2017.10.009
Chen, N., Chen, W.J., Yan, H., et al., 2020. Evolutionary patterns of plastome uncover diploid-polyploid maternal relationships in Triticeae. Mol. Phylogenet. Evol., 149: 106838. DOI:10.1016/j.ympev.2020.106838
Chen, S.Y., Yan, H., Sha, L.N., et al., 2021. Chloroplast phylogenomic analyses resolve multiple origins of the Kengyilia species (Poaceae: Triticeae) via independent polyploidization events. Front. Plant Sci., 12: 682040. DOI:10.3389/fpls.2021.682040
Danecek, P., Auton, A., Abecasis, G., et al., 2011. The variant call format and VCFtools. Bioinformatics, 27: 2156-2158. DOI:10.1093/bioinformatics/btr330
Dewey, D.R., 1961. Hybrids between Agropyron repens and Agropyron desertorum. J. Hered., 52: 13-21. DOI:10.1093/oxfordjournals.jhered.a107013
Dewey, D.R., 1969. Inbreeding depression in diploid and induced-autotetraploid crested wheatgrass. Crop Sci., 9: 592-595. DOI:10.2135/cropsci1969.0011183X000900050023x
Dewey, D.R., 1984. The genomic system of classification as a guide to intergeneric hybridization with the perennial Triticeae. In: Gustafson, J.P. (Ed.), Gene Manipulation in Plant Improvement: 16Th Stadler Genetics Symposium. Springer, US, Boston, MA, pp. 209–279.
Dewey, D.R., 1986. Taxonomy of the crested wheatgrasses (Agropyron). In: Johnson, K.L. (Ed.), Crested Wheatgrass: Its Values, Problems and Myths. Utah State University, Logan, Utah, pp. 31–44.
Dewey, D.R., Asay, K.H., 1982. Cytogenetic and taxonomic relationships among three diploid crested wheatgrasses. Crop Sci., 22: 645-650. DOI:10.2135/cropsci1982.0011183X002200030052x
Dos Reis, M., Donoghue, P.C.J., Yang, Z., 2016. Bayesian molecular clock dating of species divergences in the genomics era. Nat. Rev. Genet., 17: 71-80. DOI:10.1038/nrg.2015.8
Doyle, J.J., 1990. Isolation of plant DNA from fresh tissue. Focus, 12: 13-15.
Eaton, D.A.R., Overcast, I., 2020. Ipyrad: interactive assembly and analysis of RADseq datasets. Bioinformatics, 36: 2592-2594. DOI:10.1093/bioinformatics/btz966
Edelman, N.B., Frandsen, P.B., Miyagi, M., et al., 2019. Genomic architecture and introgression shape a butterfly radiation. Science, 366: 594-599. DOI:10.1126/science.aaw2090
Fan, X., Sha, L.N., Zeng, J., et al., 2012. Evolutionary dynamics of the Pgk1 gene in the polyploid genus Kengyilia (Triticeae: Poaceae) and its diploid relatives. PLoS One, 7: e31122. DOI:10.1371/journal.pone.0031122
Fan, X., Sha, L.N., Dong, Z.Z., et al., 2013. Phylogenetic relationships and Y genome origin in Elymus L. sensu lato (Triticeae; Poaceae) based on single-copy nuclear Acc1 and Pgk1 gene sequences. Mol. Phylogenet. Evol., 69: 919-928. DOI:10.1016/j.ympev.2013.06.012
Fitak, R.R., 2021. OptM: estimating the optimal number of migration edges on population trees using Treemix. Biol. Methods Protoc., 6: bpab017. DOI:10.1093/biomethods/bpab017
Flouri, T., Jiao, X., Rannala, B., et al., 2018. Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol., 35: 2585-2593. DOI:10.1093/molbev/msy147
Folk, R.A., Siniscalchi, C.M., Soltis, D.E., 2020. Angiosperms at the edge: extremity, diversity, and phylogeny. Plant Cell Environ., 43: 2871-2893. DOI:10.1111/pce.13887
Gornicki, P., Zhu, H., Wang, J., et al., 2014. The chloroplast view of the evolution of polyploid wheat. New Phytol., 204: 704-714. DOI:10.1111/nph.12931
Grant, V., 1963. The Origin of Adaptations. New York: Columbia University Press.
Greiner, S., Lehwark, P., Bock, R., 2019. Organellar Genome DRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res., 47: W59-W64. DOI:10.1093/nar/gkz238
Hoang, D.T., Chernomor, O., von Haeseler, A., et al., 2018. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol., 35: 518-522. DOI:10.1093/molbev/msx281
Hsiao, C., Asay, K.H., Dewey, D.R., 1989. Cytogenetic analysis of interspecific hybrids and amphiploids between two diploid crested wheatgrasses, Agropyron mongolicum and A. cristatum. Genome, 32: 1079-1084. DOI:10.1139/g89-557
Hu, Q., Yan, C., Sun, G., 2013. Phylogenetic analysis revealed reticulate evolution of allotetraploid Elymus ciliaris. Mol. Phylogenet. Evol., 69: 805-813. DOI:10.1016/j.ympev.2013.06.023
Huang, S., Sirikhachornkit, A., Su, X., et al., 2002. Genes encoding plastid acetyl-CoA carboxylase and 3-phosphoglycerate kinase of the Triticum/Aegilops complex and the evolutionary history of polyploid wheat. Proc. Natl. Acad. Sci. U.S.A., 99: 8133-8138. DOI:10.1073/pnas.072223799
Huang, K., Li, W., Yang, B., et al., 2022. Vcfpop: performing population genetics analyses for autopolyploids and aneuploids based on next-generation sequencing data sets. Mol. Ecol. Resour., 25: e13744.
Huson, D.H., Bryant, D., 2024. The SplitsTree App: interactive analysis and visualization using phylogenetic trees and networks. Nat. Method., 21: 1773-1774. DOI:10.1038/s41592-024-02406-3
Jensen, K.B., Larson, S.R., Waldron, B.L., et al., 2005. Characterization of hybrids from induced × natural tetraploids of Russian wildrye. Crop Sci., 45: 1305-1311. DOI:10.2135/cropsci2004.0455
Jensen, K.B., Larson, S.R., Waldron, B.L., et al., 2006. Cytogenetic and molecular characterization of hybrids between 6x, 4x, and 2x ploidy levels in crested wheatgrass. Crop Sci., 46: 105-112. DOI:10.2135/cropsci2005.0148
Jia, Y., Liu, M.L., López-Pujol, J., et al., 2023. The hybridization origin of the Chinese endemic herb genus Notopterygium (Apiaceae): evidence from population genomics and ecological niche analysis. Mol. Phylogenet. Evol., 182: 107736. DOI:10.1016/j.ympev.2023.107736
Jiao, Y., Wickett, N.J., Ayyampalayam, S., et al., 2011. Ancestral polyploidy in seed plants and angiosperms. Nature, 473: 97-100. DOI:10.1038/nature09916
Kalyaanamoorthy, S., Minh, B.Q., Wong, T.K.F., et al., 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Method., 14: 587-589. DOI:10.1038/nmeth.4285
Katoh, K., Standley, D.M., 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol., 30: 772-780. DOI:10.1093/molbev/mst010
Kinoshita, T., Ikeda, Y., Ishikawa, R., 2008. Genomic imprinting: a balance between antagonistic roles of parental chromosomes. Semin. Cell Dev. Biol., 19: 574-579. DOI:10.1016/j.semcdb.2008.07.018
Kreiner, J.M., Kron, P., Husband, B.C., 2017. Frequency and maintenance of unreduced gametes in natural plant populations: associations with reproductive mode, life history and genome size. New Phytol., 214: 879-889. DOI:10.1111/nph.14423
Kumar, S., Stecher, G., Li, M., et al., 2018. Mega X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol., 35: 1547-1549. DOI:10.1093/molbev/msy096
Leigh, J.W., Bryant, D., 2015. Popart: full-feature software for haplotype network construction. Methods Ecol. Evol., 6: 1110-1116. DOI:10.1111/2041-210X.12410
Li, H., 2021. New strategies to improve minimap2 alignment accuracy. Bioinformatics, 37: 4572-4574. DOI:10.1093/bioinformatics/btab705
Liang, S., Zhang, X., Wei, R., 2022. Ecological adaptation shaped the genetic structure of homoploid ferns against strong dispersal capacity. Mol. Ecol., 31: 2679-2697. DOI:10.1111/mec.16420
Liu, X., Fu, Y.X., 2020. Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biol., 21: 280. DOI:10.1007/s12250-020-00244-z
Liu, Y., Xu, X., Dimitrov, D., et al., 2023. An updated floristic map of the world. Nat. Commun., 14: 2990. DOI:10.3390/molecules28072990
Löve, Á., 1982. Generic evolution of the wheatgrass. Biol. Zbl., 101: 199-212.
Ma, P.F., Zhang, Y.X., Zeng, C.X., et al., 2014. Chloroplast phylogenomic analyses resolve deep-level relationships of an intractable bamboo tribe Arundinarieae (Poaceae). Syst. Biol., 63: 933-950. DOI:10.1093/sysbio/syu054
Malinsky, M., Svardal, H., Tyers, A.M., et al., 2018a. Whole-genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow. Nat. Ecol. Evol., 2: 1940-1955. DOI:10.1038/s41559-018-0717-x
Malinsky, M., Trucchi, E., Lawson, D.J., et al., 2018b. RADpainter and fineRADstructure: population inference from RADseq data. Mol. Biol. Evol., 35: 1284-1290. DOI:10.1093/molbev/msy023
Malinsky, M., Matschiner, M., Svardal, H., 2021. Dsuite - fast D-statistics and related admixture evidence from VCF files. Mol. Ecol. Resour., 21: 584-595. DOI:10.1111/1755-0998.13265
Marcussen, T., Sandve, S.R., Heier, L., et al., 2014. Ancient hybridizations among the ancestral genomes of bread wheat. Science, 345: 1250092. DOI:10.1126/science.1250092
Masterson, J., 1994. Stomatal size in fossil plants: evidence for polyploidy in majority of angiosperms. Science, 264: 421-424. DOI:10.1126/science.264.5157.421
Matzke, N.J., 2014. Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Syst. Biol., 63: 951-970. DOI:10.1093/sysbio/syu056
McCarty, R.E., Evron, Y., Johnson, E.A., 2000. The chloroplast ATP synthase: a rotary enzyme?. Annu. Rev. Plant Physiol. Plant Mol. Biol., 51: 83-109. DOI:10.1146/annurev.arplant.51.1.83
Mellish, A., Coulman, B., Ferdinandez, Y., 2002. Genetic relationships among selected crested wheatgrass cultivars and species determined on the basis of AFLP markers. Crop Sci., 42: 1662-1668. DOI:10.2135/cropsci2002.1662
Minh, B.Q., Schmidt, H.A., Chernomor, O., et al., 2020. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol., 37: 1530-1534. DOI:10.1093/molbev/msaa015
Page, A.J., Taylor, B., Delaney, A.J., et al., 2016. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb. Genom., 2: e000056.
Paradis, E., Schliep, K., 2019. Ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics, 35: 526-528. DOI:10.1093/bioinformatics/bty633
Patil, I., 2021. Visualizations with statistical details: the “ggstatsplot” approach. JOSS, 6: 3167. DOI:10.21105/joss.03167
Pickrell, J.K., Pritchard, J.K., 2012. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genetics, 8: e1002967. DOI:10.1371/journal.pgen.1002967
Purcell, S., Neale, B., Todd-Brown, K., et al., 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet., 81: 559-575. DOI:10.1086/519795
Quinlan, A.R., Hall, I.M., 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26: 841-842. DOI:10.1093/bioinformatics/btq033
Rambaut, A., Drummond, A.J., Xie, D., et al., 2018. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol., 67: 901-904. DOI:10.1093/sysbio/syy032
Revell, L.J., 2024. Phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12: e16505. DOI:10.7717/peerj.16505
Rohling, E.J., Yu, J., Heslop, D., et al., 2021. Sea level and deep-sea temperature reconstructions suggest quasi-stable states and critical transitions over the past 40 million years. Sci. Adv., 7: eabf5326. DOI:10.1126/sciadv.abf5326
Ronquist, F., Teslenko, M., van der Mark, P., et al., 2012. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol., 61: 539-542. DOI:10.1093/sysbio/sys029
Said, M., Hřibová, E., Danilova, T.V., et al., 2018. The Agropyron cristatum karyotype, chromosome structure and cross-genome homoeology as revealed by fluorescence in situ hybridization with tandem repeats and wheat single-gene probes. Theor. Appl. Genet., 131: 2213-2227. DOI:10.1007/s00122-018-3148-9
Sang, T., 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Crit. Rev. Biochem. Mol. Biol., 37: 121-147. DOI:10.1080/10409230290771474
Sarkar, R., 1956. Crested wheatgrass complex. Can. J. Bot., 34: 328-345. DOI:10.1139/b56-026
Schulz-Schaeffer, J., Allderdice, P.W., Creel, G.C., 1963. Segmental allopolyploidy in tetraploid and hexaploid Agropyron species of the crested wheatgrass complex (section Agropyron). Crop Sci., 3: 525-530. DOI:10.2135/cropsci1963.0011183X000300060021x
Sha, L.N., Fan, X., Yang, R.W., et al., 2010. Phylogenetic relationships between Hystrix and its closely related genera (Triticeae; Poaceae) based on nuclear Acc1, DMC1 and chloroplast trnL-F sequences. Mol. Phylogenet. Evol., 54: 327-335. DOI:10.1016/j.ympev.2009.05.005
Sha, L.N., Fan, X., Li, J., et al., 2017. Contrasting evolutionary patterns of multiple loci uncover new aspects in the genome origin and evolutionary history of Leymus (Triticeae; Poaceae). Mol. Phylogenet. Evol., 114: 175-188. DOI:10.1016/j.ympev.2017.05.015
Sha, L.N., Liang, X., Tang, Y., et al., 2022. Evolutionary patterns of plastome resolve multiple origins of the Ns-containing polyploid species in Triticeae. Mol. Phylogenet. Evol., 175: 107591. DOI:10.1016/j.ympev.2022.107591
Shi, L., Chen, H., Jiang, M., et al., 2019. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res., 47: W65-W73. DOI:10.1093/nar/gkz345
Sloan, D.B., Warren, J.M., Williams, A.M., et al., 2018. Cytonuclear integration and co-evolution. Nat. Rev. Genet., 19: 635-648. DOI:10.1038/s41576-018-0035-9
Soltis, D.E., Soltis, P.S., 1999. Polyploidy: recurrent formation and genome evolution. Trends Ecol. Evol., 14: 348-352. DOI:10.1016/S0169-5347(99)01638-9
Soltis, D.E., Soltis, P.S., Schemske, D.W., et al., 2007. Autopolyploidy in angiosperms: have we grossly underestimated the number of species?. Taxon, 56: 13-30. DOI:10.2307/25065732
Soltis, D.E., Soltis, P.S., Endress, P.K., et al., 2018. Phylogeny and Evolution of the Angiosperms: Revised and Updated Edition. Chicago: The University of Chicago Press.
Spoelhof, J.P., Soltis, P.S., Soltis, D.E., 2017. Pure polyploidy: closing the gaps in autopolyploid research. J. Syst. Evol., 55: 340-352. DOI:10.1111/jse.12253
Stebbins, G.L., 1947. Types of polyploids: their classification and significance. In: Demerec, M. (Ed.), Advances in Genetics. Academic Press, New York, pp. 403–429.
Stebbins, G.L., 1950. Variation and Evolution in Plants. New York: Columbia University Press.
Sun, X., Liu, D., Zhang, X., et al., 2013. SLAF-seq: an efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing. PLoS One, 8: e58700. DOI:10.1371/journal.pone.0058700
Swofford, D.L., 2003. PAUP*. Phylogenetic Analysis Using Parsimony (*And Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.
Taylor, R.J., McCoy, G.A., 1973. Proposed origin of tetraploid species of crested wheatgrass based on chromatographic and karyotypic analyses. Am. J. Bot., 60: 576-583. DOI:10.1002/j.1537-2197.1973.tb05960.x
Tzvelev, N.N., 1976. Zlaki SSSR (Grasses of the Soviet Union). Nauka, Leningrad.
Yang, Z., 2007. Paml 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol., 24: 1586-1591. DOI:10.1093/molbev/msm088
Yang, J.B., Li, D.Z., Li, H.T., 2014. Highly effective sequencing whole chloroplast genomes of angiosperms by nine novel universal primer pairs. Mol. Ecol. Resour., 14: 1024-1031. DOI:10.1111/1755-0998.12251
Yang, J., Zhang, C., Zhao, N., et al., 2018. Chinese root-type mustard provides phylogenomic insights into the evolution of the multi-use diversified allopolyploid Brassica juncea. Mol. Plant, 11: 512-514. DOI:10.1016/j.molp.2017.11.007
Yen, C., Yang, J.L., Baum, B.R., 2013. Biosystematics on Triticeae (Poaceae). vol. 3. Beijing: Chinese Agricultural Press.
Yin, L., Zhang, H., Tang, Z., et al., 2021. rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genom. Proteom. Bioinform., 19: 619-628. DOI:10.1016/j.gpb.2020.10.007
Yu, Y., Blair, C., He, X., 2020. Rasp 4: ancestral state reconstruction tool for multiple genes and characters. Mol. Biol. Evol., 37: 604-606. DOI:10.1093/molbev/msz257
Zhang, C., Mirarab, S., 2022. Weighting by gene tree uncertainty improves accuracy of quartet-based species trees. Mol. Biol. Evol., 39: msac215. DOI:10.1093/molbev/msac215
Zhang, D., Gao, F., Jakovlić, I., et al., 2020. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol. Ecol. Resour., 20: 348-355. DOI:10.1111/1755-0998.13096
Zhang, X., Sun, Y., Landis, J.B., et al., 2020. Genomic insights into adaptation to heterogeneous environments for the ancient relictual Circaeaster agrestis (Circaeasteraceae, Ranunculales). New Phytol., 228: 285-301. DOI:10.1111/nph.16669
Zhang, L., Zhu, X., Zhao, Y., et al., 2022. Phylotranscriptomics resolves the phylogeny of Pooideae and uncovers factors for their adaptive evolution. Mol. Biol. Evol., 39: msac026. DOI:10.1093/molbev/msac026
Zhao, Y., Xie, J., Dou, Q., et al., 2017. Diversification of the P genome among Agropyron Gaertn. (Poaceae) species detected by FISH. Comp. Cytogenet., 11: 495-509. DOI:10.3897/CompCytogen.v11i3.13124
Zhou, X., Yang, J., Wang, S., et al., 2018. Vegetation change and evolutionary response of large mammal fauna during the Mid-Pleistocene transition in temperate northern East Asia. Palaeogeogr. Palaeoclimatol. Palaeoecol., 505: 287-294. DOI:10.1016/j.palaeo.2018.06.007