Genomic analyses provide insights into the genetic basis of quality traits in Amomum tsaoko
Yingmin Zhanga,b, Congwei Yanga, Jiahong Donga, Jinyu Zhangc, Ticao Zhangd,**, Guodong Lia,b,*     
a. Faculty of Traditional Chinese Pharmacy, Yunnan University of Chinese Medicine, Kunming 650500, China;
b. Yunnan Key Laboratory of Dai and Yi Medicines, Yunnan University of Chinese Medicine, Kunming 650500, China;
c. Medicinal Plants Research Institute, Yunnan Academy of Agricultural Sciences, Kunming 650205, China;
d. State Key Laboratory of Phytochemistry and Natural Medicines, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
Keywords: Amomum tsaoko    Quality traits    GWAS    Transcriptomic    Metabolomic    

Amomum tsaoko Crevost & Lemarie is a perennial herb belonging to the genus Amomum in the family Zingiberaceae (Fig. S1; Wu and Kai, 2000). It is primarily distributed in southern China, with smaller populations occurring in northern Vietnam (Yang et al., 2022). The dried fruit of A. tsaoko is widely used in traditional Chinese medicine for treating gastrointestinal disorders, vomiting, and malaria (Chinese Pharmacopoeia Commission, 2025). Additionally, it is commonly used as a culinary spice and has been approved as a medicine-food homology species in China (Zhang et al., 2014). Recent studies identified essential oils as the primary bioactive components of A. tsaoko, predominantly monoterpene hydrocarbons and oxygenated monoterpenes, including 1,8-cineole, α-pinene, β-pinene, terpineol, geraniol, and geranial (Cui et al., 2017; Li et al., 2021). Due to these medicinal and culinary benefits, A. tsaoko has substantial economic significance, leading to extensive commercial cultivation.

China accounts for approximately 80% of the global cultivation area and production of Amomum tsaoko. Within China, Yunnan Province is the leading cultivation region, contributing more than 90% of the national production (Zhang et al., 2019). However, the development of the A. tsaoko industry has been hampered by germplasm confusion, a lack of superior cultivars, and severe germplasm degradation (Yang et al., 2020). Currently, no distinct wild populations have been identified, and all available germplasm originates from long-term cultivation (Yang et al., 2024). These issues primarily arise from the limited understanding of genetic diversity and the genetic basis underlying important quality traits, hindering breeding programs and germplasm improvement. Therefore, understanding genetic diversity and identifying candidate genes associated with important quality traits through genomic studies are essential to overcome these challenges. In this study, we sequenced the whole genome of A. tsaoko, providing insights into its genetic characteristics and evolutionary background. We also conducted whole-genome resequencing on 152 germplasm accessions collected from major cultivation regions to capture most of the genetic diversity within the species. Genome-wide association studies (GWAS) were subsequently conducted to identify genetic loci influencing key quality traits relevant to its medicinal and culinary properties. Additionally, transcriptome sequencing and metabolomic analyses were carried out on diverse fruit types of A. tsaoko to reveal the molecular mechanisms underlying trait differences.

The genome size of Amomum tsaoko was estimated to be 2.062 Gb with a heterozygosity rate of 0.99%, based on K-mer analysis (Table S1 and Fig. S2). Utilizing Nanopore sequencing and high-throughput chromosome conformation capture (Hi-C) technologies, we generated a near-complete diploid genome assembly of 1.967 Gb with a contig N50 of 29.73 Mb (Fig. S3). Over 99.82% of the genome sequences were successfully anchored to 24 pseudo-chromosomes (Fig. 1A and Tables S2–S4). BUSCO analysis indicated 93.9% genome completeness (Table S5). This assembly is more complete than previously reported A. tsaoko genome (Table S6; Li et al., 2022, Chen et al., 2025). A total of 35,162 protein-coding genes were annotated (Table S7), along with 7126 non-coding RNA genes (Table S8). BUSCO analysis revealed 95.3% completeness for protein-coding genes, with 97.39% functional annotation (Tables S9 and S10). Repetitive sequences account for 85.23% of the genome, with LTR/Copia and LTR/Gypsy bursts estimated at approximately 1.2 and 1.5 Million years ago (Mya), respectively (Table S11 and Fig. S4).

Fig. 1 Multi-omics characterization of Amomum tsaoko. A: Chromosomal features of the A. tsaoko genome, with tracks indicating chromosomes, gene density, repeat sequence density, GC content, and chromosome collinearity. B: A phylogenetic tree of 14 plant species showing estimated divergence times, with pink bars at nodes representing the 95% highest posterior density. Red and green numbers indicate gene family expansion and contraction in each species and their common ancestor. C: Phylogenetic tree of A. tsaoko germplasm resources constructed based on SNP markers. D: Three-dimensional PCA clustering plot of 152 A. tsaoko germplasm accessions. E: Population structure analysis of 152 A. tsaoko germplasm accessions. F: Manhattan plot for fruit length association analysis. G: Quantile–quantile plot for fruit length association analysis. H: Local manhattan plot and linkage disequilibrium heatmap for candidate gene. I: Schematic representation of candidate genes including candidate causal SNP. J: Candidate genes linked to fruit length, with phenotypic significance testing. K: Terpenoid and monoterpenoid biosynthesis pathways in A. tsaoko. L: Heatmap of gene expression clusters for fruit length, with red asterisks marking significantly associated candidate genes.

Comparative genomics analyses across 14 species identified 30,632 orthologous gene families, including 419 single-copy genes. Among these, 11,688 gene families were shared across Zingiberaceae species, while 720 gene families were unique to Amomum tsaoko (Fig. S5). Compared to other Zingiberaceae species, the A. tsaoko genome exhibited an expansion in 894 gene families and a contraction in 1,536 gene families (Fig. 1B). Notably, the expanded gene families were significantly enriched in the KEGG functional categories such as “Terpenoid backbone biosynthesis,” “Tryptophan metabolism,” and “Monoterpenoid biosynthesis” (Fig. S6). The expansion of terpene synthase families may enhance the production of terpenoid volatiles (Zhang et al., 2024), potentially contributing to the accumulation of medicinal components in the fruits of A. tsaoko. MCMCtree analysis estimated the divergence time between A. tsaoko and Wurfbainia species at approximately 11.96 Mya, with Zingiberaceae diversification occurring around 20.89 Mya (Fig. S7).

Whole-genome resequencing of 152 cultivated Amomum tsaoko accessions collected from various regions within Yunnan Province (Table S12), with an average genome coverage of 20.55 × per accession. A total of 21.92 million single nucleotide polymorphisms (SNP) were identified (Table S13). Phylogenetic, principal component analysis (PCA), and population structure (K = 2) analyses classified the accessions into two subpopulations: “cultigen 1,” comprising 81 accessions from Southeast Yunnan, and “cultigen 2,” including 14 accessions from Western Yunnan and 57 accessions from Northwest Yunnan (Fig. 1CE). Both subpopulations exhibited similar nucleotide diversity (2.93 × 10−3 for “cultigen 1” and 2.96 × 10−3 for “cultigen 2”) and Tajima’s D values (1.59 and 1.65, respectively), with low genetic differentiation (FST = 0.0238). Linkage disequilibrium decay patterns were comparable, though slightly slower in “cultigen 1” (41 kb) than in “cultigen 2” (48 kb) (Fig. S8). Historical evidence indicates cultivation in Southeast Yunnan for over 300 years, with germplasm from Northwest and Western Yunnan likely derived from this region (Wen et al., 2024). These results suggest that the current population structure was influenced by a founder effect, with limited genetic differentiation between subpopulations.

Statistical and comparative analyses were conducted on Amomum tsaoko accessions to evaluate important fruit quality traits, including fruit length, volatile oil content and more (Table S14; Figs. S9 and S10). GWAS identified 580 significant SNPs associated with nine fruit quality traits (Figs. 1F, G and S11-S17). The candidate genes associated with these traits were primarily involved in plant hormone regulation (Table S15). A key fruit length-associated gene, Ats20G00291450, encodes a zinc finger protein with a nonsynonymous SNP (Chr20: 65842538, G/T) in the exon, leading to an alanine-to-serine substitution (Fig. 1I). Based on this SNP, accessions were grouped into two haplotypes, with Hap.1 being significantly associated with longer fruit length compared to Hap.2 (Fig. 1J). Similar findings have been reported in tomato, indicating that zinc finger proteins such as SlPZF1 regulate fruit size through modulation of pericarp cell size (Zhao et al., 2021). Additionally, ZmZFP2 has been demonstrated to regulate kernel size and weight in maize (Zhang et al., 2025), further supporting the potential conserved function of zinc finger proteins in fruit and seed development. Another significant gene associated with fruit length, Ats21G00302410, encodes an AP2/EREBP family transcription factor. Two polymorphisms, SNP1 (Chr21G: 63193976, G/C) and SNP2 (Chr21G: 63193984, A/C), located upstream of Ats21G00302410 and forming three haplotypes (Fig. 1I). Accessions with Hap.3 exhibited significantly longer fruit lengths compared to those with Hap.1 and Hap.2 (Fig. 1J). Genes of the AP2 clade are known to primarily regulate plant developmental processes. For example, the AP2/ERF transcription factor gene ENO regulates tomato fruit size via the floral meristem developmental network (Yuste-Lisbona et al., 2020). Similar regulatory mechanisms involving AP2/ERF family genes have also been reported in apple fruit development (Wang et al., 2017).

Transcriptomic analyses of Amomum tsaoko fruits with spherical, elliptical, and elongated types generated 57.71 Gb of sequence reads (Table S16 and Fig. S18). Differentially expressed genes (DEGs) identified among different fruit shapes were significantly enriched in pathways such as “Plant-pathogen interaction,” “Plant hormone signal transduction,” “MAPK signaling,” and “Terpenoid backbone biosynthesis” (Fig. S19). Metabolomic analyses revealed that spherical fruits accumulated the highest levels of terpenoid compounds, which are crucial bioactive constituents of A. tsaoko (Fig. S20). Metabolomic and transcriptomic integrative analyses identified 21 DEGs associated with terpenoid biosynthesis, with DXS (Ats08G00118320) strongly correlating with 1,8-cineole synthesis (Figs. 1K and S21). Additionally, the integration of RNA-Seq data with GWAS results provided robust evidence highlighting regulatory genes involved in plant hormone pathways, including AP2 (Ats21G00302410), AUX1 (Ats13G00186730), and SAUR (Ats16G00235120), as significantly associated with fruit shape (Figs. 1L and S22). Notably, the consistent expression patterns of these candidate genes across transcriptomic data reinforce their identification through GWAS, supporting their critical roles as regulators of plant development and fruit morphology (Bao et al., 2024; Mao et al., 2025). Collectively, these integrative analyses strengthen previous findings that the plant hormone signaling pathway plays an essential role in determining fruit morphology (Zheng et al., 2022; Yisilam et al., 2025).

In conclusion, we generated a high-quality, chromosome-level genome assembly of Amomum tsaoko, offering a comprehensive understanding of its genetic background. Notably, genes involved in fruit type regulation and terpenoid biosynthesis were identified, and multi-omics data shed light on the mechanisms underlying the quality trait formation. Analyses of genetic variation in core germplasm identified molecular markers and candidate genes linked to key quality traits, providing targets for precision breeding. Therefore, our findings form a solid foundation for genetic improvement and sustainable utilization of A. tsaoko.

Acknowledgements

This work was supported by the National Key R&D Program of China (2024YFF1306700), the National Natural Science Foundation of China (82260739), the Yunnan Provincial Science and Technology Department-Applied Basic Research Joint Special Funds of Yunnan University of Chinese Medicine (202101AZ070001-005, 202101AZ070001-166), the Yunnan Revitalization Talent Support Program “Young Talent” Project (YNWR-QNBJ-2020-278) and Strategic Priority Research Program of the Chinese Academy of Sciences (XDB1230000).

CRediT authorship contribution statement

Yingmin Zhang: Writing – original draft, Visualization, Software, Investigation, Formal analysis, Data curation. Congwei Yang: Visualization, Resources, Formal analysis, Data curation. Jiahong Dong: Resources, Investigation, Formal analysis. Jinyu Zhang: Resources, Formal analysis. Ticao Zhang: Writing – review & editing, Conceptualization. Guodong Li: Writing – review & editing, Conceptualization.

Data availability

The genome assembly and raw sequence data have been submitted to the National Center for Biotechnology Information (NCBI) database under accession number JBJXVZ000000000. The annotations of Amomum tsaoko have been uploaded to Figshare (https://doi.org/10.6084/m9.figshare.29194664).

Declaration of competing interest

The authors declare that they have no conflict of interest.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.pld.2025.07.006.

References
Bao, D.F., Chang, S.Q., Li, X.D., et al., 2024. Advances in the study of auxin early response genes: Aux/IAA, GH3, and SAUR. Crop J., 12: 964-978. DOI:10.1016/j.cj.2024.06.011
Chen, S.S., Peng, X.C., Xie, Z.Y., et al., 2025. Genetic and genomic insights into dichogamy in Zingiberaceae. Plant Commun., 6: 101352. DOI:10.1016/j.xplc.2025.101352
Chinese Pharmacopoeia Commission, 2025. Pharmacopoeia of the People's Republic of China (Part Ⅰ). Beijing: Chinese Medicine and Technology Press: pp. 257-258.
Cui, Q., Wang, L.T., Liu, J.Z., et al., 2017. Rapid extraction of Amomum tsaoko essential oil and determination of its chemical composition, antioxidant and antimicrobial activities. J. Chromatogr. B, 1061: 364-371. DOI:10.1016/j.jchromb.2017.08.001
Li, G.D., Lu, Q.W., Wang, J.J., et al., 2021. Correlation analysis of compounds in essential oil of Amomum tsaoko seed and fruit morphological characteristics, geographical conditions, locality of growth. Agronomy, 11: 744. DOI:10.3390/agronomy11040744
Li, P., Bai, G.X., He, J.B., et al., 2022. Chromosome-level genome assembly of Amomum tsao-ko provides insights into the biosynthesis of flavor compounds. Hortic. Res., 9: uhac211. DOI:10.1093/hr/uhac211
Mao, L., Guo, C., Niu, L.Z., et al., 2025. Subgenome asymmetry of gibberellins-related genes plays important roles in regulating rapid growth of bamboos. Plant Divers., 47: 68-81. DOI:10.1016/j.pld.2024.10.004
Wang, C.H., Xin, M., Zhou, X.Y., et al., 2017. The novel ethylene-responsive factor CsERF025 affects the development of fruit bending in cucumber. Plant Mol. Biol., 95: 519-531. DOI:10.1007/s11103-017-0671-z
Wen, H., Yang, M.Q., Yang, T.M., et al., 2024. Herbal textual research on Tsaoko Fructus in famous classical formulas. Chin. J. Exp. Tradit. Med. Formulae, 30: 89-99.
Wu, D.L., Kai, L., 2000. In: Wu, Z.Y., Raven, P.H., Hong, D.Y. (Eds.), Flora of China, 24. Science Press, Beijing, China, pp. 322–377.
Yang, Q., Chen, D.J., Yang, L.Y., et al., 2020. Current status and development strategies for the Amomum tsaoko industry in Yunnan Province. Mod. Agric. Sci. Technol., 1: 245-247. DOI:10.1504/ijict.2020.106317
Yang, S.Y., Xue, Y.F., Chen, D.J., et al., 2022. Amomum tsao-ko Crevost & Lemarie’: a comprehensive review on traditional uses, botany, phytochemistry, and pharmacology. Phytochem. Rev., 21: 1487-1521. DOI:10.1007/s11101-021-09793-x
Yang, Y.W., Li, G.D., He, J.C., et al., 2024. Biological and Resource Research of Amomum tsaoko. Kunming, China: Yunnan Science and Technology Press: p. 85.
Yisilam, G., Zheng, E.T., Li, C.N., et al., 2025. The chromosome-scale genome of black wolfberry (Lycium ruthenicum) provides useful genomic resources for identifying genes related to anthocyanin biosynthesis and disease resistance. Plant Divers., 47: 201-213. DOI:10.1016/j.pld.2025.01.001
Yuste-Lisbona, F.J., Fernández-Lozanoa, A., Pinedac, B., 2020. ENO regulates tomato fruit size through the floral meristem development network. Proc. Natl. Acad. Sci. U.S.A., 117: 8187-8195. DOI:10.1073/pnas.1913688117
Zhang, T.T., Lu, C.L., Jiang, J.G., 2014. Bioactivity evaluation of ingredients identified from the fruits of Amomum tsaoko Crevost et Lemaire, a Chinese spice. Food Funct., 5: 1747-1754. DOI:10.1039/C4FO00169A
Zhang, W., Lu, B.Y., Meng, H.L., et al., 2019. Phenotypic diversity analysis of the fruit of Amomum tsao-ko Crevost et Lemarie, an important medicinal plant in Yunnan, China. Genet. Resour. Crop Evol., 66: 1145-1154. DOI:10.1007/s10722-019-00765-x
Zhang, Z.Y., Xia, H.X., Yuana, M.J., et al., 2024. Multi-omics analyses provide insights into the evolutionary history and the synthesis of medicinal components of the Chinese wingnut. Plant Divers., 46: 309-320. DOI:10.1016/j.pld.2024.03.010
Zhang, L., Wang, Q.L., Li, W.Y., et al., 2025. ZmZFP2 encoding a C4HC3-type RING zinc finger protein regulates kernel size and weight in maize. Crop J., 13: 418-431. DOI:10.1016/j.cj.2025.01.013
Zhao, F.F., Zhang, J.J., Weng, L., et al., 2021. Fruit size control by a zinc finger protein regulating pericarp cell size in tomato. Mol. Hortic., 1: 1-16. DOI:10.1186/s43897-021-00009-6
Zheng, H., Dong, Y., Nong, H.L., et al., 2022. VvSUN may act in the auxin pathway to regulate fruit shape in grape. Hortic. Res., 9: uhac200. DOI:10.1093/hr/uhac200