整合生物学先验信息的全基因组选择方法及其在家畜育种中的应用进展

引用本文

袁泽湖, 葛玲, 李发弟, 乐祥鹏, 孙伟. 整合生物学先验信息的全基因组选择方法及其在家畜育种中的应用进展[J]. 畜牧兽医学报, 2021, 52(12): 3323-3334.

YUAN Zehu, GE Ling, LI Fadi, YUE Xiangpeng, SUN Wei. The Method of Genomic Selection by Integrating Biological Prior Information and Its Application in Livestock Breeding[J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52(12): 3323-3334.

袁泽湖^1,2, 葛玲³, 李发弟², 乐祥鹏², 孙伟^1,3

1. 扬州大学教育部农业与农产品安全国际合作联合实验室, 扬州 225000;
2. 兰州大学草地农业科技学院草地农业生态系统国家重点实验室/农业农村部草牧业创新重点实验室/教育部草地农业工程研究中心, 兰州 730020;
3. 扬州大学动物科学与技术学院, 扬州 225000

收稿日期：2021-04-06

基金项目：国家自然科学基金国际合作项目（32061143036）；国家自然科学基金面上项目（31872333；32172689）；江苏省农业重大新品种创制项目（PZCZ201739）；江苏省重点研发计划（现代农业）项目（BE2018354）；江苏省农业科技自主创新资金项目（CX（18）2003）；江苏省自然科学基金青年基金项目（BK20210811）；扬州市重点研发计划（YZ2021055）

作者简介：袁泽湖(1988-), 男, 重庆綦江人, 助理研究员, 博士, 主要从事动物遗传育种研究, E-mail: yuanzehu@yzu.edu.cn.

通信作者：乐祥鹏, 主要从事动物遗传育种研究, E-mail: lexp@lzu.edu.cn; 孙伟, 主要从事动物遗传育种与繁育研究, E-mail: dkxmsunwei@163.com.

摘要：相较于传统的育种方法，全基因组选择（genomic selection，GS）通过对拟留种的个体进行早期选择和增加选择的准确性进而加快育种的遗传进展。通过改进GS方法无法再缩短育种的世代间隔，因而如何提高GS的准确性以获得额外的遗传进展一直是GS研究的核心问题。当前，各种组学技术不断成熟，从公开的资料或前期的研究积累获取生物学先验信息已比较容易。因而，如何在GS模型中整合已知的先验信息进而提高GS的准确性以获得额外的遗传进展成为当前育种研究的热点问题。本文对生物学先验信息的类型以及整合先验信息的GS方法进行综述，探讨了这些方法在家畜育种中的应用和前景，以期为家畜育种中开展整合生物学先验信息的GS研究提供借鉴与参考。

关键词：生物学先验信息全基因组选择家畜育种多组学

The Method of Genomic Selection by Integrating Biological Prior Information and Its Application in Livestock Breeding

YUAN Zehu^1,2, GE Ling³, LI Fadi², YUE Xiangpeng², SUN Wei^1,3

1. Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education, Yangzhou University, Yangzhou 225000, China;
2. Grassland Agriculture Engineering Center of Ministry of Education, Key Laboratory of Grassland Livestock Industry Innovation of Ministry of Agriculture and Rural Affairs, State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China;
3. College of Animal Science and Technology, Yangzhou University, Yangzhou 225000, China

Corresponding author: YUE Xiangpeng, E-mail: lexp@lzu.edu.cn; SUN Wei, E-mail: dkxmsunwei@163.com.

G矩阵加权系数d_i计算公式
Formula of weighted index (d_i) for G matrix

公式含义
Explanation of formula

模型名称
Model name

参考文献
Reference

2p_i(1-p_i)b_i²

p_i为等位基因频率，b_i为标记效应

TA-BLUP

[19]

$\frac{{\widehat {{v_{qj}}}}}{{{{ mean }}\left( {\widehat {{v_{qj}}}} \right)}}$

${\widehat {{v_{qj}}}}$为Bayes混合分布模型估计的后验标记方差

[18]

$\frac{{{{\widehat {{q_j}}}^2}}}{{{{ mean }}\left( {{{\widehat {{q_j}}}^2}} \right)}}$

${\widehat {{q_j}}}$为Bayes混合分布模型估计的后验标记效应

[18]

$\frac{{2{p_j}\left( {1 - {p_j}} \right)\widehat {b_j^2}}}{{{{ mean }}\left( {2{p_j}\left( {1 - {p_j}} \right)\widehat {b_j^2}} \right)}}$

p_j为等位基因频率，$\widehat {b_j}$为GWAS估计的标记效应

[18]

$\frac{{ - 2{p_j}\left( {1 - {p_j}} \right)\mathit{lo}{\mathit{g}_{10}}\left( {{P_{bj}}} \right)}}{{{{ mean }}\left( { - 2{p_j}\left( {1 - {p_j}} \right)\mathit{lo}{\mathit{g}_{10}}\left( {{P_{bj}}} \right)} \right)}}$

p_j为等位基因频率，P_bj为GWAS P值

[18]

b_i²+max (b_i²)

b_i为标记效应

[39]

b_i²/ν_i^|s-2|

b_i为标记效应，ν为尺度参数，S_i为标准差

[39]

$\frac{{\sum\nolimits_{j = 1}^m 2 {p_j}\left( {1 - {p_j}} \right)}}{{\sigma _a^2}}$

b_i为标记效应，p_j为等位基因频率，σ_a²为加性遗传方差

[20]

$\frac{{abs\left( {{b_i}} \right)m}}{{\sum\nolimits_{i = 1}^m a bs\left( {{b_i}} \right)}}$

b_i为标记效应，m为标记个数

AEWGBLUP

[40]

[1]	MEUWISSEN T H E, HAYES B J, GODDARD M E. Prediction of total genetic value using genome-wide dense marker maps[J]. Genetics, 2001, 157(4): 1819-1829. DOI:10.1093/genetics/157.4.1819
[2]	GEORGES M, CHARLIER C, HAYES B. Harnessing genomic information for livestock improvement[J]. Nat Rev Genet, 2019, 20(3): 135-156.
[3]	HICKEY J M. Sequencing millions of animals for genomic selection 2.0[J]. J Anim Breed Genet, 2013, 130(5): 331-332. DOI:10.1111/jbg.12054
[4]	PÉREZ-ENCISO M, RINCON J C, LEGARRA A. Sequence-vs. chip-assisted genomic selection: accurate biological information is advised[J]. Genet Sel Evol, 2015, 47(1): 43. DOI:10.1186/s12711-015-0117-5
[5]	HEIDARITABAR M, CALUS M P L, MEGENS H J, et al. Accuracy of genomic prediction using imputed whole-genome sequence data in white layers[J]. J Anim Breed Genet, 2016, 133(3): 167-179. DOI:10.1111/jbg.12199
[6]	VAN BINSBERGEN R, CALUS M P L, BINK M C A M, et al. Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle[J]. Genet Sel Evol, 2015, 47(1): 71. DOI:10.1186/s12711-015-0149-x
[7]	ABDOLLAHI-ARPANAHI R, MOROTA G, PENAGARICANO F. Predicting bull fertility using genomic data and biological information[J]. J Dairy Sci, 2017, 100(12): 9656-9666. DOI:10.3168/jds.2017-13288
[8]	YE S P, LI J Q, ZHANG Z. Multi-omics-data-assisted genomic feature markers preselection improves the accuracy of genomic prediction[J]. J Anim Sci Biotechno, 2020, 11(1): 109. DOI:10.1186/s40104-020-00515-5
[9]	DE LAS HERAS-SALDANA S, LOPEZ B I, MOGHADDAR N, et al. Use of gene expression and whole-genome sequence information to improve the accuracy of genomic prediction for carcass traits in Hanwoo cattle[J]. Genet Sel Evol, 2020, 52(1): 54. DOI:10.1186/s12711-020-00574-2
[10]	GEBREYESUS G, BOVENHUIS H, LUND M S, et al. Reliability of genomic prediction for milk fatty acid composition by using a multi-population reference and incorporating GWAS results[J]. Genet Sel Evol, 2019, 51(1): 16. DOI:10.1186/s12711-019-0460-z
[11]	袁泽湖. 整合GWAS和eQTL先验的绵羊部分肉用性状全基因组选择研究[D]. 兰州: 兰州大学, 2020. YUAN Z H. The prior information from GWAS and eQTL increase the accuracy of genomic selection in several sheep meat traits[D]. Lanzhou: Lanzhou University, 2020. (in Chinese)
[12]	BOTELHO M E, LOPES M S, MATHUR P K, et al. Applying an association weight matrix in weighted genomic prediction of boar taint compounds[J]. J Anim Breed Genet, 2021, 138(4): 442-453. DOI:10.1111/jbg.12528
[13]	MOGHADDAR N, KHANSEFID M, VAN DER WERF J H J, et al. Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations[J]. Genet Sel Evol, 2019, 51(1): 72. DOI:10.1186/s12711-019-0514-2
[14]	MACLEOD I M, BOWMAN P J, JAGT C J V, et al. Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits[J]. BMC Genomics, 2016, 17(1): 144. DOI:10.1186/s12864-016-2443-6
[15]	HOFF J L, DECKER J E, SCHNABEL R D, et al. QTL-mapping and genomic prediction for bovine respiratory disease in U.S. Holsteins using sequence imputation and feature selection[J]. BMC Genomics, 2019, 20(1): 555. DOI:10.1186/s12864-019-5941-5
[16]	NANI J P, REZENDE F M, PEÑAGARICANO F. Predicting male fertility in dairy cattle using markers with large effect and functional annotation data[J]. BMC Genomics, 2019, 20(1): 258. DOI:10.1186/s12864-019-5644-y
[17]	LIU A X, LUND M S, BOICHARD D, et al. Improvement of genomic prediction by integrating additional single nucleotide polymorphisms selected from imputed whole genome sequencing data[J]. Heredity, 2020, 124(1): 37-49. DOI:10.1038/s41437-019-0246-7
[18]	SU G, CHRISTENSEN O F, JANSS L, et al. Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances[J]. J Dairy Sci, 2014, 97(10): 6547-6559. DOI:10.3168/jds.2014-8210
[19]	ZHANG Z, LIU J F, DING X D, et al. Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix[J]. PLoS One, 2010, 5(9): e12648. DOI:10.1371/journal.pone.0012648
[20]	FRAGOMENI B O, LOURENCO D A L, MASUDA Y, et al. Incorporation of causative quantitative trait nucleotides in single-step GBLUP[J]. Genet Sel Evol, 2017, 49(1): 59. DOI:10.1186/s12711-017-0335-0
[21]	MA P P, LUND M S, AAMAND G P, et al. Use of a Bayesian model including QTL markers increases prediction reliability when test animals are distant from the reference population[J]. J Dairy Sci, 2019, 102(8): 7237-7247. DOI:10.3168/jds.2018-15815
[22]	CHANG L Y, TOGHIANI S, LING A, et al. High density marker panels, SNPs prioritizing and accuracy of genomic selection[J]. BMC Genet, 2018, 19(1): 4. DOI:10.1186/s12863-017-0595-2
[23]	TOGHIANI S, CHANG L Y, LING A, et al. Genomic differentiation as a tool for single nucleotide polymorphism prioritization for genome wide association and phenotype prediction in livestock[J]. Livest Sci, 2017, 205: 24-30. DOI:10.1016/j.livsci.2017.09.007
[24]	BOUWMAN A C, DAETWYLER H D, CHAMBERLAIN A J, et al. Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals[J]. Nat Genet, 2018, 50(3): 362-367. DOI:10.1038/s41588-018-0056-5
[25]	XIANG R D, VAN DEN BERG I, MACLEOD I M, et al. Quantifying the contribution of sequence variants with regulatory and evolutionary significance to 34 bovine complex traits[J]. Proc Natl Acad Sci U S A, 2019, 116(39): 19398-19408. DOI:10.1073/pnas.1904159116
[26]	FANG L Z, SAHANA G, MA P P, et al. Use of biological priors enhances understanding of genetic architecture and genomic prediction of complex traits within and between dairy cattle breeds[J]. BMC Genomics, 2017, 18(1): 604. DOI:10.1186/s12864-017-4004-z
[27]	EDWARDS S M, SØRENSEN I F, SARUP P, et al. Genomic prediction for quantitative traits is improved by mapping variants to gene ontology categories in Drosophila melanogaster[J]. Genetics, 2016, 203(4): 1871-1883. DOI:10.1534/genetics.116.187161
[28]	SARUP P, JENSEN J, OSTERSEN T, et al. Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs[J]. BMC Genet, 2016, 17: 11.
[29]	SONG H L, YE S P, JIANG Y F, et al. Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs[J]. Genet Sel Evol, 2019, 51(1): 58. DOI:10.1186/s12711-019-0500-8
[30]	郝兴杰. 整合功能注释的全基因组选择和关联分析方法研究[D]. 武汉: 华中农业大学, 2018. HAO X J. Incorporating functional annotation to develop genomic selection and genome-wide association study method[D]. Wuhan: Huazhong Agricultural University, 2018. (in Chinese)
[31]	XIANG R D, MACLEOD I M, DAETWYLER H D, et al. Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations[J]. Nat Commun, 2021, 12(1): 860. DOI:10.1038/s41467-021-21001-0
[32]	ZHANG Q Q, SAHANA G, SU G S, et al. Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle[J]. Genet Sel Evol, 2018, 50(1): 62. DOI:10.1186/s12711-018-0432-8
[33]	XU L, GAO N, WANG Z Z, et al. Incorporating genome annotation into genomic prediction for carcass traits in Chinese Simmental beef cattle[J]. Front Genet, 2020, 11: 481. DOI:10.3389/fgene.2020.00481
[34]	GAO N, MARTINI J W R, ZHANG Z, et al. Incorporating gene annotation into genomic prediction of complex phenotypes[J]. Genetics, 2017, 207(2): 489-501. DOI:10.1534/genetics.117.300198
[35]	LEE H J, CHUNG Y J, JANG S, et al. Genome-wide identification of major genes and genomic prediction using high-density and text-mined gene-based SNP panels in Hanwoo (Korean cattle)[J]. PLoS One, 2020, 15(12): e0241848. DOI:10.1371/journal.pone.0241848
[36]	VEERKAMP R F, BOUWMAN A C, SCHROOTEN C, et al. Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle[J]. Genet Sel Evol, 2016, 48(1): 95. DOI:10.1186/s12711-016-0274-1
[37]	VANRADEN P M. Efficient methods to compute genomic predictions[J]. J Dairy Sci, 2008, 91(11): 4414-4423. DOI:10.3168/jds.2007-0980
[38]	LEGARRA A, AGUILAR I, MISZTAL I. A relationship matrix including full pedigree and genomic information[J]. J Dairy Sci, 2009, 92(9): 4656-4663. DOI:10.3168/jds.2009-2061
[39]	ZHANG X Y, LOURENCO D, AGUILAR I, et al. Weighting strategies for single-step genomic BLUP: An iterative approach for accurate calculation of GEBV and GWAS[J]. Front Genet, 2016, 7: 151.
[40]	REN D Y, AN L X, LI B J, et al. Efficient weighting methods for genomic best linear-unbiased prediction (BLUP) adapted to the genetic architectures of quantitative traits[J]. Heredity, 2021, 126(2): 320-334. DOI:10.1038/s41437-020-00372-y
[41]	GIANOLA D, FERNANDO R L, SCHÖN C C. Inferring trait-specific similarity among individuals from molecular markers and phenotypes with Bayesian regression[J]. Theor Popul Biol, 2020, 132: 47-59. DOI:10.1016/j.tpb.2019.11.008
[42]	ZHANG Z, OBER U, ERBE M, et al. Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies[J]. PLoS One, 2014, 9(3): e93017. DOI:10.1371/journal.pone.0093017
[43]	ZHANG Z, ERBE M, HE J L, et al. Accuracy of whole-genome prediction using a genetic architecture-enhanced variance-covariance matrix[J]. G3 Genes Genom Genet, 2015, 5(4): 615-627.
[44]	MOURESAN E F, SELLE M, RÖNNEGÅRD L. Genomic prediction including SNP-specific variance predictors[J]. G3 Genes Genom Genet, 2019, 9(10): 3333-3343.
[45]	GAO N, TENG J Y, YE S P, et al. Genomic prediction of complex phenotypes using genic similarity based relatedness matrix[J]. Front Genet, 2018, 9: 364. DOI:10.3389/fgene.2018.00364
[46]	朱波, 王延晖, 牛红, 等. 畜禽基因组选择中贝叶斯方法及其参数优化策略[J]. 中国农业科学, 2014, 47(22): 4495-4505. ZHU B, WANG Y H, NIU H, et al. The strategy of parameter optimization of Bayesian methods for genomic selection in livestock[J]. Scientia Agricultura Sinica, 2014, 47(22): 4495-4505. DOI:10.3864/j.issn.0578-1752.2014.22.015 (in Chinese)
[47]	尹立林, 马云龙, 项韬, 等. 全基因组选择模型研究进展及展望[J]. 畜牧兽医学报, 2019, 50(2): 233-242. YIN L L, MA Y L, XIANG T, et al. The progress and prospect of genomic selection models[J]. Acta Veterinaria et Zootechnica Sinica, 2019, 50(2): 233-242. (in Chinese)
[48]	GIANOLA D. Priors in whole-genome regression: the Bayesian alphabet returns[J]. Genetics, 2013, 194(3): 573-596. DOI:10.1534/genetics.113.151753
[49]	GAO N, LI J Q, HE J L, et al. Improving accuracy of genomic prediction by genetic architecture based priors in a Bayesian model[J]. BMC Genet, 2015, 16: 120.
[50]	KADARMIDEEN H N. Genomics to systems biology in animal and veterinary sciences: progress, lessons and opportunities[J]. Livest Sci, 2014, 166: 232-248. DOI:10.1016/j.livsci.2014.04.028
[51]	SPEED D, BALDING D J. MultiBLUP: improved SNP-based prediction for complex traits[J]. Genome Res, 2014, 24(9): 1550-1557. DOI:10.1101/gr.169375.113
[52]	LLOYD-JONES L R, ZENG J, SIDORENKO J, et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics[J]. Nat Commun, 2019, 10(1): 5086. DOI:10.1038/s41467-019-12653-0
[53]	BRØNDUM R F, SU G S, LUND M S, et al. Genome position specific priors for genomic prediction[J]. BMC Genomics, 2012, 13: 543. DOI:10.1186/1471-2164-13-543
[54]	MOORE J K, MANMATHAN H K, ANDERSON V A, et al. Improving genomic prediction for pre-harvest sprouting tolerance in wheat by weighting large-effect quantitative trait loci[J]. Crop Sci, 2017, 57(3): 1315-1324. DOI:10.2135/cropsci2016.06.0453
[55]	YIN L L, ZHANG H H, ZHOU X, et al. KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters[J]. Genome Biol, 2020, 21(1): 146. DOI:10.1186/s13059-020-02052-w
[56]	MOURESAN E F, SELLE M, RÖNNEGÅRD L. Genomic prediction including SNP-specific variance predictors[J]. G3 Genes Genom Genet, 2019, 9(10): 3333-3343.
[57]	FANG L Z, SAHANA G, MA P P, et al. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection[J]. Genet Sel Evol, 2017, 49(1): 44. DOI:10.1186/s12711-017-0319-0
[58]	房灵昭. 奶牛复杂性状基因组特征模型分析及基因组选择研究[D]. 北京: 中国农业大学, 2017. FANG L Z. Genomic feature model analysis of complex traits in dairy cattle[D]. Beijing: China Agricultural University, 2017. (in Chinese)
[59]	MA P P, LUND M S, AAMAND G P, et al. Use of a Bayesian model including QTL markers increases prediction reliability when test animals are distant from the reference population[J]. J Dairy Sci, 2019, 102(8): 7237-7247. DOI:10.3168/jds.2018-15815
[60]	MEHRBAN H, NASERKHEIL M, LEE D H, et al. Genomic prediction using alternative strategies of weighted single-step genomic BLUP for yearling weight and carcass traits in Hanwoo beef cattle[J]. Genes, 2021, 12(2): 266. DOI:10.3390/genes12020266
[61]	谈成, 边成, 杨达, 等. 基因组选择技术在农业动物育种中的应用[J]. 遗传, 2017, 39(11): 1033-1045. TAN C, BIAN C, YANG D, et al. Application of genomic selection in farm animal breeding[J]. Hereditas (Beijing), 2017, 39(11): 1033-1045. (in Chinese)
[62]	周磊, 杨华威, 赵祖凯, 等. 基因组选择在我国种猪育种中应用的探讨[J]. 中国畜牧杂志, 2018, 54(3): 4-8. ZHOU L, YANG H W, ZHAO Z K, et al. The application of genomic selection in Chinese pig breeding industry[J]. Chinese Journal of Animal Science, 2018, 54(3): 4-8. (in Chinese)
[63]	ZHANG C, KEMP R A, STOTHARD P, et al. Genomic evaluation of feed efficiency component traits in duroc pigs using 80k, 650k and whole-genome sequence variants[J]. Genet Sel Evol, 2018, 50(1): 14. DOI:10.1186/s12711-018-0387-9
[64]	CLARK E L, ARCHIBALD A L, DAETWYLER H D, et al. From FAANG to fork: application of highly annotated genomes to improve farmed animal production[J]. Genome Biol, 2020, 21(1): 285. DOI:10.1186/s13059-020-02197-8


畜牧兽医学报 2021, Vol. 52 Issue (12): 3323-3334. DOI: 10.11843/j.issn.0366-6964.2021.012.001	PDF