浙江大学学报 (农业与生命科学版)  2017, Vol. 43 Issue (2): 164-152
 文章快速检索 高级检索
 Association analysis revealed importance of dominance effects on days to silk of maize nested association mapping (NAM) population [PDF全文]
MONIR Md Mamun, Jun ZHU
Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China
Summary: Full model and multi-loci additive model were used to analyze the days to silk (DS, female flowering) of maize nested association mapping (NAM) population. Analysis with the full model revealed that small effects of additive, dominance, epistasis, and their environmental interactions of many loci controlled the DS of maize NAM population. Dominance related effects had large impacts on the trait. Estimated total heritability was 79.86%, whereas 50.52% was due to dominance related effects. Environmental specific genetic effects also revealed as imperative for DS, explained 27.31% phenotypic variations. The highly significant (-log10PEW>5) quantitative trait SNPs (QTSs) identified were 50 for full model, but 47 for additive model with low heritability (31.65%). Utilizing the association analysis results of DS, genotypes and total genetic effects of superior lines, superior hybrids were predicted that could be useful for future breeding program.
Keyword: genome-wide association study    maize    days to silk    dominance effects

Flowering time is an important trait, measuring the adaption capability of plants to local environments[1-2]. The transition from vegetative growth to flowering by integrating different environmental prompts is crucial for plant reproductive success[3]. Flowering time is considered as a major selection criterion in plant breeding[4]. Maize is originated from Balsas teosinte (Zea mays ssp. parviglumis) in the Mexican highlands (approximately 9 000 years ago), and has evolved to adapt in diverse ecological conditions[1]. Dissection of the genetic mechanisms of maize flowering time is crucial for evolutionary analysis and future breeding programs. Several studies have been conducted to discover the underlying genetic architecture of flowering time of maize by using quantitative trait locus (QTL) mapping and genome-wide association study (GWAS)[1-2, 5].

Dominance and epistasis are important phenomena in quantitative genetics area. Complexity of the genetic architecture can be largely attributed to epistasis, which plays a significant role in heterosis, inbreeding depression, adaptation, reproductive isolation, and speciation[6]. However, most of the GWAS of different organisms have been analyzed by ignoring the impacts of dominance, epistasis and environmental interaction. Ignoring the important factors could be a major cause of missing heritability of GWAS. Heterozygous genotypes are generally found with high proportion in random mating and others specially designed populations. However, in whole genome sequencing data with a large number of single nucleotide polymorphisms (SNPs), a small portion of heterozygote genotypes can be found in inbred lines of animals and crops that could have large impacts on phenotypic traits[7-8]. In this study, an attempt was made to discover the impacts of heterozygous genotypes on days to silk (DS) of maize nested association mapping (NAM) population. For that, the full model approach with additive, dominance, epistasis, and their environmental interactions was analyzed to dissect genetic architecture of DS by using QTXNetwork[9]. Maize NAM population was constructed by only five-generation self-crossing within 25 diverse families[1, 5, 10]. However, there were no heterozygous genotypes rather than a small portion of missing genotypes. The missing genotypes were replaced by heterozygote genotypes in this study. An additive model with only additive (a) and additive by environmental interactions (ae) was also analyzed for comparison study. Genotypes and total genetic effects of best line (BL), superior line (SL), and superior hybrid (SH) were arranged to observe the scope of improvements for future maize breeding.

1 Materials and methods 1.1 Genotype and phenotype data

Maize nested association mapping (NAM) population derived in the United States (US-NAM) was used in this study, which was derived by crossing 25 diverse lines with B73 and then self-pollination for five generations[5, 10]. Days to silk (DS) were scored over nine environments. However, to get rid from computational complexity, data from four environments were analyzed. We downloaded the genotype and phenotype data sets from http://www.panzea.org/.

1.2 Statistical analysis

Newly developed approach for association mapping, implemented in QTXNetwork, was used for association mapping. The approach has two distinct parts: generalized multi-factor dimensionality reduction (GMDR) method to scan SNPs by 1D for main effects, 2D and 3D for epistasis interactions using module GMDR-GPU[11] of QTXNetwork, and then association mapping was conducted on detected SNPs by using quantitative traits SNPs (QTS) module of QTXNetwork. Two different models for association mapping were used in this study, called full genetic model and multiloci additive model. The full genetic model includes SNP loci effects (a, d, aa, ad, da, dd) as fixed; environment (e) and loci by environment interaction (ae, de, aae, ade, dae, dde) as random effects for four environments (1 for E1, 2 for E2, 3 for E4, and 4 for E9). The statistical approaches of full and additive models[12] were used for conducting association analyses.

Henderson method Ⅲ [13] was used to calculate the F-statistic test for association analysis. A total of 2 000 times permutation was conducted for calculating the critical F-value to control the experiment-wise type Ⅰ error (αEW < 0.05). Parameters were estimated by using the MCMC (Markov chain Monte Carlo) algorithm with 20 000 Gibbs sample iterations[9, 14-16]. Experiment-wise critical P value (PEW-value) was calculated by controlling experiment-wise typeⅠerror (PEW < 0.05).

2 Results 2.1 Estimated heritability using full model

Days to silk (DS) of maize NAM population is highly heritable trait[5]. Estimated total heritability by using full model approach was 79.86% for DS, mostly due to dominance and dominance related epistasis effects (${h_{D + }}^2\hat = 50.52\%$) (Table 1), referring the importance of analyzing dominance-related effects even if in inbred lines. Recent study shows that environmental specific effects are relatively unimportant for leaf orientation traits of maize NAM population, contributing to only 4.98%-7.32% phenotypic variation[7]. Unlike the maize leaf orientation traits, large amount of heritability was estimated due to environmental specific effects (${h_{GE}}^2\hat = 27.31\%$), which refer the genetic effects varied across different environments.

Table 1 Estimated heritability (%) of genetic effects for days to silk using full model and additive model

2.2 Genetic architecture of DS

Association analyses for DS identified multiple loci with different genetic effects. Full model approach identified total 50 highly significant (-log10 PEW>5) QTSs (Fig. 1, Table S1 available at http://www.zjujournals.com/agr/EN/article/showSupportInfo.do?id=10459). The identified QTSs had 64 genetic main effects and 54 environmental specific effects. Therefore, environmental specific effects of QTSs play important roles in DS of NAM population. Despite of the low frequency of heterozygote genotypes of the identified loci (8.21%-9.24% for the loci which had dominant effects, and 3.51%-9.03% for the loci which had dominance related epistasis interaction), we observed large impacts of dominance related effects on DS; though only three QTSs had highly significant dominant effects, there were five pairs of QTSs with highly significant dominance related epistasis interactions (Table S1 available at http://www.zjujournals.com/agr/EN/article/showSupportInfo.do?id=10459). Flowering time in plants results from interactive molecular pathways[17], and epistasis effects have been observed in Arabidopsis[18] and rice[19]. In this study, the full model identified total 24 pairs of highly significant epistasis effects for DS of NAM population. In converse to self-fertilizing crop species, small effects of many loci were reported to control the flowering time using QTL mapping of maize NAM population[5]. Similar to previous QTL mapping of DS of NAM population, association analysis with the full model estimated small genetic effects of DS QTSs. The largest positive individual effect of QTS (S10_ 113745101) had a dominant effect of only 1.43 days (-log10 PEW = 47.3) that could explain 2.92% phenotypic variation. Again, the largest negative individual effect of QTS (S1_172281879) had an additive × environment 1 (ae1) effect of-0.912 day (-log10 PEW = 51.5) that contributed to 0.85% phenotypic variation, though total additive effect of the QTS in environment 1 (a + ae1) was only-0.559 day. Similar to individual genetic effects of loci, estimated epistasis effects were also small. The largest epistasis effects of QTSs (S4_ 53677782 and S8_37237820) had a dominance × dominance (dd) effect of only 2.688 days (-log10 PEW= 22.3), which could explain 10.31% phenotypic variation. The identified QTS S3_159869611 had the largest positive additive effect (a $\hat =$ 0.486 day, -log10PEW= 61.1), and the QTS S2_109001252 had the largest negative additive effect (a $\hat =$ -0.408 day, -log10 PEW= 43.3).

 Circle: QTS with additive effect; Square: QTS with dominant effect; Line between two QTSs: Epistasis effect; Red: QTS with general effects for two environments; Green: QTS with environment-specific effects; Blue: QTS with both general and environment-specific effects; Black: QTS with signifi⁃ cant epistasis effects but without detected individual effects. Fig. 1 G × G plot of detected significant QTSs (PEW < 0.05) for DS by using full model (DS_ADI) and additive model (DS_A) approaches
2.3 Candidate gene annotation

Candidate genes corresponding to DS QTSs were collected from Gramene database (http://ensembl.gramene.org/Zea_mays/). Functions of candidate genes were searched in the UniProt (http://www.uniprot.org/uniprot/) with the accession number of the genes collected from Gramene database. Descriptions of some of the candidate genes were collected from NCBI gene database. Moreover, the functions of candidate genes were collected via literature search in Google. Functions of some candidate genes were tabulated in supplementary Table 2 (Table S2 available at http://www.zjujournals.com/agr/EN/article/showSupportInfo.do?id=10459). We observed that some of the candidate genes were members of well-known gene families that have crucial functions in plant life. For example, QTS S1_ 172281879 is the near variant of C3HC4-type RING finger family protein gene GRMZM2G116714. The C3HC4-type RING finger genes play important roles in various physiological processes including growth, development, and stress responses[20]. QTS S3_ 54472637 is the variant of MYB transcription factor protein gene GRMZM2G051256. The MYB transcription factor proteins play regulatory roles in development processes and defense responses in plants[21]. Functions of most of the candidate genes are still unknown.

Table 2 Prediction of total genetic effects of days to silk

2.4 Prediction of best line, superior line, and superior hybrid for DS

Along with the provided association mapping results, best line (BL), superior line (SL), and superior hybrid (SH) can be predicted for DS that may help breeders for future breeding program (Table 2). Overall total genetic effect of the non-B73 allele homozygous (QQ) combinations was 2.25 days across environments, but variant from 0.20 to 4.18 days in four environments. Predicted total genetic effect for F1 hybrid (1.95 days) was smaller than non-B73 allele homozygous (QQ) genotypes.

Maximum positive total genetic effect across environments was revealed for the line Z012E0020 (6.83 days) called as the positive best line (best line (+)), whereas environment specific positive best lines were Z008E0050 (9.89 days) in environment 1, Z012E0124 (9.72 days) in environment 2, Z007E0043 (6.89 days) in environment 3, and Z012E0058 (9.27 days) in environment 4 (Table S3 available at http://www.zjujournals.com/agr/EN/article/showSupportInfo.do?id=10459). Maximum negative total genetic effect across environments was revealed for the line Z019E0177 (-5.72 days) called as negative best line (best line (-)), and its total genetic values were varied to (-1.87--8.56) days under four different environments. Environmental specific negative best lines were Z024E0182 (-9.05 days) in environment 1, Z024E0114 (－6.16 days) in environment 2, Z010E0020 (－5.48 days) in environment 3, and Z024E0094 (－8.69 days) in environment 4. Total genetic values of environmental specific best lines were largely varied, (－2.50-－9.05) days for line Z024E0182, (－2.57-－7.36) days for line Z024E0114, (－2.11-－5.48) days for line Z010E0020, and (－1.41-－8.69) days for line Z024E0094. Therefore, there was no specific best line across the environments for DS.

The predicted superior negative line (superior line (-)) could provide insight for crop improvement along with the optimum homozygous genotypes (QQ, qq) combinations. Total overall genetic effect of the predicted superior line had-7.11 days, which was smaller than the existing best line (Z019E0177).

Again, the total genetic effect of the negative superior hybrid, that exhausted the optimum combination of homozygous (QQ, qq) and heterozygous (Qq) genotypes had-11.80 days, which was 6.08 days earlier than the existing line Z019E0177, referring that the predicted superior hybrid has greater scope than the predicted superior line for further improvement. We tabulated optimum genotypes corresponding to loci of the predicted lines (Table S4 available at http://www.zjujournals.com/agr/EN/article/showSupportInfo.do?id=10459) that could be helpful to breeders for further crop improvement.

2.5 Association mapping with additive model

Additive model identified 47 highly significant QTSs, among which 31 QTSs were also identified by full model (Fig. 1). As like the full model, estimated effects from additive model were small. Estimated total heritability was 31.65% by using additive model approach that was less than half of the total heritability of full model (Table 1), illustrating the problem of missing heritability by using additive model. Therefore, ignoring dominant and epistasis interactions may have large impacts on under-estimating heritability of complex traits.

3 Discussion