Extreme plastid RNA editing may confound phylogenetic reconstruction: A case study of Selaginella (lycophytes)
Xin-Yu Dua,b, Jin-Mei Lua, De-Zhu Lia,b     
a. Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, 132 Lanhei Road, Kunming, Yunnan 650201, China;
b. Kunming College of Life Science, University of Chinese Academy of Sciences, 19 Qingsong Road, Kunming, Yunnan 650201, China
Abstract: Cytidine-to-uridine (C-to-U) RNA editing is common in coding regions of organellar genomes throughout land plants. In most cases RNA editing alters translated amino acids or creates new start codons, potentially confounds phylogenetic reconstructions. In this study, we used the spike moss genus Selaginella (lycophytes), which has the highest frequency of RNA editing, as a model to test the effects of extreme RNA editing on phylogenetic reconstruction. We predicted the C-to-U RNA editing sites in coding regions of 18 Selaginella plastomes, and reconstructed the phylogenetic relationships within Selaginella based on three data set pairs consisted of plastome or RNA-edited coding sequences, first and second codon positions, and translated amino acid sequences, respectively. We predicted between 400 and 3100 RNA editing sites of 18 Selaginella plastomes. The numbers of RNA editing sites in plastomes were highly correlated with the GC content of first and second codon positions, but not correlated with the GC content of plastomes as a whole. Contrast phylogenetic analyses showed that there were substantial differences (e.g., the placement of clade B in Selaginella) between the phylogenies generated by the plastome and RNA-edited data sets. This empirical study provides evidence that extreme C-to-U RNA editing in the coding regions of organellar genomes alters the sequences used for phylogenetic reconstruction, and might even confound phylogenetic reconstruction. Therefore, RNA editing sites should be corrected when plastid or mitochondrial genes are used for phylogenetic studies, particularly in those lineages with abundant organellar RNA editing sites, such as hornworts, quillworts, spike mosses, and some seed plants.
Keywords: GC content    Land plants    Organellar genome    Phylogenomics    RNA editing    
1. Introduction

RNA editing is a post-transcriptional modification of RNA that occurs in nuclear and/or organellar genomes of some organisms (Maier et al., 1996; Gott and Emeson, 2000). In land plants, the most prevalent type of RNA editing in organellar genomes is cytidine-to-uridine (C-to-U) editing, while U-to-C editing only occurs in some plants, e.g., hornworts, lycophytes, and ferns (Chateigner-Boutin and Small, 2011; Guo et al., 2015; Knie et al., 2016; Ichinose and Sugita, 2017). The frequency of organellar RNA editing in plants varies from zero to thousands of sites (Ichinose and Sugita, 2017; Lenz et al., 2018). Most RNA editing events serve to restore the evolutionarily conserved amino acid residues in mRNAs or to create start and stop codons (Chateigner-Boutin and Small, 2011; Takenaka et al., 2013). Therefore, RNA editing is an essential process to correct defective genes and maintain genetic information (Knoop, 2011; Small et al., 2020). For plant molecular phylogeny, plastid DNA sequences are the primary data source (Gitzendanner et al., 2018), and over the last 30 years, protein-coding sequences or translated amino acid sequences have been the most commonly used sequences (Hasebe et al., 1995; Jansen et al., 2007; Ruhfel et al., 2014; Nie et al., 2020).

The abundance of RNA editing sites in DNA sequences raises the possibility that these sites bias phylogenetic analysis. Some early studies tested the effects of organellar RNA editing on phylogenetic reconstruction by comparing phylogenies generated from DNA and cDNA sequences. These studies generally suggested that RNA editing has no or little effect on DNA-based phylogenetic reconstruction (Bowe and dePamphilis, 1996; Varigerow et al., 1999; Szmidt et al., 2001; Petersen et., 2006, 2013). However, these studies only used a few organellar genes (mostly mitochondrial genes), and some sampled lineages had relatively low RNA editing frequency (Ichinose and Sugita, 2017; Lenz et al., 2018). Therefore, a reevaluation of the effect of RNA editing on phylogenetic reconstruction at a genomic level is essential, especially in highly RNA-edited lineages, such as hornworts, lycophytes, and ferns (Ichinose and Sugita, 2017; Lenz et al., 2018).

The frequency of RNA editing in plastid genomes (3415 editing sites) is highest in Selaginella uncinata (Desv. ex Poir.) Spring (Oldenkott et al., 2014); for mitochondrial genomes the highest frequency of RNA editing (2152 editing sites) is found in another species of Selaginella P. Beauv., Selaginella moellendorffii Hieron (Hecht et al., 2011). In both species, RNAs are exclusively modified from "C" to "U" and editing mainly occurs in protein-coding regions (Hecht et al., 2011; Oldenkott et al., 2014). Selaginella, the single genus in Selaginellaceae (Selaginellales or spike moss), is distributed worldwide and contains about 700–800 species (PPG, 2016; Weststrand and Korall, 2016). All sequenced Selaginella plastomes have a remarkably high GC content (50.7–56.5%) (Zhang et al., 2020), and because high GC content is thought to be correlated with organellar RNA editing, Selaginella species are supposed to have high levels of RNA editing (Malek et al., 1996; Tsuji et al., 2007; Smith, 2009; Hecht et al., 2011; Oldenkott et al., 2014).

Previous studies have shown that RNA editing sites can be verified by comparing genomic DNA sequences with cDNA sequences (Wolf et al., 2004; Oldenkott et al., 2014), or can be predicted by comparing genomic DNA sequences with verified DNA or amino acid sequences among closely related species (Lenz et al., 2018). Prediction of C-to-U RNA editing in coding regions is relatively reliable because most editing events tend to restore conserved codons (Oldenkott et al., 2014; Lenz et al., 2018; Small et al., 2020; Gerke et al., 2020). In this study, we used Selaginella as a research model to test the effect of extreme plastid RNA editing on phylogenetic reconstruction. We first predicted RNA editing sites in plastid protein-coding sequences and then compared phylogenetic reconstructions of Selaginella derived from plastome and RNA-edited DNA/amino acid sequences.

2. Materials and methods 2.1. Taxon sampling and predicting RNA editing sites

We downloaded 18 Selaginella plastome sequences from GenBank (Table 1), and then predicted the RNA editing sites in protein-coding regions of these plastomes using the online plant RNA editing prediction and analysis tool PREPACT 3 (Lenz et al., 2018). Predictions were conducted under BLASTX mode with default parameters, and the plastid protein database of S. uncinata (modified from AB197035.2, 76 protein sequences, 3488 RNA editing records) was selected as reference.

Table 1 Taxon sampling and number of predicted RNA editing sites for Selaginella plastomes, as well as the GC content in different data sets and percentage of altered amino acids (AA) due to RNA editing events.
No. Taxon sampling GenBank accession No. Size (bp) No. of RNA editing sites GC% GC% (corrected) GC% of CDS GC% of CDSr GC% of codon1-2 GC% of codon1r-2r Altered AA (%)
1 Selaginella bisulcata Spring NC041640 140, 509 2876 52.8 50.8 54.0 49.5 56.5 49.9 12.8
2 Selaginella doederleinii Hieron. NC041641 142, 752 2680 51.1 49.2 50.8 46.8 54.0 48.0 11.6
3 Selaginella exaltata (Kunze) Spring MN427927 117, 523 1089 52.0 50.9 50.9 48.7 52.5 49.2 7.4
4 Selaginella hainanensis X.C. Zhang & Noot. NC041642 144, 201 3100 54.8 52.7 54.2 49.7 56.7 50.0 13.0
5 Selaginella indica (Milde) R.M. Tryon NC041098 122, 460 1130 53.6 52.7 51.9 49.9 52.9 49.8 6.0
6 Selaginella kraussiana (Kunze) A. Braun NC040926 129, 971 1256 52.3 51.3 51.5 49.7 52.2 49.5 5.2
7 Selaginella lepidophylla (Hook. & Grev.) Spring NC040927 114, 693 761 51.9 51.2 50.2 48.8 51.1 49.0 4.1
8 Selaginella lyallii (Hook. & Grev.) Spring NC041556 110, 411 405 50.7 50.3 49.4 48.7 50.1 49.0 2.2
9 Selaginella moellendorffii Hieron. MG272484 143, 525 2357 51.0 49.4 50.4 46.9 53.3 48.1 10.1
10 Selaginella nummularifolia Ching MK622381 148, 924 2267 50.0 48.7 49.7 46.4 52.7 47.8 10.1
11 Selaginella pennata (D. Don) Spring NC041643 138, 024 3075 52.9 50.7 54.2 49.4 56.8 49.6 13.8
12 Selaginella remotifolia Spring NC041644 131, 867 1474 56.5 55.4 54.6 52.4 54.2 50.8 6.5
13 Selaginella rossii (Baker) Warb. MK622382 146, 469 2813 51.0 48.8 50.9 46.6 54.4 48.0 13.0
14 Selaginella sanguinolenta (L.) Spring NC041645 147, 148 2866 50.8 48.9 50.8 46.5 54.3 47.9 12.3
15 Selaginella stauntoniana Spring MK622384 126, 835 1858 54.0 52.7 52.7 49.4 54.4 49.5 10.2
16 Selaginella tamariscina (P. Beauv.) Spring NC041646 126, 700 1776 54.1 52.7 52.7 49.5 54.3 49.5 9.2
17 Selaginella uncinata (Desv. ex Poir.) Spring AB197035 144, 170 3079 54.8 52.8 54.2 49.1 56.7 49.8 13.0
18 Selaginella vardei H. Lév. NC041099 121, 254 1098 53.2 52.3 51.7 49.7 52.7 49.7 5.8
19 Dendrolycopodium obscurum (L.) A. Haines NC040923 160, 877 \ 35.0 \ 36.3 \ 42.2 \ \
20 Diphasiastrum digitatum (Dill. ex A. Braun) Holub NC040993 159, 614 \ 35.7 \ 36.8 \ 42.5 \ \
21 Huperzia serrata (Thunb.) Trevis. NC033874 154, 176 \ 36.3 \ 37.0 \ 42.7 \ \
22 Lycopodium clavatum L. NC040994 151, 819 \ 34.5 \ 35.4 \ 41.6 \ \
23 Isoëtes piedmontana (N. Pfeiff.) C.F. Reed NC040925 145, 030 \ 38.0 \ 39.3 \ 44.5 \ \
24 Isoëtes yunguiensis Q.F. Wang & W.C. Taylor NC041146 145, 355 \ 38.0 \ 39.3 \ 44.5 \ \
2.2. Data set construction

Predicted editing information of each plastome was transformed into feature file format using notepad++ (https://notepad-plus-plus.org/), and then was added to the original GenBank flat file using Sequin (NCBI). Six lycophyte plastomes (two species of Isoëtaceae or quillworts and four species of Lycopodiaceae or club mosses) were downloaded as outgroups (Table 1). Coding sequences of each gene were extracted using Geneious 8 (Kearse et al., 2012). Genes missing from more than half of sampled Selaginella were excluded in phylogenetic analyses. Coding sequences of each gene were aligned based on translated protein sequences, and obvious aligned gaps (occurred in > 50% samples) were trimmed by codon using Geneious 8 (Kearse et al., 2012). Seventy-five genes were finally aligned and concatenated to produce a coding sequences (CDS) matrix (Table 2). To test the effect of plastid RNA editing on phylogenetic reconstruction, we constructed three contrasting data set pairs, which consisted of the genomic DNA/amino acid sequences and RNA-edited (or corrected) sequences, respectively. Specifically, to produce a contrast CDSr matrix, predicted editing sites "C" were replaced by "T" using Geneious 8 (Kearse et al., 2012) (Table 2). To produce two new data sets named codon 1-2 and codon1r-2r, respectively, first and second codon positions of the CDS and CDSr matrices were extracted using Geneious 8 (Kearse et al., 2012) (Table 2). Finally, the matrices CDS and CDSr were translated into amino acids matrices named AA and AAr, respectively (Table 2), using MEGA 6 (Tamura et al., 2013).

Table 2 Characteristics of data sets applied in the phylogenetic analyses and tree length of resultant trees.
Data set No. of taxa Length (nt/aa) RNA editing sites (nt/aa) Pairwise identity (%) GC content (outgroup - ingroup) (%) Tree length
CDS 24 64, 452 35, 285 62.8 48.1 (37.4–51.9) 3.260380
CDSr 24 64, 395 \ 65.0 45.7 (37.4–48.7) 3.248279
codon1-2 24 42, 968 34, 889 67.4 51.0 (43.0–53.9) 1.995459
codon1r-2r 24 42, 930 \ 70.6 47.5 (43.0–49.1) 1.655210
AA 24 21, 484 \ 57.5 \ 3.552255
Aar 24 21, 465 \ 63.4 \ 2.872170
Abbreviations: nt: nucleotides, aa: amino acids.
2.3. Phylogenetic analyses

The best-fit substitution models of all the DNA matrices were determined by jModeltest 2 (Posada, 2008) (GTR + I + G), and all DNA matrices were partitioned by codon position. For the amino acid matrices, the best-fit substitution models were automatically determined by the phylogenetic inference software, i.e., substitution model set to PROTGAMMAAUTO. Maximum Likelihood (ML) method, implemented in RAxML 8.2.10 (Stamatakis, 2014), was used to infer the phylogenetic relationships within Selaginella. A thorough tree search for the best ML tree was performed, and bootstrap analyses were performed with 1000 replications; bipartition information from the bootstrap trees was drawn on the best ML tree. Possible correlations between number of plastid RNA editing sites, plastome GC content, CDS GC content, and the GC content of codon 1-2 data sets were tested using R ape package (https://www.r-project.org/). The non-independent effects of phylogenetic signal were corrected using the phylogenetically independent contrasts (PIC) method (Felsenstein, 1985).

3. Results 3.1. Predicted RNA editing sites

The numbers of predicted C-to-U RNA editing sites in protein-coding regions of 18 Selaginella plastomes are shown in Table 1. Detailed information of predicted RNA editing sites is provided in the Appendix. Of the 18 plastomes examined, S. lyallii had the least number of predicted editing sites (405); whereas S. hainanensis, Selaginella pennata, and S. uncinata had the highest numbers of editing sites, ranging from 3075 to 3100 (Table 1). Selaginella uncinata was predicted to have 3079 editing sites, and verified to have 3415 sites by transcriptome data (Oldenkott et al., 2014). Selaginella lepidophylla and S. kraussiana were predicted to have 761 and 1256 editing sites in our study, respectively, and were verified to have 720 and 1353 sites by transcriptomes data, respectively (Smith, 2020).

The characteristics of three data set pairs that were used in phylogenetic analyses are shown in Table 2. The CDS matrix has an aligned length of 64, 452 bp and contains 35, 285 editing sites. The codon 1-2 matrix has an aligned length of 42, 968 bp and contains 34, 889 editing sites.

3.2. Correlations between RNA editing and GC content

The GC content of 18 Selaginella plastomes varied from 50.0% to 56.5% (Table 1). The corrected data sets CDSr and codon1r-2r had relatively higher values of pairwise identity and lower GC content (Table 2). For the CDS and CDSr data set pairs, GC content of ingroups dropped from 51.9% to 48.7% (Table 2). For the codon 1-2 and codon1r-2r data set pairs, GC content of ingroups dropped from 53.9% to 49.1% (Table 2). Correlation analysis of RNA editing sites and GC content in different data sets is shown in Table 3.

Table 3 The significance of correlations between number of RNA editing sites in Selaginella plastomes and GC content of different data sets using PIC.
P value RNAe GCpt GCcds GCcodon1-2
RNAe / 0.2744 0.0120* 1.58E-05**
GCpt 0.2744 / 2.41E-04** 0.01662*
GCcds 0.0120* 2.41E-04** / 3.97E-07**
GCcodon1-2 1.58E-05** 0.01662* 3.97E-07** /
Note: *: P < 0.05, **: P < 0.01. Abbreviations: RNAe: number of RNA editing sites in plastomes, GCpt: GC content of plastomes, GCcds: GC content of CDS regions, GCcodon1-2: GC content of first and second codon positions, PIC: the phylogenetically independent contrasts method.

The number of RNA editing sites in Selaginella plastomes was not correlated (P > 0.05) with the GC content of plastomes, but highly correlated (P < 0.01) with the GC content of the first and second codon positions (Table 3). For example, the plastomes of the related species Selaginella remotifolia and S. kraussiana were similar in size (131, 867 bp versus 129, 971 bp) and number of RNA editing sites (1474 versus 1256), but had very different GC content (56.5% versus 52.3%). In fact, S. remotifolia had the highest GC content among all 18 sampled Selaginella species, even higher than the most RNA-edited species examined, i.e., S. hainanensis (Table 1). The plastomes of S. lyallii, S. moellendorffii, and S. hainanensis have 405, 2357, and 3100 predicated editing sites, respectively, and the GC content of first and second codon positions are 50.1%, 53.3%, and 56.7%, respectively (Table 3). Interestingly, species of clade D in Selaginella (Fig. 1) had a relatively higher number of editing sites (2267–2866) and lower GC contents (50.2–50.8%), whereas species of clade A in Selaginella (Fig. 1) had a relatively lower number of editing sites (1776–1858) and higher GC contents (54.0–54.1%), although they share very similar GC contents at the first and second codon positions (52.7–54.4%) (Table 1). The present results also show that both the GC content of plastomes and the GC content of first and second codon positions are highly correlated (P < 0.01) with the GC content of CDS regions (Table 3).

Fig. 1 Phylogenetic trees derived from three contrast data set pairs, with a histogram showing the number of RNA editing sites in each Selaginella plastome. Note: Branches with 100% bootstrap support values are indicated by "*", otherwise, values were indicated along the branches. Name of clades (A–E) in Selaginella are indicated in a–f. Abbreviations: S. = Selaginella, De. = Dendrolycopodium, Di. = Diphasiastrum, H. = Huperzia, L. = Lycopodium, I. = Isoëtes.
3.3. Phylogenetic results

Phylogenetic trees resulting from three data set pairs are shown in Fig. 1a–f. Numbers of RNA editing sites in Selaginella plastomes were mapped on the phylogenetic tree of codon1r-2r (Fig. 1f). The main clades of Selaginella are indicated as clades A–E in Fig. 1a–f. The tree lengths of the phylogenetic trees derived from corrected data sets were relatively shorter than those derived from plastome data sets (Table 2). For example, compared with the tree lengths of 3.552255 derived from data set AA, that from data set AAr decreased to 2.872170 (Table 2, Fig. 1b, e). The phylogenetic results based on all three plastome data sets support the sister relationship between clade B and C (MLBS = 55/100/100; Fig. 1a–c), while the results based on all three corrected data sets support the sister relationship between clade B and A with high bootstrap values (all MLBS = 100; Fig. 1d–f). Five of six analyses strongly support the sister relationship of clade D and clade A + B + C (all MLBS = 100; Fig. 1a–c, e, f), but the analysis based on CDSr matrix supports the sister relationship of clade D and all other Selaginella species (MLBS = 100; Fig. 1d). The results of three analyses based on plastome data sets (CDS, codon 1-2, and AA) in the present analyses (Fig. 1a–c) were mostly consistent with the results of Zhang et al. (2020).

4. Discussion 4.1. Variation of predicted RNA editing in Selaginella plastomes

The predicted number of C-to-U RNA editing sites in plastomes of three Selaginella species (S. kraussiana, S. lepidophylla and S. uncinata) were basically equal to their actual numbers, as verified by previous studies (Oldenkott et al., 2014; Smith, 2020). Therefore, the predictions were largely accurate and reliable. Generally, the number of RNA editing sites were relatively uniform among species within each of the Selaginella clades A–E (e.g., 2876–3100 in the four species of clade B; Fig. 1f, Table 1), but varied greatly between different clades, e.g., the numbers of editing sites in the two species of clade A are 1776 and 1858, which are much lower than those of its sister group, clade B (2876–3100) (Fig. 1f, Table 1). The studies of Zhang et al., 2019, Zhang et al., 2020 suggested that species in clade E have lower substitution rates and simpler plastome structures, whereas species in clades A–D have higher substitution rates and more complex plastome structures. Our analyses indicate that the frequency of RNA editing in Selaginella plastomes may be positively correlated with the variation in plastome structure and the substitution rate. However, there was an exception in the two species of clades A. Compared with species in clade D, species in clades A had a relatively lower number of RNA editing sites (Fig. 1f, Table 1), but more complex plastome structures and an accelerated substitution rate (Zhang et al., 2019, Zhang et al., 2020). More comprehensive taxon sampling and transcriptomic evidence are needed to illuminate the evolutionary patterns of RNA editing in Selaginella.

4.2. Correlations between RNA editing and GC content

Correlation analysis indicated that the number of RNA editing sites in Selaginella plastomes was highly correlated with the GC content of the first and second codon positions, but not correlated with the GC content of plastomes (Table 3). The GC content of plastomes was only moderately correlated with the GC content of first and second codon positions (Table 3). A similar pattern has been observed in the mitochondrial genomes of ferns and gymnosperms (Guo et al., 2016, 2017). The GC content of the plastid genomes of lycophytes vary extensively, with ca. 35–36% in Lycopodiales, ca. 38% in Isoëtales, and ca. 51–57% in Selaginellales (Mower et al., 2019; Zhang et al., 2019, Zhang et al., 2020). The present study showed that even for the corrected data set consisting of the most frequently edited first and second codon positions (codon1r-2r), the average GC content of the sampled Selaginella species (49.1%) was still relatively higher than those of club mosses (41.6–42.7%) and quillworts (44.5%) (Table 1, Table 2). The GC content of non-coding sequences, which are usually about half the size of the plastomes, are generally higher than 50%; however, RNA editing is rarely found in non-coding sequences (Oldenkott et al., 2014). Our study implies that there may be other mechanism(s) associated with the GC content of both coding and non-coding regions of plastome sequences.

4.3. Impact of RNA editing on phylogeny

The prediction of RNA editing sites in plastomes of Selaginella showed that an average of about 10% of the translated amino acids had changed due to an extreme editing event (Table 1). Previous studies have indicated that genomic DNA sequences may carry considerable homoplasy but have a greater number of variable sites; in contrast, RNA-edited DNA sequences may be more phylogenetically informative, although they have fewer variable sites because C-to-U editing tends to restore conserved codons (Bowe and dePamphilis, 1996; Varigerow et al., 1999; Peterson et al., 2006). Our results show that all the corrected data sets had relatively higher values of pairwise identity and thus contained less phylogenetic information (Table 2); however, the bootstrap values did not decrease significantly (Fig. 1), although the tree length of the phylogenetic trees derived from corrected data sets were relatively shorter (Table 2).

Contrast analyses showed that the extreme C-to-U RNA editing in coding regions of plastomes definitely altered the effect of the sequences in phylogenetic reconstructions. There were some substantial differences in the relationships among clades A–E in Selaginella between the results of the plastome and corrected data sets (Fig. 1). The three analyses based on the plastome data sets (CDS, codon 1-2, and AA) (Fig. 1a–c) supported the sister relationship between clade B and C, which is consistent with the results of Zhang et al. (2020), which were obtained using similar data sets and tree inference strategies. However, all three analyses based on corrected data sets (CDSr, codon1r-2r, and AAr) highly supported the sister relationship between clade A and B (Fig. 1d–f). Moreover, the results based on the corrected data set CDSr supported the sister relationship of clade D and all other Selaginella species (Fig. 1d), which is similar to the results of Zhou et al. (2016); however, the five other analyses supported the sister relationship between clade D and the aggregate of clade A + B + C. Overall, the infra-generic relationships within this big genus Selaginella still need further research with both adequate taxon sampling and molecular characters, as well as the correction of RNA editing sites.

5. Conclusions

In this study, we found that the extreme RNA editing in plastomes can substantially affect or even alter the result of phylogenetic reconstruction. Specifically, when we compared phylogenetic reconstructions of Selaginella based on plastome and RNA-edited data sets, we found that phylogenetic relationships among Selaginella differed. Therefore, we recommend that researchers correct RNA editing sites when using plastid or mitochondrial genes in phylogenetic analyses, especially in those lineages with abundant organellar RNA editing sites, such as hornworts, quillworts, spike mosses, and some seed plants.

Author contributions

J.-M.L., and D.-Z.L. designed research; X.-Y.D. performed research and analyzed data; and X.-Y.D., J.-M.L., and D.-Z.L. wrote the paper.

Declaration of Competing Interest

The authors declare no conflicts of interest.

Acknowledgements

We thank Dr. Peng-Fei Ma for improving the manuscript. We also thank the two anonymous reviewers for their constructive comments and suggestions. The study was supported by the Strategic Priority Research Program, Chinese Academy of Sciences, China (XDB 31000000); the National Natural Science Foundation of China (31970232); the Large-scale Scientific Facilities of the Chinese Academy of Sciences, China (2017-LSF-GBOWS-02); and the technological leading talent project of Yunnan, China (2017HA014).

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.pld.2020.06.009.

References
Bowe L.M., dePamphilis C.W., 1996. Effects of RNA editing and gene processing on phylogenetic reconstruction. Mol. Biol. Evol, 13: 1159-1166. DOI:10.1093/oxfordjournals.molbev.a025680
Chateigner-Boutin A.L., Small I., 2011. Organellar RNA editing. Organellar RNA editing. Wiley Interdisciplinary Reviews-RNA, 2: 493-506. DOI:10.1002/wrna.72
Felsenstein J., 1985. Phylogenies and the comparative method. Am. Nat, 125: 1-15. DOI:10.1086/284325
Gerke P., Szovenyi P., Neubauer A., et al, 2020. Towards a plant model for enigmatic U-to-C RNA editing: the organelle genomes, transcriptomes, editomes and candidate RNA editing factors in the hornwort Anthoceros agrestis. New Phytol, 225: 1974-1992. DOI:10.1111/nph.16297
Gitzendanner M.A., Soltis P.S., Yi T.S., et al, 2018. Plastome phylogenetics: 30 years of inferences into plant evolution. Adv. Bot. Res, 85: 293-313. DOI:10.1016/bs.abr.2017.11.016
Gott J.M., Emeson R.B., 2000. Functions and mechanisms of RNA editing. Annu. Rev.Genet, 34: 499-531. DOI:10.1146/annurev.genet.34.1.499
Guo W., Grewe F., Fan W., et al, 2016. Ginkgo and Welwitschia mitogenomes reveal extreme contrasts in gymnosperm mitochondrial evolution. Mol. Biol. Evol, 33: 1448-1460. DOI:10.1093/molbev/msw024
Guo W., Grewe F., Mower J.P., 2015. Variable frequency of plastid RNA editing among ferns and repeated loss of Uridine-to-Cytidine editing from vascular plants. PloS One, 10: e0117075. DOI:10.1371/journal.pone.0117075
Guo W., Zhu A., Fan W., et al, 2017. Complete mitochondrial genomes from the ferns Ophioglossum californicum and Psilotum nudum are highly repetitive with the largest organellar introns. New Phytol, 213: 391-403. DOI:10.1111/nph.14135
Hasebe M., Wolf P.G., Pryer K.M., et al, 1995. Fern phylogeny based on rbcL nucleotide sequences. Am. Fern J, 85: 134-181. DOI:10.2307/1547807
Hecht J., Grewe F., Knoop V., 2011. Extreme RNA editing in coding islands and abundant microsatellites in repeat sequences of Selaginella moellendorffii mitochondria: the root of frequent plant mtDNA recombination in early tracheophytes. Genome Biol. Evol, 3: 344-358. DOI:10.1093/gbe/evr027
Ichinose M., Sugita M., 2017. RNA editing and its molecular mechanism in plant organelles. Genes, 8: 5.
Jansen R.K., Cai Z., Raubeson L.A., et al, 2007. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. U.S.A, 104: 19369-19374. DOI:10.1073/pnas.0709121104
Kearse M., Moir R., Wilson A., et al, 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28: 1647-1649. DOI:10.1093/bioinformatics/bts199
Knie N., Grewe F., Fischer S., et al, 2016. Reverse U-to-C editing exceeds C-to-U RNA editing in some ferns e a monilophyte-wide comparison of chloroplast and mitochondrial RNA editing suggests independent evolution of the two processes in both organelles. BMC Evol. Biol, 16: 134. DOI:10.1186/s12862-016-0707-z
Knoop V., 2011. When you can't trust the DNA: RNA editing changes transcript sequences. Cell. Mol. Life Sci, 68: 567-586. DOI:10.1007/s00018-010-0538-9
Lenz H., Hein A., Knoop V., 2018. Plant organelle RNA editing and its specificity factors: enhancements of analyses and new database features in PREPACT 3. 0.BMC Bioinf, 19: 255. DOI:10.1186/s12859-018-2244-9
Maier R.M., Zeltz P., Kössel H., et al, 1996. RNA editing in plant mitochondria and chloroplasts. Plant Mol. Biol, 32: 343-365. DOI:10.1007/BF00039390
Malek O., Lättig K., Hiesel R., et al, 1996. RNA editing in bryophytes and a molecular phylogeny of land plants. EMBO J, 15: 1403-1411. DOI:10.1002/j.1460-2075.1996.tb00482.x
Mower J.P., Ma P.F., Grewe F., et al, 2019. Lycophyte plastid genomics: extreme variation in GC, gene and intron content and multiple inversions between a direct and inverted orientation of the rRNA repeat. New Phytol, 222: 1061-1075. DOI:10.1111/nph.15650
Nie Y., Foster C.S.P., Zhu T., et al, 2020. Accounting for uncertainty in the evolutionary timescale of green plants through clock-partitioning and fossil calibration strategies Syst. Biol, 69: 1-16.
Oldenkott B., Yamaguchi K., Tsuji-Tsukinoki S., et al, 2014. Chloroplast RNA editing going extreme: more than 3400 events of C-to-U editing in the chloroplast transcriptome of the lycophyte Selaginella uncinata. RNA, 20: 1499-1506. DOI:10.1261/rna.045575.114
Petersen G., Seberg O., Davis J.I., et al, 2006. RNA editing and phylogenetic reconstruction in two monocot mitochondrial genes. Taxon, 55: 871-886. DOI:10.2307/25065682
Petersen G., Seberg O., Davis J.I., 2013. Phylogeny of the Liliales (Monocotyledons)with special emphasis on data partition congruence and RNA editing. Cladistics, 29: 274-295. DOI:10.1111/j.1096-0031.2012.00427.x
Posada D., 2008. jModelTest: phylogenetic model averaging. Mol. Biol. Evol, 25: 1253-1256. DOI:10.1093/molbev/msn083
PPG, 2016. A community-derived classification for extant lycophytes and ferns. J. Systemat. Evol, 54: 563-603. DOI:10.1111/jse.12229
Ruhfel B.R., Gitzendanner M.A., Soltis P.S., et al, 2014. From algae to angiosperms-inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes. BMC Evol. Biol, 14: 23. DOI:10.1186/1471-2148-14-23
Small I.D., Schallenberg-Rudinger M., Takenaka M., et al, 2020. Plant organellar RNA editing: what 30 years of research has revealed. Plant J, 101: 1040-1056. DOI:10.1111/tpj.14578
Smith D.R., 2009. Unparalleled GC content in the plastid DNA of Selaginella. Plant Mol. Biol, 71: 627-639. DOI:10.1007/s11103-009-9545-3
Smith D.R., 2020. Unparalleled variation in RNA editing among Selaginella plastomes. Plant Physiol, 182: 12-14.
Stamatakis A., 2014. RAxML version 8: a tool for phylogenetic analysis and postanalysis of large phylogenies. Bioinformatics, 30: 1312-1313. DOI:10.1093/bioinformatics/btu033
Szmidt A.E., Lu M.Z., Wang X.R., 2001. Effects of RNA editing on the coxI evolution and phylogeny reconstruction. Euphytica, 118: 9-18. DOI:10.1023/A:1004046220115
Takenaka M., Zehrmann A., Verbitskiy D., et al, 2013. RNA editing in plants and its evolution. Annu. Rev. Genet, 47: 335-352. DOI:10.1146/annurev-genet-111212-133519
Tamura K., Stecher G., Peterson D., et al, 2013. MEGA6: molecular evolutionary genetics analysis version 6. 0. Mol. Biol. Evol, 30: 2725-2729. DOI:10.1093/molbev/mst197
Tsuji S., Ueda K., Nishiyama T., et al, 2007. The chloroplast genome from a lycophyte (microphyllophyte), Selaginella uncinata, has a unique inversion, transpositions and many gene losses. J. Plant Res, 120: 281-290. DOI:10.1007/s10265-006-0055-y
Varigerow S., Teerkorn T., Knoop V., 1999. Phylogenetic information in the mitochondrial nad5 gene of pteridophytes: RNA editing and intron sequences. Plant Biol, 1: 235-243. DOI:10.1111/j.1438-8677.1999.tb00249.x
Weststrand S., Korall P., 2016. A subgeneric classification of Selaginella (Selaginellaceae). Am. J. Bot, 103: 2160-2169. DOI:10.3732/ajb.1600288
Wolf P.G., Rowe C.A., Hasebe M., 2004. High levels of RNA editing in a vascular plant chloroplast genome: analysis of transcripts from the fern Adiantum capillus-veneris. Gene, 339: 89-97. DOI:10.1016/j.gene.2004.06.018
Zhang H.R., Wei R., Xiang Q.P., et al, 2020. Plastome-based phylogenomics resolves the placement of the sanguinolenta group in the spikemoss of lycophyte(Selaginellaceae). Mol. Phylogenet. Evol.. DOI:10.1016/j.ympev.2020.106788
Zhang H.R., Xiang Q.P., Zhang X.C., 2019. The unique evolutionary trajectory and dynamic conformations of DR and IR/DR-coexisting plastomes of the early vascular plant Selaginellaceae (Lycophyte). Genome Biol. Evol, 11: 1258-1274. DOI:10.1093/gbe/evz073
Zhou X.M., Gao X.F., Zhang L.B., 2016. A large-scale phylogeny of the lycophyte genus Selaginella based on plastid and nuclear loci. Cladistics, 32: 360-389. DOI:10.1111/cla.12136