b. Guangdong Laboratory for Lingnan Modern Agriculture, College of Life Sciences, South China Agricultural University, Guangzhou 510642, China;
c. School of Life Sciences, Southwest University, Chongqing 400715, China;
d. Department of Biology, Colorado State University, Fort Collins, CO 80523, USA;
e. School of Medical, Molecular and Forensic Sciences, Murdoch University, Murdoch, WA 6149, Australia;
f. State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China;
g. Shenzhen Research Institute of Henan University, Shenzhen 518000, China
Mitochondria (mt) and plastids (pt) are semi-autonomous organelles, each possessing an independent genome. Many nuclear-originated genes interact with these organelles to maintain normal cellular metabolism (Danecek and McCarthy, 2017). Particularly, the processes of organellar genome replication, recombination and repair (RRR for short) are mediated by the nuclear-encoded proteins, such as DNA helicase (RECG), MutS Homolog 1 (MSH1) and monokaryotic chloroplast 1 (MOC1) (Gualberto and Newton, 2017; Chevigny et al., 2020). The SSB1/SSB2, organellar DNA polymerases (DNAP) paralogs (POLIA/POLIB), RecA-like recombinases (RECA) and organellar single-stranded DNA binding protein (OSB) are all multiple copy gene families, indicating the possibility of functional redundancy and/or functional partitioning (Gualberto and Newton, 2017; Fuchs et al., 2020). In plants, the models of pt genome replication include the double D-loop replication, rolling circle and recombination-dependent replication (Morley et al., 2019a, Morley et al., 2019b). However, the replication mechanism of plant mtDNA remains largely unexplored due to the complex structure of mt genomes (Wu et al., 2020a, b). Some proteins involved in plant mtDNA replication, such as SSBs with a bacterial origin, a bacteriophage-related primase helicase called TWINKLE, and Plant Organellar DNA polymerases (POPs) which are more similar to eukaryotic. POPs participate in replication and can bypass DNA lesions to allow replication to continue, maintaining the stability and integrity of organellar genomes (Baruch-Torres and Brieba 2017). There may be broader evolutionary relationships between components of the plant organellar DNA replication machinery (Morley et al., 2019a, b; Czernecki et al., 2023). The copy number of mtDNA can be dynamic during its replication (Ayabe et al., 2023). For example, in Arabidopsis thaliana, the copy number of mtDNA was increased in the respiratory chain complex I NADH dehydrogenase subunit7 (nad7) mutant and msh1 and recA3 mutants, while both mtDNA and ptDNA copy numbers were decreased in polIa and polIb mutants (Shedge et al., 2010; Cupp and Nielsen, 2013; Morley and Nielsen, 2016; Ayabe et al., 2023). To comprehensively understand and quantify the mechanisms of mtDNA variation controlled by nuclear coding genes, further exploring them in combination with high-fidelity sequencing technology is necessary.
Plant mt genomes undergo extensive structural rearrangements, while maintaining low nucleotide substitution rates of coding sequences in angiosperms (Wolfe et al., 1987; Sloan et al., 2012). The main reason for this apparent paradox is the existence of numerous repeats in plant mt genomes, including large repeat sequences (≥ 1000 bp), intermediate-sized repeats (50–1000 bp) and small repeats (< 50 bp) (Gualberto and Newton, 2017; Wynn and Christensen, 2019). Repeat-mediated rearrangements in mt genomes are suppressed by nuclear genes and mutations in these genes can cause significant structural instability (Gualberto and Newton, 2017). High-fidelity (HiFi) reads allowed for global detection of mt genome recombination and quantification of recombination products in A. thaliana msh1 mutants (Zou et al., 2022). In the recG1 mutant of A. thaliana, crossover products of the repeats D, F and X were increased (Wallet et al., 2015). The A. thaliana polIb mutant accumulated more double-strand breaks (DSBs) under ciprofloxacin treatment, indicating its role in DSB repair (DSBR) (Cupp and Nielsen, 2013). Additionally, the MSH1 gene has been demonstrated to maintain a low mutation rate in A. thaliana organellar genomes (Wu et al., 2020a, b). Recent studies have shown that a decrease in mtDNA copy number accelerates mtDNA substitution rates, as low-copy mtDNA causes insufficient templates for homologous recombination repair (HR) (Zwonitzer et al., 2024). To more completely understand how organellar DNA is maintained by all components of the RRR mechanism, each molecular component needs to be thoroughly examined.
It is known that SSB genes play important roles in mtDNA replication, recombination and repair (Shereda et al., 2008). The proteins bind to single-stranded DNA exposed at the replication fork, preventing replication fork collapse and recruiting replication-related enzymes to the replication fork for DNA replication (Garcia-Medel et al., 2019). The replication of mtDNA has been extensively studied in Escherichia coli and bacteriophage T7. The E. coli SSB protein consists of an N-terminal oligonucleotide/oligosaccharide-binding (OB)-fold domain which mediates the tetramerization of the SSB protein and serves as the primary binding site for single-stranded DNA (ssDNA); a disordered C-terminal tail that mediates protein–protein interactions and is involved in DNA replication, recombination and repair (Antony and Lohman, 2019). Additionally, a non-conserved, intrinsically disordered linker (IDL) exists between the N- and C-termini, and complete deletion of the IDL eliminates the highly cooperative binding of SSB to ssDNA (Antony and Lohman, 2019). The SSB homolog of bacteriophage T7 (gp2.5) contains an OB-fold domain, which binds and stabilizes ssDNA, and promotes DNA replication. In gp2.5, the acidic C-terminal tail mediates the interactions with T7 DNA polymerase (gp5) and helicase-primase (gp4), coordinating the initiation and progression of DNA replication. The gp2.5 is also involved in the recombination, repair and packaging of T7 DNA, and is essential for the growth of the bacteriophage (Hernandez and Richardson, 2019). Knowledge about plant SSBs is limited. In Arabidopsis thaliana, both SSB1 and SSB2 contain the OB-fold domains. The C-terminal tail of SSB1 was acidic, as in bacteria, but that of SSB2 was aromatic, suggesting potential functional divergence (Brieba et al., 2019). The A. thaliana SSB1 protein interacts with TWINKLE and the functional domains of POLIA/B, both of which constitute the minimal machinery for DNA replication (Morley et al., 2019a, b). Overexpression of A. thaliana SSB1 resulted in increased mtDNA copy numbers and lowered the respiratory oxygen consumption rate (OCR), indicating mitochondrial dysfunction (Zhang et al., 2022). Knocking out A. thaliana SSB1 and SSB2 resulted in increased asymmetric recombination of intermediate-sized repeats (such as repeats F, L and X) (Qian et al., 2022). The recombination frequency of illegitimate recombination of three 49 bp dispersed repeats increased in rice ssb1 mutants (Li et al., 2021). The A. thaliana SSB1 protein was also found to be involved in DSBR, which interacted with POLIB to initiate microhomology mediated end joining (MMEJ) for DSBR (Garcia-Medel et al., 2019). The evolution and functional divergence of the SSB genes in plants has not been systematically analyzed. SSB regulates mtDNA replication by directly affecting the copy number of the mtDNA, indirectly by interacting with other RRR proteins, or both, which is not yet comprehensively understood.
In the current study, we aimed to investigate the roles of SSB1 and SSB2 in the replication and mutation of organellar genomes quantitatively, taking advantage of the HiFi sequencing technology. We analyzed the evolution of SSB genes in 37 representative species and found that the divergence of SSB genes occurred in the ancestors of modern gymnosperms and angiosperms. The ratios of mt/pt genome-mapped reads versus nuclear genome-mapped reads were decreased in ssb1 mutants. However, the decrease in copy number was observed only for the pt genome but not for the mt genome in ssb2 mutants. Structural variation analysis exhibited a slight increase in the recombination frequency within the mt genome of ssb1 and ssb2 mutants. Additionally, the frequency of base-level mutations in both mt and pt genomes was increased in ssb1 and ssb2 mutants. Our results suggest that SSB1 mainly plays a role in replication and indirectly affects the frequencies of recombination and mutation.
2. Materials and methods 2.1. Data sources and molecular analysesTo understand the evolution of SSB genes in plants, we first selected 37 plant species, spanning from algae (6), bryophytes (3), lycophytes (3), ferns (3), gymnosperms (2) to angiosperms (20) for comparison. In addition, SSB protein sequences from human, mouse, fruit fly, zebrafish, yeast and four bacteria were also downloaded from NCBI. The genomic information and gene annotation files for these 37 species were downloaded from the databases indicated in Tables S1–1. Using Arabidopsis thaliana SSB1 and SSB2 proteins as queries, we searched for protein sequences of different species using BLASTP (parameters set with an e-value cutoff at 0.001) and searched for different species' coding sequences (CDs) and whole genome sequences using TBLASTN (parameters set with an e-value cutoff at 0.001), obtaining candidate SSB genes' CDs and protein sequences. Proteins containing the oligonucleotide/oligosaccharide/binding (OB)-fold domain (PF00436) were analyzed using the Conserved Domain Database (Marchler-Bauer and Bryant, 2004; Lu et al., 2020). All obtained full-length protein sequences were aligned using MAFFT (with Normal Alignment Mode and auto Strategy), trimmed manually, and then a phylogenetic tree was constructed by IQtree, the model being (maximum-likelihood gene phylogeny reconstruction) in PhyloSuite v.1.2.3 (Figs. 1A and S2) (Zhang et al., 2020). Logos of conserved amino acid sequences were created using TBtools-II (Chen et al., 2023). The three-dimensional structure of SSB proteins is modeled by AlphaFold 3 and is displayed by PyMOL (http://www.pymol.org) (Abramson et al., 2024).
![]() |
Fig. 1 Phylogeny of SSB proteins, protein structure, and distribution of exon numbers in SSB genes in plants. A. Phylogeny of SSB proteins was reconstructed from their full-length protein sequences. Multiple sequence alignment was performed by PhyloSuite using MAFFT, trimmed manually and the tree was built by IQ-TREE2 using the maximum-likelihood method with LG + I + G4 as the best-fitting amino acid replacement model, and 1000 bootstrap replicates were used to assess clade support. Bootstrap support values > 90% are marked with black circles, 70%–90% are marked with dark purple circles, and < 70% are marked with gray circles. B. The number and proportion of SSB genes with different exon numbers from our species sampled in this study. The x-axis refers to different exon numbers, and the y-axis represents the number of SSB genes and their proportion among our sample set. C. The three-dimensional structure diagram of Arabidopsis thaliana SSB1 and SSB2 proteins is displayed by PyMOL (http://www.pymol.org). The amino acids 1–28 in the AtSSB1 protein sequence and 1–23 in the AtSSB2 protein sequence. The cyan area indicates the DNA binding sites homologous to Escherichi coli SSB (W54, F60), with AtSSB1 (W122, H128) and AtSSB2 (W137, Y143). The blue area represents the C-terminal, with AtSSB1 having an acidic terminus and AtSSB2 having an aromatic terminus. D. Presentation of SSB gene structures from different lineages. SSB1 contains arrangements with 5, 6, and 7 exons; SSB2 contains four arrangements with 2, 4, 5, and 7 exons; the early SSB clade contains 3 arrangements with 1, 4, and 5 exons. The blue boxes represent the exons, the black solid lines represent the introns, and the black dotted lines between exons link those with similar sequences. |
The Arabidopsis thaliana T-DNA insertion mutant line ssb1-1 (SALK_203889), and CRISPR-Cas9-generated knockout lines ssb1-2 and ssb2-1 were obtained from Southwest University in Chongqing, China (Qian et al., 2021, 2022). Rosette leaves from one or two 6-week-old individuals were collected for DNA extraction and PacBio HiFi sequencing, which generated a total of 26.25 Gb of data (Table S2–1). When analyzing structural variations, due to the limited amount of data per individual, we merged the data from different samples of the same mutant for analysis. However, when analyzing the mutation frequencies, we didn't merge different samples to emphasize sample-specific effects.
2.3. Read coverage calculationsAll reads were mapped to mt and pt reference genomes using PacBio pbmm2 (v.1.3.0; parameters: --sort --preset CCS), and per-base coverage was calculated by the depth subcommand of samtools (v.1.7) (Danecek and McCarthy 2017), normalized by the nuclear coverage and average coverage (Zou et al., 2022).
2.4. Identification of rearrangements in mt and pt genomesHiFi reads were aligned to the Arabidopsis thaliana Columbia (Col-0) reference genome (NC_037304.1, used for mt genome; NC_000932.1, for pt genome; and Col-PEK (GCA_020911765.2) for nuclear genome) using winnowmap (v.2.03) (Jain et al., 2020; Hou et al., 2022). Organellar reads were used to identify recombination in the mt and pt genomes, and recombination frequencies were quantified by HiFiSr (Zou et al., 2022).
2.5. Identification of SNVs and InDels in the mt and pt genomesTo identify SNVs and InDels in the mt and pt genomes, zero-rearrangement reads were realigned to the mt and pt reference genomes using winnowmap (v.2.03) (Jain et al., 2020). Variants were called using bcftools (v.1.18) (Danecek and McCarthy, 2017). Variant frequencies and sequencing depths for each called SNV or InDel identified by bcftools were calculated from raw BAM files using the samtools (v.1.7) (Danecek and McCarthy, 2017).
3. Results 3.1. The evolution of SSB genes in plantsThe SSB genes in Arabidopsis thaliana are similar to the bacterial homolog (Edmondson et al., 2005). However, their evolution in plants remains less clear. To investigate the evolutionary divergence of SSB proteins, we first collected all the candidate sequences using BLASTP in 37 plant species from algae (6 species), bryophytes (3 species), lycophyte (3 species), ferns (3 species), gymnosperms (2 species) and angiosperms (20 species), and downloaded SSB proteins from NCBI for human, mouse, fruit fly, zebrafish, yeast and four bacteria for phylogenetic analysis (Table S1–1). Conserved domain analysis showed that all candidate proteins contain the OB-fold domain (Fig. S1A; Table S1–2). SSB genes of plants are homologous to those in bacteria and animals, which was shown in the evolution of SSB genes from the larger groups of bacteria, animals, yeast, and plants (Fig. S2). There was only one copy of SSB in bacteria and animals, but one to four copies of SSB in plants (Fig. S2). The evolutionary tree was divided into three main clades, and the clades SSB1 and SSB2 were found in both gymnosperms and angiosperms (Fig. S2). We focused on the evolution of SSB genes in plant groups, and the evolutionary relationship of SSB proteins in 20 plant species, as shown in Fig. 1A. The topology of the tree suggested three distinguishable groups, the SSB1 clade, the SSB2 clade, and an early diverging clade (early SSB) containing algae and bryophytes. In gymnosperms and angiosperms, both SSB1 and SSB2 were present. In the lycophyte Selaginella moellendorffii only SSB2 was present (although in 4 copies), suggesting that the divergence of SSB1 and SSB2 occurred after the divergence of seed plants from early vascular plants (Fig. 1A).
Multiple sequence alignment showed that most of the SSB1 (11/91.7%) and SSB2 (17/85%) proteins contained GTGG domains (Fig. S1B). Both SSB1 (12/100%) and SSB2 (18/90%) proteins contain QWHR domains, where W is the DNA binding site homologous to bacterial W54 (Fig. 1C; Fig. S1B and Fig. S3) (Brieba, 2019). These two domains also were found in Marchantia polymorpha and Physcomitrium patens, but not in algae. A-6-amino acid (aa) deletion in all SSB1 proteins distinguished them from SSB2 proteins (Fig. S1B and Fig. S3). Another DNA binding site F60 in bacteria corresponds to H in SSB1 and Y in SSB2 (Fig. S1B and Fig. S3). We conducted a structural comparison between E. coli SSB and Arabidopsis thaliana SSB1 and SSB2, revealing a high similarity in the OB-fold domain. However, the N-terminal and C-terminal regions are not similar (Fig. S1C, gray and red). The AtSSB1 protein has an acidic C-terminal tail, whereas AtSSB2 has an aromatic C-terminal tail (Fig. 1C). The similarities and differences in the protein sequences suggest that there may be some functional redundancy and specific functions unique to SSB1 and SSB2 proteins.
We further compared the gene structures of SSB1 and SSB2 in our sampled lineages. Exon numbers from 1 to 7 were found among the SSB genes in the selected 20 plant species. Among them, the proportions were as follows: nine of the SSB1 genes contained six exons (75%), eight genes contained four exons (50%) and four genes contained one exon in early SSB (40%) (Fig. 1B and Table S1–3). We selected several examples from SSB1 and SSB2 with different numbers of exons to show how related sequences were positioned around introns (Fig. 1D). In SSB1 genes, there was a high similarity among flowering plants, such as A. thaliana, Oryza sativa and Brassica rapa: OsSSB1 contains 7 exons, with exons 1, 2, 3, 4 corresponding to AtSSB1 exons 6, 5, 4, 3, which corresponds to BrSSB1 exons 5, 4, 3, 2 (Fig. 1D). In SSB2 genes: the similarity between species in the same phylum with different numbers of exons was high, such as gymnosperm GbSSB2 with exon 1 and 3 correspond to CpSSB2 exon 3, 4 (Fig. 1D). In early SSB, the plants within the same species have a high similarity among different copies, such as exon 1–4 (excluding exon 5) of PpSSB-1 in P. patens can be matched with the 4 exons of PpSSB-2 (Fig. 1D). The alga Chondrus crispus CcSSB-1 with one exon correspond to MpSSB exon 3 (Fig. 1D). Our data indicate that the exon sequences of SSB1, SSB2, and early SSB were similar while introns appear to be highly variable in presence, length and nucleotide content.
We compared the expression profiles of SSB1 and SSB2 obtained from the Arabidopsis eFP Browser and found that during early seed germination, SSB2 exhibits higher expression levels compared to SSB1 (Figs. S4A and S4B). The published RNA-seq data also indicated that both SSB1 and SSB2 of Arabidopsis thaliana were highly expressed during the unicellular microspores (UNM) and bicellular pollen (BCP) stages, while their expression decreased during subsequent stages such as the tricellular pollen (TCP) and mature pollen grains (MPG) stages (Tables S1–4) (Klodova et al., 2023). Within the mature pollen, the expression of SSB2 in sperm cells was higher than that of SSB1 (Tables S1–4) (Klodova et al., 2023). Additionally, meiotic cells showed a 10-fold higher expression of SSB1 compared to SSB2, suggesting a more important role of the SSB1 gene in meiotic division (Tables S1–4) (Misra et al., 2023). Together, the distinct evolutionary trajectories and differences in expression of SSB1 and SSB2 suggest that potentially divergent functions exist for these proteins during plant growth development.
3.2. Reduced organellar DNA content in ssb1 and ssb2 mutantsTo investigate the function of SSB genes in organellar genome replication, all the reads were aligned to the reference genome of Col-0. The average HiFi read depth ranged from 13.21 × to 32.43 × for the nuclear genome, 523.13 × to 1340.67 × for the pt genome, and 45.77 × to 154.52 × for the mt genome (Table S2–1). We found that the proportions of mt (0.80–2.02%) and pt (3.32–8.76%) genome mapped reads in ssb1-1, ssb1-2 and ssb2-1 mutants were lower than those of the wild-type (WT) plants indicating that SSB1 and SSB2 were involved in maintaining organellar DNA abundance (Fig. 2 and Fig. S5A; Table S2–2). Compared with ssb1-1 and ssb1-2 mutants, the copy number of mtDNA in the ssb2-1 mutant was slightly decreased, suggesting SSB2 may have a limited impact on mtDNA replication (Fig. 2). The organellar coverage that normalized to the nuclear coverage indicated a decreased organellar coverage in ssb mutants (Fig. 2C and D). Despite this, the overall patterns of HiFi read coverage of mt and pt genomes in the mutants were similar to WT, including the elevation at large repeats (Figs. S5B and S5C).
![]() |
Fig. 2 The change of organellar DNA content in ssb1 and ssb2 mutants and coverage across the mt and pt genomes after nuclear normalization. A. The mtDNA was decreased in ssb1-1, ssb1-2 and ssb2-1 compared to WT. B. The ptDNA was decreased in ssb1-1, ssb1-2 and ssb2-1 compared to WT. The bars represent the average, and the three black circles represent the three replicates; The number on the right side of the bar represents the average of the three replicates; The numbers in parentheses represent multiples relative to wild-type; mt to mitochondrial; pt to plastid. Asterisks indicate a significant difference using the student's t-test (*P < 0.05, ns: not significant); C and D. The organellar coverage normalized to the nucleus. Plots show nucleus normalized per-base read average coverage of 3 samples each WT, ssb1-1, ssb1-2 and ssb2-1 mutants. Light red, light purple and dark blue lines denote the normalized per-base read average coverage of ssb1-1, ssb1-2 and ssb2-1 mutants. Light green lines denote the normalized per-base read average coverage of WT (W3-5-2, Col-XJTU and Col-0). |
We then analyzed the structural rearrangements in the mt genomes of ssb1-1, ssb1-2 and ssb2-1 mutants, which were mostly one-rearrangement events (Fig. S6; Table S3–1). In ssb1-1, ssb1-2 and ssb2-1 mutants, the global patterns of one-rearrangement events were similar to wild-type, but the recombination frequency of some intermediate-sized repeats was slightly increased (Fig. 3; Table S4). For example, repeats MMJS, A, B, F and L in ssb1-1, ssb1-2 and ssb2-1 were slightly increased in frequency (Fig. 3, Fig. 4A; Table S5). Except for the large repeats, the proportion of the sum of the frequencies of MMJS, A, B, F and L in one rearrangement was higher in ssb1-1 (53%), ssb1-2 (51%) and ssb2-1 (53%) mutants than in the wild-type (40%) (Fig. 4A; Table S5). Compared with msh1 mutants, the lower frequency of rearrangements in ssb1-1, ssb1-2 and ssb2-1 mutants (Fig. S6 and Fig. S7) indicated that the effect of SSB1 and SSB2 deficiency on mt genome rearrangement frequencies might be indirect. In contrast to mt genomes, we did not observe a significant difference in the structural variation in the pt genome between ssb mutants and wild-type (Fig. S8; Table S6).
![]() |
Fig. 3 Patterns of one-rearrangement mt reads detected in mt genomes of WT, ssb1-1, ssb1-2 and ssb2-1. Dot plots show global patterns of the junctions and relative read counts of mitochondrial one-rearrangement events in different samples. Colors represent categories of alignment overlap length (AO length). The sizes of dots denote read counts normalized to 10, 000 total mt genome-mapped reads per sample. |
![]() |
Fig. 4 Structural variants in the mt genome detected by HiFi reads. A. Excluding the repeat Large1 and Large2 mediated recombination, the top five repeats (MMJS, A, B, F and L) were primarily attributed to a slight increase in the recombination frequency among mutants. B to F. The recombination products in repeat MMJS, A, B, F, and L of ssb1-1, ssb1-2 and ssb2-1 mutants. G and H, repeat F and L-mediated forms of recombination product and genes located at repetitive boundaries. |
Using long reads spanning the full length of repeat sequences, we analyzed the proportion of recombination products of the repeats MMJS, A, B, F and L with the higher recombination activity in ssb1-1, ssb1-2 and ssb2-1 (Fig. 4 and Fig. S9; Table S7). Different repeats have varying proportions of recombination products, such as the frequency of type 2-1 was higher than type 1-2 in MMJS and L, whereas it was the opposite in repeat F (Fig. 4B, E, and F); However, repeats A and B exhibited no significant difference in frequency between wild-type and mutants (Fig. 4C and D). Three genes in the small circle (40, 671bp) that flank repeat F were atp8 at the repeat F-2 boundary, and seven genes in the small circle (61, 102bp) that atp9 at the boundary of repeat L-1 (Fig. 4G and H). We also analyzed the frequencies of SNVs and small InDels and found that the average frequencies of InDels of mt and pt genomes were significantly higher in ssb1 and ssb2 mutant mt genomes than in WT samples, as well as increased frequencies of SNVs in pt genome, which suggested that loss of SSB1 or SSB2 function could also affect the accumulation of small-scale variants (Fig. 5A, B and C, Tables S8–S11).
![]() |
Fig. 5 The nucleotide variants in ssb1-1, ssb1-2 and ssb2-1 mutants compared to WT. A. The frequency of InDels in the mt genome. The figure shows sites with frequencies higher than 0.1 removed and filter out reads that are less than 3, with all site mutation frequencies detailed in Tables S8 and S9. B. The frequency of InDels in the pt genome. The figure shows sites with frequencies higher than 0.01 removed and filter out reads that are less than 3, with all site mutation frequencies detailed in Tables S10 and S11. C. The frequency of SNVs in the pt genome. The figure shows sites with frequencies higher than 0.01 removed and filter out reads that are less than 3, with all site mutation frequencies detailed in Tables S10 and S11. Asterisks indicate a significant difference using the student's t-test (*P < 0.05, **P < 0.01, and ***P < 0.001, ns: not significant). |
Many genes involved in organellar genome replication, recombination and repair (RRR) belong to multi-copy gene families with redundant functions (Cupp and Nielsen, 2014; Fuchs et al., 2020). However, the functional divergence of different members was less explored. In this study, phylogenetic analysis revealed that SSB proteins diverged into two major groups in seed plants. In all gymnosperms and angiosperms, both SSB1 and SSB2 were present (Fig. 1A; Fig. S2). The early land plants have only a single (Marchantia polymorpha) or multiple (Physcomitrium patens) very similar copies of the SSB gene (Fig. 1A; Fig. S2). To fully understand the evolution of SSB genes in early land plants, more species need to be sequenced such that the copy number and type of the SSB genes can be more clearly mapped onto the phylogenetic tree.
Based on the differences in gene expression levels and protein structures, SSB1 and SSB2 likely evolved functional divergence (neofunctionalization). Previous studies have indicated that the moss Physcomitrium patens underwent two rounds of whole-genome duplication (WGD) which probably explains the presence of multiple closely related SSB copies in this species (Lang et al., 2018; Bi et al., 2024). During DNA replication of E. coli, the SSB protein recruited and interacted with at least 17 other proteins through the 9 acidic amino acids at its C-terminus (Shinn et al., 2019). In A. thaliana, the C-terminal sequences of SSB1 (acidic) and SSB2 (aromatic) proteins were different (Fig. 1C), suggesting that they may interact with different proteins (Brieba, 2019). In our results, genes in different groups contained different exon numbers. An earlier study suggested that during the evolutionary divergence of AtSSB1 and AtSSB2, AtSSB1 stepwisely obtained 2 additional introns (Edmondson et al., 2005). The A. thaliana SSB1 and SSB2 were involved in seed germination (Qian et al., 2022), with SSB1 showing lower expression levels compared to SSB2 (Figs. S4B and S4C). In ssb1 and ssb2 mutants, about 20 % seeds of the two mutants were shrunken with germination defects (Qian et al., 2022). The expression level of SSB2 in sperm cells was higher than SSB1 in mature pollen (Klodova et al., 2023; Misra et al., 2023). Additionally, the expression level of SSB1 was ten times higher than that of SSB2 in post-meiotic cells, which might imply a more important role of SSB1 during the process of meiosis. Whether there is a functional difference between SSB1 and SSB2 in maintaining organelle genome stability still needs to be proven experimentally.
4.2. SSB1 and SSB2 regulate the replication of organellar genomes and indirectly affect DNA recombination and repairPlant organellar DNA replication machinery is composed of TWINKLE, POLI, SSB and other proteins (Morley et al., 2019a, Morley et al., 2019b). The mtDNA copy number in plants is very low compared to animals (except mosses), as in Arabidopsis thaliana nondividing tissues, where the mtDNA is often found at less than one copy per mitochondrion (Zhang et al., 2022). The changes in copy number of organellar DNA correlated with different stages of cell development in different tissues (Zhang et al., 2022). DNA copy numbers of mt and pt genomes were reduced in polIa and polIb mutants with slight growth delay (Cupp and Nielsen, 2013; Morley and Nielsen, 2016). When SSB is overexpressed, mtDNA copy number increases, which leads to mitochondrial dysfunction and abnormal plant respiration, indicating that the appropriate copy number is necessary for proper cellular function in plants (Zhang et al., 2022). In a previous study, knocking out SSB1 and SSB2 resulted in an increased expression level of mitochondrial genes (Qian et al., 2022). However, according to the ratios of HiFi reads mapped to reference genomes, the mt and pt genomes DNA copy numbers were reduced in both ssb1-1 and ssb1-2 mutants (Fig. 2A and B; Table S2–2). When the three mutants were compared, it was found that the mt DNA content of ssb2-1 decreased slightly compared with ssb1-1 and ssb1-2 (Fig. 2A), however, there were no significant differences in germination and growth retardation (Qian et al., 2022). Previous studies reported that SSB1 and SSB2 were localized to mitochondria (Edmondson et al., 2005; Qian et al., 2022; Zhang et al., 2022). However, no studies had explicitly stated their location in chloroplasts. Although the available evidence mainly focuses on mitochondria, their role in chloroplasts should not be completely ruled out.
We found that mt genome rearrangements were slightly elevated in ssb1 and ssb2 mutants, as evidenced by an increase in the frequency of several repeats-mediated recombination (MMJS, A, B, F and L). These recombination signals might be indirectly affected by copy number reduction. A recent study on 60 species showed that a lower copy number of organellar genomes could impair the efficiency of homologous recombination, possibly because reduced mtDNA copy number led to fewer templates for DNA damage repair (Zwonitzer et al., 2024). The correlation between mtDNA changes and variations is likely caused by the interaction of multiple factors, rather than being dominated by any single factor (Wang et al., 2024). The pt genome contains very few intermediate repeats and the genome structure is relatively stable, compared to the mt genome (Zou et al., 2022). Thus, in ssb1 and ssb2 mutants, the pt DNA copy numbers were decreased, but the effect on pt genome rearrangements was not observed.
In vitro complementation fidelity assay showed that Arabidopsis thaliana SSB1 and SSB2 did not change the fidelity of POLIA and POLIB during organellar DNA replication (Ayala-Garcia et al., 2018). Furthermore, Sanger sequencing of PCR products failed to show changes in mutation frequencies in ssb1 and ssb2 mutants (Qian et al., 2022). By contrast, our data showed that the frequency of SNV and InDel variants in organellar genomes were slightly increased in ssb1-1 and ssb1-2 mutants (Fig. 5), indicating SSB1 was involved in regulating mutation frequencies of organellar genomes, perhaps indirectly. The recombination frequency and InDel slightly increased in the ssb2-1 mutant, although there was no significant change in mtDNA copy number. The observed increase in the frequency of small-scale mutations may be an indirect result of the reduced DNA copy number.
5. ConclusionOur results demonstrate that SSB1 plays an important role in organellar DNA replication by the HiFi sequencing technology. Loss of function of SSB1 led to decreased copy numbers of mt and pt genomes, but there was only a minor change in mtDNA copy number in ssb2 mutants. Additionally, the frequencies of structural and small-scale variants were slightly increased in ssb1 and ssb2 mutants. Further research is needed to explore the distinct roles of SSB1 and SSB2 in maintaining organellar genome stability. Importantly, our findings further pinpoint that the variation in organellar genome copy number can affect DNA replication, recombination, and repair in plants.
AcknowledgmentsThis work was supported by grants from the National Natural Science Foundation of China (32170238, 32400191), Guangdong Basic and Applied Basic Research Foundation (2023A1515111029), the Science, Technology and Innovation Commission of Shenzhen Municipality (RCYX20200714114538196), the Chinese Academy of Agricultural Sciences Elite Youth Program (grant 110243160001007) and the Guangdong Pearl River Talent Program (2021QN02N792). We also acknowledged the advice from Dr. Daniel B. Sloan in the manuscript preparation.
CRediT authorship contribution statement
Weidong Zhu: Writing – review & editing, Writing – original draft, Visualization, Validation, Investigation, Formal analysis, Data curation. Jie Qian: Writing – review & editing, Resources, Investigation. Yingke Hou: Writing – review & editing, Visualization. Luke R. Tembrock: Writing – review & editing. Liyun Nie: Validation. Yi-Feng Hsu: Resources. Yong Xiang: Writing – review & editing. Yi Zou: Writing – review & editing, Project administration, Methodology, Investigation, Funding acquisition, Conceptualization. Zhiqiang Wu: Writing – review & editing, Supervision, Resources, Project administration, Funding acquisition, Conceptualization.
Data accessibility statement
All relevant data can be found within the manuscript or supplementary information. The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (
Declaration of competing interest
The authors have no competing interest to declare.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.org/10.1016/j.pld.2024.11.001.
Abramson, J., Adler, J., Dunger, J., et al., 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630: 493-500. DOI:10.1038/s41586-024-07487-w |
Ayabe, H., Toyoda, A., Iwamoto, A., et al., 2023. Mitochondrial gene defects in Arabidopsis can broadly affect mitochondrial gene expression through copy number. Plant Physiol., 191: 2256-2275. DOI:10.1093/plphys/kiad024 |
Ayala-Garcia, V.M., Baruch-Torres, N., Garcia-Medel, P.L., et al., 2018. Plant organellar DNA polymerases paralogs exhibit dissimilar nucleotide incorporation fidelity. FEBS J., 285: 4005-4018. DOI:10.1111/febs.14645 |
Baruch-Torres, N., Brieba, L.G., 2017. Plant organellar DNA polymerases are replicative and translesion DNA synthesis polymerases. Nucleic Acids Res., 45: 10751-10763. DOI:10.1093/nar/gkx744 |
Bi, G.Q., Zhao, S.J., Yao, J.W., et al., 2024. Near telomere-to-telomere genome of the model plant Physcomitrium patens. Nat. Plants, 10: 327-343. DOI:10.1038/s41477-023-01614-7 |
Brieba, L.G., 2019. Structure-function analysis reveals the singularity of plant mitochondrial DNA replication components: a Mosaic and Redundant System. Plants, 8: 533. DOI:10.3390/plants8120533 |
Chen, C.J., Wu, Y., Li, J.W., et al., 2023. TBtools-II: a "one for all, all for one" bioinformatics platform for biological big-data mining. Mol. Plant, 16: 1733-1742. DOI:10.1016/j.molp.2023.09.010 |
Chen, T.T., Chen, X., Zhang, S.S., et al., 2021. The genome sequence archive Family: toward explosive data growth and diverse data types. Dev. Reprod. Biol., 19: 578-583. DOI:10.1016/j.gpb.2021.08.001 |
Chevigny, N., Schatz-Daas, D., Lotfi, F., et al., 2020. DNA repair and the stability of the plant mitochondrial genome. Int. J. Mol. Sci., 21: 328. DOI:10.3390/ijms21010328 |
Cupp, J.D., Nielsen, B.L., 2013. Arabidopsis thaliana organellar DNA polymerase IB mutants exhibit reduced mtDNA levels with a decrease in mitochondrial area density. Physiol. Plantarum, 149: 91-103. DOI:10.1111/ppl.12009 |
Cupp, J.D., Nielsen, B.L., 2014. Minireview: DNA replication in plant mitochondria. Mitochondrion, 19: 231-237. DOI:10.1016/j.mito.2014.03.008 |
Czernecki, D., Nourisson, A., Legrand, P., et al., 2023. Reclassification of family A DNA polymerases reveals novel functional subfamilies and distinctive structural features. Nucleic Acids Res., 51: 4488-4507. DOI:10.1093/nar/gkad242 |
Danecek, P., McCarthy, S.A., 2017. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics, 33: 2037-2039. DOI:10.1093/bioinformatics/btx100 |
Edmondson, A.C., Song, D., Alvarez, L.A., et al., 2005. Characterization of a mitochondrially targeted single-stranded DNA-binding protein in Arabidopsis thaliana. Mol. Genet. Genom., 273: 115-122. DOI:10.1007/s00438-004-1106-5 |
Fuchs, P., Rugen, N., Carrie, C., et al., 2020. Single organelle function and organization as estimated from Arabidopsis mitochondrial proteomics. Plant J., 101: 420-441. DOI:10.1111/tpj.14534 |
Garcia-Medel, P.L., Baruch-Torres, N., Peralta-Castro, A., 2019. Plant organellar DNA polymerases repair double-stranded breaks by microhomology-mediated end-joining. Nucleic Acids Res., 47: 3028-3044. DOI:10.1093/nar/gkz039 |
Gualberto, J.M., Newton, K.J., 2017. Plant mitochondrial genomes: dynamics and mechanisms of mutation. Annu. Rev. Plant Biol., 68: 225-252. DOI:10.1146/annurev-arplant-043015-112232 |
Hernandez, A.J., Richardson, C.C., 2019. Gp2.5, the multifunctional bacteriophage T7 single-stranded DNA binding protein. Semin. Cell Dev. Biol., 86: 92-101. DOI:10.1016/j.semcdb.2018.03.018 |
Hou, X.R., Wang, D.P., Cheng, Z.K., et al., 2022. A near-complete assembly of an Arabidopsis thaliana genome. Mol. Plant, 15: 1247-1250. DOI:10.1016/j.molp.2022.05.014 |
Jain, C., Rhie, A., Zhang, H.W., Chu, C., et al., 2020. Weighted minimizer sampling improves long read mapping. Bioinformatics, 36: i111-i118. DOI:10.1093/bioinformatics/btaa435 |
Klodova, B., Potesil, D., Steinbachova, L., et al., 2023. Regulatory dynamics of gene expression in the developing male gametophyte of Arabidopsis. Plant Reprod., 36: 213-241. DOI:10.1007/s00497-022-00452-5 |
Lang, D., Ullrich, K.K., Murat, F., et al., 2018. The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution. Plant J., 93: 515-533. DOI:10.1111/tpj.13801 |
Li, D.Q., Wu, X.B., Wang, H.F., et al., 2021. Defective mitochondrial function by mutation in THICK ALEURONE 1 encoding a mitochondrion-targeted single-stranded DNA-binding protein leads to increased aleurone cell layers and improved nutrition in rice. Mol. Plant, 14: 1343-1361. DOI:10.1016/j.molp.2021.05.016 |
Lu, S.N., Wang, J.Y., Chitsaz, F., et al., 2020. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res., 48: D265-D268. DOI:10.1093/nar/gkz991 |
Marchler-Bauer, A., Bryant, S.H., 2004. CD-Search: protein domain annotations on the fly. Nucleic Acids Res., 32: W327-W331. DOI:10.1093/nar/gkh454 |
Misra, C.S., Sousa, A.G.G., Barros, P.M., et al., 2023. Cell-type-specific alternative splicing in the Arabidopsis germline. Plant Physiol., 192: 85-101. DOI:10.1093/plphys/kiac574 |
Morley, S.A., Ahmad, N., Nielsen, B.L., 2019a. Plant organelle genome replication. Plants, 8: 358. DOI:10.3390/plants8100358 |
Morley, S.A., Nielsen, B.L., 2016. Chloroplast DNA copy number changes during plant development in organelle DNA polymerase mutants. Front. Plant Sci., 7: 57. http://www.onacademic.com/detail/journal_1000040532009110_073a.html. |
Morley, S.A., Peralta-Castro, A., Brieba, L.G., et al., 2019b. Arabidopsis thaliana organelles mimic the T7 phage DNA replisome with specific interactions between Twinkle protein and DNA polymerases Pol1A and Pol1B. BMC Plant Biol., 19: 241. DOI:10.1186/s12870-019-1854-3 |
Qian, J., Li, M., Zheng, M., et al., 2021. Arabidopsis SSB1, a mitochondrial single-stranded DNA-binding protein, is involved in ABA response and mitochondrial RNA splicing. Plant Cell Physiol., 62: 1321-1334. DOI:10.1093/pcp/pcab097 |
Qian, J., Zheng, M., Wang, L.G., et al., 2022. Arabidopsis mitochondrial single-stranded DNA-binding proteins SSB1 and SSB2 are essential regulators of mtDNA replication and homologous recombination. J. Integr. Plant Biol., 64: 1952-1965. DOI:10.1111/jipb.13338 |
Shedge, V., Davila, J., Arrieta-Montiel, M.P., et al., 2010. Extensive rearrangement of the Arabidopsis mitochondrial genome elicits cellular conditions for thermotolerance. Plant Physiol., 152: 1960-1970. DOI:10.1104/pp.109.152827 |
Shereda, R.D., Kozlov, A.G., Lohman, T.M., et al., 2008. SSB as an organizer/mobilizer of genome maintenance complexes. Crit. Rev. Biochem. Mol. Biol., 43: 289-318. DOI:10.1080/10409230802341296 |
Shinn, M.K., Kozlov, A.G., Nguyen, B., et al., 2019. Are the intrinsically disordered linkers involved in SSB binding to accessory proteins?. Nucleic Acids Res., 47: 8581-8594. http://www.xueshufan.com/publication/2964319670. |
Sloan, D.B., Alverson, A.J., Chuckalovcak, J.P., et al., 2012. Rapid evolution of enormous, multichromosomal genomes in flowering plant mitochondria with exceptionally high mutation rates. PLoS Biology, 10: e1001241. DOI:10.1371/journal.pbio.1001241 |
Wallet, C., Ret, M.L., Bergdoll, M., et al., 2015. The RECG1 DNA translocase is a key factor in recombination surveillance, repair, and segregation of the mitochondrial DNA in Arabidopsis. Plant Cell, 27: 2907-2925. http://www.plantcell.org/content/27/10/2907.full-text.pdf. |
Wang, J., Zou, Y., Mower, J.P., et al., 2024. Rethinking the mutation hypotheses of plant organellar DNA. Genom. Commun., 1: e003. |
Wolfe, K.H., L, W.H., Sharp, P.M., 1987. Rates of nucleotide substitution vary greatly among plant mitochondrial chloroplast and nuclear DNAs. Proc. Natl. Acad. Sci. U.S.A., 84: 8054-9058. DOI:10.1073/pnas.84.22.8054 |
Wu, Z.Q., Waneka, G., Broz, A.K., et al., 2020a. MSH1 is required for maintenance of the low mutation rates in plant mitochondrial and plastid genomes. Proc. Natl. Acad. Sci. U.S.A., 117: 16448-16455. DOI:10.1073/pnas.2001998117 |
Wu, Z.Q., Liao, X.Z., Zhang, X.N., et al., 2020b. Genomic architectural variation of plant mitochondria—a review of multichromosomal structuring. J. Systemat. Evol., 60: 160-168. DOI:10.1504/ijmpt.2020.113194 |
Wynn, E.L., Christensen, A.C., et al., 2019. Repeats of unusual size in plant mitochondrial genomes: identification, incidence and evolution. G3 Genes Genomes Genet., 9: 549-559. DOI:10.1534/g3.118.200948 |
Xue, Y.B., Bao, Y.M., Zhang, Z., et al., 2022. Database resources of the national genomics data center, China national center for bioinformation in 2022. Nucleic Acids Res., 50: D27-D38. DOI:10.1093/nar/gkab951 |
Zhang, D., Gao, F.L., Jakovlic, I., 2020. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol. Ecol. Resour., 20: 348-355. DOI:10.1111/1755-0998.13096 |
Zhang, L.G., Ma, J., Shen, Z.R., et al., 2022. Low copy numbers for mitochondrial DNA moderates the strength of nuclear-cytoplasmic incompatibility in plants. J. Integr. Plant Biol., 65: 739-754. DOI:10.3390/photonics9100739 |
Zou, Y., Zhu, W.D., Sloan, D.B., et al., 2022. Long-read sequencing characterizes mitochondrial and plastid genome variants in Arabidopsis msh1 mutants. Plant J., 112: 738-755. DOI:10.1111/tpj.15976 |
Zwonitzer, K.D., Tressel, L.G., Wu, Z.Q., et al., 2024. Genome copy number predicts extreme evolutionary rate variation in plant mitochondrial DNA. Proc. Natl. Acad. Sci. U.S.A., 121: e2317240121. DOI:10.1073/pnas.2317240121 |