Plastid phylogenomic analyses of the Selaginella sanguinolenta group (Selaginellaceae) reveal conflict signatures resulting from sequence types, outlier genes, and pervasive RNA editing


  作  者:Zhang MH, Xiang QP*, Zhang XC*


  刊物名称:Molecular Phylogenetics and Evolution


  卷:173 期: 页码:107507


  Different from the generally conserved plastomes (plastid genomes) of most land plants, the Selaginellaceae plastomes exhibit dynamic structure, high GC content and high substitution rates. Previous plastome analyses identified strong conflict on several clades in Selaginella, however the factors causing the conflictions and the impact on the phylogenetic inference have not been sufficiently investigated. Here, we dissect the distribution of phylogenetic signals and conflicts in Selaginella sanguinolenta group, the plastome of which is DR (direct repeats) structure and with genome-wide RNA editing. We analyzed the data sets including 22 plastomes representing all species of the S. sanguinolenta group, covering the entire geographical distribution from the Himalayas to Siberia and the Russian Far East regions. We recovered four different topologies by applying multispecies coalescent (ASTRAL) and concatenation methods (IQ-TREE and RAxML) on four data sets of PC (protein-coding genes), NC (non-coding sequences), PCN (the concatenated PC and NC), and RC (predicted RNA editing sites “C” were corrected by “T”), respectively. Six monophyletic clades, S. nummularifolia clade, S. rossii clade, S. sajanensis clade, S. sanguinolenta I clade, S. sanguinolenta II clade, and S. sanguinolenta III clade, were consistently resolved and supported by the characteristics of GC content, RNA editing frequency, and gene content. However, the relationships among these clades varied across the four topologies. To explore the underlying causes of the uncertainty, we compared the phylogenetic signals of the four topologies. We identified that the sequence types (coding versus non-coding), outlier genes (genes with extremely high |ΔGLS| values), and C-to-U RNA editing frequency in the protein-coding genes were responsible for the unstable phylogenomic relationship. We further revealed a significant positive correlation between the |ΔGLS| values and the variation coefficient of the RNA editing number. Our results demonstrated that the coalescent method performed better than the concatenation method in overcoming the problems caused by outlier genes and extreme RNA editing events. Our study particularly focused on the importance of exploring the plastid phylogenomic conflicts and suggested conducting concatenated analyses cautiously when adopting organelle genome data.




