Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement

2023-07-04

  作  者:Li KP, Xu P, Wang JP, Yi X, Jiao YN*

  影响因子:16.6

  刊物名称:Nature Communications

  出版年份:2023

  卷:14 期:1 页码:6556

  论文摘要:

  Assembly of a high-quality genome is important for downstream comparative and functional genomic studies. However, most tools for genome assembly assessment only give qualitative reports, which do not pinpoint assembly errors at specific regions. Here, we develop a new reference-free tool, Clipping information for Revealing Assembly Quality (CRAQ), which maps raw reads back to assembled sequences to identify regional and structural assembly errors based on effective clipped alignment information. Error counts are transformed into corresponding assembly evaluation indexes to reflect the assembly quality at single-nucleotide resolution. Notably, CRAQ distinguishes assembly errors from heterozygous sites or structural differences between haplotypes. This tool can clearly indicate low-quality regions and potential structural error breakpoints; thus, it can identify misjoined regions that should be split for further scaffold building and improvement of the assembly. We have benchmarked CRAQ on multiple genomes assembled using different strategies, and demonstrated the misjoin correction for improving the constructed pseudomolecules.

  全文链接:https://www.nature.com/articles/s41467-023-42336-w


附件下载:

位于北京西部香山脚下的中国科学院植物研究所是我国系统与进化生物学领域的第一个国家重点实验室

版权所有 © 系统与进化植物学国家重点实验室[中国科学院植物研究所]
ICP备16067583号-12 网站管理

联系我们

  • 地址:北京市海淀区 香山南辛村20号
  • 邮编:100093
  • 电话:010-6283 6086
  • 传真:010-6283 6095
  • 电邮:lseb@ibcas.ac.cn

语言切换