Long-read sequencing for de novo genome assembly in bioeconomic context

Vogel, Alexander; Usadel, Björn (Thesis advisor); Kurth, Ingo (Thesis advisor); Schaffrath, Ulrich (Thesis advisor)

Aachen : RWTH Aachen University (2020, 2021)
Dissertation / PhD Thesis

Dissertation, RWTH Aachen University, 2020


Recent advances in third-generation sequencing technologies are currently offering unprecedented possibilities for DNA sequencing. Here I present an optimized protocol for the enrichment of long reads for nanopore sequencing and the adaptations thereof to meet the characteristics of different chemistry versions and project requirements for the application to whole de novo genome assembly, genome scaffolding and structural variant detection. The obtained data ranging from the infancy of the technology to the most recent chemistry versions was applied on organisms of variable genome complexity including bacteria, algae and complex plants. Supplemented and in comparison, with conventional next-generation sequencing and alternative long-read sequencing and scaffolding approaches the work resulted in high-quality genome sequences of the engineered bacteria Gluconobacter oxydans IK003.1, five Chlorella algae strains as well as the plants Cuscuta campestris, Solanum pennellii LYC1722 and Solanum lycopersicoides. Thereof, S. pennellii marks the first complete plant genome sequenced using nanopore technology and a valuable public dataset for the development of novel algorithms and assembly tools for this emerging data source. Similarly, the here presented genomes of Chlorella sorokiniana 211-8k and Solanum lycopersicoides represent some of the most contiguous long-read sequenced algae and plant genomes to date. Furthermore, taking recent developments, in the fast evolving field of nanopore sequencing into account, this thesis provides detailed evidence for current limitations of the workflows and identifies potential bottlenecks towards the concurrent increase of read length and sequencing yield while providing a comprehensive snapshot of most of the current state of the art technologies and their combination for various genome complexities.