
Viruses display high genetic diversity both within and among viral species, as well as within and among infected hosts. The composition of mixed samples can be assessed by metagenomics approaches, such as sequence read annotation by taxonomic classification using existing reference genomes and databases. For the majority of novel viral sequences encountered in biodiversity studies (WP 1.1), no reference genome or homolog is known, and discovery of de novo viral species by state-of-the-art genome assemblers ignores low-frequency variants and technical errors.
Low-frequency variants are of especially great interest for harbouring drug resistance mutations or affecting virulence. In this context, with the true interdisciplinary approach of VIROINF the following questions can be answered:
- How can we distinguish viral haplotypes in RNA-Seq data and characterise sequence-based evolution (ESR 3, ESR 8, ESR 4)?
- What is the role of quasispecies in virus pathogenesis and evolution (ESR 14, ESR 4)?
- How does intra-viral (ESR 1, ESR 2) and viral-host (ESR 6, ESR 15) selective pressure shape short-term evolution?
- Are RNA modifications also dictating some selective pressure (ESR 15)?
- Can we predict and design the fittest virus within a viral quasispecies (ESR 1)?
These questions will be mainly addressed through the integration of virus evolution experiments that generate high-resolution 2nd and 3rd generation sequences (ESR 4, ESR 9) and the development of novel bioinformatics tools to resolve quasispecies structures from the resulting data (ESR 8, ESR 3).
VIROINF will search for patterns which can emerge when host affiliation is projected onto viral taxonomy. Branch permutation techniques will be used to statistically determine at which level the viruses and hosts co-evolve. These patterns will reveal the evolution of virus-host associations for different classes of viruses. The associations will be validated by analysing both, known and predicted, host-associations independently (see WP 1.2). Moreover, virus-host associations will be used to validate the co-evolutionary signal.
PhD Projects
Main projects:
Side projects:
Journal Articles
V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation Journal Article
In: GigaScience, vol. 13, pp. giae065, 2024.
VILOCA: Sequencing quality-aware haplotype reconstruction and mutation calling for short- and long-read data Journal Article
In: bioRxiv, 2024.
FrameRate: learning the coding potential of unassembled metagenomic reads Journal Article
In: bioRxiv, 2022.
ITN -- VIROINF: Understanding (Harmful) Virus-Host Interactions by Linking Virology and Bioinformatics Journal Article
In: Viruses, vol. 13, no. 5, pp. 766, 2021.
Presentations
Systematic benchmarking and error evaluation of basecallers for Nanopore Direct RNA-seq Presentation
Poster at 21st European Conference on Computational Biology 2022, 20.09.2022.
FrameRate: predicting coding frames direct from unassembled DNA reads Presentation
Poster at 21st European Conference on Computational Biology 2022, 19.09.2022.
Read error correction for heterogeneous NGS samples using Dirichlet mixture models Presentation
Poster at Asonca 2022 - Biological systems: from first principles to data-driven modelling and back , 27.03.2022.
Error correction and local haplotype reconstruction for NGS data Presentation
Poster at International Virus Bioinformatics Meeting 2022, 24.03.2022.