PhD Projects » ESR 14: Computational methods for the analysis of metagenomic datasets to extract viral sequences within the context of commercial data-mining

Field:Bioinformatics
Host institution:University of Vienna (UV), Austria
Local supervisor: Prof. Dr. Thomas Rattei (UV, Computational Systems Biology)
Local co-supervisor: Prof. Dr. Matthias Horn (UV, Microbiology and Ecosystem Science)
Project partner: ESR 7

Bacteriophages are increasingly being developed for the treatment of bacterial infections, and their role as modulators of the microbiome is of a growing area of extension of phage use. The sheer number and diversity of phage particles make it impossible to isolate and test each of them individually. This project aims to identify and harness phage sequences for commercial use by developing a method for large- scale identification of phage sequences, functional annotation, and in vitro validation of findings in a bottom-up approach.

We have four specific aims:

  • Identification of phage sequences within >50 metagenomic datasets (including upstream generation of >5 metagenomic datasets by partner UZH and meta-analysis of >45 publicly available datasets). Reconstruction of viral genomes sequences from metagenomic data (assembly, binning, phage classification); Reconstruction of cellular genome sequences (MAGs) from the same/related metagenomic datasets, in order to extract potential phage-host pairs from co-occurrence patterns; Prediction of prophage sequences in MAG in order to facilitate host prediction of phage sequences (in collaboration with ESR 12).
  • Extraction and testing of sequences to validate WP 1 findings ie., hypothesis testing by either isolating the phages themselves or having their genome synthesised (in collaboration with ESR 7).
  • Sub-analysis of geographical and temporal variation in metagenomic datasets that could justify product composition or modification analysis of geographic and temporal variation of phage sequences according to the local epidemiology. Relating these changes to geographic and temporal variation of related microbiome data (in collaboration with ESR 7).
  • Select for sequences with developmental potential (phage sequences with greater pH stability; host range expansion, etc.)
  • Computational modelling of phage stability and host range (in collaboration with ESR 7).