PhD Projects » ESR 8: Genotype-Phenotype mapping and inference of epistatic interactions driving adaptation in viral hosts

Liuwei Wang

I grew up on the tropical Hainan island in southern China, next to beautiful beaches and coconut trees. I studied in Hong Kong (another island!) for my bachelor’s degree with a double major in Ecology and Computer Science, where my interest in Bioinformatics first sparked. After I graduated from HKU, I found myself by the coast of the Red Sea, at the King Abdullah University of Science and Technology (KAUST), a graduate university then less than 10 years of age, where I eventually completed a Master’s degree in Statistics.

I’ve always liked the intersection of Biology, Computer Science, Statistics, and Mathematics, and the VIROINF ITN has just the right amount of each. I am sure that my PhD studies will help me grow as a scientist and learn many cool things along the way!

Berlin will also be my first inland and non-tropical place to call home, and I look forward to enjoying and enduring this thing called seasons 🙂 During my free time, I enjoy hiking, tennis, swimming, and taking lots of naps.

Host institution:
Robert Koch Institute (RKI), Germany
Local supervisor:
Dr. Max von Kleist (RKI)
Local co-supervisor:
Prof. Dr. Dino McMahon (FU Berlin)
Project partner:
Harshit Prajapati (ESR4)
Work packages:
WP 1.3 Virus-host interactions
WP 2.1 Microevolution: Virus quasispecies
WP 2.2 Macroevolution: Natural selection of viruses

Max von Kleist
Dino McMahon

Project description

Viruses hijack the host cellular machinery for replication. This hijacking is driven by the interaction of viral proteins and non-coding RNAs with host-cellular components. Viral genomic sites instrumental to these interactions are thus conserved through evolution, yet bioinformatics analysis of evolutionary conservation cannot tell apart and quantify the functional relevance of genomic sites under selection. We have previously shown using the Mutational Interference Mapping Experiment (MIME) that in vitro and in vivo evolution experiments with subsequent NGS generate data sets that allow quantifying the phenotype of every nucleotide in a single experiment.

The goal of this project is to improve and tailor mathematical and computational methods to identify and functionally characterise domains in the viral genome under evolutionary selection and to apply these tools to data from ESR 4, ESR 9, ESR 2 and our secondment.

  • On the technical side, we will learn maximum entropy models (direct coupling analysis, DCA) from nucleotide abundances in functionally selected and de-selected virus quasispecies to identify single- and interacting sites.
  • As an alternative approach and a means to validate the above, we investigate the application of deep learning approaches to identify higher-order epistasis in collaboration with secondment Bernhard Renard (HPI).
  • We then derive algebraic expressions from kinetic modelling of the investigated selection process to interpret the derived coupling terms mechanistically (phenotypically). This allows to characterise the complex (and possibly highly constraining) fitness landscape on which adaptation takes place.

In the development phase our computational methods will be benchmarked by simulating the selection experiments and the corresponding NGS data sets, as well as with existing NGS datasets and phenotypic endpoints from previous work (e.g. HIV genome packaging). Subsequently, we will apply the tools to NGS data generated by in vitro evolution experiments of ESR 2 / secondment R. Smyth (HIRI) to investigate domains in the influenza genomic RNA, which are responsible for viral packaging/re-assortment, in collaboration with AllGenetics. Moreover, we will apply the methods to in vivo evolution experiments conducted by ESR 4 (deformed wing virus), as well as ESR 9 (Drosophila virus) to predict phenotypic contributions to virus growth, mortality and host transcriptional responses.


Liu-Wei, Wang; Bohn, Patrick; van der Toorn, Wiep; Smyth, Redmond P.; von Kleist, Max

Systematic benchmarking and error evaluation of basecallers for Nanopore Direct RNA-seq Presentation

Poster at 21st European Conference on Computational Biology 2022, 20.09.2022.


Liu-Wei, Wang; Hoehndorf, Robert; Aubrey, Wayne; Creevey, Christopher; Clare, Amanda; Dimonaco, Nicholas

FrameRate: predicting coding frames direct from unassembled DNA reads Presentation

Poster at 21st European Conference on Computational Biology 2022, 19.09.2022.