I grew up on the tropical Hainan island in southern China, next to beautiful beaches and coconut trees. I studied in Hong Kong (another island!) for my bachelor’s degree with a double major in Ecology and Computer Science, where my interest in Bioinformatics first sparked. After I graduated from HKU, I found myself by the coast of the Red Sea, at the King Abdullah University of Science and Technology (KAUST), a graduate university then less than 10 years of age, where I eventually completed a Master’s degree in Statistics.
I’ve always liked the intersection of Biology, Computer Science, Statistics, and Mathematics, and the VIROINF ITN has just the right amount of each. I am sure that my PhD studies will help me grow as a scientist and learn many cool things along the way!
Berlin will also be my first inland and non-tropical place to call home, and I look forward to enjoying and enduring this thing called seasons 🙂 During my free time, I enjoy hiking, tennis, swimming, and taking lots of naps.
Robert Koch Institute (RKI), Germany
Dr. Max von Kleist (RKI)
Prof. Dr. Dino McMahon (FU Berlin)
Harshit Prajapati (ESR4)
WP 1.3 Virus-host interactions
WP 2.1 Microevolution: Virus quasispecies
WP 2.2 Macroevolution: Natural selection of viruses
Viruses hijack the host cellular machinery for replication. This hijacking is driven by the interaction of viral proteins and non-coding RNAs with host-cellular components. Viral genomic sites instrumental to these interactions are thus conserved through evolution, yet bioinformatics analysis of evolutionary conservation cannot tell apart and quantify the functional relevance of genomic sites under selection. We have previously shown using the Mutational Interference Mapping Experiment (MIME) that in vitro and in vivo evolution experiments with subsequent NGS generate data sets that allow quantifying the phenotype of every nucleotide in a single experiment.
The goal of this project is to improve and tailor mathematical and computational methods to identify and functionally characterise domains in the viral genome under evolutionary selection and to apply these tools to data from ESR 4, ESR 9, ESR 2 and our secondment.
- On the technical side, we will learn maximum entropy models (direct coupling analysis, DCA) from nucleotide abundances in functionally selected and de-selected virus quasispecies to identify single- and interacting sites.
- As an alternative approach and a means to validate the above, we investigate the application of deep learning approaches to identify higher-order epistasis in collaboration with secondment Bernhard Renard (HPI).
- We then derive algebraic expressions from kinetic modelling of the investigated selection process to interpret the derived coupling terms mechanistically (phenotypically). This allows to characterise the complex (and possibly highly constraining) fitness landscape on which adaptation takes place.
In the development phase our computational methods will be benchmarked by simulating the selection experiments and the corresponding NGS data sets, as well as with existing NGS datasets and phenotypic endpoints from previous work (e.g. HIV genome packaging). Subsequently, we will apply the tools to NGS data generated by in vitro evolution experiments of ESR 2 / secondment R. Smyth (HIRI) to investigate domains in the influenza genomic RNA, which are responsible for viral packaging/re-assortment, in collaboration with AllGenetics. Moreover, we will apply the methods to in vivo evolution experiments conducted by ESR 4 (deformed wing virus), as well as ESR 9 (Drosophila virus) to predict phenotypic contributions to virus growth, mortality and host transcriptional responses.
Systematic benchmarking and error evaluation of basecallers for Nanopore Direct RNA-seq Presentation
Poster at 21st European Conference on Computational Biology 2022, 20.09.2022.
FrameRate: predicting coding frames direct from unassembled DNA reads Presentation
Poster at 21st European Conference on Computational Biology 2022, 19.09.2022.