Blog

How to start reading in Genomics Language Models!

March 3, 2026

The first step is to familiarize yourself with the key concepts and literature. Here is a list of review papers and book chapters to get you started.

  • Deep learning: new computational modelling techniques for genomics. Nature Reviews Genetics, 2019, Paper.
  • Transformers and genome language models. Nature Machine Intelligence, 2025 Paper.
  • Genomic language models: opportunities and challenges. Trends in Genetics, 2025. Paper.

And these are some introductory YouTube videos:

  • Large Language Models in Computational Biology by Jian Ma Link 43mins.

  • MIA Primer: Gokcen Eraslan, A Primer on DNA Foundation Modeling Link 61mins.

Next, we need to familiarize ourselves with the literature. These are the latest methods papers:

  • Species-aware DNA language models capture regulatory elements and their evolution. Genome Biology, 2024, Paper.
  • A DNA language model based on multispecies alignment predicts the effects of genome-wide variants. Nature Biotechnology, 2025, Paper.
  • Predicting functional constraints across evolutionary timescales with phylogeny-informed genomic language models Paper.

Next steps include identifying the knowledge gaps and areas for improvement. It is also important to run the tools on small datasets and review their outputs.

How to start working in orthology inference!

March 2, 2026

The first step is to familiarize yourself with the key concepts and literature. Here is a list of review papers and book chapters to get you started.

  • Orthologs, Paralogs, and Evolutionary Genomics. Paper.
  • Orthology: Promises and Challenges. Paper.

Watch YouTube videos: “Introduction to molecular evolution & phylogenetics”: Link starting at 1:28:00.

Next, we need to familiarize ourselves with the literature. These are the latest methods papers:

  • Orthology inference at scale with FastOMA. Paper.
  • OrthoFinder: phylogenetic orthology inference for comparative genomics. Paper.
  • SonicParanoid: fast, accurate and easy orthology inference Paper.

Then, the goal is to identify the knowledge gaps and areas for improvement. An important step is to run the tools on small datasets and check their outputs.

Intro to Machine Learning!

March 1, 2026

There are many online courses on machine learning. I would suggest the following:

1- MIT Introduction to Deep Learning 6.S191, Link 10 videos, each around 50 mins.

2- Stanford CS229: Machine Learning by Andrew Ng, Autumn 2018, Link. A similiar one is avaiable on Coursera.

3- Hugging Face Course link, 80 videos and material. A more practical aspect of LLM.

Depending on your goal, the following courses are useful:

  • Machine Learning Undergraduate Course by Kasper Green Larsen (Aarhus University) link, 59 Videos. A more mathematical course.

  • Stanford CS224W: Machine Learning with Graphs by Jure Leskovec Link, 60 videos.