Genomics Data Scientist (Fixed Term)

University of Cambridge - Department of Genetics

The research group of Al-Kindi Professor Richard Durbin is in the Department of Genetics, University of Cambridge. We are seeking a talented genomics data scientist or bioinformatician to contribute to data analysis and the development of methods and software. Current research focuses on the development of new data structures and algorithms for large-scale sequencing data, and their application to areas such as vertebrate genome assembly, human demography, and genome and species evolution of Lake Malawi cichlid fish. We are also involved in the Vertebrate Genomes Project which has the initial goal of producing reference quality genome assemblies for one species from every vertebrate order using cutting edge long-read DNA sequencing technologies like PacBio, BioNano and 10X Genomics.

Primary responsibilities:

  • Develop, maintain, and run pipelines and processes for the QC, and analysis of high-throughput sequencing data.
  • Evaluate and compare new tools and technologies such as new assembly programs or tools for inclusion in the pipeline.
  • Develop and maintain a system for tracking data sets and their analysis progress against team projects.
  • Participate in the development of novel bioinformatics software tools and techniques for high-throughput sequencing and assembly.
  • Contribute to scientific projects and publications.
  • Help to make our data and resources available to a wide community of biologists and geneticists.

This role would suit somebody with some previous experience with bioinformatics or other large scale scientific data analysis, or a newly qualified graduate student with data science skills interested in DNA sequence data. Previous experience with DNA sequencing data is not strictly necessary for the position. We have a strong publication record and culture of producing open data resources and open source software development. The group also retains an affiliation with the Wellcome Trust Sanger Institute, so we will be working closely with many of the production and research groups there.


  • Advanced degree in a scientific discipline, or equivalent experience
  • Record of multiple years of computational scientific data analysis
  • Familiarity with the unix computing environment
  • Proficiency in one or more scripting languages, preferably Python and Perl
  • Excellent critical and problem-solving skills
  • Attention to detail and the ability to work to agreed timelines
  • A high level of communication skills to be able to elicit complex requirements from, and convey complex information to, groups with different levels of technical knowledge
  • Ability to quickly adapt to new problems and ideas
  • Experience with database management in MySQL or similar


  • Knowledge of DNA sequencing data and technologies
  • Experience with the git version control system
  • Experience with running software on a compute farm or cluster
  • Previous experience with managing large volumes of data
  • Web development experience


Fixed-term: The funds for this post are available for 2 years in the first instance.

To apply online for this vacancy and to view further information about the role, please visit:

For further information or questions about this post please contact Shane McCarthy ( or Richard Durbin (

Please quote reference PC15132 on your application and in any correspondence about this vacancy.

The University values diversity and is committed to equality of opportunity.

The University has a responsibility to ensure that all employees are eligible to live and work in the UK.

Share this job
  Share by Email   Print this job   More sharing options
We value your feedback on the quality of our adverts. If you have a comment to make about the overall quality of this advert, or its categorisation then please send us your feedback
Advert information

Type / Role:


South East England