Computational Proteomics Scientist
EMBL-EBI - European Bioinformatics Institute
Hinxton, United Kingdom
Are you a motivated computational scientist passionate about developing algorithms for cutting-edge proteomics? We are looking for an enthusiastic Computational Proteomics Scientist to join the Proteomics and Metabolomics Team at the European Bioinformatics Institute (EMBL-EBI). The successful candidate will contribute to the development of algorithms, computational resources, and community standards for single-cell proteomics, working at the interface of data science, software engineering, and biological discovery.
Our team is responsible for the development and maintenance of world-leading proteomics resources, including PRIDE, a founding member of the ProteomeXchange Consortium, which captures and disseminates large-scale proteomics data from the global scientific community. We are also major contributors to international standardization efforts through the Proteomics Standards Initiative (PSI). Single-cell proteomics is an emerging and rapidly evolving field, generating complex, sparse, and large-scale datasets. You will play a key role in designing scalable computational methods and standards to support robust analysis, reproducibility, and FAIR data sharing for next-generation proteomics experiments.
Your role:
As part of a multidisciplinary and highly international team, you will contribute to the development of novel computational methods and software infrastructure for single-cell proteomics. The post holder will:
- Design and develop algorithms for single-cell proteomics data analysis, including peptide/protein quantification, normalization, missing value handling, batch correction, and quality control;
- Develop scalable and high-performance software components in Python and C++ for processing large-scale and high-dimensional proteomics datasets;
- Contribute to machine learning and deep learning approaches for proteomics data analysis, including representation learning, denoising, feature selection, and integrative multi-omics analysis;
- Participate in the definition and implementation of community data standards, formats, and APIs for single-cell proteomics under the umbrella of the Proteomics Standards Initiative;
- Work closely with experimental scientists and software engineers to integrate algorithms into production-ready workflows, databases, and cloud-based infrastructures;
- Support the transformation of proteomics resources into AI-ready datasets, enabling downstream use by advanced analytics and large-scale machine learning systems;
- Collaborate with other EMBL-EBI teams and international partners to integrate single-cell proteomics data with resources such as Expression Atlas and UniProt.
You have:
The post holder should have a strong background in computational proteomics, bioinformatics, computer science, or a related quantitative discipline, with typically 3+ years of relevant research or industry experience. You will thrive in a collaborative, interdisciplinary environment and communicate effectively with both computational and experimental scientists.
Essential skills and experience:
- Strong experience in mass spectrometry-based proteomics, ideally including single-cell or low-input proteomics;
- Proven experience developing algorithms for proteomics data analysis;
- Excellent programming skills in Python and C++, with a focus on performance, maintainability, and reproducibility;
- Experience with machine learning and/or deep learning frameworks (e.g. PyTorch, TensorFlow, JAX, scikit-learn);
- Solid understanding of statistical methods for high-dimensional and sparse biological data;
- Experience working with large-scale datasets, including efficient I/O and memory-aware data processing;
- Proficiency with version control systems such as Git;
- Ability to work independently, manage multiple priorities, and communicate results clearly in an international environment.
You may also have:
- Experience with proteomics data formats and standards (e.g. mzML, mzIdentML, mzTab, Parquet-based formats);
- Familiarity with high-performance computing (HPC), GPU acceleration, or parallel computing;
- Experience with workflow systems (e.g. Nextflow) and containerized environments (Docker, Singularity);
- Knowledge of cloud-based infrastructures and scalable data processing frameworks;
- Interest in FAIR data principles, open science, and community-driven standardization;
- Experience integrating proteomics data with other omics modalities (e.g. transcriptomics, metabolomics).
Don't forget to mention EuroScienceJobs when applying.