Job Purpose 

Congenica is applying machine learning to genomic data to advance the capability of its clinical decision software, Sapientia™. A new machine learning team is being formed to deliver automation that will have a huge impact on patients’ lives.

Genomics Data Curator is a new role within the machine learning team. You will be one of the earliest members of this new team, using your skills to improve genomic healthcare for people worldwide.

You will be accountable for the delivering clean, well-prepared data sets to our statisticians and engineers to enable them to be immediately highly productive. You will be to primary person for all data preparation and selection requirements for the machine learning team, setting the highest possible standards.

Main Responsibilities

  • Prepare, integrate and manipulate large data sets in order to accelerate the activities of the machine learning team
  • Analyse data to inform the team of the viability and suitability of data for certain challenges
  • Research and recommend new data sets that can help us solve specific challenges
  • Build exceptional knowledge about different data sets and their nuances
  • Work with Head of A.I. & Data Strategy to realise and inform the overall strategy
  • Build trusted relations with colleagues, partners and stakeholders

Employee Profile

Essential – Attributes candidate must have on entering the role

Knowledge, Skills & Abilities

  • Skilled in preparing, integrating and manipulating high-quality, very large (TBs) genomic data sets for further analysis.
  • Deep understanding of the field of genomics and the data gathered.
  • Skilled in researching and reviewing suitable data sources to solve a challenge.
  • Experience with PostgreSQL/ relational databases.
  • Knowledge of statistics.
  • Exceptional attention to detail.

 Related Experience

  • Skilled in preparing, integrating and manipulating very large (TBs) data sets (non-genomic)

Behavioural Qualities

  • Energised by being part of a new team with a specific commercial remit to achieve.
  • Able to work under time pressure, clearly prioritising work to achieve goals.
  • Self-motivated and results-driven, problem-solver.
  • Friendly, approachable and builds positive personal and organisational relationships
  • Excellent written and verbal communication.
  • Enthusiastic, hardworking, well organised and able to prioritise.
  • Continuous learning and application and sharing of new skills and knowledge.
  • Professional appearance and able to produce high standards of work

 Desirable – Attributes already held or to be developed to perform the role


  • Master’s degree in genomics or similar scientific field

Knowledge, Skills & Abilities

  • Knowledge of machine learning and statistical tools that will be using the data once prepared (e.g. Tensorflow, R).
  • Working with AWS.
  • Programming in Python.
  • Experience of data visualisation.
  • Knowledge of data privacy and security requirements.

Related Experience

  • Start-up or scale-up experience.

To apply for this position, please send your CV and a covering letter, to