Machine Learning Infrastructure Engineer - #4694
Our mission is to detect cancer early, when it can be cured. We are working to change the trajectory of cancer mortality and bring stakeholders together to adopt innovative, safe, and effective technologies that can transform cancer care. We are a healthcare company, pioneering new technologies to advance early cancer detection. We have built a multi-disciplinary organization of scientists, engineers, and physicians and we are using the power of next-generation sequencing (NGS), population-scale clinical studies, and state-of-the-art computer science and data science to overcome one of medicine’s greatest challenges. GRAIL is headquartered in the bay area of California, with locations in Washington, D.C., North Carolina, and the United Kingdom. It is supported by leading global investors and pharmaceutical, technology, and healthcare companies. For more information, please visit grail.com
The expected, full-time, annual base pay scale for this position is $190k-$255k.
Responsibilities
Partner with research teams to identify computational pain points or limitations in performing computational experiments and analyses.
Design, build, and evolve software which usefully extends research capabilities, including infrastructure for distributed ML training and evaluation on large controlled genomic datasets.
Develop tools and processes that ensure GxP-compliant testing, patchability, and inference reproducibility for classifiers that are promoted to production use.
Develop and maintain the research team’s software environment, including tools to assess the health, performance, and cost of the system.
These summarize the role’s primary responsibilities and are not an exhaustive list. They may change at the company’s discretion.
Required Qualifications
5+ years of experience developing software supporting machine learning, scientific computing, or large-scale data processing systems
Strong programming skills in Python and a systems-level language such as Golang (preferred), Java, C#, C++, etc.
Experience working with modern machine learning frameworks such as PyTorch or TensorFlow
Experience with Distributed Computing paradigms (Spark, Ray, Flink, Beam, etc.)
A commitment to high-quality professionally engineered software
Strong communication skills with the ability to help developers from a wide range of software development backgrounds
BS in Computer Science, Engineering, Bioinformatics, or a related field, or equivalent practical experience
Preferred Qualifications
Good understanding of container orchestration through Docker and cloud technologies.
Experience with scientific computing tools: NumPy, Jupyter, R Notebook, etc.
Experience with techniques used in modern AI (including LLM) training
Experience with whole genome sequencing, whole exome sequencing, bisulfite sequencing, and/or whole transcriptome sequencing data
Practical experience setting up continuous integration systems, along with expertise in at least one build tool (e.g. Bazel (preferred), Buck, Maven, Gradle)
Familiarity with AWS services, best practices, and security
Advanced degree (MS or PhD) in computer science, engineering, bioinformatics or a related discipline