W
source · wttj·req · jb_66660c8835·listed 2d ago
Staff Software Engineer (Simulation ML Infrastructure)
Waymo·London, United Kingdom·Hybrid·Full-time
Sourced listing · wttjSalary disclosed
compensation · disclosed
£150,000 — £162,000
source · wttj
Summary
the pitchJoin Waymo, a leader in autonomous driving technology. As a Staff Software Engineer, you will lead the development of advanced AI/ML infrastructure for multi-billion parameter foundation models. You will collaborate closely with core teams, provide technical leadership, design and scale large distributed systems, and mentor junior engineers. Enjoy a comprehensive benefits package, including medical, dental, and vision insurance, competitive compensation, and a hybrid work model.
Role
posted by company- BS in Computer Science, Robotics, similar technical field of study, or equivalent practical experience
- 5+ years of professional software engineering experience, with at least 3 years in machine learning infrastructure such as developing, scaling, training, deploying, and optimizing large-scale machine learning systems from data to model
- MS in Computer Science, Robotics, similar technical field of study, or equivalent practical experience
- 10+ years of professional software engineering experience, with at least 5 years in machine learning infrastructure such as developing, designing, scaling, training, deploying, and optimizing large-scale machine learning systems from data to model
- Solid experience in the development and optimization of machine learning infrastructure tools like DeepSpeed, PyTorch, TensorFlow, or similar frameworks
- Strong expertise in distributed training techniques, including gradient sharding and optimization strategies for scaling large models across ML accelerator profiling tools to uncover performance bottlenecks
- Deep understanding of state-of-the-art machine learning models such as auto-regressive transformers and familiarity with custom-kernels for diverse h/w compute based efficiency
- Practical familiarity in Autonomous Driving, Simulations, and ML accelerators is a huge plus
Key responsibilities
- Lead the development of advanced AI/ML infrastructure for multi-billion parameter foundation models in ML accelerator-friendly simulations.
- Design and scale large distributed systems covering the ML lifecycle, supporting planet-scale dataset generation and model training.
- Provide deep technical leadership on large-scale ML model architectures, especially for autonomous vehicle models, and mentor junior engineers.