Google DeepMind, a leading force in artificial intelligence research, is seeking a Senior Software Engineer to join their Pre-Training team. This role represents a unique opportunity to shape the future of large language models (LLMs) by optimizing their training and inference capabilities at an unprecedented scale.
The position involves working at the intersection of cutting-edge ML model development and hardware optimization, specifically focusing on TPU architecture. You'll be responsible for co-designing models and implementing crucial components across various layers, from model architecture to custom kernels, ensuring maximum efficiency in frontier model delivery.
As part of this role, you'll be working with state-of-the-art LLM models, optimizing their performance on Google's advanced hardware accelerators throughout the entire lifecycle - from research to deployment. Your expertise will be crucial in developing custom kernels, collaborating with compiler and framework teams, and ensuring efficient training at industry-leading scales.
The ideal candidate brings deep expertise in distributed training of LLMs, particularly at the 1e25 FLOPs scale on modern GPU/TPU clusters. Your experience with ML frameworks like JAX and PyTorch, combined with low-level programming expertise in CUDA and OpenCL, will be essential for success in this role.
At Google DeepMind, you'll be part of a diverse team of scientists, engineers, and ML experts working together to advance artificial intelligence for widespread public benefit. The company offers competitive compensation, including a base salary range of $235,000 - $350,000, plus bonus, equity, and comprehensive benefits.
This is an opportunity to work on some of the most challenging and impactful problems in AI, while contributing to the development of technologies that could be one of humanity's most useful inventions. Join a team that prioritizes safety and ethics while pushing the boundaries of what's possible in artificial intelligence.