Anyscale, backed by Andreessen Horowitz, NEA, and Addition with $250+ million in funding, is revolutionizing distributed computing through Ray, their open-source project. As a Deep Learning Performance Engineer, you'll be at the forefront of optimizing cutting-edge ML models, working with companies like OpenAI, Uber, and Spotify. This role is crucial for maintaining Anyscale's market-leading performance in AI infrastructure.
The position demands expertise in GPU/CUDA programming, deep learning frameworks, and system optimizations. You'll collaborate with product and research teams, implementing state-of-the-art practices in LLM engines. The role offers exciting opportunities to work with advanced technologies like vLLM and TensorRT-LLM.
The company provides comprehensive benefits including competitive equity, healthcare coverage, and various stipends for wellness and education. Located in San Francisco, you'll join a team dedicated to democratizing distributed computing and making it accessible to developers of all skill levels. This is an excellent opportunity for those passionate about performance engineering in AI and distributed systems.
Additional valued skills include ML Systems knowledge, experience with deep learning model training, and contributions to frameworks like PyTorch or TensorFlow. Experience with Ray or deep learning compilers would be advantageous. The role offers competitive compensation ranging from $170,112 to $237,000, reflecting Anyscale's data-driven and transparent approach to compensation.