Model Optimization Engineer

World Labs

World Labs develops Large World Models, advancing AI beyond language and 2D visuals into complex 3D environments, both virtual and real.

San Francisco, CA, USA

Machine Learning

Senior Software Engineer

In-Person

3+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Model Optimization Engineer

World Labs is at the forefront of artificial intelligence innovation, developing Large World Models that extend AI capabilities beyond traditional language and 2D visual processing into complex 3D environments. As a Model Optimization Engineer, you'll play a crucial role in bridging the gap between cutting-edge research and production-ready systems. The position demands expertise in PyTorch, CUDA programming, and deep learning model optimization techniques.

You'll be responsible for transforming research code into highly optimized, low-latency inference solutions, working with state-of-the-art models and technologies. The role requires both technical excellence and collaborative skills, as you'll work closely with research teams to implement optimizations while maintaining model accuracy.

The ideal candidate brings 3+ years of experience in deep learning model optimization, with expert-level knowledge in PyTorch and CUDA programming. You should be comfortable with model quantization, inference frameworks, and multi-GPU deployment strategies. Additional experience with custom CUDA kernel development and high-performance serving frameworks like Triton or vLLM is highly valued.

World Labs offers an environment where fearless innovation, resilience, and collaboration are celebrated. You'll be part of a global team working on technology that will fundamentally change how machines perceive and interact with the world. This is an opportunity to make a significant impact in the field of AI while working with some of the brightest minds in the industry.

If you're passionate about pushing the boundaries of AI technology and want to contribute to developing spatially intelligent AI systems that will shape the future, this role at World Labs could be your next career-defining move.

Last updated 18 days ago

Responsibilities For Model Optimization Engineer

Optimize neural network models for inference through quantization, pruning, and architectural modifications while maintaining accuracy
Profile and benchmark model performance to identify computational bottlenecks
Implement optimizations using torch.compile, custom CUDA kernels, and specialized inference frameworks
Deploy multi-GPU inference solutions with efficient model parallelism and serving architectures
Collaborate with research teams to ensure optimization techniques integrate smoothly with model development workflows

Requirements For Model Optimization Engineer

Python

3+ years optimizing deep learning models for production inference
Expert-level PyTorch and CUDA programming experience
Hands-on experience with model quantization (INT8/FP16) and inference frameworks (TensorRT, ONNX Runtime)
Proficiency in GPU profiling tools and performance analysis
Experience with multi-GPU inference and model serving at scale
Strong understanding of transformer architectures and modern ML model optimization techniques