Meta is seeking an experienced Production Systems Engineer to join their Release to Production (RTP) team, focusing on AI/ML initiatives. This role is central to Meta's AI infrastructure, working with cutting-edge hardware and software systems that power their AI capabilities.
The position involves managing the end-to-end Hardware Lifecycle of Meta's servers, including prototyping experimental hardware, conducting pre-production debugging, and implementing automated system monitoring. The role requires expertise in network technologies, including NICs, Switches, Optics, and various protocols, with a focus on supporting Meta's AI systems at scale.
As a Production Systems Engineer, you'll work closely with cross-functional teams, including hardware designers, networking teams, and system manufacturers. You'll be responsible for driving the integration of new AI platforms, creating diagnostic tools, and developing solutions for hardware health issues. The role combines hands-on technical work with strategic system planning and optimization.
The ideal candidate should have strong experience with Linux systems, network technologies, and troubleshooting complex systems. Knowledge of AI workload requirements and experience with large-scale deployments is highly valuable. The position offers competitive compensation ($132,000-$191,000/year) plus bonus and equity, along with comprehensive benefits.
This is an excellent opportunity for someone passionate about infrastructure and AI systems to work at the forefront of technology, helping to build and maintain the systems that power Meta's AI initiatives. The role offers exposure to cutting-edge technology and the chance to work on systems at a massive scale, making a direct impact on Meta's AI infrastructure.