Production Systems Engineer

Meta builds technologies that help people connect, find communities, and grow businesses.
DevOps
Senior Software Engineer
In-Person
5,000+ Employees
Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Data Center Production Engineer

Senior Data Center Production Engineer role at Meta, focusing on technical leadership and infrastructure optimization with competitive compensation and benefits.

Product Quality Engineer, Thermal / Mechanical

Senior Quality Engineer role at Meta focusing on thermal and mechanical systems for data centers, offering competitive compensation and growth opportunities.

Onsite Data Center Design Mechanical Engineer

Senior Mechanical Engineer role at Meta focusing on data center design and construction, requiring 10+ years of experience and professional engineering license.

Enterprise Systems Engineer

Senior Enterprise Systems Engineer role at Meta, building and maintaining infrastructure for Reality Labs Research, focusing on Linux environments and modern DevOps practices.

Production Engineer

Production Engineer role at Meta, ensuring smooth operation and growth of Meta's services.

Description For Production Systems Engineer

Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team in Dublin. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services. The RTP team is responsible for the end-to-end Hardware Lifecycle of all Meta servers, from exploration and development to production health. RTP Engineers work closely with Production Engineering teams, Enterprise Networking, Hardware Designers, Networking Teams, Manufacturers, Vendors, Datacenter Operation teams and New Product Introduction teams to ensure the smooth operation of systems across the planet.

We encounter problems from the very smallest of scales (errors occurring at the microscopic scale, within single registers of a CPU) up to the very largest - deploying solutions to our entire millions-strong fleet. We look for people with curiosity and drive, who want to tackle the hardest problems in the domain.

Typically we will hire engineers from backgrounds such as Site Reliability Engineer (SRE), Software Engineer, Systems Engineer, Systems Development Engineer, DevOps Engineer, Systems Administrator, or similar. You will have demonstrated ability to drive projects to successful business outcomes. Your previous experience will always be less important than demonstrated problem solving abilities and attitude.

Responsibilities:

  • Build and develop tooling solutions to automate business critical processes in service of managing the health of the Meta production fleet
  • Troubleshoot, diagnose and root cause system failures, working with key partners to identify and deliver solutions
  • Proactively identify opportunities to fix or enhance tooling, hardware and processes
  • Build subject matter expertise in one or more of the specialist areas covered by the RTP team in Dublin - Firmware Deployment, Edge/CDN hardware, or Silicon Sustaining

Minimum Qualifications:

  • Bachelors degree in Computer Science, related technical discipline, or equivalent work experience
  • Experience coding in a higher-level language (Python, PHP, Java, Go, Rust, C++)
  • Experience building, maintaining and debugging production services or platforms - usually (but not necessarily) in a linux/unix environment
  • Knowledge of server architecture and components across Compute/Storage/AI Systems/Networking
  • Scientific approach to troubleshooting, root-cause analysis and investigation
  • Good communication skills, able to collaborate easily with others

Preferred Qualifications:

  • An interest in data center server hardware
  • Experience in programming and/or tooling development

Join Meta and help shape the future of social technology beyond the constraints of screens, distance, and even the rules of physics.

Last updated 3 months ago

Responsibilities For Production Systems Engineer

  • Build and develop tooling solutions to automate business critical processes
  • Troubleshoot, diagnose and root cause system failures
  • Proactively identify opportunities to fix or enhance tooling, hardware and processes
  • Build subject matter expertise in specialist areas (Firmware Deployment, Edge/CDN hardware, or Silicon Sustaining)

Requirements For Production Systems Engineer

Python
PHP
Java
Go
Rust
Linux
  • Bachelors degree in Computer Science, related technical discipline, or equivalent work experience
  • Experience coding in a higher-level language (Python, PHP, Java, Go, Rust, C++)
  • Experience building, maintaining and debugging production services or platforms
  • Knowledge of server architecture and components across Compute/Storage/AI Systems/Networking
  • Scientific approach to troubleshooting, root-cause analysis and investigation
  • Good communication skills, able to collaborate easily with others

Interested in this job?