Staff Site Reliability Engineer

Moloco is a machine learning company empowering organizations to grow and unlock the full value of their unique first-party data, elevating the traditional path to performance advertising.
Site Reliability
Staff Software Engineer
In-Person
501 - 1,000 Employees
12+ years of experience
AI · Enterprise SaaS

Description For Staff Site Reliability Engineer

Moloco, a machine learning company at the forefront of performance advertising, is seeking a Staff Site Reliability Engineer to join their team in Seoul, Korea. This role is crucial in building and evolving a highly scalable, cost-efficient, and low latency platform that leverages state-of-the-art Cloud technologies to sustain their growing mobile ads ecosystem.

The platform handles an impressive 7+ million requests per second, about 80 times that of Google search traffic. This immense volume presents unique and challenging problems in terms of scalability and reliability. As a member of the SRE team, you'll have the opportunity to solve these huge challenges alongside the engineers who built the platform, using cutting-edge technologies.

Key responsibilities include:

  • Designing and implementing strategies to achieve higher levels of reliability and availability
  • Leading multi-team infrastructure projects
  • Designing & implementing scalable & reliable service infrastructure
  • Creating service and infrastructure release pipelines
  • Developing tools to support efficient operations and improve business scalability
  • Leading and assisting in troubleshooting production issues
  • Proactively identifying opportunities to improve reliability, finding risks, and creating solutions to mitigate them
  • Establishing and evangelizing the highest level of operational excellence

The ideal candidate will have:

  • 12 years of SRE/DevOps development experience in IT-related areas
  • A Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field
  • Strong knowledge of SRE/DevOps best practices
  • Experience with tools such as Kubernetes, Terraform, Jenkins, GitHub Actions, etc.
  • Software development skills in at least one programming language
  • Outstanding problem-solving skills
  • Experience working with real-time systems and highly scalable software architectures
  • Experience with Cloud service providers such as GCP, AWS, or Azure

Moloco offers a comprehensive benefits package and promotes a culture of inclusion and belonging. They value diversity and see it as crucial to their success. The company has received several recognitions, including ranking in the top 10% of Inc. 5000 fastest-growing private companies for 2023 and receiving Google's Cloud DevOps Dreamers Award in 2023.

Join Moloco to tackle challenging real-world problems, make a positive impact on millions of mobile users worldwide, and grow alongside top-notch colleagues in an exciting period of growth.

Last updated a month ago

Responsibilities For Staff Site Reliability Engineer

  • Design and implement strategies to achieve higher levels of reliability and availability
  • Lead multi-team infrastructure projects
  • Design & implement scalable & reliable service infrastructure
  • Design & implement service and infrastructure release pipelines
  • Design & implement tools to support efficient operations and improve business scalability
  • Lead and assist in troubleshooting production issues
  • Proactively identify opportunities to improve reliability, continuously find the risks, and create solutions to mitigate them
  • Establish and evangelize the highest level of operational excellence by supporting internal reliability solution onboarding

Requirements For Staff Site Reliability Engineer

Kubernetes
  • 12 years of SRE/DevOps development experience in IT-related areas
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or related major
  • Knowledge of SRE/DevOps best practices
  • Experience with tools such as Kubernetes, Terraform, Jenkins, GitHub Actions, etc.
  • Software development skills in at least one programming language
  • Outstanding problem-solving skills
  • Experience working with real-time systems, and highly scalable software architectures
  • Experience with Cloud service providers such as GCP, AWS, or Azure

Benefits For Staff Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Education Budget
  • Comprehensive benefits package
  • Innovative benefits that empower employees to take care of themselves and their families
  • Conditions for employees to do the best work of their career

Interested in this job?

Jobs Related To Moloco Staff Site Reliability Engineer

Site Reliability Engineer (L5) - Security Engineering

Netflix seeks a Site Reliability Engineer (L5) for Security Engineering to enhance critical infrastructure reliability and support business growth in LIVE streaming, Gaming, and Ads.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer for Site Reliability Engineering at Airbnb, developing tools and systems for service reliability and incident management.

Engineering Manager, Reliability Engineering

Airbnb seeks an Engineering Manager for Site Reliability to drive long-term strategy and ensure infrastructure performance.

Site Reliability Developer 4

Site Reliability Developer 4 at Oracle in Bengaluru, India. Design and deliver mission-critical stack with focus on security, resiliency, scale, and performance.