Taro Logo

Staff Site Reliability Engineer

Moloco is a machine learning company empowering organizations to grow and unlock the full value of their unique first-party data, elevating the traditional path to performance advertising.
Site Reliability
Staff Software Engineer
In-Person
501 - 1,000 Employees
12+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Staff Site Reliability Engineer

Moloco, a machine learning company at the forefront of performance advertising, is seeking a Staff Site Reliability Engineer to join their team in Seoul, Korea. This role is crucial in building and evolving a highly scalable, cost-efficient, and low latency platform that leverages state-of-the-art Cloud technologies to sustain their growing mobile ads ecosystem.

The platform handles an impressive 7+ million requests per second, about 80 times that of Google search traffic. This immense volume presents unique and challenging problems in terms of scalability and reliability. As a member of the SRE team, you'll have the opportunity to solve these huge challenges alongside the engineers who built the platform, using cutting-edge technologies.

Key responsibilities include:

  • Designing and implementing strategies to achieve higher levels of reliability and availability
  • Leading multi-team infrastructure projects
  • Designing & implementing scalable & reliable service infrastructure
  • Creating service and infrastructure release pipelines
  • Developing tools to support efficient operations and improve business scalability
  • Leading and assisting in troubleshooting production issues
  • Proactively identifying opportunities to improve reliability, finding risks, and creating solutions to mitigate them
  • Establishing and evangelizing the highest level of operational excellence

The ideal candidate will have:

  • 12 years of SRE/DevOps development experience in IT-related areas
  • A Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field
  • Strong knowledge of SRE/DevOps best practices
  • Experience with tools such as Kubernetes, Terraform, Jenkins, GitHub Actions, etc.
  • Software development skills in at least one programming language
  • Outstanding problem-solving skills
  • Experience working with real-time systems and highly scalable software architectures
  • Experience with Cloud service providers such as GCP, AWS, or Azure

Moloco offers a comprehensive benefits package and promotes a culture of inclusion and belonging. They value diversity and see it as crucial to their success. The company has received several recognitions, including ranking in the top 10% of Inc. 5000 fastest-growing private companies for 2023 and receiving Google's Cloud DevOps Dreamers Award in 2023.

Join Moloco to tackle challenging real-world problems, make a positive impact on millions of mobile users worldwide, and grow alongside top-notch colleagues in an exciting period of growth.

Last updated 9 months ago

Responsibilities For Staff Site Reliability Engineer

  • Design and implement strategies to achieve higher levels of reliability and availability
  • Lead multi-team infrastructure projects
  • Design & implement scalable & reliable service infrastructure
  • Design & implement service and infrastructure release pipelines
  • Design & implement tools to support efficient operations and improve business scalability
  • Lead and assist in troubleshooting production issues
  • Proactively identify opportunities to improve reliability, continuously find the risks, and create solutions to mitigate them
  • Establish and evangelize the highest level of operational excellence by supporting internal reliability solution onboarding

Requirements For Staff Site Reliability Engineer

Kubernetes
  • 12 years of SRE/DevOps development experience in IT-related areas
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or related major
  • Knowledge of SRE/DevOps best practices
  • Experience with tools such as Kubernetes, Terraform, Jenkins, GitHub Actions, etc.
  • Software development skills in at least one programming language
  • Outstanding problem-solving skills
  • Experience working with real-time systems, and highly scalable software architectures
  • Experience with Cloud service providers such as GCP, AWS, or Azure

Benefits For Staff Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Education Budget
  • Comprehensive benefits package
  • Innovative benefits that empower employees to take care of themselves and their families
  • Conditions for employees to do the best work of their career

Interested in this job?