Site Reliability Engineer

A late-stage logtech startup disrupting Southeast Asia's express logistics industry, delivering to 100 million customers with predictability, flexibility and convenience.
Ho Chi Minh City, Vietnam
Site Reliability
Senior Software Engineer
Hybrid
1,000 - 5,000 Employees
Logistics

Description For Site Reliability Engineer

Ninja Van, a leading Southeast Asian logistics technology company, is seeking a Site Reliability Engineer to join their dynamic team. Founded in 2014, the company has grown to become a major player in the region's express logistics sector, processing 250 million API requests and delivering over 1.5 million parcels daily across six Southeast Asian markets.

The role combines technical expertise with operational excellence, focusing on maintaining and improving critical infrastructure services. As an SRE, you'll be responsible for managing services written in Go and Java, implementing robust monitoring systems, and ensuring high availability of critical services. You'll work with infrastructure-as-code tools, automate processes, and participate in on-call rotations to maintain service reliability.

The ideal candidate should have strong experience with Go and Java services, cloud platforms, and monitoring tools. Knowledge of Linux/Unix systems, networking, and CI/CD pipelines is essential. Additional experience with Terraform, Kubernetes, and scripting languages would be advantageous.

This position offers the opportunity to work with a rapidly growing company that has raised over US$500 million in funding and serves 100 million customers. You'll be part of a lean team where your contributions will have direct impact on the company's success. The role combines technical challenges with the satisfaction of improving infrastructure that serves 600,000 active shippers across all e-commerce segments.

Working at Ninja Van means joining a company that values initiative, team-first mentality, and personal responsibility. The hybrid work environment allows for flexibility while maintaining collaborative relationships with development and operations teams. If you're passionate about creating reliable, scalable infrastructure and want to be part of Southeast Asia's logistics revolution, this role offers an excellent opportunity to make a significant impact.

Last updated 13 days ago

Responsibilities For Site Reliability Engineer

  • Manage and maintain horizontal services primarily written in Go and Java
  • Participate in on-call rotations to ensure high availability and reliability of critical services
  • Implement strategies for handling outages and minimizing downtime
  • Use and maintain infrastructure-as-code tools to provision and manage resources
  • Automate manual processes to improve infrastructure efficiency
  • Develop and maintain monitoring, metrics, and alerting systems
  • Implement and fine-tune dashboards for system performance
  • Collaborate with development teams on metrics and monitoring
  • Document processes, systems, and solutions
  • Participate in regular reviews and audits of systems

Requirements For Site Reliability Engineer

Go
Java
Kubernetes
Linux
  • Experience with services written in Go and Java
  • Familiarity with monitoring and observability tools
  • Experience with building and maintaining CI/CD pipelines
  • Strong knowledge of Linux/Unix systems and networking
  • Experience with cloud platforms (AWS, GCP, Azure)
  • Strong in problem-solving, cross-functional communication
  • Experience with infrastructure reliability and system performance

Interested in this job?

Jobs Related To Ninja Van Site Reliability Engineer

Senior Site Reliability Engineer(SRE)

Senior SRE position at Ninja Van, managing infrastructure and reliability for Southeast Asia's leading logistics tech company.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Zscaler, focusing on cloud infrastructure, automation, and maintaining high-availability systems across AWS, Azure, and GCP.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Zscaler, focusing on cloud infrastructure, automation, and maintaining high-availability systems across AWS, Azure, and GCP.

Senior Site Reliability Engineer

Senior SRE position at Blacklane focusing on system reliability, observability, and mentoring, offering hybrid work and equity in a global mobility company.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Prove, focusing on building and maintaining scalable, reliable systems for digital identity solutions.