Site Reliability Engineer

Pure Storage

Pure Storage redefines the storage experience and empowers innovators by simplifying how people consume and interact with data.

Bengaluru, Karnataka, India

Site Reliability

Mid-Level Software Engineer

In-Person

5+ years of experience

Enterprise SaaS

Description For Site Reliability Engineer

Pure Storage is seeking a Site Reliability Engineer to join their Infrastructure Shared Service (ISS) team in Bengaluru, India. As an SRE, you'll work on improving the reliability and performance of Pure Storage's critical infrastructure applications. You'll be responsible for setting and owning SLO goals for uptime and latency, as well as helping colleagues leverage available features and workflows. The role involves working with backend web servers, load balancers, and database servers to ensure they run smoothly.

Key responsibilities include:

Engaging in the entire lifecycle of services from design to operation
Designing, operating, and troubleshooting enterprise systems
Establishing sustainable incident response and blameless postmortems
Supporting services pre-launch through system design and capacity planning
Scaling systems through automation and scripting
Collaborating with development teams and stakeholders across time zones
Ensuring hardware design meets business and technical requirements
Maintaining documentation on system configurations and procedures
Performing day-to-day server, storage, and network administration
Deploying infrastructure manually and via automation platforms
Troubleshooting and resolving hardware, software, and network issues

The ideal candidate should have:

5+ years of experience as an SRE, DevOps Engineer, or Infrastructure Engineer
Strong programming skills in Python or other languages
Experience with distributed systems, Linux environments, and VMware
Familiarity with observability platforms like Elastic or DataDog
Knowledge of Infrastructure as Code tools (Ansible, Terraform)
Experience with containerization and cloud environments (AWS & Azure)

This role offers the opportunity to work on cutting-edge technology in a fast-paced environment, contributing to the success of a company that's revolutionizing data storage and management. Join Pure Storage to be part of building the future of data infrastructure.

Last updated 8 months ago

Responsibilities For Site Reliability Engineer

Engage in and improve the whole lifecycle of services—from inception and design, through deployment and operation
Design, operate, maintain, and troubleshoot enterprise systems such as databases, message queues, APIs, and distributed applications
Establish and practice sustainable incident response and blameless postmortems to prevent problem recurrence
Support services before they go live through activities such as system design, developing software platforms and frameworks, capacity planning, and launch reviews
Scale systems sustainably through mechanisms like scripting and automation
Work closely with development teams, infrastructure teams, and business stakeholders across multiple time zones
Ensure that hardware design meets business and technical requirements, including performance, scalability, and reliability
Create and maintain detailed documentation on system configurations, procedures, and operational policies
Day to day server administration (physical, virtual), storage administration, network config and applications support
Deploy infrastructure manually and also via configuration management / automation platforms
Troubleshoot hardware, software, and network related issues, provide quick resolution and perform root cause analysis

Requirements For Site Reliability Engineer

Python

Linux

Kubernetes

Experience programming in Python or other languages
Experience in designing, analysing, and troubleshooting large-scale distributed systems
Able to work in a 24x7 on-call rotation (approx. 1 week every 2 months)
Systematic problem-solving approach, strong communication skills, and a sense of ownership and drive
Working experience of Observability platforms such as Elastic or DataDog
Experience deploying / troubleshooting Linux systems (Red Hat/CentOS), Ubuntu as well as VMware environments (esxi, NSX, vsan)
Experience working directly with end users to determine deployment and configuration requirements
Ability to lift 15+ kilograms when working with storage equipment

Benefits For Site Reliability Engineer

Flexible time off
Wellness resources
Company-sponsored team events

Pure Storage

Pure Storage redefines the storage experience and empowers innovators by simplifying how people consume and interact with data.

Bengaluru, Karnataka, India

Site Reliability

Mid-Level Software Engineer

In-Person

5+ years of experience

Enterprise SaaS

Interested in this job?

Jobs Related To Pure Storage Site Reliability Engineer

Site Reliability Engineer II

Microsoft

Microsoft Azure Site Reliability Engineer II position focusing on cloud infrastructure, system reliability, and customer experience improvement with competitive compensation and remote work options.

Software Engineer II, Site Reliability Engineering

Google

Google is seeking a Software Engineer II for their Site Reliability Engineering team to build and maintain large-scale distributed systems with focus on reliability and performance.

Site Reliability Developer 2

Oracle

Site Reliability Developer position at Oracle focusing on cloud infrastructure, automation, and system reliability with 3-5+ years of experience required.

Site Reliability Engineer II

Microsoft

Microsoft seeks Site Reliability Engineer II for cybersecurity solutions, offering hybrid work, competitive pay, and comprehensive benefits.

Site Reliability Engineer

PEXA International

Site Reliability Engineer role at PEXA International focusing on platform reliability, incident management, and infrastructure optimization in a remote setting.