Senior Site Reliability Developer

Oracle is a world leader in cloud solutions, using tomorrow's technology to tackle today's problems. They have partnered with industry-leaders in almost every sector and continue to thrive after 40+ years of change by operating with integrity.
$79,000 - $158,200
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
6+ years of experience
Cloud

Description For Senior Site Reliability Developer

Oracle's Cloud Infrastructure (OCI) team is seeking Site Reliability Engineers (SREs) to build highly distributed systems, platform services, and tools for a massive scale, multi-tenant cloud environment. This role is within the OCI USGOVOPS Tier3 team, focusing on engineering infrastructure solutions for scale and performance while ensuring high availability for customers. SREs will spend time on both operational work and software engineering tasks, working closely with service owners in all aspects of service operations. The ideal candidate is a proficient programmer with broad knowledge in areas such as networking, internet protocols, and Linux systems. This position offers the opportunity to work on cutting-edge cloud technology and contribute directly to customer success.

Last updated 18 days ago

Responsibilities For Senior Site Reliability Developer

  • Work with SRE team on shared full stack ownership of services and technology areas
  • Understand end-to-end configuration, technical dependencies, and behavioral characteristics of production services
  • Design and deliver mission-critical stack with focus on security, resiliency, scale, and performance
  • Partner with development teams to improve service architecture
  • Communicate technical characteristics of services and guide teams in adding capabilities to Oracle Cloud
  • Understand and communicate scale, capacity, security, and performance attributes of services
  • Act as ultimate escalation point for complex or critical issues
  • Troubleshoot issues and define mitigations using deep understanding of service topology
  • Explain the effect of product architecture decisions on distributed systems

Requirements For Senior Site Reliability Developer

Linux
Python
  • BS degree in Computer Science or related technical field, or equivalent practical experience
  • Proficiency in writing services/task automation in Python, Bash, Ruby, Perl, JavaScript, or Java
  • Strong communication skills
  • Familiarity with core protocols (DNS, DHCP, HTTP, TCP)
  • Deep knowledge of Linux internals and host-based networking
  • Expert Linux/Unix performance and stability troubleshooting skills
  • Experience with monitoring solutions for large scale environments
  • Systematic problem-solving approach and sense of ownership
  • Experience working with mission-critical tier one services and associated pager duty
  • 5+ years managing large scale, highly distributed services infrastructures
  • 2+ years managing host virtualization technologies
  • US Citizenship required

Benefits For Senior Site Reliability Developer

401k
Dental Insurance
Medical Insurance
Vision Insurance
Parental Leave
Equity
  • Medical, dental, and vision insurance
  • Short term and long term disability
  • Life insurance and AD&D
  • Health care and dependent care Flexible Spending Accounts
  • Pre-tax commuter and parking benefits
  • 401(k) Savings and Investment Plan with company match
  • Flexible Vacation policy
  • 11 paid holidays
  • Paid sick leave
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan
  • Financial planning and group legal

Interested in this job?

Jobs Related To Oracle Senior Site Reliability Developer

Site Reliability Engineer

Join Canonical as a Site Reliability Engineer, bringing Python expertise to enterprise infrastructure devops and cloud technologies.

Senior Site Reliability Engineer, Data Science and ML Platforms

Senior Site Reliability Engineer for NVIDIA's Data Science & ML Platforms team, focusing on large-scale production systems and SRE practices.

Site Reliability Engineer

Tecsys seeks a Site Reliability Engineer with 5+ years experience to improve platform reliability and uptime, working remotely with cutting-edge technologies.