Site Reliability Developer 4

As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's problems. True innovation starts with diverse perspectives and various abilities and backgrounds. When everyone's voice is heard, we're inspired to go beyond what's been done before. It's why we're committed to expanding our inclusive workforce that promotes diverse insights and perspectives. We've partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer a highly competitive suite of employee benefits designed on the principles of parity and consistency. We put our people first with flexible medical, life insurance and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
7+ years of experience
Enterprise SaaS · Cloud

Description For Site Reliability Developer 4

Tackle sophisticated problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Craft and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

You will be responsible to work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the effect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

Additionally, you will be part of the OHAI roadmap, such as help drive the effort in the future state of our products as we begin to migrate them over to OCI platforms.

Last updated a day ago

Responsibilities For Site Reliability Developer 4

  • Drive Project to improve the availability, scalability, security, latency, and efficiency of our cloud service.
  • Drive and actively participate in the resolution of complex technical issues spanning multiple Cloud services and work towards ensuring service availability goals remain intact by developing solutions to complex problems and incidents.
  • Act as a trusted technical advisor to customers and solve complex Infrastructure and DevOps challenges.
  • Create and deliver the best practices recommendations, sample code, and technical documents.
  • Contribute to making our infrastructure simple, secure, reliable, and easy to operate.
  • Define and develop monitoring infrastructure criteria (SLIs, SLOs).
  • Solve complex and difficult problems and build automation to prevent problem recurrence.
  • Participate in incident management, operational/security maintenance, software performance analysis, and system tuning.

Requirements For Site Reliability Developer 4

Python
Linux
Kubernetes
  • 7+ years of professional experience as a Site Reliability Engineer or equivalent experience.
  • 3+ years Linux Experience.
  • Bachelor's degree/master's degree (Information Technology/ Computer System Engineering).
  • 3+ years' experience and working knowledge in Python, Perl and/or Shell Scripting.
  • Experience in Cloud migration and builds projects.
  • Managing production running on UNIX flavors (RHEL, OEL).
  • Knowledge of Cloud/Infrastructure as a Code (IaaC) like Shepherd and Terraform.
  • Knowledge of CI/CD Platforms and components like OKE, Jenkins and Splat.
  • Knowledge of Source Control Systems

Interested in this job?

Jobs Related To Oracle Site Reliability Developer 4

Site Reliability Engineer (L5) - Security Engineering

Netflix seeks a Site Reliability Engineer (L5) for Security Engineering to enhance critical infrastructure reliability and support business growth in LIVE streaming, Gaming, and Ads.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer for Site Reliability Engineering at Airbnb, developing tools and systems for service reliability and incident management.

Engineering Manager, Reliability Engineering

Airbnb seeks an Engineering Manager for Site Reliability to drive long-term strategy and ensure infrastructure performance.

Site Reliability Developer 4

Site Reliability Developer 4 at Oracle in Bengaluru, India. Design and deliver mission-critical stack with focus on security, resiliency, scale, and performance.

Site Reliability Engineer

Join Freed as a Site Reliability Engineer to manage cloud infrastructure, implement security, and maintain databases for AI-powered healthcare products.