Adobe Connect is a leading platform for web conferencing and virtual collaboration, trusted by organizations worldwide to deliver impactful online experiences. The Adobe Connect Team is dedicated to ensuring that our platform is reliable, scalable, and high-performing. As the Senior Site Reliability Engineering (SRE) Manager, you will play a crucial role in driving the success of Adobe Connect by leading a team of talented engineers and ensuring the stability and resilience of our services.
Key Responsibilities:
- Leadership and Management:
- Lead, mentor, and manage a team of SREs, fostering a culture of collaboration, innovation, and continuous improvement.
- Set strategic direction for the SRE team, aligning with business goals and ensuring the reliability and scalability of Adobe Connect services.
- Drive the professional development of team members, providing coaching, feedback, and growth opportunities.
- Operational Excellence:
- Oversee the management of Tier-1 monitoring/alerts, incident response, and post-mortem analysis, ensuring timely resolution and learning from incidents.
- Develop and implement strategies for improving system reliability, including automation, performance tuning, and capacity planning.
- Ensure robust disaster recovery and business continuity plans are in place and regularly tested.
- Collaboration and Communication:
- Collaborate closely with engineering, product, and infrastructure teams to ensure seamless integration and deployment of new features and updates.
- Act as a key stakeholder in product development, advocating for reliability, scalability, and operational efficiency from the early stages of design.
- Communicate effectively with cross-functional teams and executive leadership, providing updates on system performance, reliability metrics, and ongoing projects.
- Continuous Improvement and Innovation:
- Drive automation initiatives to reduce manual intervention, improve efficiency, and minimize downtime.
- Identify and implement best practices in SRE, staying ahead of industry trends and emerging technologies.
- Foster a culture of continuous improvement, encouraging the team to experiment, learn, and iterate on processes and tools.
- Resource Planning and Allocation:
- Manage team resources effectively, balancing operational tasks with project work to ensure the team can meet both short-term and long-term objectives.
- Participate in hiring, onboarding, and training new team members, ensuring the SRE team is well-equipped to handle the demands of the Adobe Connect platform.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 10+ years of experience in site reliability engineering, infrastructure engineering, or a related role, with at least 5 years in a leadership or management position.
- Proven experience managing large-scale, distributed systems in a cloud environment (AWS, Azure, or GCP).
- Strong expertise in automation, monitoring, and incident management tools and practices.
- Excellent problem-solving and analytical skills, with a focus on delivering high-quality, reliable services.
- Strong communication and interpersonal skills, with the ability to lead and inspire a team.
- Experience with Agile methodologies and a solid understanding of DevOps practices.
Preferred Qualifications:
- Experience working with Adobe Connect or other web conferencing platforms.
- Certifications in cloud platforms (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).
- Knowledge of security and compliance standards (ISO 27001, SOC 2, GDPR, etc.).
- Experience in performance tuning, capacity planning, and cost optimization in cloud environments.
Adobe is proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law.