Microsoft Cloud Hardware Infrastructure Engineering (CHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. CHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Skype, OneDrive and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide and we are looking for passionate, high-energy engineers to help achieve that mission.
As Microsoft's cloud business continues to grow the ability to deploy new offerings and HW infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Cloud Hardware Infrastructure Engineering (CHIE) team is instrumental in defining and delivering measures of success for hardware design, qualification, fleet support, scale, and sustainability related to Microsoft cloud hardware.
Azure Memory and Storage Center of Excellence (AMS COE) is part of the CHIE organization focusing on Memory and Storage devices going into the Cloud hardware servers. AMS provide memory and storage solutions to Azure, drive memory and storage suppliers to deliver high quality products, meeting out requirements.
We are looking for an experienced hands-on Software Engineer in SSD/HDD solutions for fleet health with a strong passion for customer focused solutions, insight and industry knowledge to architect and specify hardware storage solutions that optimize for quality, reliability, cost, and performance.
Key Responsibilities: • Design and build infrastructure for storage devices at scale • Develop scalable live monitoring capabilities, failure detection and prediction algorithms for storage devices • Investigate, triage and root cause SSD/HDD related failures in Azure solutions • Build automation for operations of storage devices • Collaborate with suppliers to design reliable, high performance and quality storage devices • Develop ML algorithms for failure prediction • Analyze data to identify, prototype, and drive the implementation of technical and process improvements to increase the predictability, agility, and quality of Azure systems • Actively support Azure service stakeholders
This role offers the opportunity to work on cutting-edge cloud storage technologies, contribute to the reliability and performance of Microsoft's global cloud infrastructure, and collaborate with industry-leading experts in the field of storage and cloud computing.