Aethir is the only Enterprise-grade AI-focused GPU-as-a-service provider in the market. Its decentralized cloud computing infrastructure allows GPU providers (containers) to meet Enterprise clients who need powerful GPU chips for professional AI/ML tasks. With a network of over 40,000 top-shelf GPUs, including 3,000 NVIDIA H100s, Aethir provides enterprise-grade GPU computing at scale.
We are seeking a Site Reliability Engineer (SRE) for our new headquarters in Kuala Lumpur, Malaysia. This role is crucial in monitoring, troubleshooting, and optimizing our production system to ensure high performance and stability for our AI and gaming customers worldwide.
Key responsibilities include:
Requirements:
We offer benefits such as a hypergrowth startup environment, fantastic career progression opportunities, and a collaborative, innovative work environment. Join us in shaping the future of decentralized computing!