System Development Engineer, Alexa Language and Data Ops

Amazon is a global technology company that develops and maintains industry-leading multi-modal and multi-lingual large language models (LLM) through its Artificial General Intelligence (AGI) team.
DevOps
Mid-Level Software Engineer
Contact Company
5,000+ Employees
3+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:
Product Support Engineer, Prime Air

Join Amazon Prime Air as a Product Support Engineer to drive innovation in drone delivery technology, working with cutting-edge hardware and software systems.

Maintenance Engineering Planner, Amazon Reliability Maintenance Engineering - IntlRME

Join Amazon as a Reliability Maintenance Engineering Planner, managing CMMS software and maintenance planning for global warehouse operations.

System Development Engineer, Kumo

AWS System Development Engineer position focusing on cloud infrastructure automation and support, offering competitive salary and benefits with opportunities for growth.

System Development Engineer, AGI - Modeling Services

DevOps Engineer role focusing on LLM infrastructure management and automation at Amazon's AGI team, working with cutting-edge AI technologies.

System Development Engineer, AGI - Modeling Services

System Development Engineer role at Amazon's AGI team, focusing on LLM infrastructure and automation

Description For System Development Engineer, Alexa Language and Data Ops

The Artificial General Intelligence (AGI) team at Amazon is seeking passionate, talented, and inventive engineers to play a pivotal role in the development and maintenance of industry-leading multi-modal and multi-lingual large language models (LLM). The AGI team's mission is to leverage hyper-scalable, general-purpose large model training and inference systems to develop and deploy cutting-edge sensory AI foundational models that revolutionize machine perception, interpretation, and interaction with humans and the physical world.

Key responsibilities include:

  • Providing support for cluster and node management to ensure smooth operation of LLM infrastructure
  • Continuously improving and automating cluster/capacity/maintenance upgrades
  • Developing automation tools for improving operational excellence
  • Working on operations and maintenance-driven coding projects, primarily in Ruby, Rails, Java, Python, or shell scripts, AWS, and web technologies
  • Hands-on experience with Kubernetes and expertise in different AWS services
  • Driving company-wide campaigns with Support and Engineering teams
  • Participating in design and code reviews and identifying bottlenecks
  • Troubleshooting and researching root causes thoroughly to resolve defects

The ideal candidate should have:

  • 3+ years of administrative experience in networking, storage systems, operating systems, and hands-on systems engineering
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, or Rust
  • Experience with Linux/Unix
  • Experience with CI/CD pipelines and build processes
  • Preferred: Experience with distributed systems at scale

Amazon values a "Work Hard. Have Fun. Make History" approach, with a strong focus on sharing learning experiences from the front line with development teams. The role offers various opportunities for growth and specialization, whether you prefer mastering a domain, juggling multiple tasks, implementing process improvements, or focusing on coding.

Join the AGI team at Amazon to be at the forefront of AI innovation and contribute to the development of cutting-edge language models and AI technologies.

Last updated a month ago

Responsibilities For System Development Engineer, Alexa Language and Data Ops

  • Provide support for cluster and node management, ensuring smooth operation of LLM infrastructure
  • Continuously improve and automate cluster/capacity/maintenance upgrades
  • Develop automation tools for improving operational excellence
  • Work on operations and maintenance driven coding projects
  • Drive company-wide campaigns with Support and Engineering teams
  • Participate in design and code reviews and identify bottlenecks
  • Troubleshoot and research root causes thoroughly and resolve defects

Requirements For System Development Engineer, Alexa Language and Data Ops

Python
Ruby
Java
Kubernetes
  • 3+ years of administrative experience in networking, storage systems, operating systems, and hands-on systems engineering
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, or Rust
  • Experience with Linux/Unix
  • Experience with CI/CD pipelines and build processes
  • Hands-on experience with Kubernetes
  • Expertise in different AWS services

Interested in this job?