Core Pipeline is a globally distributed team that develops tools and workflows that act as the foundational infrastructure for ILM's creative departments at all of our locations around the world. We work closely with CG Production to create innovative tooling for media management, publishing and asset management, developer platforms, and other core services. We work with all levels of artists and supervisors to create efficient, resilient, and artist-friendly software.
Within Core Pipeline, the Platform team provides the services, tools and applications that underpin the VFX pipeline. The Staff Production Engineer for Platform is engaged in architecting, developing, and maintaining the environment for the application developers to operate in. They will be a strong individual contributor and teammate, with an eye towards security, scalability, and reliability.
As a Staff Production Engineer, you may be required to: • Work closely with development teams to ensure applications are designed for scalability, reliability, and performance. • Design, deploy, and maintain distributed, multi-region, highly scalable, and reliable services to improve developer productivity and experience. • Build and maintain effective monitoring, logging, and alerting systems. • Implement Site Reliability Engineering (SRE) best practices. • Continuously improve our application infrastructure and processes. • Drive discussions around the future of the developer platform at ILM. • Provide support in the event of critical service downtime.
Required skills: • Expert knowledge of Python and the Linux environment. • Experience working with network and application protocols like NFS, TCP, gRPC, HTTP, etc. • Experience with several relational or NoSQL technologies such as MySQL, PostgreSQL, MongoDB, Redis, Cassandra, ElasticSearch. • Experience building and managing Kubernetes clusters in production. • Experience with infrastructure as code such as Terraform, Ansible. • Experience with CI/CD pipelines. • Experience with monitoring and logging tools such as ELK stack, Prometheus, Grafana, and Datadog. • Knowledge of or expertise with SRE practices. • Experience designing and implementing distributed systems in multi-region and hybrid environments. • Strong problem-solving and troubleshooting skills. • Bachelor's degree, M.Sc, or Ph.D in Computing Science, or equivalent professional experience.
Additional qualifications (a plus): • Experience developing with C, C++, Rust • Experience with message queuing systems such as RabbitMQ or Kafka • Experience with cloud platforms such as AWS, Azure, or Google Cloud Platform • VFX, Feature Animation or Episodic production experience • Experience developing on Windows
We're looking for someone who:
If you are a Staff Platform Engineer with a strong background in data-intensive systems and are looking for an exciting opportunity to make movie magic, we encourage you to apply.