Axelera AI, a European Series B startup, is revolutionizing the AI landscape with their innovative in-memory computing platform. They are seeking an AI Research Engineer specialized in model compression to join their dynamic team. This role focuses on developing cutting-edge compression techniques for Generative AI models, optimizing them for real-time inference across various environments, from edge computing to server-side deployments.
The position offers a unique opportunity to work at the intersection of advanced machine learning, in-memory computing, and high-performance AI inference. The ideal candidate will be responsible for developing and implementing sophisticated model compression techniques while maintaining or improving model accuracy. They will work closely with cross-functional teams to integrate optimizations into the AI platform.
Key responsibilities include designing compression techniques like pruning and quantization, performance tuning for high-throughput inference, and staying current with the latest research developments. The role requires expertise in deep learning frameworks, experience with model optimization, and strong understanding of AI/ML concepts.
The position is based in Italy, with options to work from Milan, Florence, or Bologna. Axelera AI offers competitive compensation, including equity options, and supports relocation for international talent. The company promotes a diverse, inclusive environment and provides significant growth opportunities as part of a fast-growing Series B startup.