Offer summary

Qualifications:

Strong background in deep learning frameworks like TensorFlow or PyTorch., Experience in developing and deploying multimodal AI models involving text, vision, or audio., Knowledge of model optimization techniques such as quantization, pruning, and distillation., Advanced degree in Computer Science, Machine Learning, or related fields, with 5+ years of relevant experience..

Key responsibilities:

Design, develop, and optimize multimodal AI models for real-time inference.

Collaborate with cross-functional teams to integrate AI models into the platform.

Optimize models for memory efficiency, low latency, and high throughput.

Stay updated with the latest research and implement innovative techniques in generative AI.

Job description

Company Overview
Axelera is a European, highgrowth Series B startup revolutionizing the AI landscape with our inmemory computing platform. We specialize in creating AI hardware and software optimized for highperformance inference, catering to cuttingedge use cases across highend edge computing, embodied AI, and serverside AI deployments. We are looking for passionate, innovative research engineers to join our team and help drive the future of AI.

Role Overview
We are seeking an AI Research Engineer with expertise in developing and optimizing multimodal AI models. The role will be central to advancing our platform’s capabilities in inference for Generative AI, working on stateoftheart models that integrate multiple data modalities (e.g., text, vision, and audio) for a broad range of applications.

This is an exciting opportunity to work at the intersection of advanced machine learning, inmemory computing, and highperformance AI inference on cuttingedge hardware architectures.

Responsibilities:

Model Development: Design, develop, and optimize multimodal AI models for realtime, highefficiency inference across a variety of deployment environments (edge, serverside, and embodied AI).
Collaboration: Work closely with crossfunctional teams, including AI researchers, hardware engineers, and software engineers to integrate AI models into the broader platform.
Scalability and Optimization: Focus on optimizing models for memory efficiency, lowlatency inference, and high throughput.
Innovation: Stay uptodate with the latest research in multimodal AI, proposing and implementing new techniques to push the boundaries of whats possible in generative AI.
Deployment & Testing: Implement best practices for model testing, deployment, and continuous improvement to ensure models scale effectively in production environments.