Software Engineer ML Platform Infrastructure (all genders)
Jetzt bewerben
Software Engineer ML Platform Infrastructure (all genders)

Software Engineer ML Platform Infrastructure (all genders)

Berlin Vollzeit Kein Home Office möglich
Zalando GmbH

As part of our team, you’ll play a key role in developing and maintaining the ML Platform that powers machine learning at Zalando. We own a suite of critical tools and services—including the model registry, AI inventory, our internal ML infrastructure framework (MLI), and the central feature store—that support 35+ ML teams and over 300 Applied Scientists across the company.

Your work will directly impact how teams across Zalando build, manage, and scale their ML solutions. From enabling reproducible experimentation to managing production-ready features and models, you\’ll contribute to systems that ensure our ML practitioners can move fast without compromising on quality, safety, or reliability.

This is a highly collaborative role where you’ll interface with a broad range of product domains, providing the tooling and infrastructure that help ML Engineers focus on what they do best: building models that enhance our customer experience. You\’ll work with a modern tech stack—including Python, Kubernetes, Airflow, Ray, and distributed computing technologies—on infrastructure that supports thousands of models and workflows company-wide.

Joining us means being part of a team that values operational excellence, innovation, and developer experience. You’ll help shape the future of ML at Zalando, building robust platforms that scale with the company and unlock the full potential of AI.

This role offers a unique opportunity to work on systems that power critical customer-facing services, ensuring efficient and secure traffic delivery at massive scale. You’ll collaborate with experienced engineers, continuously improving performance and operational excellence across our service landscape.

Be part of a team where your contributions drive significant technological advancements, impacting millions of customers across Europe.

WHAT WE’D LOVE YOU TO DO (AND LOVE DOING)

  1. Design, build, and maintain critical components of Zalando’s ML platform, ensuring it is scalable, reliable, and efficient for 35+ ML teams and 300+ Applied Scientists.
  2. Develop and enhance the model registry, AI inventory, central feature store, and ML infrastructure framework (MLI) to streamline the entire ML lifecycle—from experimentation to production.
  3. Optimize the platform for performance, scalability, and operational excellence, reducing complexity and improving developer experience.
  4. Take ownership of key infrastructure projects, leading design discussions, mentoring team members, and driving cross-functional initiatives.
  5. Promote best practices in CI/CD, observability, monitoring, and infrastructure-as-code, ensuring smooth and reliable ML model deployment.
  6. Actively contribute to an engineering culture of learning, innovation, and knowledge sharing, collaborating across teams to improve the overall ML ecosystem at Zalando.
  7. Have a solid foundation in Python, with a desire to expand your skills in designing and developing production-ready applications, especially those that support MLOps.
  8. Have a foundational understanding of Kubernetes concepts and a strong interest in learning to design, deploy, and manage applications within clusters.
  9. Have good understanding of AWS services (including but not limited to SageMaker, Bedrock, EC2, Lambda, CloudFormation) and experience designing solutions that involve these services.
  10. Write clean, simple, readable, testable, and maintainable code, with a focus on performance and scalability.
  11. Have experience in CI/CD practices, logging, and monitoring, and are committed to operational excellence in large-scale environments.
  12. Exhibit a strong desire to learn and grow in the field of infrastructure technologies, with a specific interest in MLOps and machine learning workflows.
  13. Are an excellent communicator with strong verbal and written communication skills in English, able to explain complex technical concepts to both technical and non-technical audiences.

Bonus: Experience with Ray.io or distributed systems is a plus.

Bonus: Previous experience with platform/infrastructure teams.

If you think you have what it takes, we encourage you to apply even if you don\’t meet every single requirement. You may just be the right candidate for this or other roles.

OUR OFFER

Zalando provides a range of benefits, here’s an overview of what you can expect. Ask your Talent Acquisition Partner to learn more about what we offer. Learn all about Zalando and our values here:

  • Employee shares program
  • 40% off fashion and beauty products sold and shipped by Zalando, 30% off Zalando Lounge, discounts from external partners
  • 2 paid volunteering days a year
  • Hybrid working model with 60% remote per week, actual practice is up to each team to best support their collaboration
  • Work from abroad for up to 30 working days a year
  • 27 days of vacation a year to start
  • Relocation assistance available (subject to prior agreement)
  • Family services, including counselling and support
  • Health and wellbeing options (including Gympass)
  • Mental health support and coaching available

#J-18808-Ljbffr

Zalando GmbH

Kontaktperson:

Zalando GmbH HR Team

Software Engineer ML Platform Infrastructure (all genders)
Zalando GmbH
Jetzt bewerben
Zalando GmbH
Ähnliche Positionen bei anderen Arbeitgebern
Europas größte Jobbörse für Gen-Z
discover-jobs-cta
Jetzt entdecken
>