Dein Profil
- 6+ years in Site Reliability Engineering, DevOps, or Infrastructure roles, with hands-on experience managing cloud-native production environments on AWS or GCP.
- Strong expertise with Kubernetes and Helm for orchestration and deployment.
- Solid experience in Terraform for infrastructure as code.
- Deep understanding of Linux systems and containerization with Docker .
- Proficiency in cloud platforms (AWS or GCP), including IAM, networking, and storage.
- Strong observability mindset with experience in Prometheus , Grafana , and Loki .
- Experience with CI/CD pipelines, preferably using GitHub Actions or GitLab CI/CD .
- Familiarity with monitoring and incident response , including alerting, SLOs/SLIs, and on-call rotations.
Bonus:
- Experience with Argo Workflows
- Knowledge of authentication and authorization tools (e.g., OIDC , JWT , AWS Cognito ).
- Familiarity with tracing systems such as Grafana Tempo .
- Understanding of security tooling and practices (e.g., WAF , firewalls, secrets management).
- Exposure to financial services or insurance environments.
Warum wir?
- A challenging and supportive environment with motivated colleagues, where you will be part of defining a completely new product or product features from scratch, where you can learn a lot and have fun
- Attractive compensation
- A long-term development perspective in a growing company
- A friendly team in which you are welcome as a person and in which the joy of work and development prospects are at the forefront
Unsere Werte
Bewerbungsprozess
Über Uns
Deine Mission
- Design, implement, and maintain infrastructure as code using Terraform across AWS and GCP environments.
- Manage and optimize Kubernetes clusters, using Helm for packaging and deployment of applications.
- Build and maintain observability stacks using Grafana, Prometheus, Loki, and tracing tools like Grafana Tempo.
- Ensure high availability, scalability, and resilience of production systems.
- Improve deployment processes with CI/CD pipelines (e.g., GitHub Actions), enabling safe and fast delivery of software.
- Support internal teams by providing reliable, well-documented, and secure infrastructure.
- Troubleshoot production incidents, perform root cause analysis, and implement postmortem processes.
- Maintain and harden Docker-based environments and Cloud systems.
- Champion best practices in monitoring, incident management, performance tuning, and infrastructure automation.
- Collaborate with development, product, and security teams to ensure infrastructure supports business needs.
Deine Aufgaben
#J-18808-Ljbffr
Site Reliability Engineer (m/w/d) Arbeitgeber: Baobab, Inc.
Als Arbeitgeber bieten wir ein herausforderndes und unterstützendes Umfeld, in dem motivierte Kollegen gemeinsam an der Entwicklung neuer Produkte arbeiten. Unsere Unternehmenskultur fördert nicht nur die persönliche und berufliche Weiterentwicklung, sondern auch den Spaß an der Arbeit. Mit attraktiven Vergütungen und langfristigen Perspektiven in einem wachsenden Unternehmen sind wir der ideale Ort für Site Reliability Engineers, die ihre Fähigkeiten in einem dynamischen Team einbringen möchten.
Kontaktperson:
Baobab, Inc. HR Team