Senior HPC AI Engineer

Senior HPC AI Engineer

Sachseln Vollzeit 54000 - 84000 € / Jahr (geschätzt) Kein Home Office möglich
T

Auf einen Blick

  • Aufgaben: Design and maintain large scale HPC/AI clusters while collaborating with researchers and developers.
  • Arbeitgeber: Join NVIDIA, a leader in groundbreaking computing technologies and AI advancements.
  • Mitarbeitervorteile: Enjoy a diverse workplace with opportunities for growth and innovation in cutting-edge tech.
  • Warum dieser Job: Be at the forefront of AI and HPC, contributing to revolutionary solutions and workflows.
  • Gewünschte Qualifikationen: Requires a degree in Computer Science or Engineering and 5+ years of relevant experience.
  • Andere Informationen: We value diversity and provide accommodations for individuals with disabilities.

Das voraussichtliche Gehalt liegt zwischen 54000 - 84000 € pro Jahr.

NVIDIA is looking for an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for an outstanding architect for a senior HPC position, to be a key player in the most exciting computing hardware and software, contributing to the latest breakthroughs in artificial intelligence and GPU computing. You will provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, collaborating with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialists to architect, develop, and bring up large scale performance platforms.What you will be doing:Design, implement and maintain large scale HPC/AI clusters with monitoring, logging and alerting.Manage Linux job/workload schedules and orchestration tools.Develop and maintain continuous integration and delivery pipelines.Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.Deploy monitoring solutions for the servers, network, and storage.Perform troubleshooting from bare metal, operating system, software stack, and application level.Being a technical resource, develop, redefine, and document standard methodologies to share with internal teams.Support Research & Development activities and engage in POCs/POVs for future improvements.What we need to see:A degree in Computer Science, Engineering, or a related field and 5+ years of experience.Knowledge of HPC and AI solution technologies from CPUs and GPUs to high speed interconnects and supporting software.Experience with job scheduling workloads and orchestration tools such as Slurm, K8s.Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalld, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.Python programming and bash scripting experience.Comfortable with automation and configuration management tools such as Jenkins, Ansible, Puppet/Chef.Deep knowledge of Networking Protocols like InfiniBand, Ethernet.Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix).Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud).Ways to stand out from the crowd:Knowledge of CPU and/or GPU architecture.Knowledge of Kubernetes, container-related microservice technologies.Experience with GPU-focused hardware/software (DGX, Cuda).Background with RDMA (InfiniBand or RoCE) fabrics.We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation. #J-18808-Ljbffr

Senior HPC AI Engineer Arbeitgeber: TN Switzerland

At NVIDIA, we pride ourselves on being a leader in the HPC and AI space, offering our employees the opportunity to work with cutting-edge technologies that are shaping the future of computing. Our collaborative work culture fosters innovation and creativity, providing ample opportunities for professional growth and development. Located in a vibrant tech hub, we offer competitive benefits and a commitment to diversity, ensuring that every team member feels valued and empowered to contribute to groundbreaking advancements in artificial intelligence and GPU computing.
T

Kontaktperson:

TN Switzerland HR Team

StudySmarter Bewerbungstipps 🤫

So bekommst du den Job: Senior HPC AI Engineer

✨Tip Number 1

Make sure to showcase your experience with HPC and AI technologies during the interview. Be prepared to discuss specific projects where you've designed or managed large-scale HPC clusters, as this will demonstrate your hands-on expertise.

✨Tip Number 2

Familiarize yourself with the latest trends in GPU computing and accelerated computing. Being able to discuss recent advancements or breakthroughs in these areas can set you apart from other candidates.

✨Tip Number 3

Highlight your proficiency with orchestration tools like Slurm and Kubernetes. Prepare examples of how you've used these tools to optimize job scheduling and workload management in previous roles.

✨Tip Number 4

Demonstrate your problem-solving skills by preparing for technical questions related to troubleshooting at various levels, from bare metal to application level. This will show your depth of knowledge and readiness for the challenges of the role.

Diese Fähigkeiten machen dich zur top Bewerber*in für die Stelle: Senior HPC AI Engineer

HPC and AI solution technologies
Job scheduling workloads and orchestration tools (e.g., Slurm, K8s)
Linux (Redhat/CentOS and Ubuntu) networking and internals
Windows networking and OS level security
Storage solutions (Lustre, GPFS, zfs, xfs)
Python programming
Bash scripting
Automation and configuration management tools (Jenkins, Ansible, Puppet/Chef)
Networking Protocols (InfiniBand, Ethernet)
Virtual systems (VMware, Hyper-V, KVM, Citrix)
Cloud computing platforms (AWS, Azure, Google Cloud)
CPU and/or GPU architecture knowledge
Kubernetes and container-related microservice technologies
GPU-focused hardware/software (DGX, Cuda)
Background with RDMA (InfiniBand or RoCE) fabrics

Tipps für deine Bewerbung 🫡

Tailor Your CV: Make sure your CV highlights relevant experience in HPC and AI technologies. Emphasize your knowledge of job scheduling tools like Slurm and Kubernetes, as well as your programming skills in Python and bash scripting.

Craft a Strong Cover Letter: In your cover letter, express your passion for HPC and AI. Discuss specific projects or experiences that demonstrate your ability to design and maintain large-scale HPC/AI clusters, and how you can contribute to NVIDIA's innovative environment.

Showcase Relevant Projects: Include examples of past projects where you developed or managed HPC systems. Highlight any experience with cloud computing platforms and automation tools, as these are key aspects of the role.

Highlight Collaboration Skills: Since the role involves working with researchers and developers, emphasize your teamwork and communication skills. Mention any collaborative projects you've been part of, especially those that required cross-functional cooperation.

Wie du dich auf ein Vorstellungsgespräch bei TN Switzerland vorbereitest

✨Showcase Your Technical Expertise

Be prepared to discuss your experience with HPC and AI technologies in detail. Highlight specific projects where you've designed or maintained large-scale HPC/AI clusters, and be ready to explain the challenges you faced and how you overcame them.

✨Demonstrate Problem-Solving Skills

Expect technical questions that assess your troubleshooting abilities. Prepare examples of how you've diagnosed and resolved issues at various levels, from bare metal to application level, and be ready to walk through your thought process.

✨Familiarize Yourself with Relevant Tools

Make sure you are well-versed in job scheduling and orchestration tools like Slurm and Kubernetes. Be ready to discuss your experience with automation tools such as Jenkins and Ansible, and how you've used them to streamline processes.

✨Engage with the Interviewers

During the interview, ask insightful questions about the team's current projects and future goals. This shows your genuine interest in the role and helps you understand how you can contribute to their success.

Senior HPC AI Engineer
TN Switzerland
T
  • Senior HPC AI Engineer

    Sachseln
    Vollzeit
    54000 - 84000 € / Jahr (geschätzt)

    Bewerbungsfrist: 2027-03-08

  • T

    TN Switzerland

Ähnliche Positionen bei anderen Arbeitgebern
Europas größte Jobbörse für Gen-Z
discover-jobs-cta
Jetzt entdecken
>