Posted Aug 16

Nvidia is hiring a
Senior AI Network System Architect

Israel, Tel Aviv • Israel, Yokneam • Israel, Raanana • 2 Locations • 4 Locations
Full time

Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and state-of-the-art accelerated computing platforms. Because of our work, scientists, researchers, and engineers can advance their ideas. We pioneered a supercharged form of computing loved by the fastest-paced computer users in the world - scientists, designers, artists, and gamers.

We seek a highly motivated Senior AI Network System Architect to join our team of experts and help shape the future of high-performance and ML / AI computing. Our next-generation Infiniband, NVLink, and Ethernet systems will be at the forefront of connecting and powering the world's most advanced AI clusters. As an AI system architect at NVIDIA, you will have the opportunity to work on some of the most cutting-edge technology and help drive the innovation of our next-generation networks that top researchers and engineers worldwide will use.

What you’ll be doing:

  • Exploring new technologies and workloads in machine learning and artificial intelligence, understanding how they interact with the network.

  • Running workloads on AI systems, analyzing bottlenecks and possible enhancements.

  • Performing research and optimizations for communications libraries such as NCCL and UCX.

  • Define the next generation of networking products to support and accelerate cutting-edge ML workloads.

  • Develop models for simulations, analyze simulation results, and develop optimization algorithms.

  • Collaborate with multi-functional teams, including other architecture teams, logic design, system software, firmware, and ML research teams, to ensure the successful execution of the project.

What we need to see:

  • M.Sc, or Ph. D degree in Computer Science, Computer Engineering, or Electrical Engineering.

  • At least 3+ years of industry or research experience in computer networks.

  • Vast knowledge of ML / AI workloads, specifically in distributed training.

  • Excellent understanding of large-scale network behavior and the effect of distributed computing workloads on the network.

  • Experience in the development of simulation environments.

  • Possess problem-solving and critical thinking skills.

  • Ability to operate in a highly dynamic environment.

  • Ability to work concurrently with multiple groups in the organization.

Ways to stand out of the crowd:

  • Knowledge of communication libraries such as NCCL, UCX, and UCC.

  • Good knowledge of network protocols - such as InfiniBand, IP, TCP, RoCE, and network topologies.

  • Experience with Python, C++, and dockers.

NVIDIA has some of the most forward-thinking and hardworking people in the world working for us, and due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

We are committed to fostering a diverse work environment and proud to be an equal-opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Please mention that you found the job on ARVR OK. Thanks.