Posted Aug 16

Nvidia is hiring a
Senior AI Platform Engineer

US, CA, Santa Clara • US, TX, Austin • 2 Locations • US, CA, Santa Clara
Full time

At NVIDIA, we pride ourselves on data-driven decision-making, and the data science platform team is at the heart of this initiative! We are looking for an excellent Principal ML/AI Engineer to enhance our AI platform which process 10+ Trillion of events per month. This role will work closely with data science, data engineering, and product teams to ensure delivery of reliable services. Excellent opportunity to architect state-of-the-art tech, including machine learning at scale, containerized environments and GPU based inferencing. The candidate will have the opportunity to optimize and expand our data science platform and organization for model generation, orchestration, deployment, health, operations and metrics.
 

What you’ll be doing:

  • Design, architect MLOps lifecycle for our data science platform

  • Own production machine learning pipelines both in-house and cloud

  • Design, architect our GPU inferencing service both offline and online

  • Implement machine learning deployment tools, observability and APIs to build self-serve ML pipelines

  • Work closely with data scientists and data engineers to advise on implementation and provide trainings

  • Work with internal and external product teams to enable new features for GPU based inferencing

What we need to see:

  • Bachelors or Master’s degree in Computer Science or a related technical field or equivalent experience

  • 10+ years of software engineering experience

  • Proficient understanding of distributed computing principles

  • Excellent understanding of ML Ops best practices, ability to design, measure, track, validate models and orchestrate

  • Expert level SW development skills in one or more: Java/Scala/Python/Go

  • Strong background with deploying ML/DL models at scale and inferencing stacks

  • Strong background with state-of-the-art neural network architectures (CNN, RNN, GAN, LLMs etc.) and experience in developing or using major deep learning frameworks (e.g. TensorFlow, PyTorch etc)

  • Experience designing and implementing low latency, high throughput applications on K8s

  • Excellent communication skills including the ability to identify and communicate data driven insights

Ways to stand out from the crowd:

  • Knowledge on building k8s custom operators

  • Experience with SLURM

  • Contributions to open source

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

The base salary range is $216,000 - $414,000. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Please mention that you found the job on ARVR OK. Thanks.