Posted Aug 16

Nvidia is hiring a
Senior System Engineer - DevOps

China, Shanghai • 2 Locations • 4 Locations
Full time

We are now looking for Senior DevOps Software Engineers. You will work on open-source technologies and enterprise adoptions such as:

  • Accelerate Apache Spark with GPU (Spark-RAPIDS) to speedup data processing and machine learning dramatically

  • Medical deep learning framework (project MONAI) that revolutionizes healthcare AI solutions worldwide

  • Federated learning technology (NVFlare) that builds generalizable AI models from diverse data sources while ensuring data security and privacy

What you'll be doing:

  • Serve as a technical lead in defining, designing, developing, and maintaining the DevOps tools, frameworks & platforms

  • Implement, advocate, and carry out CI/CD conventions and write tools to automate various steps involved in this process

  • Develop and maintain Build, Deployment, and Continuous Integration infrastructure

  • Enable the development team by providing automated build and test solutions using Docker, Kubernetes/YARN, and on-prem/CSPs

  • Work with open source communities, including RAPIDS, Spark, MONAI, and NVFlare, on CI/CD

  • Work closely with Development and QA teams to help ensure end-to-end quality

  • Full stack development opportunities depending on the candidate's capabilities

What we need to see:
  • BS or MS in Computer Science, Computer Engineering, or closely related fields

  • 5+ years of working experience in software development

  • 2+ years experience in CI/CD system, Strong programming and debugging skills in Python/Java/C++ with extensive bash scripting experience

  • Excellent knowledge of Gitlab/Github or other source version control systems

  • Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Jenkins, Kubernetes, Docker, etc)

  • Strong experience in build tools like maven, setup tools, cmake, unit testing, and code-coverage tools

  • Strong skills in software release process (maven repository, PyPI, Conda)

  • Familiar with various Linux systems like Ubuntu, CentOS, Rocky

  • Familiar with cloud services like AWS, Azure, GCP

  • Good knowledge of open-source big-data technologies (Spark, Hadoop) and/or ML/DL frameworks (TensorFlow, PyTorch)

Ways to stand out from the crowd:
  • Good open-source project management skills

  • Kubernetes, YARN, Spark, or Ray experience

  • Experience with Configuration Management such as Ansible, and Terraform

  • Knowledge of monitoring systems (Prometheus, Grafana)

  • Experience with CUDA would be a huge plus

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.

Please mention that you found the job on ARVR OK. Thanks.