Nvidia is hiring a
Senior System Engineer - DevOps
We are now looking for Senior DevOps Software Engineers. You will work on open-source technologies and enterprise adoptions such as:
Accelerate Apache Spark with GPU (Spark-RAPIDS) to speedup data processing and machine learning dramatically
Medical deep learning framework (project MONAI) that revolutionizes healthcare AI solutions worldwide
Federated learning technology (NVFlare) that builds generalizable AI models from diverse data sources while ensuring data security and privacy
What you'll be doing:
Serve as a technical lead in defining, designing, developing, and maintaining the DevOps tools, frameworks & platforms
Implement, advocate, and carry out CI/CD conventions and write tools to automate various steps involved in this process
Develop and maintain Build, Deployment, and Continuous Integration infrastructure
Enable the development team by providing automated build and test solutions using Docker, Kubernetes/YARN, and on-prem/CSPs
Work with open source communities, including RAPIDS, Spark, MONAI, and NVFlare, on CI/CD
Work closely with Development and QA teams to help ensure end-to-end quality
Full stack development opportunities depending on the candidate's capabilities
BS or MS in Computer Science, Computer Engineering, or closely related fields
5+ years of working experience in software development
2+ years experience in CI/CD system, Strong programming and debugging skills in Python/Java/C++ with extensive bash scripting experience
Excellent knowledge of Gitlab/Github or other source version control systems
Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Jenkins, Kubernetes, Docker, etc)
Strong experience in build tools like maven, setup tools, cmake, unit testing, and code-coverage tools
Strong skills in software release process (maven repository, PyPI, Conda)
Familiar with various Linux systems like Ubuntu, CentOS, Rocky
Familiar with cloud services like AWS, Azure, GCP
Good knowledge of open-source big-data technologies (Spark, Hadoop) and/or ML/DL frameworks (TensorFlow, PyTorch)
Good open-source project management skills
Kubernetes, YARN, Spark, or Ray experience
Experience with Configuration Management such as Ansible, and Terraform
Knowledge of monitoring systems (Prometheus, Grafana)
Experience with CUDA would be a huge plus
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.
Please mention that you found the job on ARVR OK. Thanks.