Nvidia is hiring a
Senior Software Engineer - Infrastructure
NVIDIA is searching for a highly motivated software engineer for the team that is building a next gen Network management and Telemetry system in cloud using modern design principles at internet scale. It will be a highly scalable, modern network operations toolset that provides visibility, troubleshooting, validation and telemetry for Ethernet and InfiniBand networks.
The person will be part of the team building the platform for network management taking care of the Application Infrastructure role. The focus would be on efficiency by automating the repetitive workflows and excellent troubleshooting skills.
What you'll be doing:
Focus on efficiency by automating repetitive workflows.
Working on microservices based architecture.
Deploying and troubleshooting non-disruptive cloud operations with an emphasis on secure production infrastructure.
Continuous evaluation of existing system and driving improvements.
Managing deployment/upgrade for Operating Systems, Kubernetes clusters.
Day to day support for engineering activities with CI/CD tools like git, jenkins.
Contribute to applications like data ingestion, distributed computing, near real time analytic engines, RESTful APIs and user interfaces.
What we need to see:
5+ years of experience in complex microservices based architectures.
Highly skilled in Kubernetes and containerd.
Experience with modern deployment architecture for non-disruptive cloud operations including blue green and canary rollouts.
Automation expert with hands on skills in frameworks like Ansible & Terraform.
Strong knowledge of NoSQL DB (preferably Cassandra), Kafka/Kafka Streams and Nginx.
Expert in AWS, Azure or GCP.
Having good programming background in languages like Scala or Python.
Knows best practices and discipline of managing a highly available and secure production infrastructure.
Ways to stand out from the crowd:
Experience with APM tools like Dynatrace, Datadog, AppDynamics, New Relic, etc.
Skills in Linux/Unix Administration.
Experience with Prometheus/Grafana.
Implemented highly scalable log aggregation systems in past using ELK stack or similar.
Implemented robust metrics collection and alerting infrastructure.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative, passionate and self-motivated, we want to hear from you!
Please mention that you found the job on ARVR OK. Thanks.