Posted Aug 16

Nvidia is hiring a
Distinguished Engineer - Apache Spark

China, Shanghai
Full time

NVIDIA is seeking a Distinguished Engineer for the Apache Spark Acceleration group. Data scientists spend a considerable amount of time exploring data and iterating over machine learning (ML) experiments. Every hour of compute required to sort through datasets, extract features and fit ML algorithms impedes an efficient business workflow. NVIDIA believes that data science workflows can benefit tremendously from being accelerated, to enable data scientists to explore many more and larger datasets to drive towards their business goals, faster and more efficiently.

Apache Spark is the most popular data processing engine in data centers. We strive to accelerate Apache Spark 3.x use cases without application code changes. You will work with the open source community to accelerate Apache Spark with GPUs. You will engage in open source projects such as Apache Spark, RAPIDS Spark, RAPIDS, Apache Iceberg, Delta Lake, Hudi, Apache Kyuubi and more. The RAPIDS Spark library can be used both on-premise and cloud services.  We strive to be available in cloud services including Tencent Cloud EMR, Alibaba Cloud EMR, Databricks, AWS EMR, Google Dataproc, Microsoft Azure Synapse, Oracle OCI among others. 

What you'll be doing:

  • Lead the architecture, design and implementation of accelerated Apache Spark and related big-data frameworks

  • Work with a team of engineers including PMC and Committers of Apache Spark, Apache Hadoop, Apache Hive, and Apache Arrow

  • Engage open source communities (including Apache Spark, RAPIDS) for technical discussion and contribution, and engage new communities where we may not have a strong presence yet

  • Represent NVIDIA in customer technical discussions and collaborations, within the China market and globally

  • Work with NVIDIA strategic partners to deploy advanced machine learning and data analytics solutions in public cloud or on-premises clusters

  • Present technical solutions at industry conferences and meetups

  • Collaborate with distributed systems teams to craft solutions to distributed processing problems challenges at large scale

  • Provide recommendations and feedback to teams regarding decisions surrounding topics such as infrastructure, continuous integration and testing strategy

  • Build, test and optimize CUDA/C++ libraries across different platforms

  • Mentor a team of talented engineers building distributed systems.  Provide guidance on design, architecture and algorithms.  Be active in code reviews.  

What we need to see:

  • BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field

  • 15+ years of work or research experience in software development

  • Outstanding technical skills in designing and implementing high-quality distributed systems

  • Excellent programming skills in C++, Java, Scala, Python

  • Strong verbal and written communication and interpersonal skills

  • Excellent knowledge about distributed system schedulers: Kubernetes, Hadoop YARN, Spark Standalone

  • Able to delve into a new area and quickly come up to speed

  • Able to work with teams across boundaries and geographies

Ways to stand out from the crowd:

  • Hands on experience with key open source big-data projects as a contributor or committer to Apache Spark, Apache Hadoop, Apache Flink, Apache Kafka, Apache Hive, Apache Arrow, Delta Lake

  • Experience with design and implementation of SQL engines 

We are an AA/EEO/Disabled employer and with highly competitive salaries and a comprehensive benefits package. NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, contact us!

Please mention that you found the job on ARVR OK. Thanks.