Nvidia is hiring a
Distinguished Engineer - Apache Spark
NVIDIA is seeking a Distinguished Engineer for the Apache Spark Acceleration group. Data scientists spend a considerable amount of time exploring data and iterating over machine learning (ML) experiments. Every hour of compute required to sort through datasets, extract features and fit ML algorithms impedes an efficient business workflow. NVIDIA believes that data science workflows can benefit tremendously from being accelerated, to enable data scientists to explore many more and larger datasets to drive towards their business goals, faster and more efficiently.
Apache Spark is the most popular data processing engine in data centers. We strive to accelerate Apache Spark 3.x use cases without application code changes. You will work with the open source community to accelerate Apache Spark with GPUs. You will engage in open source projects such as Apache Spark, RAPIDS Spark, RAPIDS, Apache Iceberg, Delta Lake, Hudi, Apache Kyuubi and more. The RAPIDS Spark library can be used both on-premise and cloud services. We strive to be available in cloud services including Tencent Cloud EMR, Alibaba Cloud EMR, Databricks, AWS EMR, Google Dataproc, Microsoft Azure Synapse, Oracle OCI among others.
What you'll be doing:
Lead the architecture, design and implementation of accelerated Apache Spark and related big-data frameworks
Work with a team of engineers including PMC and Committers of Apache Spark, Apache Hadoop, Apache Hive, and Apache Arrow
Engage open source communities (including Apache Spark, RAPIDS) for technical discussion and contribution, and engage new communities where we may not have a strong presence yet
Represent NVIDIA in customer technical discussions and collaborations, within the China market and globally
Work with NVIDIA strategic partners to deploy advanced machine learning and data analytics solutions in public cloud or on-premises clusters
Present technical solutions at industry conferences and meetups
Collaborate with distributed systems teams to craft solutions to distributed processing problems challenges at large scale
Provide recommendations and feedback to teams regarding decisions surrounding topics such as infrastructure, continuous integration and testing strategy
Build, test and optimize CUDA/C++ libraries across different platforms
Mentor a team of talented engineers building distributed systems. Provide guidance on design, architecture and algorithms. Be active in code reviews.
What we need to see:
BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field
15+ years of work or research experience in software development
Outstanding technical skills in designing and implementing high-quality distributed systems
Excellent programming skills in C++, Java, Scala, Python
Strong verbal and written communication and interpersonal skills
Excellent knowledge about distributed system schedulers: Kubernetes, Hadoop YARN, Spark Standalone
Able to delve into a new area and quickly come up to speed
Able to work with teams across boundaries and geographies
Ways to stand out from the crowd:
Hands on experience with key open source big-data projects as a contributor or committer to Apache Spark, Apache Hadoop, Apache Flink, Apache Kafka, Apache Hive, Apache Arrow, Delta Lake
Experience with design and implementation of SQL engines
We are an AA/EEO/Disabled employer and with highly competitive salaries and a comprehensive benefits package. NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, contact us!
Please mention that you found the job on ARVR OK. Thanks.