The Job logo

What

Where

Senior Compiler Optimization Engineer - MLIR

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Join our team to analyze and improve the performance of application code on NVIDIA GPUs. We are looking for someone with strong experience in compiler optimizations, C++ programming, and a good understanding of software engineering principles. This role involves working with geographically distributed teams and being part of the forefront of deep-learning compiler technology. Apply now for an opportunity to contribute to architecture design and work with higher-level programming languages.

What you will be doing:

  • Analyze the performance of application code running on NVIDIA GPUs with the aid of profiling tools.
  • Identify opportunities for performance improvements in the LLVM based compiler middle end optimizer.
  • Design and develop new compiler passes and optimizations to produce best-in-class, robust, supportable compiler and tools.
  • Interact with Open-source LLVM community to ensure tighter integration.
  • Work with geographically distributed compiler, hardware and application teams to oversee improvements and problem resolutions.
  • Be part of a team that is at the center of deep-learning compiler technology spanning architecture design and support through higher level languages.

 

What we need to see:

  • B.S, M.S or Ph.D. in Computer Science, Computer Engineering, or related fields (or equivalent experience).
  • 5+ years experience in Compiler Optimizations such as Loop Optimizations, Inter-procedural optimizations and Global optimizations.
  • Excellent hands-on C++ programming skills.
  • Understanding of any Processor ISA (GPU ISA would be a plus).
  • Strong background in software engineering principles with a focus on crafting robust and maintainable solutions to challenging problems.
  • Good communication and documentation skills and self-motivated.

 

Ways to stand out from the crowd:

  • Masters or PhD preferred
  • Experience in developing applications in CUDA or other parallel programming language.
  • Deep understanding of parallel programming concepts.
  • LLVM and/or Clang compiler development experience.
  • Familiarity with deep learning frameworks and NVIDIA GPUs.


 

Set alert for similar jobsSenior Compiler Optimization Engineer - MLIR role in Santa Clara, United States or Redmond, United States
NVIDIA Logo

Company

NVIDIA

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 Years

Category

Engineering

Locations

Santa Clara, California, United States

Redmond, Washington, United States

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

NVIDIA Logo

Senior Deep Learning Compiler Engineer - MLIR

NVIDIA

Austin, Texas, United States

+2 more

Posted: a year ago

Analyzing deep learning networks and developing compiler optimization algorithms. Collaborating with teams to accelerate deep learning software. Defining APIs, performance tuning, crafting and implementing compiler techniques.

NVIDIA Logo

Senior Performance Engineer

NVIDIA

Santa Clara, California, United States

Posted: a year ago

What you’ll be doing: Lead all aspects of implementing performance practices in large scale infrastructure, deliver powerful tools, methodologies, and flows to validate and improve several datacenter products in parallel. Accelerate strategic customer deployments and ensure speed-of-light bringup and deployment of ground-breaking AI infrastructure by working hand in hand tailoring design and faster processes to customer needs. Specific responsibilities include owning the architecting of performance design and settings of datacenter at scale products both implemented in FW and SW components to ensure velocity and scale while efficiently using resources. This involves early engagement with HW/FW/SW/platform internal and customer teams, and other groups, to build end-to-end solutions and optimize datacenter product designs. As a key member you will supply to architecting of the implementation of server and rack level telemetry aspects, collaborate and establish continuous improvements in our design flows. Participating in engagements with various SW and FW (BMC/SBIOS/OS/drivers etc) teams to develop best-in-class practices and tools, you will be analyzing, debugging and resolving critical firmware and software issues for the best AI workload performance at scale. Provide engineering solutions to enable large scale performance strategies for performance for Datacenter GPU Computing products and software stacks, ensure technical relationships with internal and external engineering teams, and assisting systems engineers in building creative solutions based on NVIDIA technology. Be an internal reference for firmware, at scale deployment for datacenter and large-scale GPU-accelerated system solutions among the NVIDIA technical community.   What we need to see: 5+ years of experience in using accelerated computing for datacenter container computing solutions. Strong knowledge of accelerated computing software stacks (CUDA). Experience using and handling modern Cloud and container-based Enterprise computing architectures. C/C++/Python/Bash programming/scripting experience. Experience with CPU architecture. Experience with container technology and Linux based OSes. Experience working with engineering or academic research community supporting high performance computing or deep learning. Strong verbal and written communication skills. Strong teamwork and social skills. Ability to multitask effectively in a dynamic environment. Action driven with strong analytical and analytical skills. Desire to be involved in multiple diverse and creative projects. BS in Engineering, Mathematics, Physics, or Computer Science (or equivalent experience). MS or PhD desirable.   Ways to stand out from the crowd: Deep Learning framework skills. DL and graph compiling programming skills. Exposure to virtualization techniques, cloud platform solutions. Exposure to scheduling and resource management systems. Experience with high performance or large scale computing environments.

NVIDIA Logo

Senior HPC Scheduler Engineer

NVIDIA

Santa Clara, California, United States

Posted: a year ago

What you’ll be doing: Provide engineering solutions and prototypes to enable efficient resource management and job scheduling for large scale clusters, ensure technical relationships with internal and external engineering teams, and assist system architects and machine learning/deep learning engineers in building creative solutions based on NVIDIA technology. Be an internal reference for scheduling and resource management concepts and methodologies among the NVIDIA technical community. Test, evaluate, and benchmark new technologies and products and work with vendors, partners and peers to improve functionality and optimize performance. What we need to see: 5+ years of experience designing and running scheduling and resource management systems in large datacenter/AI/HPC solutions. Knowledge and experience with resource management / scheduling code bases: SLURM preferred, other implementations (LSF, SGE, Torque...). Proven understanding of performance clusters, infrastructure and workload patterns. Experience using and installing Linux-based server platforms. C/Python/Bash/Lua programming/scripting experience. Experience working with engineering or academic research community supporting HPC or deep learning. Strong teamwork and both verbal and written communication skills. Ability to multitask efficiently in a very dynamic environment! Action driven with strong analytical and troubleshooting skills. Desire to be involved in multiple diverse and innovative projects. BS in Engineering, Mathematics, Physics, or Computer Science or equivalent experience. MS or PhD desirable. Ways to stand out from the crowd: Experience with HPC cluster administration for AI. Experience deploying containerized services. Experience with orchestrators (e.g. Kubernetes). Demonstrated work with Open-Source software: building, debugging, patching and contributing code. Experience tuning memory, storage, and networking settings for performance on Linux systems. Exposure to monitoring and telemetry systems.voyager