Kernel Engineer, Modular

$166.5-242k

The estimated base salary range for this role to be performed in Canada, regardless of the province, is $157,500.00 - $228,500.00 CAD. The total compensation for a candidate will also include annual target bonus, equity, and benefits, with equity making up a significant portion of your total compensation

C++

Mid and Senior level

San Francisco Bay Area

Remote from Canada, US

Fast, scalable Gen AI inference platform

Job no longer available

Fast, scalable Gen AI inference platform

201-500 employees

B2BArtificial IntelligenceMachine Learning

Job no longer available

$166.5-242k

C++

Mid and Senior level

San Francisco Bay Area

Remote from Canada, US

201-500 employees

B2BArtificial IntelligenceMachine Learning

Company mission

To have real, positive impact in the world by reinventing the way AI technology is developed and deployed into production with a next-generation developer platform.

Job

Company

Role

Who you are

In-depth knowledge of C++ and low-level (micro)architectural performance is required
4+ years of experience working on complex code and systems
Experience with performance modeling and performance data analysis
Understanding of Parallelization techniques for ML / HPC Acceleration
Deep interest in machine learning technologies and use cases
Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture

Desirable

Some knowledge of compiler fundamentals is valuable, as is familiarity with kernel authoring paradigms (i.e., OpenMP, CUDA, Halide, Rise/Lift, or others)
Experience with performance profilers, performance data analysis tools, visualization tools, and debugging or experience working with embedded systems
Experience working with distributed/parallel programming models and an understanding of parallel hardware
Experience developing firmware for accelerators and embedded programming
Experience with HPC programming and accelerator languages such as CUDA, OpenCL, SYCL, etc

What the job involves

ML developers today face significant friction in taking trained models into deployment
They work in a highly fragmented space, with incomplete and patchwork solutions that require significant performance tuning and non-generalizable/ model-specific enhancements
At Modular, we are building the next generation AI platform that will radically improve the way developers build and deploy AI models
A core part of this offering is providing a platform that allows developers reuse deployment specific tuning and enhancements across model families and frameworks
As an AI Kernel Engineer you will own developing and tuning performance libraries for AI models
You will develop kernels and algorithms to increase performance of kernels, reduce the activation volumes, speedup data pre- and post-processing, and in general increase the end-to-end performance of the model
Design and optimize high-performance ML numeric and data manipulation kernels/operators
Utilize low-level C/C++/Assembly programming to achieve state of the art performance. Your work will also entail potentially introducing new novel compiler and tools support
Work with compiler, framework, runtime and performance teams to deliver end-to-end performance that fully utilizes today’s complex server and mobile systems
Collaborate with architects and hardware engineers to co-design future accelerators, including ISA for new hardware features and evolving ISA
Collaborate with machine learning researchers to guide system development for future ML trends

Our take

AI development is booming, but the tools behind it? Not so much. Developers are often stuck dealing with messy, disconnected systems that slow everything down and bump up costs.

To address this, Modular has developed a unified AI platform that streamlines the development process. It's AI Engine enhances the performance of models on CPUs and GPUs, supporting popular frameworks like TensorFlow and PyTorch. And on top of that, it's built Mojo, a programming language that's as easy as Python but much faster (35,000 times faster).

In 2023, the company raised an impressive $100M to power its mission. It's been using this funding to grow its team, support more hardware, and push Mojo even further. With the hopes of making AI development faster, smoother, and a whole lot more fun for everyone.

Kirsty

Company Specialist at Welcome to the Jungle

Company

Funding (2 rounds)

Aug 2023

$100m

LATE VC

Jun 2022

$30m

EARLY VC

Total funding: $130m

Company benefits

A variety of fantastic health benefits (health, dental, vision insurance; life insurance etc) are available
A 401k plan with up to 5% match
Free tax advice on Carta
Generous work-from-home stipend of $1500 to help you improve your home office
Unlimited paid time off and flexible work hours

Leadership

Chris Lattner

(Co-Founder & CEO)

Founder, Architect, Engineer and BDFL for the LLVM Organization. Former President of Engineering and Product at SiFive, as well as working in various roles at Google, Tesla, and Apple.

Tim Davis

(Co-Founder & President)

Worked at Google for 5 years, ultimately as Group Product Lead for Google ML. Previously founded Fluc Inc, where they developed the entire ML powered delivery system with their co-founder, and mapping software system CrowdSend.

Salary benchmarks

We don't have enough data yet to provide salary benchmarks for this role.

Submit your salary to help other candidates with crowdsourced salary estimates.

Share this job

View 5 more jobs at Modular