Staff Cloud Software Engineer, Tenstorrent

Cloud Infrastructure

$100-500k

OTE

Range inclusive of base and variable compensation targets

Kubernetes
Python
Go
Ansible
Prometheus
Grafana
Git
REST API
loki
Expert level
Austin
San Francisco Bay Area
Tenstorrent

Building computers for AI

Be an early applicant

Tenstorrent

Building computers for AI

501-1000 employees

B2BArtificial IntelligenceEnterpriseManufacturingDeep TechMachine Learning

Be an early applicant

$100-500k

OTE

Range inclusive of base and variable compensation targets

Kubernetes
Python
Go
Ansible
Prometheus
Grafana
Git
REST API
loki
Expert level
Austin
San Francisco Bay Area

501-1000 employees

B2BArtificial IntelligenceEnterpriseManufacturingDeep TechMachine Learning

Company mission

To bring unique solutions to the problems facing AI and machine learning.

Role

Who you are

  • 10+ years of hands-on software engineering experience working with distributed systems in Cloud and/or HPC environments
  • 5+ years of experience working with clustered (multi-host) AI hardware and applications for training and inference
  • 5+ years of experience with Kubernetes clusters, including cluster and application deployment (e.g., CNI, CSI, Helm), operations, and development of extensions (e.g., Device plugins, Operators)
  • Strong working knowledge of Python and Go
  • Infrastructure as Code as a first-class citizen (e.g. Ansible)
  • Strong Git, GitOps, and CI/CD experience
  • Familiarity with performance requirement implications of AI/ML workloads, both inference and training
  • Familiarity with virtualization technologies and platforms
  • Hands-on experience with MLOps concepts and frameworks for end-to-end model training pipelines
  • Strong understanding of networking concepts – experience with network hardware configuration and management is a plus
  • Familiarity with security implications of multi-tenant environments on hardware, software, and networking level
  • Familiarity with observability, monitoring and alerting tools (e.g., Grafana, Prometheus, Loki)
  • Agile / lean software project management experience
  • Strong programming skills with years of experience in various programming languages; familiarity of both object oriented and functional programming
  • REST API development and integration experience – full-stack web development experience is a plus

What the job involves

  • This Staff Cloud Software position is looking to bring new specialized expertise into the team in the area of distributed high-performance and AI computing, especially in Kubernetes-based cloud native environments
  • You will be driving design, implementation, and integration of systems to support scaling compute capabilities seamlessly from single-host systems into exaflop-scale clusters
  • Design and drive implementation of distributed systems for AI computing applications in Cloud and novel supercomputing cluster environments
  • Hands-on software development, testing, integration, operations, and support
  • Closely collaborate with the team through the full stack and life cycle of AI data center applications, from data center design and rollout to MLOps
  • Operate within on-premises data centers and public cloud environments
  • Drive projects through their whole software development lifecycle, both on technical and non-technical side
  • Collaboration with both highly technical and non-technical stakeholders with differing backgrounds, being able to communicate highly complex topics to diverse audiences
  • Continuous improvement of engineering practices through code reviews and adoption of relevant techniques and technologies

Share this job

View 18 more jobs at Tenstorrent

Insights

Top investors

52% employee growth in 12 months

Company

Funding (last 2 of 7 rounds)

Dec 2024

$700m

SERIES D

Jun 2024

$300m

LATE VC

Total funding: $1.3bn

Our take

Advances in artificial intelligence apps, combined with a global chip shortage, are strong tailwinds for chip processor Tenstorrent. The semi-conducter developer is designing specialized chips for running artificial intelligence applications, building chips for what it calls "software 2.0".

The company’s unique approach is to build for this future software, which it believes will be written at a higher level. Combining this with conditional computing, Tenstorrent supports a scalable distributed network approach that is analogous to a human brain.

While the market for specialised AI hardware is destined to be gigantic, the company has to compete with a plethora of hopeful startups (as well as major tech companies like NVIDIA and Qualcomm). That said, Tenstorrent boasts an excellent trajectory so far, with its latest moves including unveiling its high-end Wormhole AI processors, and a partnership with LG which will see the development of chips for "Affectionate Intelligence".

Kirsty headshot

Kirsty

Company Specialist at Welcome to the Jungle