Senior Engineer, Illumio

Cloud, Observability Lead

$161-185k

AWS
GCP
Python
Go
Terraform
Azure
Prometheus
Grafana
Datadog
Senior level
San Francisco Bay Area

5 days a week in office (Sunnyvale, CA)

Illumio

Automated enforcement against cyberattacks

Open for applications

Illumio

Automated enforcement against cyberattacks

501-1000 employees

B2BEnterpriseSaaSCyber SecurityAutomation

Open for applications

$161-185k

AWS
GCP
Python
Go
Terraform
Azure
Prometheus
Grafana
Datadog
Senior level
San Francisco Bay Area

5 days a week in office (Sunnyvale, CA)

501-1000 employees

B2BEnterpriseSaaSCyber SecurityAutomation

Company mission

To enable every organization to realize a future without high-profile breaches.

Role

Who you are

  • Proven experience in a DevOps or observability-focused role, concentrating on production service management and operational excellence
  • Prior experience working with microservices in a production environment is a must
  • At least 5+ years of experience managing large numbers of instances in public clouds like AWS, Azure, GCP, etc
  • Strong expertise in observability practices and tools (e.g., Prometheus, Grafana, Datadog)
  • Experience enhancing logging, reducing log noise, and integrating critical metrics into services
  • Proficiency in building and managing dashboards and monitoring tools
  • Expertise in setting up and managing PagerDuty alerts, with on-call rotation and escalation management knowledge
  • Strong collaboration skills to work closely with engineering teams, advocating for observability best practices
  • Familiarity with cloud platforms (AWS, GCP, Azure) and modern CI/CD processes
  • Automation scripting or coding experience (Python, Go, or similar)
  • Knowledge of infrastructure-as-code tools (e.g., Terraform, CloudFormation)
  • Excellent problem-solving skills and attention to detail in managing complex systems

What the job involves

  • We are seeking a Senior Engineer for our Cloud team with a strong focus on observability to join our engineering team as the Observability Lead
  • In this role, you will champion initiatives to enhance our production systems'the reliability, visibility, and operational readiness of our production systems
  • You will collaborate closely with engineers to catalog services, improve logging practices, reduce log noise, and integrate additional metrics across all applications
  • Additionally, you will develop runbooks, build dashboards, and manage PagerDuty configurations and escalation workflows
  • Serve as an advocate for observability practices within the engineering team, promoting operational best practices and reliability
  • Catalog all production services, documenting critical details for operational visibility and management
  • Collaborate with engineering teams to develop and implement a comprehensive observability plan, ensuring metrics are integrated into all services
  • Enhance logging practices where needed, reduce log noise, and ensure meaningful insights are captured
  • Add and refine metrics across applications to improve operational visibility and performance tracking
  • Develop detailed runbooks for critical alerts and incidents, facilitating efficient response processes
  • Build and maintain dashboards that offer insights into SLAs, performance, and business metrics for engineering and product teams
  • Set up and manage PagerDuty alerts, define on-call duties, and establish incident escalation paths
  • Continuously improve alerting, logging, and monitoring processes to enhance service reliability and reduce unnecessary noise

Share this job

View 37 more jobs at Illumio

Insights

Top investors

1% employee growth in 12 months

Company

Company benefits

  • Medical, Dental, Vision Coverage
  • Health and Dependent Savings Accounts
  • Life and Disability Programs
  • Paid Parental Leave
  • Voluntary Benefit Programs
  • Company Sponsored Wellness Program
  • Wellness Reimbursement Program
  • Retirement Savings
  • Equity Opportunities
  • Paid time off and Paid Holidays
  • Employee Incentive Program

Funding (last 2 of 6 rounds)

Jun 2021

$225m

SERIES F

Feb 2019

$65m

SERIES E

Total funding: $562.1m

Our take

Cybersecurity remains an ongoing battle, with evolving cyberthreats and ever-more determined cybercriminals, and major data breaches such as the SolarWinds hack have the potential to cause widespread disruption. Illumio is taking a different approach to most of the mainstream security companies, with a process called "microsegmentation". This makes it easier to guard companies from security breaches, and contain any breaches that occur.

Detection of viruses is no longer considered sufficient to keep companies and individuals safe. Illumio takes a “zero trust” approach, in which everything is viewed as suspicious, a similar approach to that which is used by the White House. The company operates at the cutting edge of microsegmentation, focusing on protecting cloud datacenters and workloads. With more and more companies moving to the cloud, this approach is crucial.

Illumio also protects endpoints, partnering with big players such as Citrix, Amazon Web Services, and Splunk. A collaboration with Appgate, a Zero Trust secure access company, delivers the industry's first integrated Zero Trust Network Access and Segmentation solution, which can reduce risk across hybrid infrastructure, and proves Illumio's dedication to innovating the space. Additionally, the company announced a Federal Risk and Authorization Management Program (FedRAMP), that is tailored to assist federal agencies in reducing risk.

Kirsty headshot

Kirsty

Company Specialist at Welcome to the Jungle