Site Reliability Engineer, PEXA

£70-75k

AWS
Kubernetes
Terraform
Splunk
Azure
Prometheus
Grafana
Git
Mid and Senior level
Remote from UK
PEXA

Online property settlement, tracking and insights

Job no longer available

PEXA

Online property settlement, tracking and insights

21-100 employees

FintechB2BReal EstateSaaS

Job no longer available

£70-75k

AWS
Kubernetes
Terraform
Splunk
Azure
Prometheus
Grafana
Git
Mid and Senior level
Remote from UK

21-100 employees

FintechB2BReal EstateSaaS

Company mission

To transform property experiences.

Role

Who you are

  • Distributed systems in AWS and/or Azure cloud environments
  • Bring a developer mindset to platform challenges, understanding how software and infrastructure are designed, implemented, and integrated
  • Strong knowledge of container orchestration and scaling, with experience in managing and troubleshooting workloads
  • Experience of managing Kubernetes clusters, service mesh and hosted workloads
  • Proficient in observability and monitoring tools, including configuring alerts, creating dashboards, and conducting root cause analysis. Some of the tools we use are: Grafana, Prometheus, Elastic, Splunk
  • Configuring incident management platforms such as PagerDuty
  • Hands-on experience with Infrastructure-as-Code (IaC) and automation to improve operational efficiency, using tools like Terraform, Bicep or CloudFormation
  • Strong understanding of modern SDLC and CI/CD processes, with experience in scripting, automation and version control systems such as Git
  • Collaborating in DevSecOps upholding security best practices and compliance standards. Understanding of security frameworks such as Azure or AWS Well-Architected Frameworks
  • Experience in high availability (HA) and disaster recovery (DR) strategiesand execution
  • Adept at collaborating with diverse teams across cultures and working effectively under pressure
  • Empathetic team player who builds strong relationships, tackles challenges, and delivers results while maintaining quality and team morale
  • Strong understanding of Agile principles, excellent communication skills, and a customer-centric mindset

What the job involves

  • The Site Reliability Engineer is responsible for the technical support and operation of UK Platforms (both from an application and infrastructure perspective) by actively managing all incidents to resolution and supporting software releases
  • The role endeavours to make sure that PEXA Groups support offering for our platform adheres to the highest level of operational and security requirements but at the same time deliver a seamless and secure support experience to our customers
  • The role is also responsible for additional activities including (but not limited to) application (E.g. SWIFT SILs), OS and Infrastructure patching, DR testing, creation of alerting and monitoring and service transition activities – knowledge transfer, operation playbook updates/knowledge articles update
  • The SRE will closely collaborate with the customer support team and the product development squads in various global locations to achieve the best outcome for the technical support of PEXA’s customers and integrated partners as well as working closely with, PEXA AU run teams to ensure alignment of PEXA’s strategic direction of creating a consistent and “best in class” support experience for PEXA’s customers globally
  • Overall, this role follows through on the vision and execution of the technical support function, is the contact point for technical incidents as well as for the support teams
  • Ensure high availability and reliability of UK platforms with day-to-day support
  • Manage incidents with rapid resolution, root cause analysis, and post-mortems to prevent recurrence
  • Optimise monitoring and alerting to enable proactive issue detection and fast response
  • Identify process improvements and suggest service management enhancements for long-term stability
  • Report problems, risks, issues, and change requests to minimise downtime
  • Coordinate resolution and escalation of Platform Services issues, fostering cross-team collaboration
  • Manage the Production environment, overseeing incidents, fixes, performance, and stability
  • Drive continuous improvement by automating processes and enhancing operational performance
  • Help define the cloud platform service roadmap to enhance system reliability
  • Collaborate with UK Support and Delivery Squads to address pain points and add value
  • Assist squads in estimating and resolving Platform Defects that cause incidents
  • Oversee application, OS, and infrastructure patching, DR testing, monitoring setup, KT, and updating operational playbooks and knowledge articles

Share this job

View 1 more job at PEXA

Company

Our take

Even as industries embrace the digital revolution, some practices lag behind. For example, property exchange and remortgaging still involve a host of paperwork and back-and-forth between parties. PEXA supplies an eConveyancing platform to automate and simplify these processes, improving accuracy and efficiency while reducing costs.

PEXA was founded in 2010 to bring automation to the Australian property market, before expanding to the UK. The company created its eConveyancing service from scratch, including developing the technology behind it, and can serve property exchange, purchase, development and more; though the PEXA UK division is currently focussed solely on its novel remortgaging solution, as the volume of UK remortgages is expected to rapidly escalate with growing interest rates.

PEXA has benefitted from the pandemic-instigated digital transformation, as well as the similarly fuelled property market boom, which allowed the company to grow rapidly in Australia. Having brought its platform to the much larger UK property market, the company has a chance to demonstrate the value of its digital property settlement solution to a new range of customers, and the potential for massive growth.

Freddie headshot

Freddie

Company Specialist at Welcome to the Jungle