Site Reliability Engineering Lead - London

Greater London, South East, England

Apply by 29 Apr 2026

£90000 - £110000 per annum

Job Ref.: BH-56921-1

Job Description

We’re looking for a true SRE leader with a strong software engineering background. This isn’t a DevOps “on-call only” role — you’ll need to be comfortable reading and writing production code, deeply understanding application behaviour, and working alongside developers as a technical peer.

You’ll lead and mentor the SRE team, setting direction and raising the bar for reliability across our systems. You’ll take end-to-end ownership of production, ensuring availability, performance, and effective incident response, while defining SLIs and partnering with Product on meaningful SLOs and error budgets.

In practice, that means you’ll:

Own production systems (availability, performance, incident response)
Define SLIs/SLOs and use error budgets to guide decisions
Run incident management, on-call, and blameless postmortems
Get hands-on with code (PHP, Java/.NET) to troubleshoot and improve reliability
Drive automation and reduce operational toil
Build observability that gives real insight into system health
Partner with engineers to embed reliability into the SDLC

A big part of the role is shaping culture — creating a blameless environment, improving how we respond to incidents, and driving continuous, systemic improvements. You’ll also lead on capacity planning, performance optimisation, and cost efficiency as the platform scales.

We’re looking for someone who brings strong technical leadership, communicates clearly (especially during incidents), and takes real ownership of problems through to resolution. You should be comfortable operating at scale, have deep experience with SLIs/SLOs, incident management, and observability tooling, and be at home working with Linux, databases, cloud platforms (ideally Azure), Kubernetes, and Infrastructure as Code.
Just as importantly, you should enjoy tackling complex, imperfect systems — and turning them into something reliable, scalable, and well-understood.

APPLY NOW

Recent Jobs.

Site Reliability Engineering Lead - London

Greater London, South East, England

Oracle HCM Business Anallyst - Financial Services

London, Greater London, South East, England

Oracle HCM Business AnalystLondon - 3 days a week onsite £700 to £800 per day via Umbrella 6 Month Contract extensions A leading Financial Services business is seeking an Oracle HCM BA to join them

Programme Manager - Asset Management

London, Greater London, South East, England

Programme Manager6 Month Contract Extensions 3 days a week, onsite in London£700 - £800 per day via Umbrella A leading Asset Manager is searching for a Programme Manager to join them on an interim b