SubBanner banner image

SRE | Permanent | London, Hybrid, AWS

London, United Kingdom (Britain / UK)

Apply by 24 Aug 2026

UK £90,000

Job Ref.: 56925

Job Type: Permanent

Job Description

About the job

  • Role: Site Reliability Engineer
  • Type: Full-time permanent role
  • Location: Hybrid, London City - 3 days per week on-site
  • Salary: £90,000 per annum
  • Industry: Technology - Gaming Platforms 

Our Client is a  premier provider of high-volume software solutions for the global iGaming and predictive analytics sector. With a footprint spanning the USA, UK, and Europe, they partner with industry leaders to engineer sophisticated platforms for sports wagering, prize-based systems, and complex market simulation environments. Their vision is to lead the evolution of interactive technology through intelligent, data-driven architecture that ensures seamless user experiences. The firm is driven by a culture of teamwork, transparency, and technical excellenc

The role

You will help shape and drive how the firm builds and operates reliable, observable, secure, and cost-efficient systems on AWS. Working closely with development, platform, and incident management teams, you will define reliability in measurable terms and build the tooling and processes to achieve it, improving platform speed, stability, and scalability.

Key responsibilities

  • Partner with engineering teams to define, measure, and manage SLOs/SLIs, using error budgets to guide delivery decisions.
  • Enhance observability across services (metrics, logs, traces) to detect and resolve issues proactively.
  • Lead cost optimisation: monitor spend, right-size workloads, tune autoscaling, and improve infrastructure efficiency.
  • Improve production readiness via pre-deployment checks, post-release validation, and robust platform guardrails.
  • Introduce and run chaos engineering experiments to strengthen resilience and recovery.
  • Automate operational processes to reduce manual intervention and toil across the stack.
  • Support major incident response, root-cause analysis, and continual improvement actions.
  • Collaborate cross-functionally to raise standards for stability, security, performance, and compliance.

Required skills & experience

  • 3+ years’ experience in SRE, Platform, or DevOps roles within production environments.
  • Strong Kubernetes operational experience (on-prem and AWS EKS).
  • Hands-on experience defining and operating SLOs/SLIs, alerting, and incident workflows.
  • Deep understanding of observability and telemetry (monitoring, logging, tracing).
  • Infrastructure as Code with Terraform; experience with GitOps workflows and CI/CD.
  • Scripting proficiency in Python, Bash, or Go.
  • Proven ability to balance cost efficiency with reliability and performance.
  • Excellent communication skills and the ability to work effectively across multiple teams.

Strong Desirables for this role 

  • Experience running chaos engineering experiments.
  • Exposure to high-throughput, low-latency systems.
  • FinOps knowledge or cost management practices.
  • AWS certifications (e.g., Solutions Architect, DevOps Engineer)
APPLY NOW

Recent Jobs.

Trading PMPublish date: Invalid Date
Houston, United States of America (USA)

Location: Houston, United States of America (USA)  |  Rate: £120/hr Overview We are seeking a Trading PM within the Consultancy & Advisory sector to lead the delivery of trading and risk initi

Senior Software Engineer (GO)Publish date: Invalid Date
London, United Kingdom (Britain / UK)

We are seeking a highly skilled Senior Backend Engineer (Golang) to design, build, and scale cloud-native backend platforms that power enterprise AI and GenAI initiatives across a global banking envir

Sachbearbeiter PensionskassenPublish date: Invalid Date
Zürich, Switzerland

Sachbearbeiter Pensionskassen Location: Zürich Level: Professional (bis ca. 5 Jahre Erfahrung) Start: August 2026 Contract Duration: 12 Monate Pensum: 100% Overview Für unseren Kunden in Zürich suc