Britain's Premier Job Portal
A successful Site Reliability Engineer will have:
Experience
• Minimum 3+ years of hands-on experience running AWS production systems at
scale
• Proven expertise with AWS EKS (Elastic Kubernetes Service) or similar and MSK
(Managed Streaming for Kafka) in production environments as well as database
performance diagnostics (MySQL, Postgres, MongoDB) in multi-TB scale databases
• Strong background in Infrastructure as Code, preferably with Pulumi using
TypeScript or equivalent Terraform experience
• Demonstrated experience participating in incident management (ideally as an
incident commander with a track record of leading post-mortem processes)
• Experience with high-volume data processing systems, ideally IoT telemetry or
streaming pipelines processing ≥50k messages per second
• Background in implementing and maintaining observability solutions using
Prometh...