Britain's Premier Job Portal
Duration: 12 months
Location: Toronto
Hybrid: 2 days in office a week
Lead Site Reliability Engineering (SRE) initiatives and manage the SREWCCS roadmap, ensuring end‑to‑end system stability and performance across on‑premise and cloud‑native services.
Deep application and system‑level knowledge across complex end‑to‑end environments with tightly integrated services, supporting large‑scale, multi‑tier transaction flows.
Hands‑on role from day 1, including instrumenting, analyzing, and troubleshooting complex distributed applications using APM and observability platforms (Dynatrace or comparable tools).
Provision assessments of current capabilities, identify gaps, and contribute to the SRE WCCS roadmap.
Navigate multi‑team SRE and IT Ops to drive results and provide creative workarounds and solutio...