About the Role
This role focuses on designing, building, and operating secure, scalable cloud infrastructure to support data platforms across AWS, Databricks, and Snowflake. You will work closely with engineering and security teams to enable reliable, observable, and cost‑efficient systems. The position plays a key role in improving platform performance, resilience, and operational excellence.
Responsibilities
- Design, build, and manage cloud infrastructure using Infrastructure as Code with consistent, auditable environments
- Operate and enhance observability using Grafana, including dashboards, alerts, and noise reduction
- Implement and validate backup and disaster recovery processes, ensuring readiness and alignment to RPO/RTO
- Troubleshoot incidents across infrastructure and data platforms, leading root cause analysis and mitigation
- Optimize performance and cost through usage analysis, rightsizing, and architectural...