Business Unit What the Role Entails You will support the reliability, scalability, and security of Tencent’s business-critical systems in a cloud-native environment:
System Monitoring & Incident Response - Monitor production systems using tools like Prometheus/Grafana; identify and troubleshoot outages.
- Participate in on-call rotations to resolve real-time incidents (with mentor guidance).
Automation & DevOps Practices - Develop scripts (Python/Shell) to automate deployment, scaling, and recovery tasks.
- Assist in CI/CD pipeline optimization using GitLab, Docker, and Kubernetes.
Infrastructure Optimization - Analyze system performance metrics; propose solutions to enhance reliability and cost efficiency.
- Support cloud infrastructure management (Tencent Cloud/AWS/Azure).
Collaboration & Documentation