binary technology development pte. ltd. | singapore, Singapore | Posted June 05, 2026
Position Overview
Responsibilities: Handle production incidents and post-mortem analysis for system stability improvement Designing, deploying, monitoring, and troubleshooting Kafka and Redis clusters in PROD environment, ensuring optimal performance and reliability Work closely with development teams to ensure seamless deployment of applications or systems Manage and optimize cloud infrastructure (AWS, Alicloud) for performance, cost, and reliability Develop Devops platform like online load test, change management system Leverage LLMs or AI frameworks (OpenAI, Dify, Agno, LangChain) to enhance automation in infrastructure operations, including intelligent alert triage, RCA (Root Cause Analysis), and chat-based operations (ChatOps) Continuously explore and integrate AI-driven insights into operational processes to improve reliability, reduce noise, and empower engineering teams with intelligent decision-making. Qualifications: 5+ years of hands-on experience in Kafka and Redis operations in large-scale ...