- Work on GenAI, large‑scale model training, and GPU performance optimization
- Exposure to multi‑node, multi‑GPU systems, low‑level optimization
About Our Client
The client is a global technology manufacturer recognized for innovation in advanced systems, AI, and high‑performance computing. With a strong commitment to research, sustainability, and engineering excellence, they provide an environment where highly technical engineers can solve complex, real‑world problems at scale.
Job Description
- Architect and execute large‑scale model training and fine‑tuning on multi‑node, multi‑GPU clusters
- Optimize training and inference performance using distributed strategies (DDP, FSDP, DeepSpeed, Megatron‑LM)
- Design and develop autonomous AI Agents for complex, multi‑step manufacturing workflows
- Profile and analyze GPU‑intensive workloads to identify compute, memory, and latency bottl...