Position Overview
Description
Overview:
We are seeking an experienced AI Network Engineer to support and optimize high-performance infrastructure powering AI/ML workloads. This role focuses on designing and maintaining GPU-accelerated environments leveraging NVIDIA technologies, high-throughput networking, and low-latency architectures.
Key Responsibilities:
+ Design, implement, and support high-performance networks for AI/ML workloads, including GPU clusters and distributed training environments
+ Deploy and optimize NVIDIA-based infrastructure (DGX systems, HGX platforms, or GPU clusters)
+ Configure and manage high-speed networking technologies such as InfiniBand, RoCE, and 100/200/400Gb Ethernet
+ Optimize network performance for east-west traffic, low latency, and large data throughput required for AI model training
+ Integrate NVIDIA software stack (CUDA, NCCL, GPU Cloud, AI Enterprise) with networking and compute environment...