Description: We are looking for a network performance engineer who will be capable of configuring and monitoring RoCEv2 system. The candidate needs to have strong technical knowledge/experience with ROCEv2, TCP, congestion control protocols, PFC, ECN, etc.
The candidate will be required to configure and monitor RoCEv2 network – both NIC and switch side. The candidate needs to evaluate different RoCEv2 congestion control and flow control options using benchmarks and workloads (will be provided), evaluate the impact of different options, explain the behavior, and automate performance monitoring/reporting. Experience with NVIDIA GPU, GPUDirect RDMA, etc. is a plus.
What are the top non-negotiable skill sets required for this position?
Deep hands-on experience with setting up RoCEv2 networks
Automating performance monitoring and reporting