[updated] latency characteristics for the SDR Mellanox card MT25204

[update] This was a PCI contention issue. Customers original code and test cases did not tickle this performance feature.. Their next code did. ConnectX was designed to handle codes of the latter type.
Its also quite dangerous to take as gospel any of the output of diagnostic programs without their context. And if you are a vendor, and you have a customer reporting things like this, have a careful look at what they are doing. There may be a signal there … just don’t get caught up in the number without a context. I allowed that to happen to me today without realizing it. We serve our customers best when we bring a critical mind and rigor to their issues. I did that with most of our customers today …
In short: The Mellanox SDR cards are fine. No issues. They work as intended.
[end update]

  1. Sounds very strange to me. never saw it on server platoforms. it is probably bad system setting

  2. @Gilad
    We are seeing it quite consistently across a wide range of hardware (4 different platforms). About to check a 5th out.

  3. Ok, we understand it now. Its a PCIe contention issue. These cards weren’t designed for many threads contending for the resources at once. Quadrupling the contention results in a 25x worse latency, and an order of magnitude less bandwidth.
    Cards worked fine for their other app (less contention).

