Simple question but I cant found answer. May be anybody know it ?
The documentation says that each VL has its own send/recv buffers.
Does this mean that when I using 1,2 or more VL, the buffers of the other unused VLs are idle(or not work) ?
Or are VL buffers is the shared resource that are shared between the VLs used?
I tried to test with 4 parallel tests (qperf rdma_rc_write).
Run 4 tests without indicating SL and 4 tests, where each qperf use its own SL (SL0-3).
I did not found any performance differences.
Who will explain whether there can be a difference ?
Or how the VL buffers is implemented at the hardware level ?
I read this article. Good article. But I want to clarify. Is VL have dedicated buffers or its like shared register file with ranges between VLs. Is VL buffers idle if I use only 1 VL ?
Now I think I found Indirect answer about this in presentation  (slide 37, 4.2 Study of switch buffer size).
In this presentation study dedicated VLs buffers. Its give me indirect answer that each VLs have dedicated/own buffers and that is I use one SL/VL than others VL is Idle.
1 Analyzing InfiniBand Packets https://www.openfabrics.org/images/eventpresos/workshops2015/UGWorkshop/Thursday/thursday_09.pdf https://www.openfabrics.org/images/eventpresos/workshops2015/UGWorkshop/Thursday/thursday_09.pdf
Quick answer to your question: Yes. VL has a dedicated/pre-allocated buffers.
You can consult this document which explain in details the concept of SL/VL & QOS (QoS Architecture) based on our IB Spec.
Our Mellanox OFED UM has a section as well on QoS Architecture (Quality Of Service).
By default, all traffic is being sent to SL0 (one buffer) if QOS is not being implemented.