Suggested cable to link two Sparks?

Raphael – Thanks for that.

I’ve actually seen several postings, whether on Medium or the Nvidia forums, that advise on how to hook up two Sparks and/or run vLLM on Sparks. For example:

This last one worked for me pretty much out of the box but the token generation speed was about 3 tps. Mark Ramsey, the author, has contradictory observations on speed in several files in the repository — at one point he says 3 tps is the best you can do on Llama3.3-70B and at another he says 50 - 100 tps. In the main README he says:

Performance expectations:

  • With InfiniBand/RoCE: 50-100 tokens/s for Llama-3.3-70B

  • With Ethernet fallback: <10 tokens/s (indicates configuration problem)

I filed an issue on this GitHub to get clarification but so far no response. I’m trying to determine whether I have a configuration problem due to my trying these multiple postings at one time or another; perhaps they gave inconsistent advice.

I will investigate Doran Gao’s instructions next.