Hello,
We’re trying to use the cutensor cutensorContractTrinary function in our python program, and we wrote a wrapper for it. We have a non-standard contraction pattern, and we noticed a problem in workspace size estimation.
Here’s our minimal example (please remove the .txt extension, this forum doesn’t allow me to upload .py files):
trinary_contraction.py.txt (6.1 KB)
It errors out with RuntimeError: cutensorContractTrinary failed. err=19 , and we believe this error code means CUTENSORNET_STATUS_INSUFFICIENT_WORKSPACE. When we set the ws_size to a much higher value, this error went away, so we think we did not get a correct workspace size estimation from estimateWorkspaceSize function call.
In one of my pip environment I have cutensor-cu12==2.3.1 and cupy-cuda12x==13.6.0. We have tried on both one A100 and one 5080 card, both reproducing the same problem.
We are not sure if we used the cutensor functions in the correct way. Any advise will be greatly appreciated.
Thanks!