Hello,
I am using a quadro RTX 6000, and I have a problem with GENERATIVE AI.
I met at least 2 instances where my card could not handle things that other cards with less VRAM would do.
My card has 24GB vram, yet I am not able to handle bf16 and not able to run some scripts.
I am specifically using ComfyUI program.
Many models used are BF16 or need BF16, for example the “PULID FLUX” model, and the latest AI video gen model “MOCHI”.
I talked with the creator of ComfyUI and he suggested to modify the comfyUI code to allow for float16, so we commented a line of code and allowed the program to allow float 16:
#supported_inference_dtypes = [torch.bfloat16, torch.float32]
supported_inference_dtypes = [torch.float16, torch.bfloat16, torch.float32]
The result was that comfyUI was able to run the bf16 model (with float16) but the video was corrupt at the end.
__
Maybe it is failing to handle bf16 so it is turning to f32 → which makes the generation ultra slow and not effiscient → ultimately the card would fail at the generation whereas lower vram card that can handle bf16 would not fail.
Which is a shame isnt it?
__
I need an upgrade or driver or workaround that can make my card work with BF16? Can you do that NVIDIA please?
I mean this is not bad card. And it is still used. It has 24GB vram after all. It deserves to get an update to handle this problem.
__
As for the other AI model, called “PULID FLUX”, the comfyUI would simply show the error
RuntimeError: expected scalar type Half but found BFloat16
Please help.
Btw, I don’t know if this is the right place or not.
I tried “pip install nvidia-cudnn-cu12” and my card would have the same restrictions related to the mochi ai model generation (the screenshot above).
What can be done please? And tell me if there are other subforums I should post there. thanks