Description
Hi, I tried running stable diffusion demo on V100 16G. The following error occurs:
python3 demo_txt2img.py “a beautiful photograph of Mt. Fuji during cherry blossom” --hf -token=$HF_TOKEN -v
Loading TensorRT engine: engine/vae.plan
[I] Loading bytes from engine/vae.plan
[E] 1: [defaultAllocator.cpp::allocate::21] Error Code 1: Cuda Runtime (out of memory)
[W] Requested amount of GPU memory (20401098752 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[E] 2: [executionContext.cpp::ExecutionContext::436] Error Code 2: OutOfMemory (no further information)
Traceback (most recent call last):
File “demo_txt2img.py”, line 83, in
demo.loadResources(image_height, image_width, batch_size, args.seed)
File “/workspace/mydata/tensorrt-sd/TensorRT/demo/Diffusion/stable_diffusion_pipeline.py”, line 151, in loadResources
self.engine[model_name].allocate_buffers(shape_dict=obj.get_shape_dict(batch_size, image_height, image_width), device=self.device)
File “/workspace/mydata/tensorrt-sd/TensorRT/demo/Diffusion/utilities.py”, line 234, in allocate_buffers
self.context.set_binding_shape(idx, shape)
AttributeError: ‘NoneType’ object has no attribute ‘set_binding_shape’
Then I added the –build-static-batch flag at the end of the command. The program worked fine, but I got an all-black image.
python3 demo_txt2img.py “a beautiful photograph of Mt. Fuji during cherry blossom” --hf-token=$HF_TOKEN -v --build-static-batch
Does SD demo not work on V100 now? Could you give me some suggestion?
Thank you!!
Environment
TensorRT Version: 8.6
GPU Type: V100 16G
Nvidia Driver Version: 530.30
CUDA Version: 12.1
CUDNN Version: 8
Operating System + Version: Ubuntu 18.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Container (docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.02-py3 /bin/bash)