Cuda Error with VSS + CV pipeline on 4x L40s

pawar.siddhant1 · June 3, 2025, 6:25pm

Please provide the following information when creating a topic:

Hardware Platform (GPU model and numbers)

4x L40S GPUs

CPU Specs

2x Intel(R) Xeon(R) Gold 6448Y

System Memory

500 GB

Ubuntu Version

22.04.5 LTS

Kubernetes Version

MicroK8s v1.32.3 revision 8148

NVIDIA GPU Driver Version (valid for GPU only)

565.57.01

Nvidia GPU Operator Version

24.6.2

Issue Type( questions, new requirements, bugs)

Cuda error, please see the logs attached below
error_memory_custom_cv.txt (11.3 KB)

How to reproduce the issue ? (This is for bugs. Including the command line used and other details for reproducing)

I have attached the overrides_cv.yaml file used to recreate the setup.
overrides_cv.txt (4.8 KB)
When we run summarization on 20 sec video with 5sec chucking, it run fine.
But if we use 1 min video with any chucking it errors.(see the error_memory_custom_cv.txt)

Requirement details (This is for new requirement. Including the logs for the pods, the description for the pods)

Please look into the issue.

yuweiw · June 4, 2025, 5:32am

Could you first try to disable the CV and try again?

pawar.siddhant1 · June 4, 2025, 6:15am

Hi yuweiw,

Everything works fine with CV disabled.
Everything works fine with CV enabled but small video 20 sec video

The Out of memory error occurs with CV enabled and 1 min video

We require the CV pipeline feature, as it is giving better result for our usecase.

yuweiw · June 4, 2025, 7:40am

The CV pipeline requires more models and GPU resources. Could you try to use 8xL40s or 4xH100(80G)?

pawar.siddhant1 · June 4, 2025, 8:01am

Hi yuweiw,

We can’t increase the compute we are limited to 4x L40s.

Can you take a look at the overrides_cv.yaml file and suggest any optimisation, which can help us run the CV on 4x L40s.

yuweiw · June 4, 2025, 9:11am

You can try the ways below. However, there is no guarantee that the deployment will be successful.

  - name: NIM_LOW_MEMORY_MODE
    value: "1"
  - name: NIM_RELAX_MEM_CONSTRAINTS
    value: "1"

Run the nvidia-smi command to check the utilization of resources, then allocate the resources more reasonably

Even if it can be successfully deployed, the service will be extremely slow.

pawar.siddhant1 · June 19, 2025, 8:00am

Hi yuweiw,

we reducing the NUM_CV_CHUNKS_PER_GPU to 1 from 2, this helped us run larger videos without getting cuda out of memory error.

vss:
  applicationSpecs:
    vss-deployment:
      containers:
        vss:
          env:
          - name: NUM_CV_CHUNKS_PER_GPU
            value: "1"

Topic		Replies	Views
cuda and insufficient driver version CUDA Setup and Installation	0	3206	January 9, 2013
Nvcc lower version than CUDA causes compiled code runtime error 300 CUDA NVCC Compiler	4	80	September 24, 2024
Which CUDA do You recommend? CUDA Setup and Installation	12	143	May 13, 2025
Cuda Error #4 that requires PC Reboot, Help!!! CUDA Programming and Performance	17	9603	September 17, 2013
Instaling cuda 12.5 i have 12.3 CUDA Setup and Installation	2	609	June 20, 2024
NVCC error when trying to compile FFMPEG with --enable-cuda-nvcc flag CUDA Setup and Installation ffmpeg , nvcc	5	11883	June 8, 2024
Cuda or OpenCL 32bit - OK, 64bit - KO. Why? (460.39, nvidia-uvm) Linux cuda , kernel , linux	4	695	October 12, 2021
installation problem in NVS 315 CUDA Setup and Installation	2	1834	May 4, 2015
Cuda cannot find my graphic card? CUDA Setup and Installation	5	2413	April 9, 2019
Sample devieQuery cuda program error in Cuda 10.0 and Centos 7 CUDA Setup and Installation	2	951	April 1, 2019

Cuda Error with VSS + CV pipeline on 4x L40s

Related topics