Explanation of contact parameters and relation to GPU memory usage


I am using ASE/AMP for a grasping task. My problem is that the GPU (11GB) is running out of VRAM very quickly (increasing from 4GB initially to 11GB+).

  1. I was wondering why the memory increases over time. My guess would be that with longer training time the model learns to grasp, thus causing more contacts/collisions: which increases GPU memory ?
  2. In general how can I find why the GPU is running out of memory
    • For instance I have been logging the total gpu memory and torch.cuda.memory_allocated. I observed that “torch.cuda.memory_allocated” stayed constant so it must be isaacgym using more memory? Can I somehow check what causes this memory increase?

Additionally, I struggle to completely understand following contact parameters as the documentation is very short.

  1. contact_offset - shapes whose distance is less than the sum of their contactOffset values will generate contacts:
  2. max_gpu_contact_pairs - Maximum number of contact pairs
    • How much memory does one contact pair use? How can I compute the GPU memory used for contact pairs based on this parameter?
    • Is the entire memory for contact pairs preallocated? I experience that when I increase the value of this parameter and then start the simulation the GPU uses more memory from the get go; But memory still increases during training?
  3. default_buffer_size_multiplier - will scale additional buffers used by PhysX.
    • What kind of buffers?

I would appreciate any help.