How to realloc memory in GPU?

Greetings to the programming community,
I am writing to seek your assistance and expertise on a topic related to CUDA programming. I have been exploring the CUDA platform and am currently working with an NVIDIA GeForce GTX 1060 with Max-Q Design GPU on my laptop.

How to realloc memory in GPU? Is it possible?
For instance, I have some code:
import numpy as np
from numba import jit, njit
import math
from numba import cuda

def kernel_plus_1(d_inp, d_out):
nr = cuda.blockIdx.x
nc = cuda.threadIdx.x
d_out[nr][nc] = d_inp[nr][nc] + 1

qr = 512
qc = 512
arr = np.arange(qr * qc, dtype=‘int64’).reshape(qr, qc)

d_inp = cuda.to_device(arr)
out = np.zeros_like(arr)
d_out = cuda.to_device(out)

blocks = int(arr.shape[0])
threads = int(arr.shape[1])
kernel_plus_1[blocks, threads](d_inp, d_out)
out = d_out.copy_to_host()

I wish to change dimensions of array d_out in GPU. What should I use to do it?
def kernel_func(d_inp, d_out):

I would greatly appreciate any insights, explanations, or guidance you could provide regarding the maximum thread or memory limit in device and any functions or commands that can help determine.


It’s not possible

In the future, when posting code, please format it properly. You can do this by editing your post by clicking the pencil icon below it, then select the code, then click the </> button at the top of the edit window, then save your changes. It’s also preferred that post subject titles be somehow representative of the topic of your post, not a linkedin URL.

I assume we are talking about using reallocation to extend the size of an existing allocation rather than shrinking it.

You will need to do manually what realloc() does behind the scenes when there is no adjacent free block of sufficient size available to extend the existing allocation: malloc() a new, larger, block, memcpy() the content of the currently allocated block to the new block, then free() the old block. Obviously, in the context of CUDA, that corresponds to a sequence of cudaMalloc(), cudaMemcpy(), and cudaFree().

Generally speaking, I consider reallocation to be of questionable utility, and I cannot recall an instance of using it in production code over forty years of programming with C/C++. That is anecdotal evidence, so there may be use cases where reallocation is very much appropriate and common.

Which simply cannot be done by code that begins at this location:

I am not disagreeing, and my post was meant to augment rather than to contradict. I am simply focusing on what can be done instead of what cannot be done, or putting it a different way, the “next best thing”.

Thanks a lot, your advise helped me well.
By the way, perhaps you know, whether it is possible to download CUDA on macos?
I visited the page:

There are not options to download CUDA for macos in the page there.
Regards, Yury

CUDA stopped supporting Mac OS a number of years ago (somewhere around 2018). See here.