There are explicitly published limits on the dimensions of matrices in the specification of the relevant interfaces, e.g.:
cublasOperation_t transa, cublasOperation_t transb,
int m, int n, int k, ...
On all platforms supported by CUDA and CUBLAS,
int is a signed 32-bit integer type that can represent values in [-231, 231-1].
Unless the maximum matrix dimensions are exceeded, or there is not enough GPU memory to hold the matrix, CUBLAS should work fine with a matrix of more than 232-1 elements, though I haven’t personally tried that as I don’t have a GPU with enough memory to hold an 8+GB matrix. If there is evidence to the contrary, I would consider that a bug, in which case you would want to file a bug report with NVIDIA.
Which CUBLAS function in particular do you observe failing for a matrix with more than 232-1 elements, and what are the actual dimensions of that matrix?