Hello I learn CUDA and try to make neural nets with computation graphs.
I have GPU NVIDIA GT-710 with 2GB memory sm=35.
I made simple net with neurons in layers count 11-6-6 (+1 to each layer for bias neuron). It works well.
I want to make MNIST digits recognition 784-800-10 (+1 to each layer for bias neuron) and got this error:
code=701(cudaErrorLaunchOutOfResources) “cudaGraphInstantiate(&graphExec, graph, NULL, NULL, 0)”
When I try to do 784-256-10 - graph creates. Even if I add many layers e.g. 784-256-256-256-256-256-10
But if I try to create 784-257-10 - I got the error.
How can I exactly know what is the bottleneck and where?
Ubuntu 20.20LTS, CUDA 11 + QtCreator.