I think that your main problem is about “invalid argument”. When you try to use with 512 threads per block, the nvcc calculate that the total of register you use in a block is more than 8912. thus, too many resources requested for launch was occured.
Why don’t you put your piece of code to anybody here can see and correct it.
I think that your main problem is about “invalid argument”. When you try to use with 512 threads per block, the nvcc calculate that the total of register you use in a block is more than 8912. thus, too many resources requested for launch was occured.
Why don’t you put your piece of code to anybody here can see and correct it.
It is very hard to find the problem in your code.
However, I have a suggestion.
First, make comment all your code (inside the kernel) and compile again, to see if something happen.
If not, make comment again for all the whole code inside kernel except first row. And recompiling.
Continue doing, until you can find out what is the cause.
how many threads and block are you using?
make sure that totally register of a block must be smaller or equal to 8192 and the size of shared memory only 16KB for each block.
It is very hard to find the problem in your code.
However, I have a suggestion.
First, make comment all your code (inside the kernel) and compile again, to see if something happen.
If not, make comment again for all the whole code inside kernel except first row. And recompiling.
Continue doing, until you can find out what is the cause.
how many threads and block are you using?
make sure that totally register of a block must be smaller or equal to 8192 and the size of shared memory only 16KB for each block.
I think it has to do with the number of pointers that I am sending in the code; even if I comment out the whole code still it gives the same error; but if I send only a few variables (of course then this same code wont be there) the at least the error doesn’t show up. could it be due to some error in some separate function ? I have noticed some strange errors with nvcc; for. e.g., in 1 of my functions cublasAlloc was failing because I had passed an array of wrong size in a completely different function.
I think it has to do with the number of pointers that I am sending in the code; even if I comment out the whole code still it gives the same error; but if I send only a few variables (of course then this same code wont be there) the at least the error doesn’t show up. could it be due to some error in some separate function ? I have noticed some strange errors with nvcc; for. e.g., in 1 of my functions cublasAlloc was failing because I had passed an array of wrong size in a completely different function.