In chapter 6 of the programming guide (version 2.1), there is a source code listing which has basic matrix multiplication. What I am trying to do is add a main() function to call the Mul(float *A, …) function. I wanted to explicitly declare the arrays inside main and then send them to the function. The problem is that I am new to C++ and new to CUDA and haven’t found this as easy as expected. Could someone write a suggestion of a simple main function that would do this.
What I had tried to do was
float *A[3][2];
…
A[0][0] = 1; A[0][1] = 0;
…
This definitely doesn’t work. If anyone can tolerate dealing with someone who doesn’t know enough C++, please help me out.
Of course with such a small matrix it will not see any performance improvement, but it should compile. The code requires the matrix dimensions to be a multiple of BLOCK_SIZE, but you should be able to set BLOCK_SIZE to 1 and it should execute.
CORRECTION: The above comment did not completely solve the issue, please read the problem below.
That makes a lot of sense ( I feel dumb). Thank you for replying.
I had set it up as a float *A just because thats what the function was expecting, but I don’t need to do that as long as I pass it correctly I suppose.’