Urgent help with threads please!

DenisR · March 5, 2008, 8:40pm

I find one site that says it has 16 shaders, that would mean 2 multiprocessors if I am not mistaken. If you copy the data in the GPU sheet of G84 to the right and change the name & amount of multiproc to 2, you will probably get valid output.

Note that the occupancy is the same for all versions, it is just the output of Maximum Simultaneous Blocks per GPU, that differs between the versions. And that is not really very important, occupancy is the most important output of the sheet.

mpags2001 · March 6, 2008, 6:54pm

Why is it that the value displayed by nNum is 19? What I am trying to do is add the indices of the threads to nNum so the value is supposed to be 0+1+2+3…+19. It turns out that the value displayed is the addition of the last index of the thread to nNum.

This is my code.

main.cpp:
#include <stdio.h>
#include <stdlib.h>
#define NUM 20

extern “C” void test (int* nNum, int nSize);

int main(int argc, char *argv)
{
int temp = 0;
test(&temp, NUM);
printf (“%d\n”, temp);
}

threadstest.cu:
global void compute_testd(int* nNum, int nSize)
{
unsigned int index = threadIdx.x + blockIdx.x * blockDim.x;

if (index < nSize)
{
*nNum += index;
}
}

extern “C” void test (int* nNum, int nSize)
{
int* nNumd;

cudaMalloc((void**)&nNumd, sizeof(int));
cudaMemcpy(nNumd, nNum, sizeof(int), cudaMemcpyHostToDevice);

compute_testd<<< ceil((float)nSize/256.0f), 256>>> (nNumd, nSize);

cudaMemcpy(nNum, nNumd, sizeof(int), cudaMemcpyDeviceToHost);

cudaFree(nNumd);
}

And is there a way for us to put this loop into CUDA:

for (int i = 0; i < 20; i++)
{
for (int j = 0; j < 20; j++)
{
array[i][j] = 3;
}
}

Topic		Replies	Views
Newbie help on thread blocks CUDA Programming and Performance	22	10601	December 24, 2008
An Easy Introduction to CUDA C and C++ Technical Blog	48	1259	July 19, 2018
Annoying problems with memory and/or syntax CUDA Programming and Performance	19	4769	April 8, 2008
Reduction CUDA Programming and Performance	19	3469	May 16, 2012
Memory problem? ...incredible slowdown CUDA Programming and Performance	29	16310	January 30, 2011
Is it possible to process multidimensional arrays inside the kernel? CUDA Programming and Performance	13	9051	March 31, 2015
How to choose how many threads/blocks to have? CUDA Programming and Performance	43	52434	June 7, 2022
limit of computation CUDA Programming and Performance	44	32905	April 8, 2008
Transfer-Bound Application Looking for ideas to speed it up CUDA Programming and Performance	36	29327	April 23, 2010
Cuda code performance CUDA Programming and Performance	14	3156	December 16, 2014

Urgent help with threads please!

Related topics