Hello there,
I’ve been trying to port a parallel search algorithm that requires dynamic memory allocation to CUDA. Thus I wanted to apply the device malloc function in CUDA 3.2 (explained in B.15.1). I am right now developing on a macbook pro that doesn’t have the required Fermi hardware, but on my desktop I will be able to run it. At any rate, the example malloc code in the programming manual does compile with the following command line on my laptop:
sirius:test malfunct$ nvcc -I/sw/include -arch compute_20 malloc_test.cu -o malloc_test
And I assume it will work on the GTX 480. However, there are two problems, first the printf function (explained in B.14 of the C programming manual) isn’t found, so I had to comment it out:
sirius:test malfunct$ cat malloc_test.cu
//-*-c++-*-x
__device__ __host__ void mallocTest2()
{
char* ptr = (char*)malloc(123);
}
__global__ void mallocTest()
{
char* ptr = (char*)malloc(123);
mallocTest2();
//printf("Thread %d got pointer: %p\n", threadIdx.x, ptr);
}
int main()
{
// Set a heap size of 128 megabytes. Note that this must
// be done before any kernel is launched.
cudaThreadSetLimit(cudaLimitMallocHeapSize, 128*1024*1024);
mallocTest<<<1, 5>>>();
cudaThreadSynchronize();
return 0;
}
By the way, the version of the nvcc I am using is:
sirius:test malfunct$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Thu_Nov_11_15:26:50_PST_2010
Cuda compilation tools, release 3.2, V0.2.1221
The second, and bigger problem is, when I use the malloc function in a header file called dynarray.h, which is itself included from yet other header files, I get the following errors:
nvcc -I/sw/include -arch compute_20 syntax_tree.cu -o test_syntaxtree
./dynarray.h(175): error: identifier "malloc" is undefined
./dynarray.h(164): error: identifier "malloc" is undefined
detected during:
instantiation of "void Dynarray<A>::init_vec(int) [with A=char]"
(24): here
instantiation of "Dynarray<A>::Dynarray() [with A=char]"
./string.h(16): here
2 errors detected in the compilation of "/tmp/tmpxft_00014dd8_00000000-4_syntax_tree.cpp1.ii".
make: *** [test_syntaxtree] Error 2
which is unexpected behavior to me. The functions in question were qualified as device host, however the mallocTest2() function I added to the example malloc code is also qualified like that and it does not give rise to any problems. When I copy that function to dynarray.h, however, the compiler complains that malloc is undefined. Here is the function defined, with the malloc call in line 175 of dynarray.h:
__device__ __host__ void mallocTest3()
{
char* ptr = (char*)malloc(123);
}
The verbose output of the failing compiler driver invocation:
sirius:cuda malfunct$ nvcc -I/sw/include -arch compute_20 syntax_tree.cu -o test_syntaxtree --verbose
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/local/cuda/bin
#$ _THERE_=/usr/local/cuda/bin
#$ _TARGET_SIZE_=
#$ TOP=/usr/local/cuda/bin/..
#$ PATH=/usr/local/cuda/bin/../open64/bin:/usr/local/cuda/bin:/Library/Frameworks/Python.framework/Versions/2.6/bin:/usr/local/cuda/bin:/sw/bin:/sw/sbin:/usr/local/bin:/Users/malfunct/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
#$ INCLUDES="-I/usr/local/cuda/bin/../include"
#$ LIBRARIES= "-L/usr/local/cuda/bin/../lib" -lcudart
#$ CUDAFE_FLAGS=
#$ OPENCC_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -D__CUDA_ARCH__=200 -E -x c++ -DCUDA_DOUBLE_MATH_FUNCTIONS "-I/usr/local/cuda/bin/../include" -I. -D__CUDACC__ -C -I"/sw/include" -include "cuda_runtime.h" -m32 -malign-double -o "/tmp/tmpxft_00014e5e_00000000-4_syntax_tree.cpp1.ii" "syntax_tree.cu"
#$ cudafe --m32 --gnu_version=40201 -tused --no_remove_unneeded_entities --gen_c_file_name "/tmp/tmpxft_00014e5e_00000000-1_syntax_tree.cudafe1.c" --stub_file_name "/tmp/tmpxft_00014e5e_00000000-1_syntax_tree.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_00014e5e_00000000-1_syntax_tree.cudafe1.gpu" --include_file_name "/tmp/tmpxft_00014e5e_00000000-3_syntax_tree.fatbin.c" "/tmp/tmpxft_00014e5e_00000000-4_syntax_tree.cpp1.ii"
./dynarray.h(175): error: identifier "malloc" is undefined
./dynarray.h(164): error: identifier "malloc" is undefined
detected during:
instantiation of "void Dynarray<A>::init_vec(int) [with A=char]"
(24): here
instantiation of "Dynarray<A>::Dynarray() [with A=char]"
./string.h(16): here
2 errors detected in the compilation of "/tmp/tmpxft_00014e5e_00000000-4_syntax_tree.cpp1.ii".
# --error 0x2 --
Could you please help me resolve these problems? I am hoping perhaps this is due to my inexperience with the nvcc toolkit. Thanks in advance!
Best Regards,
–
Eray Ozkural
PS: I didn’t consider it an OS X issue, but given that I haven’t tried it on linux yet, it might as well be. If you feel that is the case please remove this post or move it to the OS X section.