Can you anybody run my program in a 8800 GTX? Trying to compare my performance

Hi,

My intention is to compare the performance I obtain with my 8800 GTX when running
a program I’m writting. I would encourage people having a 8800 GTX to run it on their computer (I’ve attached the file). You can run it with the following command:

./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

Where the second and third arguments are files provide on the tar.gz and output.feat is an output file.

I would like to compare the performance of the system using the “Summary Table” of the visual cuda profiler.
In my case this table looks like:

		  #calls  GPU usec 	CPU usec 	%GPU 

r2c_radix4_sp 580 18.5556 26.0966 76.73
cu_powerSpec 580 1.95802 18.9086 8.09
memcopy 580 3.66681 15.16

Note that GPU ang CPU usage are set to be averaged (“Session View Setings”).
A second experiment would be to change the fftSize to 512. In this case the table looks like:

r2c_radix2_sp 580 13.9933 17.9517 73.15
cu_powerSpec 580 2.07906 17.2483 10.86
memcopy 580 3.0564 15.97

Thanks in advance!!!
forum.tar.gz (1.62 MB)

instead of just dumping binaries, you might want to provide detailed instructions on how to make and run your code. I just kicked it off on my sandbox machine (8800 GTX as requested, SuSE 10.2 32bit, driver 169.09):

blablablamymachine: /home/user/ABCDEF/temp/forum[62]>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.002284 seconds elapsed

0.570475 seconds elapsed

*** glibc detected *** ./feature-extraction_cuda2: munmap_chunk(): invalid pointer: 0x080498fd ***

======= Backtrace: =========

/lib/libc.so.6[0xb7b346e1]

./feature-extraction_cuda2[0x804a9a9]

/lib/libc.so.6(__libc_start_main+0xdc)[0xb7ae5f9c]

./feature-extraction_cuda2(__gxx_personality_v0+0xc1)[0x8049e51]

Plus, how do I change the fftSize to 512? Sorry to be such a showstopper, but if you want people to run your code, you might want to improve the way you explain what needs to be done. I tried make clean; make, but compilation failed on the first file:

g++  -Wall -pg -O2  -I. -I./math  -I./lib -I/usr/include/libxml2 -I/usr/local/cuda/2.0beta2/include -I. -c ./DynamicDataBuffer.cpp

./DynamicDataBuffer.cpp: In member function 'void FeatWarper::doWarping(std::string)':

./DynamicDataBuffer.cpp:69: warning: comparison between signed and unsigned integer expressions

./DynamicDataBuffer.cpp: In member function 'void FeatWarper::runFeatWarper()':

./DynamicDataBuffer.cpp:198: error: jump to label 'LASTSTEP'

./DynamicDataBuffer.cpp:174: error:   from here

./DynamicDataBuffer.cpp:178: error:   crosses initialization of 'long int refindex'

./DynamicDataBuffer.cpp:177: error:   crosses initialization of 'long int circIndex'

./DynamicDataBuffer.cpp:232: error: jump to label 'FINISH'

./DynamicDataBuffer.cpp:173: error:   from here

./DynamicDataBuffer.cpp:178: error:   crosses initialization of 'long int refindex'

./DynamicDataBuffer.cpp:177: error:   crosses initialization of 'long int circIndex'

./DynamicDataBuffer.cpp: In member function 'void DynamicDataBuffer<T>::creaSCFilter(short int) [with T = float]':

./DynamicDataBuffer.cpp:390: warning: unused variable 'accelLength'

./DynamicDataBuffer.cpp:390: warning: unused variable 'tmpAccLength'

./DynamicDataBuffer.cpp: In member function 'void DynamicDataBuffer<T>::saveHTKFeatures(const char*, bool) [with T = float]':

./DynamicDataBuffer.cpp:970: warning: converting to 'int' from 'double'

./DynamicDataBuffer.cpp: In member function 'void DynamicDataBuffer<T>::saveSelFFtoHTK(const char*, bool) [with T = float]':

./DynamicDataBuffer.cpp:1092: warning: converting to 'int' from 'double'

./DynamicDataBuffer.cpp: In member function 'void DynamicDataBuffer<T>::saveHTK_SDC_Features(const char*, bool, int, int, int) [with T = float]':

./DynamicDataBuffer.cpp:1203: warning: converting to 'int' from 'double'

./DynamicDataBuffer.cpp: In member function 'void DynamicDataBuffer<T>::saveEnergy(const char*, bool, bool, std::string) [with T = float]':

./DynamicDataBuffer.cpp:1278: warning: converting to 'int' from 'double'

make: *** [DynamicDataBuffer.o] Error 1

Note that the errors are not a CUDA issue, even though your Makefile hard-codes the path to CUDA even though it does not exist on my sandbox (I use modules).

Hi Dominik,

first of all thanks for your collaboration. I appreciate it, and I’m sorry for all the trouble with the lack of information.

I attached a new Makefile. You have to change the parameter
CUDA = /usr/local/cuda/2.0beta2 and put the path of the cuda files installed in your CPU.

  1. To compile the file:

rm *.o ; make feature-extraction_cuda2

Then, you will obtain the same error you have post. Seems that my compiler gcc version 3.4.6 20060404 (Red Hat 3.4.6-9) skips this kind of error. I’ve seen that in my laptop, gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) does find the error. The fastest way to solve it is to go to the “DynamicDataBuffer.cpp” and comment all the content of the function FeatWarper::runFeatWarper(),
since is not used when running the program in the current configuration.

I hope you don’t have any other problem

  1. The parameter fftSize is set in the configuration file mfcc-12_baseline.mycfg. Right now
    is set to 1024.

fftSize = 1024

I’ll keep an eye on the forum, so if you have any other problem, please, don’t hesitate to write again.

You have to mv Makefile.txt to Makefile
Makefile.txt (1.46 KB)

Hi folks,

I’d appreciate your help with this. See if with the beginning of the week there’s so more enthusiastic people!! =)

I hacked the Makefile. I even hacked the code as you suggested (even though this is a lot to ask from folks like me if you ask for the favor of running your code). Then I ran the binary.

INVALID-POINTER, crash.

./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.003126 seconds elapsed

0.594487 seconds elapsed

*** glibc detected *** ./feature-extraction_cuda2: free(): invalid pointer: 0x0804a330 ***

======= Backtrace: =========

/lib/libc.so.6[0xb7b446e1]

/lib/libc.so.6(cfree+0x89)[0xb7b45d79]

./feature-extraction_cuda2[0x804a952]

/lib/libc.so.6(__libc_start_main+0xdc)[0xb7af5f9c]

./feature-extraction_cuda2(__gxx_personality_v0+0xdd)[0x804a1d1]

I tried a 64-bit machine (recompile, naturally, exactly the same GPU):

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.002309 seconds elapsed

0.442330 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001783 seconds elapsed

0.456662 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001827 seconds elapsed

0.458294 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001835 seconds elapsed

0.458607 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001646 seconds elapsed

0.461911 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001731 seconds elapsed

0.452933 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001121 seconds elapsed

0.290782 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001386 seconds elapsed

0.317559 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001401 seconds elapsed

0.376975 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001224 seconds elapsed

0.292013 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001364 seconds elapsed

0.406416 seconds elapsed

>./feature-extraction_cuda2 mfcc-12_baseline.mycfg SA027RS173.US3 output.feat

0.001334 seconds elapsed

0.291411 seconds elapsed

You might want to read up on how to get proper timings with CUDA. Your results have so much variations (25%!!!) that I don’t trust them at all.