CUDA SDK 2.3 Bug report: vectorAddDrv driver API, findModule

Operating System

Win XP64, but compiling for 32-bit and with CUDA 32-bit tools

  • Synopsis description of the problem

ptx parsing by ptxas fails

  • Detailed description of the problem

In the vectorAddDrv sample, the inline function findModulePath does not zero-terminate the string into which the ptx source is read.

Please describe the conditions when the problem was observed. Screen messages, code snippets and anything else that will help in duplicating the problem should be provided here.

  • Code Snippet

[codebox]bool inline

findModulePath(const char *module_file, string & module_path, char **argv, string & ptx_source)

{

module_path = cutFindFilePath(module_file, argv[0]);

if (module_path.empty()) {

   printf("> findModulePath could not find file: <%s> \n", module_file); 

   return false;

} else {

   printf("> findModulePath found file at <%s>\n", module_path.c_str());

   if (module_path.rfind(".ptx") != string::npos) {

	   FILE *fp = fopen(module_path.c_str(), "rb");

	   fseek(fp, 0, SEEK_END);

	   int file_size = ftell(fp);

	   ptx_source.reserve(file_size+512);

	   fseek(fp, 0, SEEK_SET);

	   fread(&ptx_source[0], sizeof(char), file_size, fp);

	   fclose(fp);

   }

   return true;

}

}

// in main:

	error = cuModuleLoadDataEx(&cuModule, ptx_source.c_str(), jitNumOptions, jitOptions, (void **)jitOptVals);

…[/codebox]

As you can see +512 extra bytes are reserved. The ptx file has an EOF but not a null at the end, and while the .c_str() call in cuModuleLoadDataEx does probably add a null character, it does so at the end of the buffer which is 512 bytes too wide. Any spurious characters that are in memory at that location will get passed to the JIT compiler which promptly issues a Syntax error (see my [url=“http://forums.nvidia.com/index.php?showtopic=150486&hl=”]previous post[/url).

Suggested remedy is to rewrite findModule with something like

[codebox]…

   if (module_path.rfind(".ptx") != string::npos) {

	   fp = fopen(module_path.c_str(), "rb");

	   fseek(fp, 0, SEEK_END);

	   int file_size = ftell(fp);

	   char *buf = new char[file_size+1];

	   fseek(fp, 0, SEEK_SET);

	   fread(buf, sizeof(char), file_size, fp);

	   fclose(fp);

	   buf[file_size] = '\0';

	   ptx_source = buf;

	   delete[] buf;

   }

[/codebox]I tested this in my code and this fixed the crash.

  • CUDA toolkit release version

2.3

  • SDK release version

2.3

  • Compiler for CPU host code

MS VC++ 9.0 (2008) Professional

  • System description including:

2-socket Intel Nehalem 3GHz, 12 GB, Colfax, 2x tesla C1060, 1x QuadroFx 3700

Thanks for comprehensive bug report! Fixed.