Simple surface compilation problem on CUDA

durf_za · July 8, 2010, 1:28pm

I’m trying to compile the example code from Section 3.2.5 of the CUDA reference manual, but get the following error from ptxas:

[i]1>ptxas C:/Temp/tmpxft_00002200_00000000-7_surfTest.compute_10.ptx, line 48; Ã‡=8Ã¼: Parsing error near ‘.surf’: syntax error

1>ptxas Ã‡=8Ã¼: Ptx assembly aborted due to errors[/i]

I get the same errors from my other code when I simply add the line

surface<void,2> surfRef;

System spec:

Windows XP

CUDA +SDK v3.1

Quadro fx 3700 (Compute Compatibility 1.1)

Visual Studio 2008 Express

DisplayDriver 257.21

Code snippet below:

[codebox]#include <stdio.h>

#include <string.h>

#include <cutil_inline.h>

#include <cuda_runtime_api.h>

// 2D surfaces

surface<void, 2> inputSurfRef;

surface<void, 2> outputSurfRef;

// Simple copy kernel

global void copyKernel(int width, int height) {

// Calculate surface coordinates 

unsigned int x = blockIdx.x * blockDim.x + threadIdx.x; 

unsigned int y = blockIdx.y * blockDim.y + threadIdx.y; 

if (x < width && y < height) { 

	uchar4 data; 

	// Read from input surface 

	surf2Dread(&data, inputSurfRef, x * 4, y); 

	// Write to output surface 

	surf2Dwrite(data, outputSurfRef, x * 4, y); 

}

}

extern “C” void surfTest() {

int width = 10;

int height = 10;

int size = width * height * sizeof(float);

float *h_data = (float*) malloc(size);

// Allocate CUDA arrays in device memory 

cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(8, 8, 8, 8, cudaChannelFormatKindUnsigned); 

cudaArray* cuInputArray; 

cudaMallocArray(&cuInputArray, &channelDesc, width, height, cudaArraySurfaceLoadStore); 

cudaArray* cuOutputArray; 

cudaMallocArray(&cuOutputArray, &channelDesc, width, height, cudaArraySurfaceLoadStore); 

// Copy to device memory some data located at address h_data 

// in host memory 

cudaMemcpyToArray(cuInputArray, 0, 0, h_data, size, cudaMemcpyHostToDevice); 

// Bind the arrays to the surface references 

cudaBindSurfaceToArray(inputSurfRef, cuInputArray); 

cudaBindSurfaceToArray(outputSurfRef, cuOutputArray); 

// Invoke kernel 

dim3 dimBlock(16, 16); 

dim3 dimGrid((width + dimBlock.x - 1) / dimBlock.x, 

			 (height + dimBlock.y - 1) / dimBlock.y); 

copyKernel<<<dimGrid, dimBlock>>>(width, height); 

// Free device memory 

cudaFreeArray(cuInputArray); 

cudaFreeArray(cuOutputArray);

}

[/codebox]

Unfortunately there are no examples that I could find in the SDK which use surfaces, so I can’t test whether these compile.

Any ideas?

Richard

bitminer · July 8, 2010, 3:47pm

I am getting a similar result on Windows Server 2008 x64 using nvcc 3.1

I have traced it to array indexing in my code (which I cannot post), but will describe here. My compile output is:

2>C:/CUDA/bin64/nvcc.exe C:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/source/Matlab/lib/3rdParty/Siemens/gpu/PMatrixRecon/gpuPMatrixRecon.cu -arch sm_11 --ptxas-options=-v --keep -maxrregcount=32 -Xcompiler /EHsc,/W3,/nologo,/Od,/Zi,/MTd -m64 -DgpuPMatrixRecon_EXPORTS -ccbin "C:/Program Files (x86)/Microsoft Visual Studio 9.0/VC/bin" -DNVCC -c -o C:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/build/dvip4-Win64/source/Matlab/lib/3rdParty/Siemens/gpu/PMatrixRecon/Debug/gpuPMatrixRecon_generated_gpuPMatrixRecon.cu.obj -IC:/CUDA/include "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common/inc" -IC:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/source/cpp/lib/3rdParty/Win/boost-cmake-1_41_0 -IC:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/build/Windows-6.0/install/include/vtk-5.6 -IC:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/build/Windows-6.0/install/include/VTKEdge "-IC:/Program Files/MATLAB/R2010a/extern/include" -IC:/Users/bdavis5/Documents/QS/QS-NIH/source/branches/trunk/source/Matlab/lib/3rdParty/Siemens/gpu/PMatrixRecon/. -IC:/CUDA/include

2>gpuPMatrixRecon.cu

...

...

2>gpuPMatrixRecon.cudafe2.gpu

2>ptxas gpuPMatrixRecon.ptx, line 108; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 109; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 110; error   : Arguments mismatch for instruction 'shl'

2>ptxas gpuPMatrixRecon.ptx, line 111; error   : Arguments mismatch for instruction 'or'

2>ptxas gpuPMatrixRecon.ptx, line 146; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 147; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 148; error   : Arguments mismatch for instruction 'shl'

2>ptxas gpuPMatrixRecon.ptx, line 149; error   : Arguments mismatch for instruction 'or'

2>ptxas gpuPMatrixRecon.ptx, line 191; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 192; error   : Arguments mismatch for instruction 'and'

2>ptxas gpuPMatrixRecon.ptx, line 193; error   : Arguments mismatch for instruction 'shl'

2>ptxas gpuPMatrixRecon.ptx, line 194; error   : Arguments mismatch for instruction 'or'

2>ptxas gpuPMatrixRecon.ptx, line 266; warning : Double is not supported. Demoting to float

2>ptxas HÃ¢âˆž(Ã‡=Â¥â•”R: Ptx assembly aborted due to errors

The code is that seemingly causes this (I have verified this by commenting in/out the code relates to array indexing into a pointer to floats. In my code I have a structure:

typedef struct Matrix

{

	size_t width;

	size_t height;

	float* data;

	bool column_major;

} Matrix_t;

And the code that causes my issue where z is of type Matrix in kernel code:

z.data[z_index] = z.data[z_index] + x.data[x_index] * y.data[y_index];

//				z.data[z_index] = z.data[z_index] + x.data[x_index] * y.data[y_index];

//				*(z.data + z_index) = *(z.data + z_index) + *(x.data + x_index ) * *(y.data + y_index);

I have tried some variants both array indexing and pointer arithmetic differencing to float. The odd thing I was experiencing that this code would compile and run. The really odd thing is on my Laptop Win7 x64 I could get it to break, then rebuild and it would compile and run I would then later run into the same problem even though I was making unrelated changes to the code. I changed something which now causes this error consistently. If I figure it out I will post or if someone has had the same issue…please post.

I am very curious to know if:

is Unicode Swahili for “I took your goat” or if converted to decimal winning lottery ticket numbers (which state I am not sure. I need more go go on here.). I have seen cryptic error messages, but NVIDIA has really out done themselves here.

Simon_Green · July 8, 2010, 4:37pm

Sorry, surfaces aren’t very well documented yet. They are only supported on Fermi (compute capability 2.0 devices).

I agree the error reporting should be better!

bitminer · July 9, 2010, 1:26am

First off I wanted to clarify why I posted to this thread. I had to do with error code generated by nvcc

ptxas Ã‡=8Ã¼: Ptx assembly aborted due to errors

Which after searching google for “ptxas Ptx assembly aborted due to errors” led me here. My posts have nothing to do with surfaces, but everything to do with ptxas cryptic errors.

There must be a random assembly generator inside nvcc as my code(in the kernel):

Matrix_t v = {0};

Now generates:

1>ptxas C:/Users/bdavis/AppData/Local/Temp/tmpxft_000010f0_00000000-4_gpuPMatrixRecon.ptx, line 144; error   : Arguments mismatch for instruction 'and'

1>ptxas C:/Users/bdavis/AppData/Local/Temp/tmpxft_000010f0_00000000-4_gpuPMatrixRecon.ptx, line 145; error   : Arguments mismatch for instruction 'and'

1>ptxas C:/Users/bdavis/AppData/Local/Temp/tmpxft_000010f0_00000000-4_gpuPMatrixRecon.ptx, line 146; error   : Arguments mismatch for instruction 'shl'

1>ptxas C:/Users/bdavis/AppData/Local/Temp/tmpxft_000010f0_00000000-4_gpuPMatrixRecon.ptx, line 147; error   : Arguments mismatch for instruction 'or'

1>ptxas C:/Users/bdavis/AppData/Local/Temp/tmpxft_000010f0_00000000-4_gpuPMatrixRecon.ptx, line 250; warning : Double is not supported. Demoting to float

1>ptxas fatal   : Ptx assembly aborted due to errors

but

Matrix_t v;

does not. I am certain this compiled and ran on the gpu.

And the code:

z.data[z_index] = z.data[z_index] + x.data[x_index] * y.data[y_index];

Which I had commented out now compiles in. This is frustrating and obscure. So my code is compiling again, but I haven’t a clue why and I am just waiting for the random assembly generator to tell me I have another cryptic obscure error. This smells of data corruption or invalid memory access inside nvcc IMHO.

durf_za · July 9, 2010, 11:00am

Thanks for the fast response Simon.

It would be really useful if surface support (along with any other unsupported functions / types) could be included in Appendix G of the CUDA programming guide.

Richard

Garthof · March 23, 2011, 7:47pm

I had exactly the same issue in my Linux computer. I’ve solved it by compiling with the -arch=sm_20 option:

nvcc -arch=sm_20 myFile.out myFile.cu

I didn’t know that surfaces were an only-Fermi feature, so that explains why you have to target platforms with compute capability 2.0 or higher, as currently the compiler defaults to compute capability 1.0.

Marcelo_Arruda · May 25, 2012, 8:25pm

I had exactly the same issue in my Linux computer. I’ve solved it by compiling with the -arch=sm_20 option:
nvcc -arch=sm_20 myFile.out myFile.cu
I didn’t know that surfaces were an only-Fermi feature, so that explains why you have to target platforms with compute capability 2.0 or higher, as currently the compiler defaults to compute capability 1.0.

Thank you, I compiled without problems with -arch=sm_20.

Topic		Replies	Views
Help me understand "-Xptxas -dlcm=cg" (take 2) CUDA Programming and Performance	1	6926	November 24, 2010
CUDA 1.1 Bug - Compiler crash (ptxas) w/repro CUDA Programming and Performance	16	8598	May 19, 2008
Peculiar Shared Memory Behavior - NEED HELP! A test program appears to have two arrays sharing v CUDA Programming and Performance	21	2156	June 5, 2011
Error in my code... CUDA Programming and Performance	11	2531	December 19, 2014
compile segmentation fault CUDA Programming and Performance	14	14539	September 2, 2010
How to debug CUDA? CUDA Programming and Performance	2	1183	November 25, 2023
Memory Corruption on a Fermi-Class GPU Error only on Fermis, program works on non-Fermis. CUDA Programming and Performance	18	7107	July 22, 2011
inconsistent results on every run CUDA Programming and Performance	15	5452	October 17, 2011
CUDA V7.0 Release Mode Compile error: nvcc error : 'ptxas' died with status 0xC0000005 (ACCESS_VIOL CUDA Programming and Performance	7	3429	March 29, 2017
[HELP] CUDA 3.2 on Windows 7 x64 with VC++ 2008 Express CUDA Programming and Performance	11	22329	December 12, 2010

Simple surface compilation problem on CUDA

Related topics