Problem when defining surface reference

Hi,

i want to bind a 3D CUarray (a 3D texture of half precision floats) to a surface reference, to set its values. There’s a dedicated kernel for this. But as soon as i declare the surface reference in the kernel, the JIT compiler aborts compiling the PTX file with error message

ptxas application ptx input, line 65; fatal   : Parsing error near '.surf': syntax error

.

This is the relevant parts from the kernel (other stuff stripped)

#include <cuda.h>

surface<void, 3> volume_surface_reference;

__global__ void loader(int param, int *ptr) {

//    surf3Dwrite(0, volume_surface_reference, 2, 1, 1, cudaBoundaryModeTrap);

}

and this from the PTX (error caused by last line)

.entry _Z6loaderiPi (

                .param .s32 __cudaparm__Z6loaderiPi_param,

                .param .u64 __cudaparm__Z6loaderiPi_ptr)

        {

        .reg .u32 %r<4>;

        .reg .u64 %rd<3>;

        .loc    14      19      0

$LDWbegin__Z6loaderiPi:

        .loc    14      27      0

        ld.param.s32    %r1, [__cudaparm__Z6loaderiPi_param];

        add.s32         %r2, %r1, 1;

        ld.param.u64    %rd1, [__cudaparm__Z6loaderiPi_ptr];

        st.global.s32   [%rd1+0], %r2;

        .loc    14      28      0

        exit;

$LDWend__Z6loaderiPi:

        } // _Z6loaderiPi

        .surf .u64 volume_surface_reference;

.

Also, if i additionally try to compile with surf3Dwrite() uncommented, nvcc aborts with

Error: External calls are not supported (found non-inlined call to __surf3Dwriteu1)

.

I’m using the driver API version 4.2 and a GTX 570 with capability 2.0. The 3D CUarray will also be used by another kernel by texture reference and it is defined like this

CUDA_ARRAY3D_DESCRIPTOR array3d_desc;

array3d_desc.Format = CU_AD_FORMAT_HALF;

array3d_desc.NumChannels = 1;

array3d_desc.Width  = mVolume->getExtent().x;

array3d_desc.Height = mVolume->getExtent().y;

array3d_desc.Depth  = mVolume->getExtent().z;

array3d_desc.Flags  = CUDA_ARRAY3D_SURFACE_LDST;

if (!Cutil::safeSyncCall(cuArray3DCreate(&volume_array, &array3d_desc))) {

    return false;

}

if (!Cutil::safeSyncCall(cuModuleGetTexRef(&volume_texture_reference, dvrModule, "volume_texture_reference"))) {

    return false;

}

// Disabled, because not working by now.

//if (!Cutil::safeSyncCall(cuModuleGetSurfRef(&volume_surface_reference, loaderModule, "volume_surface_reference"))) {

//    return false;

//}

Cutil::safeSyncCall(cuTexRefSetFilterMode(volume_texture_reference, CU_TR_FILTER_MODE_LINEAR));

Cutil::safeSyncCall(cuTexRefSetAddressMode(volume_texture_reference, 0, CU_TR_ADDRESS_MODE_CLAMP));

Cutil::safeSyncCall(cuTexRefSetAddressMode(volume_texture_reference, 1, CU_TR_ADDRESS_MODE_CLAMP));

Cutil::safeSyncCall(cuTexRefSetAddressMode(volume_texture_reference, 2, CU_TR_ADDRESS_MODE_CLAMP));

Cutil::safeSyncCall(cuTexRefSetFlags(volume_texture_reference, 0));

Cutil::safeSyncCall(cuTexRefSetFormat(volume_texture_reference, CU_AD_FORMAT_HALF, 1));

if (!Cutil::safeSyncCall(cuTexRefSetArray(volume_texture_reference, volume_array, CU_TRSA_OVERRIDE_FORMAT))) {

    return false;

}

// Disabled, because not working by now.

//if (!Cutil::safeSyncCall(cuSurfRefSetArray(volume_surface_reference, volume_array, 0))) {

//    return false;

//}

.

Any ideas?

Cheers, Oliver

The simpleSurfaceWrite SDK sample compiles and runs fine. Maybe i’m missing something obvious?

It’s working. I just did not realize that nvcc does not automatically assume SM 20 when you use surfaces. You have to add it explicitly to the compiler options: -arch sm_20.