nvopencc internal error ...when using textures

I tried to use textures recently and was stopped by the following compiler error:

#$ nvopencc  -TARG:sm_10  -O0 -m32 "simpletest.cpp3.i"  -o "simpletest.ptx"

### Assertion failure at line 783 of ../../be/cg/NVISA/exp_loadstore.cxx:

### Compiler Error in file simpletest.cpp3.i during Code_Expansion phase:

### texture variable NYI

nvopencc INTERNAL ERROR: [...]/cuda/toolkit-1.0-i386/open64/lib//be returned non-zero status 1

# --error 0x100 --

make: *** [obj/debug/simpletest.cu_o] Error 255

The following pretty useless code will trigger it:

#include <stdio.h>

#include <stdlib.h>

#include <cuda.h>

#include <cutil.h>

texture<unsigned int, 1, cudaReadModeElementType> tex;

__global__ void tex_kern( unsigned int *g_out, unsigned int *g_in ) {

    g_out[threadIdx.x] = tex1Dfetch( tex, threadIdx.x );

//    g_out[threadIdx.x] = g_in[threadIdx.x];

}

int main(void) {

    const int num_threads = 32;

    unsigned int *g_out, *g_in;

    unsigned int h_in[num_threads], h_out[num_threads];

   // allocate global memory for output

    CUDA_SAFE_CALL( cudaMalloc((void**)&g_out, sizeof(h_in) ));

    CUDA_SAFE_CALL( cudaMalloc((void**)&g_in , sizeof(h_in) ));

   // fill in some values

    for( int i = 0; i < num_threads; i++ )

        h_in[i] = rand();

   // copy memory to device

    CUDA_SAFE_CALL( cudaMemcpy( g_in, h_in, sizeof(h_in), cudaMemcpyHostToDevice ));

    CUDA_SAFE_CALL( cudaBindTexture( 0, tex, g_in, sizeof(h_in) ));

   dim3 grid( 1,1,1 );

    dim3 threads( num_threads, 1, 1 );

    tex_kern<<< grid, threads >>>( g_out, g_in );

    CUDA_SAFE_CALL( cudaThreadSynchronize( ));

   // fetch & output result

    CUDA_SAFE_CALL( cudaMemcpy( h_out, g_out, sizeof(h_out), cudaMemcpyDeviceToHost ));

    for( int i = 0; i < num_threads; i++ )

        printf( "%s%d\t%d\n", (h_in[i] == h_out[i])? "  " : "! ", h_in[i], h_out[i] );

   // clean up

    CUDA_SAFE_CALL( cudaFree( g_in ));

    CUDA_SAFE_CALL( cudaFree( g_out ));

   return 0;

}

Am I missing something important here?

Some CUDA versions:

nvcc: NVIDIA (R) Cuda compiler driver

Built on Wed_Jun_20_21:03:10_PDT_2007

Cuda compilation tools, release 1.0, V0.2.1221
NVIDIA (R) CUDA Open64 Compiler

Cuda compilation tools, release 1.0, V0.2.1221

Built on 2007-06-20 20:57:00 -0700

My system: Ubuntu Feisty Fawn, x86_64 arch.

Can anyone reproduce that error, or (even better) tell me what’s wrong? Any help appreciated!

Thanks,

Robert

For what it is worth I could not reproduce the problem on an OpenSuse 10.2 x64 system. :(

I noticed our tools differ differ by dates but not by version… probably not significant though.

projects/rsz> uname -a

Linux linux 2.6.18.8-0.5-default #1 SMP Fri Jun 22 12:17:53 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux

projects/rsz> nvcc -V

nvcc: NVIDIA ® Cuda compiler driver

Copyright © 2005-2006 NVIDIA Corporation

Built on Thu_Jun_28_03:22:27_PDT_2007

Cuda compilation tools, release 1.0, V0.2.1221

projects/rsz> /usr/local/cuda/open64/bin/nvopencc -V

NVIDIA ® CUDA Open64 Compiler

Cuda compilation tools, release 1.0, V0.2.1221

Built on 2007-06-28 03:14:45 -0700

Portions Copyright © 2005-2007 NVIDIA Corporation

Portions Copyright © 2002-2005 PathScale, Inc.

Portions Copyright © 2000-2001 Silicon Graphics, Inc.

All Rights Reserved.

[projects/rsz> nvcc rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

projects/rsz>

You need at LEAST -O2 for nvopencc to compile correctly. i.e. optimization can’t be disabled in nvopencc.

Wow! Sure enough… a slightly more descriptive error message would be useful :)

fail:

nvcc -Xopencc -O0 rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

nvcc -Xopencc -O1 rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

nvcc -Xopencc -O4 rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

work just fine:

nvcc -Xopencc -O2 rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

nvcc -Xopencc -O3 rsz.cu -I ~/NVIDIA_CUDA_SDK/common/inc/

Full ack. Works like a charm with -O2…

Thanks for the hint guys!