~/NVIDIA_CUDA_SDK/bin/linux/release/fluidsGL
cutilCheckMsg() CUTIL CUDA error: cudaBindTexture failed in file <fluidsGL_kernels.cu>, line 54 : invalid texture reference.
And for simpleGL you see an attached screenshot, like you see in the middle of the black background there seem a litle red point.
Here are other runs
$ cd ~/NVIDIA_CUDA_SDK/bin/linux/release/
$ ./alignedTypes
Allocating memory...
Generating host input data array...
Uploading input data to GPU memory...
Testing misaligned types...
uint8...
cutilCheckMsg() CUTIL CUDA error: testKernel() execution failed
in file <alignedTypes.cu>, line 223 : invalid device function .
$ ./template
cutilCheckMsg() CUTIL CUDA error: Kernel execution failed in file <template.cu>, line 117 : invalid device function .
$ ls | grep est
bandwidthTest
$ ./bandwidthTest
Running on......
device 0:GeForce 8600M GT
Quick Mode
Host to Device Bandwidth for Pageable memory
.
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1055.7
Quick Mode
Device to Host Bandwidth for Pageable memory
.
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 565.8
Quick Mode
Device to Device Bandwidth
.
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 8681.5
&&&& Test PASSED
Press ENTER to exit...
$ ./BlackScholes
Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.
Executing Black-Scholes GPU kernel (512 iterations)...
cutilCheckMsg() CUTIL CUDA error: BlackScholesGPU() execution failed in file <BlackScholes.cu>, line 195 : invalid device function .
$ ./clock
Test PASSED
time = -1020950724
Press ENTER to exit...
$ ./convolutionFFT2D
Input data size : 1000 x 1000
Convolution kernel size : 7 x 7
Padded image size : 1006 x 1006
Aligned padded image size : 1024 x 1024
Allocating memory...
Generating random input data...
Creating FFT plan for 1024 x 1024...
Uploading to GPU and padding convolution kernel and input data...
...initializing padded kernel and data storage with zeroes...
...copying input data and convolution kernel from host to CUDA arrays
...binding CUDA arrays to texture references
cudaSafeCall() Runtime API error in file <convolutionFFT2D.cu>, line 241 : invalid texture reference.
$ ./dct8x8
CUDA sample DCT/IDCT implementation
===================================
Loading test image: barbara.bmp... [512 x 512]... Success
Running Gold 1 (CPU) version... Success
Running Gold 2 (CPU) version... Success
cudaSafeCall() Runtime API error in file <dct8x8.cu>, line 245 : invalid texture reference.
Running CUDA 1 (GPU) version...
And commenting a line on Kernel execution failed in file <bitonic.cu>, line 79 : invalid device function .
I get
$ bin/linux/release/bitonic
Test FAILED
Press ENTER to exit...
So, the question is, is my computer not able to do CUDA?, is a driver problem?, some info on my system
I think I can install the 180 driver, the problem is that when an update of the kernel come in the “updates” and I dont see it, it will break the system.
Also I don’t know “the correct” way of doing and avoiding break in a update of the kernel.
I mean, now I will do some like
1.- Disable 177 from restricted drivers
2.- sudo gdm stop
3.- sudo ./NVIDIA-Linux-x86-180.06-pkg1.run
4.- restart the system
and when a kernel update come, I will
1.- Enable the default 177 restricted drivers from the repo (hope this overwrite the manually installed 180)
After try install 180, I find that I was using gcc 1.4, but 180 complaint about kernel being compiled for gcc 4.3, I can relink gcc to point to 4.3, but the question that raise is if the interface of the kernel is for gcc 4.3 and SDK 2.1 dont work OK with 4.3… this setup will just work?
So, after install the updates, I go to xorg, put vesa for the driver? and then after check that all is running OK after update with vesa default driver, reinstall the newest drivers?
By the way, the examples work after update to
$ glxinfo |grep NVIDIA
server glx vendor string: NVIDIA Corporation
client glx vendor string: NVIDIA Corporation
OpenGL vendor string: NVIDIA Corporation
OpenGL version string: 2.1.2 NVIDIA 180.06
OpenGL shading language version string: 1.20 NVIDIA via Cg compiler
So after this, hehe, where should I start? I want to first do some point interpolations and things like that, so any pointer is welcome.
I think I have understood the thing about write for one thread run in n, but the partition thingy seem to something I need to test by hand.
Also I need to “model” my mind for write some things in parallel… ummm, for example if I need to calculate the fib function (I write as I redact, so I havent tried search), or other functions that depend on previous data and I dont have a input array or some like that, how to approach this type of problems? (I think of them in this momment problems that generate [not have an initial array] content and problems that process content [have an initial array]).
By the way, there exist something like #cuda on irc? or I should stick to “General CUDA GPU Computing Discussion” and “CUDA Programming and Development” (thought I don’t see the line that separate each one… ummm :wacko:).
I don’t know of better sources as here. General is about hardware and stuff I think, programming & development is more about things encountered when programming. But that is my idea ;)
It takes a while to ‘think parallel’ is my experience. I think I really got it after about 4 months during my second project (first project was embarrassingly parallel).
don’t really understand what you meant, but you can have a kernel that generates data, and after that another kernel that works on the data generated by the first kernel, if that was your question ;)
I’m refering to the first part (if at less for the moment the problem seem to much linear or dependant in calculate all that come before and can’t be breaked in handle independent each block or unit of processing)…
So, like I know for example fibonacci secuence is recursive then can be moded to be iterative, but it we follow the normal way (generate data or the secuencue from beggining to n), how this generation can be done paralelised?
I mean, “my problem” reside in that I wan to generate for example only 1, 1, 2, 3, 5, 8, 13, 21. That is 8 numbers, so I divide the problem in a: 1 to 3 and b:5 to 21, still I will need to generate a before start with b, but in the long range, thinking that I will let for example a part generate 512 numbers and the next the other 512 and so on, how is this type of problems (that for me look very linear and depend on calculate previous data) can be done in paralel.
So as I think generate the secuence from the first to elements 1, 1 and then continue adding, this dont seem adecuate for paralelise, also I can take the pascal triangle (not a line) and generate the piramid like dependent on the previous line (I mean, the calculation seem secuencial and doesn’t look that can be done in paralel).
Sorry for not be able to explain correctly but Im not natal english speaker and Im new to this way to solve a problem in parallel, hope I do it this time.
mmm, Im doing this for the “shake” or curiosity… :blink:
I don’t think you can easily generate fibonacci numbers in parallel. That you would generate on CPU, move to GPU and process afterwards. There are some things not easily done on GPU I am afraid ;)