For what are .ppm and .raw files?

I am new with CUDA and i take a look over the samples that came once with the install kit. I have playd a little whit project from NVIDIA Corporation\CUDA Samples\v8.0\2_Graphics\simpleTexture3D and i have seen that function


read from a file from data/Bucky.raw and i have seen that all the projects have data folders.
I try to see what is in ther but i think is binary code so i can’t understand.
Can anyone tell me for what are those file?

Thanks in advance,

.ppm is the suffix used for images stored in a simple, unsophisticated format (see Wikipedia: [url]Image file format - Wikipedia).

.raw is likely a data file stored as pure binary data (that is, without any meta data). It may be an image retrieved directly from a camera, but could be any sort of binary data file.

Thanks :)

I have study this file from a while and i don’t understand how they have been created, because i have try to convert them to jpg or png but convertors dosen’t recognise the file.
I don’t understand how exactly looks the file from v8.0\v8.0\2_Graphics\simpleTexture3D\data[b]Bucky.raw[/b][i][/i], and how is obtained the output from v8.0\v8.0\2_Graphics\simpleTexture3D.
It just works open in hex or binary view.

As I stated, the .raw file does not have to be an image, and it can be stored in any proprietary format the creator chose. If you want to understand what is stored in this file, I would suggest taking a look at the details of loadVolumeData(). I guess that function builds some sort of data structure from the file contents. The file itself would have been created by an inverse process that stores out such a data structure using a suitable algorithm.

Function loadVolumeData has indeed a call to a function name loadRawFile(), but this function read the file in a buffer and that’s all he does.

// Load raw data from disk
uchar *loadRawFile(const char *filename, size_t size)
    FILE *fp = fopen(filename, "rb");

    if (!fp)
        fprintf(stderr, "Error opening file '%s'\n", filename);
        return 0;
    uchar *data = (uchar *) malloc(size);
    size_t read = fread(data, 1, size, fp);
    printf("Read '%s', %lu bytes\n", filename, read);

    return data;

Not very surprising, given that this is raw data. So loadRawFile() reads the data into an array of bytes, and a hypothetical corresponding storeRawFile() would store an array of bytes to a file.

To find out what this array of bytes represents semantically, you will have to dig through the code and see how the bytes from the array are being used in the code. The function name “loadVolumeData” suggests that the array contains volume data, e.g. density values on a rectangular grid, sometimes referred to as a “map”.

My hypothesis is that this question is an XY-problem, and the actual question is: “I want to modify the simpleTexture3D example code so it uses my own volume data instead of the Bucky dataset. How can I generate my own volume data in the format expect by the simpleTexture3D code?”

I am not familiar with this CUDA example app, or the Bucky data set. Is it possible that Bucky is the same 32 x 32 x 32 volume of 8-bit data available from The Volume Library ([url]Das CMS-System des RRZE › RRZE CMS) as “Bucky Ball”?

This item may be of interest:

[url]c++ - What is .raw file format in cuda sample code? - Stack Overflow

Seems like just what the doctor ordered :-) Upvote on SO.

Is exactly the same file from Das CMS-System des RRZE › RRZE CMS the Bucky Ball.
In this code read from the file and creates a voxel and he is makeing a variable volumdSize, samething like this:

bool linearFiltering = true;
bool animate = true;
int *pArgc = nullptr;
char **pArgv = nullptr;
const char *sSDKsample = "simpleTexture3D";
const char *volumeFilename = "Bucky.raw";
const cudaExtent volumeSize = make_cudaExtent(32, 32, 32);
const uint width = 512;
const uint height = 512;
const dim3 blockSize(16, 16, 1); //aici
const dim3 gridSize(width / blockSize.x, height / blockSize.y);//aici
float w = 0.5;  // texture coordinate in z
GLuint pbo;     // OpenGL pixel buffer object
struct cudaGraphicsResource *cuda_pbo_resource; // CUDA Graphics Resource (to transfer PBO)

And in file .cu i think it multiply each pixel whit 255, samething like this :

texture<uchar, 3, cudaReadModeNormalizedFloat> tex;  // 3D texture

cudaArray *d_volumeArray = 0;

__global__ void
d_render(uint *d_output, uint imageW, uint imageH, float w)
    uint x = __umul24(blockIdx.x, blockDim.x) + threadIdx.x;
    uint y = __umul24(blockIdx.y, blockDim.y) + threadIdx.y;

    float u = x / (float)imageW;
    float v = y / (float)imageH;
    // read from 3D texture
    float voxel = tex3D(tex, u, v, w);

    if ((x < imageW) && (y < imageH))
        // write output color
        uint i = __umul24(y, imageW) + x;
        d_output[i] = voxel * 255;

I have pasted above all the declaration of the variables and i have width, height and a w = 0.5(possible w be the value on z axis ?)

Note that the texture read mode is cudaReadModeNormalizedFloat, so the texture access returns a float in [0,1]. The multiplication with 255 then turns this back into an 8-bit grey-scale value. Since the source data is provided as 8-bit data in the case of “Bucky”, this dual conversion probably isn’t necessary, but it adds flexibility to the code, as it can be trivially modified to deal with different data formats than single-byte for input and output.

I have study a little bit farder and i have understand how it works, what i still don’t understand is this part:

uint x = __umul24(blockIdx.x, blockDim.x) + threadIdx.x;
    uint y = __umul24(blockIdx.y, blockDim.y) + threadIdx.y;

    float u = x / (float)imageW;
    float v = y / (float)imageH;
    // read from 3D texture
    float voxel = tex3D(tex, u, v, w);

X and Y whill be a number (the number of the thread) and after that in u and v is stored that number divide by image width and height. This confuse me a little bit.
And this line also:

uint i = __umul24(y, imageW) + x;

What is the point to store in “i” something like thread_number + width_image+ thread_number;

i seems to be an element offset into the current image slice at z coordinate “w”.

Note that nowadays the use of _umul24() is not encouraged. Only Compute 1.x devices didn’t have a 32 bit integer multiply unit, that’s why the FP unit’s 24 bit multiplier was used. Hence on such devices the 24 bit multiplication had about twice the arithmetic throughput of a 32 bit multiplication.

But on today’s GPUs the 24 bit multiplication needs to be emulated with extra instructions, making it actually slower than 32 bit multiplications.


I don’t know whether __umul24() is a “well-known” operation to the compiler or is treated as a black box. It is likely the latter, in which case its use can interfere with optimizations such as strength reduction and induction variable creation.

Ok, i understand. I have digging for about 5 hours and things are more clear now.
Only one more questions guies if is possible.

I don’t still understand why cudaExtent volumeSize is (32,32,32) there are the number f bits/bytes readed?

const cudaExtent volumeSize = make_cudaExtent(32, 32, 32);

And also i have same problems with this tow declaration:

GLuint pbo;     // OpenGL pixel buffer object
struct cudaGraphicsResource *cuda_pbo_resource; // CUDA Graphics Resource (to transfer PBO)

In PBO (pixel buffer object) is readed pixel by pixel and saved in cuda_pbo_resource? because there is a function name reader that makes something like this:

// render image using CUDA
void render()
    // map PBO to get CUDA device pointer
    checkCudaErrors(cudaGraphicsMapResources(1, &cuda_pbo_resource, 0));
    size_t num_bytes;
    checkCudaErrors(cudaGraphicsResourceGetMappedPointer((void **)&d_output, &num_bytes, cuda_pbo_resource));
    //printf("CUDA mapped PBO: May access %ld bytes\n", num_bytes);

    // call CUDA kernel, writing results to PBO
    render_kernel(gridSize, blockSize, d_output, width, height, w);

    getLastCudaError("render_kernel failed");

    if (g_GraphicsMapFlag)
        checkCudaErrors(cudaGraphicsUnmapResources(1, &cuda_pbo_resource, 0));

Here can be an explanation for that strange divide:

float u = x / (float)imageW;
    float v = y / (float)imageH;

Here :

Thanks all for explanations. I am mainly using CUDA for some physical simulations not for 3D rendering.

However, I want to plot the results of the simulation (3D volumetric data) after each simulation iteration (computation Blockgrid execution). Suppose I have a dev_ans array in device memory, how to modify this example to plot it? I would really appreciate if anyone have some time to point me somewhere. Thanks in advance.

If you want to use the methodology being discussed here, you should probably start by understanding how the simpleTexture3D sample code works.

After that, you will note that the “texture” in question in based on a cudaArray. So if your data is in an “ordinary” device allocation dev_ans, then the first step could be to convert it to a cudaArray allocation using e.g. cudaMemcpy2DToArray or cudaMemcpy3D operation, with the cudaMemcpyDeviceToDevice token. After that you should be able to use the structure of the sample code more-or-less directly.

This also presumes you have understood that “concept” of “transmissivity” that is referred to in this thread, and that representation makes sense for your viewing. I’m personally not optimistic that this represents a flexible “viewer” for a wide range of use cases, but to each their own. For a more general approach, although it may be a fair amount of work to learn how to use it, using a tool intended for this task such as ParaView might be a worthwhile activity.

1 Like