Documentation bugs

Can we have a sticky to list doco bugs? There are a lot of them and not worth cluttering up the forum with a topic for each. One that I noticed in 0.9 is the expanded cudaGetDeviceProperties() where I think the clockRate is in KHz not Hz. It will very soon turn negative as it is defined!
Eric
PS Thanks for the enhancement here.
(another obvious one that has never been corrected is p44 in 0.9: “struct(16)” in the code boxes)

0.9 guide p63:

Number of multiprocessors appears not to be available anywhere. The other stats are in sections D.1.2 & E.2.6 also clockrate in E.2.6 should be KHz (only mentioned D.1.2 last time). Also extra entry “textureAlign” in both properties structs???

p47 in 1.0 guide:

The first time I read that in 0.8 I had to write a test program just to make sure this was not some sort of funny C compiler. It should say 8 bytes and is the same as the above example so could be deleted.

PTX manual:

  • all examples thoughout section 5.3 show initialization of variables from .shared or .local state space, which is wrong as these state spaces do not allow that

  • the statement in section 5.3.1 “texture variables do not have a type” and the examples in section 5.1.8 are wrong. ptxas does not accept .tex without type

  • section 6.4.5 mentions immediate value operands. To avoid confusion to the programmer, it should also state that the ALUs are big endian format.

  • section 7.7.5 is not fully documented

  • section 7.7.6: what is the difference between ret at toplevel and exit? Why is there no exit.uni ?

Peter

This one I don’t understand not being reported earlier… in section 5.2 of the guide the formula “R / (B * ceil(T, 32)” should read “R / (B * ceil(T, 64)” found as soon as I tried benchmarking real code. Checked for 1, 3, and 5 warp blocks. This first appeared in the 0.81 manual - perhaps it is only broken for the 64 bit driver that I am using.

It is not as much a bug in de documentation, but it deserves an annotation in bold red letters in the programming guide as far as I’m concerned:

When using a texture to exploit the linear interpolation capacity and the caching of CUDA, the coordinates used to read information from the texture are not the exact coordinates in the CUDA array. For example, when you configure linear interpolation for texture fetches and you write a value to index 2 of your CUDA array, reading it using tex1D(texture, 2.0f) will return the linear interpolation between index 1 of the CUDA array and index 2 of the CUDA array instead of the value at index 2 of the CUDA array.

This may seem usual to every-day graphics users, but it may be nice to mention it explicitly for the other programmers using CUDA.

Btw. Section F2 on page 111 of Programming Guide 1.0 says: ‘Figure F-2 illustrates nearest-point sampling for a one-dimensional texture with N=4’, but it should be ‘Figure F-2 illustrates linear filtering for a one-dimensional texture with N=4’.