Is there any way to check whether a pointer points onto the device or into the host?
No, there is not.
I guess there probably is, but is isn’t “non destructive” :lol:
I am wondering how or when this would arise in practice. After all, pointers don’t fall from heaven, they get values in a context which you should be able to track implicitly. If you need to pass them to a “swiss army knife” subroutine, then create a flag to signal the context to the subroutine. That is how the CUDA memcpy functions work…
Just looking at them, CPU and GPU pointers are very different: for example, they never contain more than a letter, start and end with one or more zeros…
Obivously this with my pc, my SO, my drivers and my applications…
at this level it’s difficult to make assumptions that always work.
Maybe you could wrap cudaMalloc and store all the pointers somewhere, they shouldn’t be too many.
Or maybe you could try to use a non-destructive memory function and check if it worked?
At end of day, pointers are just 32-bit values.
A valid GPU pointer could actually really be a valid CPU pointer as well – depending on the application. So you just can’t predict in a fool-proof way where a pointer leads to.
However, I have an idea – which I myself have not tried out:
Use the “try”, “catch” structure while indirecting a pointer. If it catches an exception – it is most likely to be a GPU pointer.
However, this is NOT fool proof and secondly, I am not sure if try-catch will work for this scenario. Hmm… I gues some1 needs to “throw” for the exception to be caught… Not sure if the OS throws an exception to the application on an invalid page-fault… May b, Its just my C++ ignorance.
Launch a CUDA kernel that will write something to this pointer and try to "cudaMemcpy’ it out. If it fails - it means that the pointer does NOT belong to that CUDA context for sure. I am just hoping that CUDA would allow only valid “cudaMalloc” pointers to be accessible from CUDA kernels.
THis is also not fool proof because a valid GPU pointer could be a valid CPU pointer as well…
And that will result in an unspecified launch failure and cause you to have to tear down your context. Doesn’t work at all.
In this context, I would like to talk about “cudaThreadExit”. I am raising a new topic for it.
Appreciate if you could leave your views there. Thanks!
–edit-- here is the link to the new topic: http://forums.nvidia.com/index.php?showtopic=97490
Ok, If I dont launch the kernel and just attempt to merely “cudaMemcpy” 1 byte of data from that pointer - Will it show up neatly?
I am hoping that “cudaMemcpy” would check for validity of the pointer before proceeding. So, it would just fail saying that the pointer is bad.
cudaMemcpy is just as much in the dark as we are. I don’t think it can check validity, it can just try to copy where you asked and hope the GPU’s memory protection doesn’t yell at it.
Knowing where a pointer points to is a problem more general than just CUDA or whichever functions the API exposes. There’s no reliable way of telling whether the number 0xe7d5dd00 is an address in device code, host code, or just the number 3’889’552’640 you use as a constant somewhere in your computations. Unless you’re using some smart pointers or a lookup allocation table, which have their own problems (for example with pointer arithmetic)
thejfasi, is the original post dealing with determining, in your code, where a pointer points to, or when you are debugging it yourself?
cudaMemcpy goes via the driver and I bet the driver wont put a bad address on the bus. The driver knows the context, all the mem allocations made in the context and he can easily tell you if a pointer is bad.
Apart from that, as you said, there is no fool-proof way to tell. Nothing prevents a GPU pointer being valid in CPU address space as well. (some weird co-incidence in some application is a possibility)
cudaErrorInvalidDevicePointer is one of the error codes defined in “driver_types.h”
Thus cudaMemcpy can really find if a pointer is bad.
Or it might segfault by trying to read from the host side. Seriously, there is no way to do this right now.
I hope it refers to “cudaMemcpy”.
It all depends on the direction of copy. We could always use “cudaMemcpyDeviceToHost” and pass the source as the pointer in question and destination as “valid host” pointer. If the GPU bpointer is bad, cudaMemcpy (of 1 byte) would return a “bad” address.
So, atleast we can verify whether a GPU pointer is valid or not. But to find out whether a pointer is host or gpu is not possible and is not fool-proof.
Yeah, but the setup of a memcpy has for sure an high overhead; for example on my system it can’t go lower than 200-300 us…
so it doesn’t look very good to be a simple check.
IMHO, would be better to wrap the device and save in its class a list of the allocated pointers; they are few and can be ordered, so checking if a 32-bit value is in the list is fast.
Otherwise you could wrap all of the malloc-ed GPU memory into smart pointers…
It is the “context” and not the “device”.
2 pointers with same value can exist in 2 different contexts and could physically refer to different memory regions. (atleast I know you can’t share pointers between contexts).
Also, If you look @ PCI-E mem space occupied by TESLA cards, it is very very less (does not expose entire 4GB). So, I think the driver has to switch banks and then do the memory copy – all of which are NON-CACHED writes resulting in pipeline-stalls. Thats probably why u can’t get it less than 200microsecs or whatever.
The validity checking should be as simple - as u have indicated.