The runtime API is thread safe - there are plenty of examples floating around using either pthreads or Boost threads which work perfectly. The only thing you have to be careful about is that contexts are bound to both devices and threads and have the lifetime of the thread that established them. So you need to make sure each thread only ever interacts with its own GPU, and that you do something to keep the threads alive for as long as you need the context the thread holds.
I am not familiar with the Qt threading API so it is pretty hard to parse what the start() and wait() methods your code calls do (or where the run() method which interacts with the GPU is actually called from).
I guess this establishes the thread context appropriately and then cudaMemGetInfo() can do its job.
The only question that remains is why cudaMemGetInfo() doesn’t establish the context itself.
For those struggling with the threading API for some reason, run() is the actual code of the that the thread runs, start(), surprisingly enough starts the thread, and wait(), waits for it to finish().
cudaMemGetInfo() is not infact a function of cutil. It is in the cudart dll and is documented in the official cuda runtime documentation.
That documentation says nothing about a valid context and infact, for a user of the cudart api, the concept of contextes should be transparent.
This is clearly a bug in cudart.