I noticed my little cuda memory benchmark is capable of “halting” the graphics system or other system-parts of the graphics card, perhaps memory system while it runs. This means any game running will temporarely freeze up… This worries me a bit. It’s a modest GT 520.
So apperently it’s possible to somehow freeze up other apps/games/etc.
My question is basically:
Are there any plans to make feature graphics cards “throttled” somehow ? A bit like multi-threading on CPU. Where applications/threads/processes get a time-slice from the operating system in an attempt to keep everything smoothly running.
Perhaps cuda/the driver already does some modest threading and such and trying to share the graphics card with other apps. However apperently the graphics cards memory system is not “time sliced” or “tokenized” ? Perhaps this could even be a “vector of attack” on multi user systems and such.
Are there plans to somehow “tokenize” or “throttle” the graphics card memory system among multiple applications to make it a bit more fair sharing ? Perhaps user should get some control over which apps get more, and which apps get less “graphics card bandwidth”.
“A bit like multi-threading on CPU. Where applications/threads/processes get a time-slice from the operating system in an attempt to keep everything smoothly running.”
would certainly have some similarities with dynamical paralellism - pushing current kernel state into global memory, and dp is not always that cheap, is it?
I don’t think so. My request is to “bandwidth throttle the gpu/gfx card” based on applications which are eventually tied to the cpu for now. Dynamic parrellism happens witin the gpu itself within a single application/cuda kernal launch. I don’t want dynamic parallelism throttling itself or anything like that. I want the applications to be throttled. If you ment “context switches” being pushed onto a stack, and dynamic parallelism being pushed onto a stack as similarties… then “stacking similarity” not much of a similarity. I couldn’t care less what’s pushed to global memory. Whatever is being pushed or popped… I want it throttled :)
" I want it throttled"
install more gpus, and consider it throttled
in your view, what would be the cost of throttling, and would it not be expensive?
“bandwidth throttle the gpu/gfx card” based on applications which are eventually tied to the cpu for now
and how much would that cost?
“then “stacking similarity” not much of a similarity”
well, that is debatable
Not much throttling is easy, when token bucket is empty, stall memory requests.
yes, we can call the token bucket, a ‘Skybuck-et’
the statement/ conclusion “Not much throttling is easy” or “not much; throttling is easy”
really depends on the premise
stall memory requests == negligible overhead
it is the time of merrymaking, hence i am going to overlook details and technicalities, and actually agree