For the “driver space” solution you might have to work together with nvidia driver programmers, maybe they would be interested in trying to help you out with that.
Yeah, rather unfortunately NVIDIA is not open source :-(
I known there are some open source nVidia gpu drivers out there because I read about them once… (I think same applies to ATI/AMD)
A quickly good turns this up:
https://nouveau.freedesktop.org/wiki/
Not sure if this is also suited for CUDA/PTX/Compute Clusters.
There is also this:
And at the bottom it says:
"
Open source drivers for NVIDIA nForce hardware are included in the standard Linux kernel and leading Linux distributions. This page includes information on open source drivers, and driver disks for older Linux distributions including 32-bit and 64-bit versions of Linux.
"
Kinda weird how it mixes unix and linux info ? :) I am not sure if unix drivers work on linux or vice versa ! ;) :)
Yeah, the thing about these real world examples really has to do with trying to get some work published. I’ve never published any work before and this has to do with adding to the credibility and showing that there are real applications that take this much time. My supervisor has been around academics all his life and supervised many students. I get the feeling that him recommending I find “real world” examples is coming more from his experience.
Why is publishing important ? and for who is it important ? For you, for him or for academy ?
What is gained from it ? Attention ? Money ? Fame ? Will this help your carrier in the future ? (I guess so ;))
Personally I think the requirement of “real-world examples” might be a bit silly though real world examples might have things we did not yet think about, for example the timer/timing related things… there may be others.
Also what if the “real world” as to adept to new coding techniques to make it possible ? ;)
Personally if I were interested in “checkpoint start method” for GPUs I would already be happy with a document describing a technique for “synthetic examples”.
Once the technique works for “synthetic examples” it can then be tried on “real-world examples”. This also offers you the possibility of publishing twice.
First for “easy/simplistic/short” examples which might already be very helpfull for readers.
Second for “complex examples” and deeply interested individuals or organizations.
As you already note further down below, one has to start somewhere. I’ve read some academic papers myself or viewed over them… most are probably way to thick/to deep/too complex for me to be of any use.
The best one I came across so far was “fast grid traversal”… that one was short and to the point, some info was lacking but could be derived.
As you also noted, you already tried a bigger solution and it was though and complex ? Why make it even more complex with a “real-world” example ? :):):)
It gives some insight for gamers, no published work exists as far as I know off :) Again I find this “need for prove” weird. Does your teacher not believe that there exist real-world long running kernels ?
Oh, he does believe. It’s just like I said before, for strengthening the publication we need to point to some existing work that uses long-running GPU kernels.
Ok, now I understand why your teacher wants this. Your teacher believes that if your document is linked to existing work that this will give your document more credibility.
One possible problem with this is that this is still kind of a “new world”, compute clusters/cuda/ptx, etc, let alone “well known”.
This is new stuff, this is the frontier :) You may have to pave the way yourself ! =D That’s the fun part about it =D
Also see it as an oppertunity… “the frontier legend” ! =D
What do you mean by “pro” and “consumer”?
nVidia sells two different kinds of cards/brands; Quadro is for “pro” and GeForce is for “consumer/gamers”, the geforce may have some features disabled or lacking certain API support.
Yeah, this will be an “application space” solution.
So what is the plan here ? Do you intend to “transform/modify” an existing “real world example” ? Or do you intend to try and find a “generic solution” to “real world examples” ?
The last one will get tricky, and thus last one will probably require gpu driver support.
I can understand that you might want to go with the first one “transform/modify” to see what can be done to the real world examples at application space.
However describing this in a document would get a bit tricky… people would then need to understand the real world example and then “admire” the “transformed/modified” example.
For the reader this might be a bit much if it’s a big real world example. Perhaps you want to try and see what “real world examples” run into… to get a taste of what the problem entails (?)
Then again it is cuda c and by now cuda c++ we are discussing about and perhaps c++ has some features that I don’t know about… perhaps you could come up with some template or some kind of c++ trick which could create a “checkpoint” for any piece of random software…
Or perhaps you might even need to write some kind of c++ interpreter/compiler and analyze the cuda c/++ code/kernel and then generate some automatic checkpointing software/c/c++ code which can then be embedded into any “real world example” :)
The current solution I’m working on aims to be independent of the kind of work being done. I just aim to be able to suspend and resume long-running kernels. For simplicity of testing however, a single kernel executing on one device would be ideal.
;) Hmmm did I guess it right that you might be working on some cuda c/c++ compiler/interpreter ? :)
Haha, a full compute cluster example would be the holy grail for me indeed! but small steps. I’ve started on a few big attempts before and things very quickly got complex and messy without me even knowing if the approach even works at all. So now I do small steps and if they work then I think about going bigger.
Yes for now it looks like it’s pretty messy out there… Could be nice if CUDA API was expanded to support compute clusters and all kinds of network/computer communication would be done below that cuda api ;) to relief cuda/parallel programmers from having to deal with all that plumbing ;)
Lol, good suggestion! Is there a formal description of this game? Because he might have questions I don’t know how to answer. Also this sounds a bit like Battleships? is it?
The game is called “World of Warships”, for more information about it try and google it and youtube it then you will get a sense of what it is.
The comparison to Battleships is kinda interesting though ! WOWS could be reduced to such an easier/simpler version which includes position (for my kernel, currently it ignores position).
However consider World of Warships a slight variation on “Battleships”. As far as I know in the game “Battleships” the board game I once played, every player (only two) get to fire a single shot at each other each turn.
The “World of Warships Battleships” board game would work as follows:
Each ship that is still alive gets to fire one shot at the enemy each turn. However the bigger ships might get 2 or 3 shots… perhaps this is a bit much though.
However to make things more fair… the “shells” could already be considered “in flight”.
So for example during round 1, each player gets to fire something like 10 shells or something… interleaved with the enemy shots.
After both players have shot 10 shots… the “damaged” is assed. Sunking ships are removed from the game.
The new ammount of shells is calculated that each player is allowed to fire, and then round 2 starts.
Also it would be cool to simply give each other a list of “fired upon” positions and then verify if they actually hit anything… (after round 1) to prevent leaking info to enemy during the shots. (Instead of saying “splash” or “boom” after each shot which would allow “next shot” adjustment)
If I ever play Battleships again with somebody… I will try and play it like this !
Should be much fun ! And makes the game more realistic and much more complex ! =D
(To make it even more “insane”/“realistic” allow ships to move 1 square or so each round ! haha ! :) or perhaps even different ammount of squares, little ships can move 3 or 4 and bigger ships only 2 or 1)
Thanks for writing… perhaps I gave you some ideas… perhaps not… perhaps you already had them…
I wish you good luck and lots of fun talking to your teacher about all of this.
And too you I also write: do write again if you feel like it ! =D