ive seen a lot how to program GPU stuff, but I’m looking for some verification guide. traditionally I just run my program and verify with golden data set. is there tool/guide on best practice for verification. thx

What leads you to believe functional verification works any differently between GPU-based computation and CPU-based computation? In general all the same techniques are used. The only practical difference I can think of is that on GPUs there may be a higher chance that your computation would use non-deterministic order of computation (e.g. atomic floating-point summation) on purpose.

For what it is worth, in my experience using a fixed set of golden reference data is usually not a best practice. I have spent too many hours “re-gilding” such data for other people’s test framework after some parameter of the computation changed by design, e.g. introduction of mixed precision. It is usually best to generate reference data dynamically at run time using a high-quality reference computation and then compute an appropriate error metric, such as maximum relative error, maximum ulp error, root mean square error, or residual.

While this generally makes for longer run time of tests, it offers much more flexibility both in terms of changes to design and requirements and the number of test vectors generated. It also helps avoid the all too common fallacy of “my GPU results do not match my CPU results bit-for-bit, therefore there must be a problem with my GPU computation”.

well for certain test set/function, I use golden data mainly its easier, for dynamic I throw in some mean square error etc.

also, is there any way for matlab to call CUDA/GPU code. so the data input will be in matlab, GPU output data, matlab use those output to do visual analysis.