Hi,
has anyone performed a comparison between the performance of caffe2 and hand-tuned C/C++ CUDA ( i.e. program built from C / C++ with CUDA runtime / driver API ) production engine?
It would be great if you could share the results.
Thanks
Hi,
has anyone performed a comparison between the performance of caffe2 and hand-tuned C/C++ CUDA ( i.e. program built from C / C++ with CUDA runtime / driver API ) production engine?
It would be great if you could share the results.
Thanks