How to produce OpenCL executable on an NVidia card?

Jing_Li · June 13, 2014, 1:43pm

I started to use pgcc compiler lately, and I have 2 questions:

Is it possible to produce OpenCL code on an Nvidia card
Is it possible to acquire the kernel code that generated (.cu or .cl file) by pgcc?

MatColgrove · June 13, 2014, 4:17pm

Hi Jing Li,

For NVIDIA devices, we only target CUDA C or LLVM. Though we target OpenCL or LLVM when targeting AMD device.

To see the generated device code, use the “keep” sub-option: “-ta=:keep”.

The kernels will be located in the “filename.*.gpu” files.

Hope this helps,
Mat

FYI, here’s the current 14.6 list of “-ta” sub-options:

% pgfortran -help -ta
-ta=tesla:{[no]autocollapse|[no]fma|[no]flushz|keep|llvm|loadcache:{L1|L2}|[no]unroll|maxregcount:<n>|[no]rdc|[no]required|cc1x|tesla|cc1+|tesla+|cc2x|fermi|cc2+|fermi+|cc3x|kepler|cc3+|kepler+|fastmath|pin|cuda5.5|cuda6.0}|nvidia|radeon:{keep|llvm|[no]unroll|[no]required|tahiti|capeverde|spectre|buffercount:<n>}|host
                    Choose target accelerator
    tesla           Select NVIDIA Tesla accelerator target
     [no]autocollapse
                    Automatically collapse tightly nested loops
     [no]fma        Generate fused mul-add instructions (default at -O3)
     [no]flushz     Enable flush-to-zero mode on the GPU
     keep           Keep kernel files
     llvm           Use LLVM back end; disables cc1x
     loadcache      Choose what hardware level cache to use for global memory loads
      L1            Use L1 cache
      L2            Use L2 cache
     [no]unroll     Enable automatic inner loop unrolling (default at -O3)
     maxregcount:<n>
                    Set maximum number of registers to use on the GPU
     [no]rdc        Generate relocatable device code
     [no]required   Issue compiler error if the compute regions fail to accelerate
     cc1x|tesla     Compile for compute capability 1.x
     cc1+|tesla+    Compile for compute capability 1.x and above
     cc2x|fermi     Compile for compute capability 2.x
     cc2+|fermi+    Compile for compute capability 2.x and above (default)
     cc3x|kepler    Compile for compute capability 3.x
     cc3+|kepler+   Compile for compute capability 3.x and above
     fastmath       Use fast math library
     pin            Set default to pin host memory
     cuda5.5        Use CUDA 5.5 Toolkit compatibility
     cuda6.0        Use CUDA 6.0 Toolkit compatibility
    nvidia          nvidia is a synonym for tesla
    radeon          Select AMD Radeon GPU accelerator target
     keep           Keep kernel source files
     llvm           Use LLVM/SPIR back end
     [no]unroll     Enable automatic inner loop unrolling (default at -O3)
     [no]required   Issue compiler error if the compute regions fail to accelerate
     tahiti         Compile for Radeon Tahiti architecture (default)
     capeverde      Compile for Radeon Capeverde architecture
     spectre        Compile for Radeon Spectre architecture
     buffercount:<n>
                    Set max number of device buffers used by OpenCL kernel
    host            Compile for the host, i.e., no accelerator target

Jing_Li · June 15, 2014, 8:36am

Hi Mat,
Thanks for your quick response. So PGI compiler does not support generating OpenCL code that can run on a CUDA enabled device, am I correct on this?

MatColgrove · June 16, 2014, 3:03pm

So PGI compiler does not support generating OpenCL code that can run on a CUDA enabled device, am I correct on this?

Correct. Given CUDA is available on NVIDIA device and the underlying OpenACC device code should transparent to the user, there was no reason to support OpenCL target generation.

Mat

Topic		Replies	Views
How to give cuda code Legacy PGI Compilers	2	4374	November 4, 2015
Generate CUDA kernel code Legacy PGI Compilers	4	4308	August 2, 2010
how to compile using -ta=nvidia suboptions Legacy PGI Compilers	4	11993	July 14, 2009
source-to-source translation of c source code Legacy PGI Compilers	1	5036	March 18, 2015
clcc - an NVIDIA OpenCL command line compiler CUDA Programming and Performance	8	10240	November 1, 2012
An OpenACC Example (Part 2) Technical Blog	2	437	September 3, 2015
PGI OpenACC nvcc compiler flags (or cuda flags) Legacy PGI Compilers	4	6766	March 21, 2014
Is the compiler aware of the GPU I have? Legacy PGI Compilers	1	1715	October 18, 2012
About PGI 14.1 SPIR and OpenACC.. Legacy PGI Compilers	1	3972	January 31, 2014
Can NVCC compile and/or generate GPU code using OpenMP? CUDA NVCC Compiler	1	1053	February 1, 2022

How to produce OpenCL executable on an NVidia card?

Related topics