As a result of profiling the application using the nsight system, the “op_generic_tensor_kernel” function is observed frequently. What is this function doing?
Hi @soohyung.zhang ,
It’s a relu kernel. The same kernel template is used to do all the pointiwise binary/unary ops.
Thanks
Thank you!