No kernel info in nsight nvprof

When I use nsight system to profile GPU applications, there is no kernel performace info, but there are cuda mem related profile.

Background:
I’m using RTX 3060 TI in win11 WSL (ubuntu 18.04),
the cuda driver version is 511.79-desktop-win10-win11-64bit-international-dch-whql.exe
the nsight system version is 2022.1.1.
I demonstrated this problem by a tensorflow python script: nsys nvprof python matmul_tf.py.
In matmul_tf.py, tensorflow matmul is executed.
All other cuda applications can’t be profiled.
Could someone help me solving this problem?

matmul_tf.py script:

import tensorflow as tf
import numpy as np

tf.compat.v1.reset_default_graph()
tf.compat.v1.disable_eager_execution()
print("support gpu:", tf.test.is_gpu_available())

m = 512
k = 512
n = 512

a_shape = [m, k]
b_shape = [k, n]

np.random.seed(0)
kernel_np = np.random.uniform(low=0.0, high=1.0, size=b_shape).astype("float32")
input_np = np.random.uniform(low=0.0, high=1.0, size=a_shape).astype("float32")

pld1 = tf.compat.v1.placeholder(dtype="float32", shape=a_shape, name="input1")
kernel = tf.constant(kernel_np, dtype="float32")

feed_dict = {pld1: input_np}

result_tf = tf.raw_ops.MatMul(a=pld1, b=kernel, transpose_a=False, transpose_b=False)

with tf.compat.v1.Session() as sess:

    for i in range(10):
        result = sess.run(result_tf, feed_dict=feed_dict)
    print("result shape:", result.shape)