Error running train.py

I have installed the Issac Gym following the installation instructions. But when I run python train.py --task Cartpole, it gives the errors:

"Box bound precision lowered by casting to {}".format(self.dtype)
RL device:  cuda:0
Sequential(
  (0): Linear(in_features=4, out_features=32, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=32, out_features=32, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=32, out_features=1, bias=True)
)
Sequential(
  (0): Linear(in_features=4, out_features=32, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=32, out_features=32, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=32, out_features=1, bias=True)
)
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 990
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 906
[Error] [carb.gym.plugin] Failed to fill rigid body state tensor
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 981
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 971
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 990
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 906

[Error] [carb.gym.plugin] Failed to fill rigid body state tensor

But it can also print some training logs like this:

[Error] [carb.gym.plugin] Failed to fill rigid body state tensor
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 971
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 990
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 906
[Error] [carb.gym.plugin] Failed to fill rigid body state tensor
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 971
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 990
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 906
[Error] [carb.gym.plugin] Failed to fill rigid body state tensor
[Error] [carb.gym.plugin] Gym cuda error: no kernel image is available for execution on the device: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 971
################################################################################
                        Learning iteration 0/500                        

                       Computation: 11204 steps/s (collection: 0.615s, learning 0.116s)
               Value function loss: 47.9882
                    Surrogate loss: -0.0001
             Mean action noise std: 1.00
                  Mean reward/step: 0.99
       Mean episode length/episode: 16.00
--------------------------------------------------------------------------------
                   Total timesteps: 8192
                    Iteration time: 0.73s
                        Total time: 0.73s
                               ETA: 365.6s

Settings:
GPU: TITAN X
Driver Version:460.91
Local Machine Cuda Version: 9.1
Conda Cudatoolkit Version: 11.1
I wonder that does the Cuda version on the local machine should >= 11.1, while only Conda Cudatoolkit version >=11.1 will not work.
Is there anyone who can solve this problem? Thank you.

Hi,
My Cuda Version: 11.2

Fri Oct 29 10:32:18 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 RTX A6000 Off | 00000000:01:00.0 On | Off |
| 70% 74C P2 166W / 300W | 10545MiB / 48676MiB | 94% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+