CUDA Initialization Error when importing a model

Hi all,

I am trying to run a simple script to analyze attention scores of Llama-2-7b-32k model on Jetstream2 cluster. I was able to run other scripts on the same cluster but I suddenly started seeing the following error when importing a model from transformers library from huggingface

Traceback (most recent call last):
  File "/home/exouser/Squeezed-Attention/offline_clustering.py", line 8, in <module>
    from utils.model_parse import (
  File "/home/exouser/Squeezed-Attention/utils/model_parse.py", line 1, in <module>
    from transformers import AutoModelForCausalLM, LlamaForCausalLM, OPTForCausalLM
  File "<frozen importlib._bootstrap>", line 1055, in _handle_fromlist
  File "/home/exouser/Squeezed-Attention/transformers/src/transformers/utils/import_utils.py", line 1500, in __getattr__
    value = getattr(module, name)
  File "/home/exouser/Squeezed-Attention/transformers/src/transformers/utils/import_utils.py", line 1499, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/home/exouser/Squeezed-Attention/transformers/src/transformers/utils/import_utils.py", line 1511, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
cudaErrorInitializationError: initialization error

my current set up is below:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:04:00.0 Off |                    0 |
| N/A   25C    P0             52W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:18:23_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0

I am quite to new in this area but if anyone has any ideas on where the root cause is please let me know.