ComfyUI: Anyone else having problems with LOW-pass CPU offloading?

nvidia3953 · January 17, 2026, 8:38am

For some reason I’m seeing that my LOW-pass WanVideo Sampler is affinitizing to the CPU and not the GPU; making for longer than necessary create times. Everything else is fine.

Any one else having this issue; what did you do to resolve?

NVES · January 17, 2026, 5:08pm

Can you share more context? such as is your library GPU aware? or if you are using a docker container, is it enable to passthru the GPU? Sometimes even if the library is GPU aware, you may need the latest version to take advantage of the SM* version for Blackwell.

nvidia3953 · January 18, 2026, 11:20am

Sure - since this is an application environment; e.g. comfyui, I’m questioning (those with experience testing ComfyUI-on-Spark) whether they’ve tested enough to see this problem or if this is more of an interaction between ComfyUI and the default Spark environment. For example, this may be a caching issue where ComfyUI is holding on to a previous state that it doesn’t recover from when restarting. The only caches I’m finding are python related but maybe there’s a Spark state that gets messed up?

This feels like state problem. Long stretches of it working off GPU then it seems to get stuck in CPU mode and doesn’t go back to GPU.

However, this also may be a problem with Spark drivers or hardware that causes GPU resources not to be available so the application falls back to CPU. There appears to be some performance bugs with Spark that I’ve been reading about and fixes not yet released.

This is running ComfyUI from the console within an ssh session on bare metal - no docker, vms, etc… venv for python but that’s it. Everything is updated to its latest - I’ll provide some inventory lists. Doesn’t look like I can attach files so including inline -

COMMAND: nvidia-smi

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3526 G /usr/lib/xorg/Xorg 18MiB |
| 0 N/A N/A 3555 G /usr/bin/gnome-shell 6MiB |
| 0 N/A N/A 4099 C …spark01/comfy-env/bin/python3 21607MiB |
±----------------------------------------------------------------------------------------+

COMMAND: /home/spark01/comfy-env/bin/python /home/spark01/ComfyUI/_support_output_20260118_030309/pip_spotlight.py

{
“spotlight_versions”: {
“accelerate”: “1.12.0”,
“diffusers”: “0.36.0”,
“numpy”: “2.2.6”,
“opencv-python”: “4.12.0.88”,
“pillow”: “12.0.0”,
“safetensors”: “0.7.0”,
“torch”: “2.9.1+cu130”,
“torchaudio”: “2.9.1”,
“torchvision”: “0.24.1”,
“transformers”: “4.57.3”,
“triton”: “3.5.1”,
“xformers”: null
}
}

COMMAND: /home/spark01/comfy-env/bin/python /home/spark01/ComfyUI/_support_output_20260118_030309/key_libs_and_cuda_via_torch.py

/home/spark01/comfy-env/lib/python3.12/site-packages/torch/cuda/init.py:283: UserWarning:
Found GPU0 NVIDIA GB10 which is of cuda capability 12.1.
Minimum and Maximum cuda capability supported by this version of PyTorch is
(8.0) - (12.0)

warnings.warn(
{
“modules”: {
“PIL”: {
“imported”: true,
“version”: “12.0.0”
},
“accelerate”: {
“imported”: true,
“version”: “1.12.0”
},
“cv2”: {
“imported”: true,
“version”: “4.12.0”
},
“diffusers”: {
“imported”: true,
“version”: “0.36.0”
},
“einops”: {
“imported”: true,
“version”: “0.8.1”
},
“numpy”: {
“imported”: true,
“version”: “2.2.6”
},
“safetensors”: {
“imported”: true,
“version”: “0.7.0”
},
“torch”: {
“imported”: true,
“version”: “2.9.1+cu130”
},
“torchaudio”: {
“imported”: true,
“version”: “2.9.1”
},
“torchvision”: {
“imported”: true,
“version”: “0.24.1”
},
“tqdm”: {
“imported”: true,
“version”: “4.67.1”
},
“transformers”: {
“imported”: true,
“version”: “4.57.3”
},
“triton”: {
“imported”: true,
“version”: “3.5.1”
},
“xformers”: {
“error”: “No module named ‘xformers’”,
“imported”: false
}
},
“torch”: {
“build_config_lines”: [
“PyTorch built with:”,
" - GCC 13.3",
" - C++ Version: 201703",
" - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d)“,
" - OpenMP 201511 (a.k.a. OpenMP 4.5)”,
" - LAPACK is enabled (usually provided by MKL)“,
" - NNPACK is enabled”,
" - CPU capability usage: DEFAULT",
" - CUDA Runtime 13.0",
" - NVCC architecture flags: -gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_100,code=sm_100;-gencode;arch=compute_110,code=sm_110;-gencode;arch=compute_120,code=sm_120;-gencode;arch=compute_120,code=compute_120",
" - CuDNN 91.3",
" - Build settings: BLAS_INFO=nvpl, BUILD_TYPE=Release, COMMIT_SHA=5811a8d7da873dd699ff6687092c225caffcf1bb, CUDA_VERSION=13.0, CUDNN_VERSION=9.13.0, CXX_COMPILER=/opt/rh/gcc-toolset-13/root/usr/bin/c++, CXX_FLAGS=-ffunction-sections -fdata-sections -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_PYTORCH_QNNPACK -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-dangling-reference -Wno-error=dangling-reference -Wno-stringop-overflow, LAPACK_INFO=nvpl, TORCH_VERSION=2.9.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, "
],
“cuda”: {
“allow_tf32_matmul”: false,
“cudnn_enabled”: true,
“cudnn_version”: 91300,
“device_count”: 1,
“is_available”: true,
“torch_cuda_version”: “13.0”
},
“devices”: [
{
“index”: 0,
“major”: 12,
“minor”: 1,
“multi_processor_count”: 48,
“name”: “NVIDIA GB10”,
“total_memory_bytes”: 128499458048
}
],
“torch_git_version”: “5811a8d7da873dd699ff6687092c225caffcf1bb”,
“torch_version”: “2.9.1+cu130”
}
}

COMMAND: /home/spark01/comfy-env/bin/python /home/spark01/ComfyUI/_support_output_20260118_030309/python_runtime_inventory.py

{
“env”: {
“PATH”: “/home/spark01/comfy-env/bin:/home/spark01/comfy-env/bin:/usr/local/cuda/bin:/opt/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin”,
“VIRTUAL_ENV”: “/home/spark01/comfy-env”
},
“python”: {
“base_prefix”: “/usr”,
“executable”: “/home/spark01/comfy-env/bin/python”,
“implementation”: “CPython”,
“platform”: “Linux-6.14.0-1015-nvidia-aarch64-with-glibc2.39”,
“prefix”: “/home/spark01/comfy-env”,
“version”: “3.12.3 (main, Jan 8 2026, 11:30:50) [GCC 13.3.0]”
},
“site”: {
“getsitepackages”: [
“/home/spark01/comfy-env/lib/python3.12/site-packages”,
“/home/spark01/comfy-env/local/lib/python3.12/dist-packages”,
“/home/spark01/comfy-env/lib/python3/dist-packages”,
“/home/spark01/comfy-env/lib/python3.12/dist-packages”
],
“getusersitepackages”: “/home/spark01/.local/lib/python3.12/site-packages”
},
“sys_path”: [
“/home/spark01/ComfyUI/_support_output_20260118_030309”,
“/usr/lib/python312.zip”,
“/usr/lib/python3.12”,
“/usr/lib/python3.12/lib-dynload”,
“/home/spark01/comfy-env/lib/python3.12/site-packages”
],
“sysconfig”: {
“paths”: {
“data”: “/home/spark01/comfy-env”,
“include”: “/usr/include/python3.12”,
“platinclude”: “/usr/include/python3.12”,
“platlib”: “/home/spark01/comfy-env/lib/python3.12/site-packages”,
“platstdlib”: “/home/spark01/comfy-env/lib/python3.12”,
“purelib”: “/home/spark01/comfy-env/lib/python3.12/site-packages”,
“scripts”: “/home/spark01/comfy-env/bin”,
“stdlib”: “/usr/lib/python3.12”
},
“platform”: “linux-aarch64”
}
}

Import sanity checks using venv python:
python: Python 3.12.3

Test: import torch / print key attributes
torch: 2.9.1+cu130
cuda available: True
torch.version.cuda: 13.0

COMMAND: /home/spark01/comfy-env/bin/pip freeze --all

absl-py==2.3.1
accelerate==1.12.0
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aiohttp_socks==0.11.0
aioice==0.10.2
aiortc==1.14.0
aiosignal==1.4.0
alembic==1.18.0
aliyun-python-sdk-core==2.16.0
aliyun-python-sdk-kms==2.16.5
annotated-doc==0.0.4
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyascii==0.3.3
anyio==4.12.1
argbind==0.3.9
arrow==1.4.0
asgiref==3.11.0
asttokens==3.0.1
attrs==25.4.0
audio-separator==0.41.0
audioread==3.1.0
av==16.1.0
babel==2.17.0
beartype==0.18.5
binaryornot==0.4.4
bitsandbytes==0.49.1
boto3==1.34.86
botocore==1.34.162
brotli==1.2.0
cached_path==1.8.1
certifi==2026.1.4
cffi==2.0.0
cfgv==3.5.0
chardet==5.2.0
charset-normalizer==3.4.4
click==8.1.8
cn2an==0.5.23
color-matcher==0.6.0
coloredlogs==15.0.1
comfy-cli==1.5.4
comfy-kitchen==0.2.6
comfyui-embedded-docs==0.4.0
comfyui-workflow-templates-core==0.3.88
comfyui-workflow-templates-media-api==0.3.39
comfyui-workflow-templates-media-image==0.3.55
comfyui-workflow-templates-media-other==0.3.80
comfyui-workflow-templates-media-video==0.3.38
comfyui_frontend_package==1.36.14
comfyui_workflow_templates==0.8.4
conformer==0.3.2
contourpy==1.3.3
contractions==0.1.73
cookiecutter==2.6.0
crcmod==1.7
cryptography==46.0.3
csvw==3.7.0
cycler==0.12.1
Cython==3.2.4
dacite==1.9.2
datasets==4.5.0
ddt==1.7.2
decorator==5.2.1
descript-audio-codec==1.0.0
descript-audiotools==0.7.2
diffq==0.2.4
diffusers==0.36.0
dill==0.4.0
Distance==0.1.3
distlib==0.4.0
dlinfo==2.0.0
dnspython==2.8.0
docstring_parser==0.17.0
docutils==0.22.4
editdistance==0.8.1
einops==0.8.1
einx==0.3.0
ema-pytorch==0.7.9
encodec==0.1.1
executing==2.2.1
faiss-cpu==1.13.2
fal_client==0.11.0
fastapi==0.128.0
ffmpy==1.0.0
filelock==3.20.3
fire==0.7.1
flatbuffers==25.12.19
flatten-dict==0.4.2
fonttools==4.61.1
frozendict==2.4.7
frozenlist==1.8.0
fsspec==2025.10.0
ftfy==6.3.1
funasr==1.3.0
g2p-en==2.1.0
gguf==0.17.1
gitdb==4.0.12
GitPython==3.1.46
google-api-core==2.29.0
google-auth==2.47.0
google-cloud-core==2.5.0
google-cloud-storage==3.8.0
google-crc32c==1.8.0
google-resumable-media==2.8.0
googleapis-common-protos==1.72.0
gradio==6.3.0
gradio_client==1.13.3
greenlet==3.3.0
groovy==0.1.2
grpcio==1.76.0
h11==0.16.0
h2==4.3.0
h5py==3.15.1
hf-xet==1.2.0
hpack==4.1.0
httpcore==1.0.9
httpx==0.28.1
httpx-sse==0.4.3
huggingface-hub==0.36.0
humanfriendly==10.0
hydra-core==1.3.2
hyper-connections==0.4.6
hyperframe==6.1.0
HyperPyYAML==1.2.3
identify==2.6.16
idna==3.11
ifaddr==0.2.0
ImageIO==2.37.2
imageio-ffmpeg==0.6.0
importlib_metadata==8.7.1
importlib_resources==6.5.2
inflect==7.5.0
iopath==0.1.10
ipython==9.9.0
ipython_pygments_lexers==1.1.1
isodate==0.7.2
jaconv==0.4.1
jamo==0.4.1
jedi==0.19.2
jieba==0.42.1
Jinja2==3.1.6
jmespath==0.10.0
joblib==1.5.3
json-logic==0.7.0a0
json5==0.13.0
jsonschema==4.26.0
jsonschema-specifications==2025.9.1
julius==0.2.7
kaldifst==1.7.17
kaldiio==2.18.1
keras==3.13.1
kiwisolver==1.4.9
kornia==0.8.2
kornia_rs==0.1.10
language-tags==1.2.0
lazy_loader==0.4
librosa==0.11.0
llvmlite==0.46.0
local-attention==1.11.2
loguru==0.7.3
Mako==1.3.10
Markdown==3.10
markdown-it-py==4.0.0
markdown2==2.5.4
MarkupSafe==2.1.5
matplotlib==3.10.8
matplotlib-inline==0.2.1
matrix-nio==0.25.2
mdurl==0.1.2
mixpanel==5.1.0
ml_collections==1.1.0
ml_dtypes==0.5.4
modelscope==1.33.0
monotonic-alignment-search==0.2.1
more-itertools==10.8.0
mpmath==1.3.0
msgpack==1.1.2
mss==10.1.0
multidict==6.7.0
multiprocess==0.70.18
munch==4.0.0
namex==0.1.0
networkx==3.6.1
ninja==1.11.1.4
nltk==3.9.2
nodeenv==1.10.0
numba==0.63.1
numpy==2.2.6
nvidia-cublas==13.0.0.19
nvidia-cuda-cupti==13.0.48
nvidia-cuda-nvrtc==13.0.48
nvidia-cuda-runtime==13.0.48
nvidia-cudnn-cu13==9.13.0.50
nvidia-cufft==12.0.0.15
nvidia-cufile==1.15.0.42
nvidia-curand==10.4.0.35
nvidia-cusolver==12.0.3.29
nvidia-cusparse==12.6.2.49
nvidia-cusparselt-cu13==0.8.0
nvidia-nccl-cu13==2.27.7
nvidia-nvjitlink==13.0.39
nvidia-nvshmem-cu13==3.3.24
nvidia-nvtx==13.0.39
omegaconf==2.3.0
onnx==1.20.1
onnx-weekly==1.21.0.dev20260112
onnx2torch-py313==1.6.0
onnxruntime==1.23.2
open_clip_torch==3.2.0
openai-whisper==20250625
opencv-python==4.12.0.88
opencv-python-headless==4.12.0.88
optree==0.18.0
orjson==3.11.5
oss2==2.19.1
packaging==25.0
pandas==2.3.3
parso==0.8.5
pathspec==1.0.3
peft==0.18.1
pexpect==4.9.0
phonemizer==3.3.0
piexif==1.1.3
pillow==12.0.0
pip==25.3
platformdirs==4.5.1
pooch==1.8.2
portalocker==3.2.0
praat-parselmouth==0.4.7
pre_commit==4.5.1
proces==0.1.7
prompt_toolkit==3.0.52
propcache==0.4.1
proto-plus==1.27.0
protobuf==6.33.4
psutil==7.2.1
ptyprocess==0.7.0
pure_eval==0.2.3
pyahocorasick==2.3.0
pyarrow==22.0.0
pyasn1==0.6.2
pyasn1_modules==0.4.2
pycparser==2.23
pycryptodome==3.23.0
pydantic==2.12.5
pydantic-settings==2.12.0
pydantic_core==2.41.5
pydub==0.25.1
pyee==13.0.0
PyGithub==2.8.1
Pygments==2.19.2
PyJWT==2.10.1
pylibsrtp==1.0.0
pyloudnorm==0.2.0
PyNaCl==1.6.2
pynndescent==0.6.0
pyOpenSSL==25.3.0
pyparsing==3.3.1
pyphen==0.17.2
pypinyin==0.55.0
pystoi==0.4.1
python-dateutil==2.9.0.post0
python-dotenv==1.2.1
python-multipart==0.0.21
python-slugify==8.0.4
python-socks==2.8.0
pytorch-wpe==0.0.1
pytz==2025.2
PyWavelets==1.9.0
pyworld==0.3.5
PyYAML==6.0.3
questionary==2.1.1
randomname==0.2.1
rdflib==7.5.0
redis==7.1.0
referencing==0.37.0
regex==2025.11.3
replicate==1.0.7
requests==2.32.5
resampy==0.4.3
resemble-perth==1.0.1
rfc3986==1.5.0
rich==14.2.0
rotary-embedding-torch==0.6.5
rpds-py==0.30.0
rsa==4.9.1
ruamel.yaml==0.18.17
ruamel.yaml.clib==0.2.15
ruff==0.14.11
s3tokenizer==0.3.0
s3transfer==0.10.4
safehttpx==0.1.7
safetensors==0.7.0
SAM-2 @ git+https://github.com/facebookresearch/sam2@2b90b9f5ceec907a1c18123530e92e794ad901a4
samplerate==0.1.0
scikit-image==0.26.0
scikit-learn==1.8.0
scipy==1.17.0
segment-anything==1.0
segments==2.3.0
semantic-version==2.10.0
semver==3.0.4
sentencepiece==0.2.1
sentry-sdk==2.49.0
setuptools==70.2.0
shellingham==1.5.4
six==1.17.0
smmap==5.0.2
sounddevice==0.5.3
soundfile==0.13.1
soxr==1.0.0
spandrel==0.4.1
SQLAlchemy==2.0.45
stack-data==0.6.3
starlette==0.49.3
surrealist==1.1.2
sympy==1.14.0
tensorboard==2.20.0
tensorboard-data-server==0.7.2
tensorboardX==2.6.4
termcolor==3.3.0
text-unidecode==1.3
textsearch==0.0.24
textstat==0.7.12
threadpoolctl==3.6.0
tifffile==2025.12.20
tiktoken==0.12.0
timm==1.0.19
tokenizers==0.22.2
toml==0.10.2
tomlkit==0.13.3
torch==2.9.1+cu130
torch-complex==0.4.4
torch-stoi==0.2.3
torchaudio==2.9.1
torchcrepe==0.0.24
torchdiffeq==0.2.5
torchfcpe==0.0.4
torchsde==0.2.6
torchvision==0.24.1
tqdm==4.67.1
traitlets==5.14.3
trampoline==0.1.2
transformers==4.57.3
triton==3.5.1
typeguard==4.4.4
typer==0.21.1
typer-slim==0.21.1
typing-inspection==0.4.2
typing_extensions==4.15.0
tzdata==2025.3
umap-learn==0.5.11
Unidecode==1.4.0
unpaddedbase64==2.1.0
uritemplate==4.2.0
urllib3==2.6.3
uv==0.9.24
uvicorn==0.40.0
vector-quantize-pytorch==1.27.19
vibevoice @ git+https://github.com/FushionHub/VibeVoice.git@3dd860579757310343749cba9623afde9e69c657
virtualenv==20.36.1
vocos==0.1.0
wandb==0.24.0
wcwidth==0.2.14
websocket-client==1.9.0
websockets==15.0.1
Werkzeug==3.1.5
wetext==0.1.2
wheel==0.45.1
x-transformers==2.14.2
xxhash==3.6.0
yarl==1.22.0
zipp==3.23.0

haidij · January 18, 2026, 12:40pm

Have you tried –gpu-only when running up ComfyUI?

I use python3 main.py --listen 127.0.0.1 --disable-mmap --use-sage-attention --supports-fp8-compute --gpu-only --cache-none --port 8188

Topic		Replies	Views
Spark: cudaErrorNotPermitted in comfyui - but only after Docker sits idle for hours DGX Spark / GB10 cuda , docker-machine-learning	1	210	December 24, 2025
ComfyUI Docker for DGX Spark DGX Spark / GB10 Projects docker , spark	7	760	December 22, 2025
Pip3 install torch torchvision fails while installing ComfyUi DGX Spark / GB10	13	905	November 26, 2025
I ran ComfyUI on DGXSPARK, flux.2 reported insufficient memory error, please help me analyze it DGX Spark / GB10	14	704	January 14, 2026
Buyers beware: DGX Spark limited to 64GB in ComfyUI DGX Spark / GB10	11	785	January 14, 2026
DGX Spark + Hunyuan3d 2.1 DGX Spark / GB10 Projects	19	840	February 2, 2026
ComfyUI on Jetson Thor -- OOM since latest apt-update Jetson Thor generative_ai	5	237	December 17, 2025
DGX Spark PyTorch LLM training throughput up to 8x slower than expected DGX Spark / GB10	1	268	February 10, 2026
DGX Spark: ComfyUI Optimized Setup & GPU Performance Tweaks – Anyone tested This? DGX Spark / GB10	5	445	January 1, 2026
Compiz high load (180%) Jetson TX2 opengl	12	1100	September 11, 2023

ComfyUI: Anyone else having problems with LOW-pass CPU offloading?

Related topics