NAME READY STATUS RESTARTS AGE
etcd-etcd-deployment-6745896d58-nz5c4 1/1 Running 0 4m39s
milvus-milvus-deployment-7bfd5c795b-r9f6c 1/1 Running 0 4m39s
minio-minio-deployment-7cf966bb89-c4jpx 1/1 Running 0 4m39s
nemo-embedding-embedding-deployment-689d64765-tbd5q 0/1 ImagePullBackOff 0 4m39s
nemo-rerank-ranking-deployment-865fdd9c67-w76lz 0/1 ImagePullBackOff 0 4m39s
neo4j-neo4j-deployment-5cdf686bcb-96vxp 1/1 Running 0 4m39s
vss-blueprint-0 1/1 Running 0 4m39s
vss-vss-deployment-55fb8cf6d8-p6t9x 0/1 Pending 0 4m38s
±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 1 N/A N/A 91866 C /opt/nim/llm/.venv/bin/python3 77712MiB |
| 2 N/A N/A 91867 C /opt/nim/llm/.venv/bin/python3 77718MiB |
±--------------------------------------------------------------------------------------+
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
91866 centific 20 0 308.1g 41.9g 2.6g S 0.0 4.2 1:11.37 pt_main_thread
91867 centific 20 0 301.1g 41.9g 2.6g R 200.3 4.2 9:28.43 pt_main_thread
90560 7474 20 0 52.1g 1.5g 22460 S 0.3 0.1 0:25.71 java
3359 root 20 0 2300708 954716 97888 S 2.7 0.1 17:09.27 kubelite
2058 root 20 0 2020812 594992 20036 S 1.0 0.1 1:30.55 k8s-dqlite
89304 root 20 0 5197820 278344 150760 S 1.0 0.0 0:13.19 milvus
89234 root 20 0 3736424 184792 48068 S 0.0 0.0 0:00.82 minio
23310 root 20 0 3670996 172808 30052 S 0.0 0.0 0:07.21 dcgm-exporter
1452 root 19 -1 173580 94912 92776 S 0.0 0.0 0:01.07 systemd-journal
2261 root 20 0 3688208 76988 54664 S 0.0 0.0 0:00.42 dockerd
kubectl logs vss-vss-deployment-55fb8cf6d8-p6t9x
Defaulted container “vss” out of: vss, check-milvus-up (init), check-neo4j-up (init), check-llm-up (init)
Name: vss-vss-deployment-55fb8cf6d8-p6t9x
Namespace: default
Priority: 0
Service Account: default
Node:
Labels: app=vss-vss-deployment
app.kubernetes.io/instance=vss-blueprint
app.kubernetes.io/name=vss
generated_with=helm_builder
hb_version=1.0.0
microservice_version=0.0.1
msb_version=2.5.0
pod-template-hash=55fb8cf6d8
Annotations: checksum/vss-configs-cm: 8a3bc5b52a74ba5abf15b4261a6e084ba8ca97e2742ca11d446ec09f6cdef4d5
checksum/vss-external-files-cm: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
checksum/vss-scripts-cm: 81ae92a369cf8a653a2cc7136469aebb35734cd7f678d13ba3b8a2bfe97f204a
checksum/vss-workload-cm: 08bd480e12de5c6d8e1e298b2769bd8457a92cd2b01881cd1271fc20d92256e8
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/vss-vss-deployment-55fb8cf6d8
Init Containers:
check-milvus-up:
Image: busybox:1.28
Port:
Host Port:
Command:
sh
-c
until nc -z -w 2 milvus-milvus-deployment-milvus-service 19530; do echo waiting for milvus; sleep 2; done
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment:
Mounts:
/opt/configs from configs-volume (rw)
/opt/scripts from scripts-cm-volume (rw)
/opt/workload-config from workload-cm-volume (rw)
/secrets/graph-db-password from secret-graph-db-password-volume (ro,path=“graph-db-password”)
/secrets/graph-db-username from secret-graph-db-username-volume (ro,path=“graph-db-username”)
/secrets/ngc-api-key from secret-ngc-api-key-volume (ro,path=“ngc-api-key”)
/secrets/openai-api-key from secret-openai-api-key-volume (ro,path=“openai-api-key”)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-97g2h (ro)
check-neo4j-up:
Image: busybox:1.28
Port:
Host Port:
Command:
sh
-c
until nc -z -w 2 neo-4-j-service 7687; do echo waiting for neo4j; sleep 2; done
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment:
Mounts:
/opt/configs from configs-volume (rw)
/opt/scripts from scripts-cm-volume (rw)
/opt/workload-config from workload-cm-volume (rw)
/secrets/graph-db-password from secret-graph-db-password-volume (ro,path=“graph-db-password”)
/secrets/graph-db-username from secret-graph-db-username-volume (ro,path=“graph-db-username”)
/secrets/ngc-api-key from secret-ngc-api-key-volume (ro,path=“ngc-api-key”)
/secrets/openai-api-key from secret-openai-api-key-volume (ro,path=“openai-api-key”)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-97g2h (ro)
check-llm-up:
Image: curlimages/curl:latest
Port:
Host Port:
Command:
sh
-c
Args:
while ! curl -s -f -o /dev/null http://llm-nim-svc:8000/v1/health/live; do
echo “Waiting for LLM…”
sleep 2
done
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Environment: <none>
Mounts:
/opt/configs from configs-volume (rw)
/opt/scripts from scripts-cm-volume (rw)
/opt/workload-config from workload-cm-volume (rw)
/secrets/graph-db-password from secret-graph-db-password-volume (ro,path="graph-db-password")
/secrets/graph-db-username from secret-graph-db-username-volume (ro,path="graph-db-username")
/secrets/ngc-api-key from secret-ngc-api-key-volume (ro,path="ngc-api-key")
/secrets/openai-api-key from secret-openai-api-key-volume (ro,path="openai-api-key")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-97g2h (ro)
Containers:
vss:
Image: nvcr.io/nvidia/blueprint/vss-engine:2.0-ea
Port: 8000/TCP
Host Port: 0/TCP
Command:
bash
/opt/scripts/start.sh
Limits:
nvidia.com/gpu: 1
Requests:
nvidia.com/gpu: 1
Liveness: http-get http://:http-api/health/live delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http-api/health/ready delay=5s timeout=1s period=5s #success=1 #failure=3
Startup: http-get http://:http-api/health/ready delay=0s timeout=1s period=10s #success=1 #failure=180
Environment:
VLM_MODEL_TO_USE: vila-1.5
MODEL_PATH:
DISABLE_GUARDRAILS: false
TRT_LLM_MODE:
VLM_BATCH_SIZE:
VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME:
VIA_VLM_ENDPOINT:
VIA_VLM_API_KEY:
OPENAI_API_VERSION:
AZURE_OPENAI_API_VERSION:
Mounts:
/opt/configs from configs-volume (rw)
/opt/scripts from scripts-cm-volume (rw)
/opt/workload-config from workload-cm-volume (rw)
/secrets/graph-db-password from secret-graph-db-password-volume (ro,path=“graph-db-password”)
/secrets/graph-db-username from secret-graph-db-username-volume (ro,path=“graph-db-username”)
/secrets/ngc-api-key from secret-ngc-api-key-volume (ro,path=“ngc-api-key”)
/secrets/openai-api-key from secret-openai-api-key-volume (ro,path=“openai-api-key”)
/tmp/via-ngc-model-cache from ngc-model-cache-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-97g2h (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
ngc-model-cache-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: vss-ngc-model-cache-pvc
ReadOnly: false
workload-cm-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vss-workload-cm
Optional: false
configs-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vss-configs-cm
Optional: false
scripts-cm-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vss-scripts-cm
Optional: false
secret-openai-api-key-volume:
Type: Secret (a volume populated by a Secret)
SecretName: openai-api-key-secret
Optional: false
secret-ngc-api-key-volume:
Type: Secret (a volume populated by a Secret)
SecretName: ngc-api-key-secret
Optional: false
secret-graph-db-username-volume:
Type: Secret (a volume populated by a Secret)
SecretName: graph-db-creds-secret
Optional: false
secret-graph-db-password-volume:
Type: Secret (a volume populated by a Secret)
SecretName: graph-db-creds-secret
Optional: false
kube-api-access-97g2h:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 71s default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.