Txt2kg Knowledge Graph Triple Extraction is slow

user51624 · February 8, 2026, 9:06am

txt2kg Knowledge Graph Triple Extraction is slow (more than 10 mins) using the existing system prompt and LLM model - Ollama Qwen3 1.7B. My sample text is tiny with only 130kB, any idea why?

christopher_owen · February 8, 2026, 10:25am

Could you add some more details of what you’ve set up and how it is supposed to work?

user51624 · February 8, 2026, 1:32pm

Thanks.

GIt clone and then run ./start.sh, see below, thanks mate

./start.sh
Checking for GPU support…
✓ NVIDIA GPU detected
GPU: NVIDIA GB10, [N/A]
Using Docker Compose V2
Checking Docker permissions…
✓ Docker permissions OK
Using ArangoDB + Ollama configuration…

Starting services…
Running: docker compose -f /home/brianho/project/dgx-spark-playbooks/nvidia/txt2kg/assets/deploy/compose/docker-compose.yml up -d
[+] Running 8/8
✔ Network compose_txt2kg-network Created 0.0s
✔ Network qdrant-network Created 0.0s
✔ Network compose_default Created 0.0s
✔ Container ollama-compose Starte… 0.3s
✔ Container compose-arangodb-1 St… 0.2s
✔ Container compose-arangodb-init-1 Started 0.3s
✔ Container compose-backend-1 Sta… 0.3s
✔ Container compose-app-1 Started 0.4s

==========================================

txt2kg is now running!

Core Services:
• Web UI: http://localhost:3001
• ArangoDB: http://localhost:8529
• Ollama API: http://localhost:11434

Next steps:

Pull an Ollama model (if not already done):
docker exec ollama-compose ollama pull llama3.1:8b
Open http://localhost:3001 in your browser
Upload documents and start building your knowledge graph!

Other options:
• Stop services: ./stop.sh
• Run frontend in dev mode: ./start.sh --dev-frontend
• Use vLLM (GPU): ./start.sh --vllm
• Add vector search: ./start.sh --vector-search
• View logs: docker compose logs -f

aniculescu · February 10, 2026, 7:53pm

Viewing the docker logs while the text is processing may help you see if something is slowing down the process

aniculescu · February 10, 2026, 7:55pm

Also, you can view the “Troubleshooting” tab in the playbook and try out those suggestions.
For example you could try setting these env variables to help performance
Set environment variables:
OLLAMA_FLASH_ATTENTION=1 (enables flash attention for better performance)
OLLAMA_KEEP_ALIVE=30m (keeps model loaded for 30 minutes)
OLLAMA_MAX_LOADED_MODELS=1 (avoids VRAM contention)
OLLAMA_KV_CACHE_TYPE=q8_0 (reduces KV cache VRAM with minimal performance impact)

user51624 · February 11, 2026, 10:35am

Thanks Ani,

Those env param have been set in the docker-compose.yml, see below, pls confirm this is indeed correct

Ollama - Local LLM inference

ollama:
build:
context: ../services/ollama
dockerfile: Dockerfile
image: ollama-custom:latest
container_name: ollama-compose
ports:

‘11434:11434’
volumes:
ollama_data:/root/.ollama
environment:
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,utility
CUDA_VISIBLE_DEVICES=0
OLLAMA_FLASH_ATTENTION=1
OLLAMA_KEEP_ALIVE=30m
OLLAMA_NUM_PARALLEL=4
OLLAMA_MAX_LOADED_MODELS=1
OLLAMA_KV_CACHE_TYPE=q8_0
OLLAMA_GPU_LAYERS=-1
OLLAMA_LLM_LIBRARY=cuda_v13
networks:
default
restart: unless-stopped
deploy:
resources:
reservations:
devices:
driver: nvidia
count: all
capabilities: [gpu]
healthcheck:
test: [“CMD”, “ollama”, “list”]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s

I will look into the docker log and see if I can see something

Topic		Replies	Views
Txt2kg Playbook ./start.sh --complete does not start Additional Services (Complete Stack): DGX Spark / GB10	19	401	January 8, 2026
DGX Spark txt2kg playbook discrepancies / CPU fallback questions DGX Spark / GB10 nemotron	6	484	November 24, 2025
Txt2kg playbook nonfunctional DGX Spark / GB10	1	67	January 8, 2026
Text to Knowledge Graph - Ollama issues DGX Spark / GB10	10	360	January 22, 2026
Step 1 of Text to Knowledge Graph playbook has an error DGX Spark / GB10	6	343	November 23, 2025
I stopped trying to make txt2kg generate triples for me DGX Spark / GB10 llama , nemotron	0	81	January 3, 2026
GDX Spark is extremely slow on a short LLM test DGX Spark / GB10 deepseek	21	3314	January 25, 2026
DGX Spark performance DGX Spark / GB10	50	3647	February 27, 2026
Local perpetual memory MEM0 \| Supermemory DGX Spark / GB10	5	342	December 8, 2025
DGX Spark + Qwen3-Next-80B: Proven Performance, But Missing Clear Path to NIM, TensorRT-LLM & Web UIs DGX Spark / GB10 cuda , nim , llama	16	3624	March 6, 2026

Txt2kg Knowledge Graph Triple Extraction is slow

==========================================

Ollama - Local LLM inference

Related topics