Fine-tuning FLUX.1-dev (Dockerized SimpleTuner) on DGX Spark

Hi everyone!

After going through some of the playbooks, I wanted to experiment further with fine-tuning on the DGX Spark. I came across SimpleTuner and what was supposed to be a quick test turned into many hours trying to get it to work, i.e. the usual friction with the amd64 architecture and getting OpenCV to build against CUDA 13.0.

Since I’ve already spent the hours fixing the problems I ran into, I packaged everything into a Docker-based workflow so others can leverage them if they want. You can find the repo here: https://github.com/provos/dgx-spark-fine-tuning-workflow

It includes tools to download regularization images, image captioning, fine-tuning and inference.

Hope this saves someone some time :)

Ps: With my current settings (2000 steps, LoRA rank 256, Prodigy optimizer, gradient accumulation of 2), training takes about 10 hours. I noticed the official NVIDIA Dreambooth example runs in about 4 hours but uses a gradient accumulation of 6. Not quite sure about the discrepancy.

2 Likes

thanks !

any performance metrics regarding this ?

2000 steps with gradient accumulation of 2 took about 10 hours. The dreambooth script that was one of the example playbooks from Nvidia took about ~4 hours with gradient accumulation of 6. I don’t know what the step equivalent would be. That said here is an image at step 1500 for my cat/dog example. Prior steps looked pretty good already, too. So, if you do 1000 steps with gradient accumulation of 1 you might get decent results in a quarter of the time, i.e. 2.5 hours.

On the other hand, you could just go to Nano Banana Pro and get results in a minute :-)

1 Like

This is really good stuff @provos. Thanks for sharing this one. Appreciated.

1 Like

I want to generate 1000 images for personal use -

I need a local system - Spark seems to be good for that but seems to not work as expected…

Depends on the size of the images and how many steps, this is all tunable. You can generate images pretty fast with this playbook:

What is “pretty fast” for you ?

Locally? 30s - 1 min, 20-50 steps

Potentially you could generate a 1000 images in 16h, depending on the quality.

ChatGPT 5.1 plus takes roughly 1 min for image generation

Thank you @provos . I really appreciate your work. Can you please tell what is your efficiency during inference? How long does it take to generate 1 image? Thank you!

around 80 sec per image ! happy now ! (with 30 steps)

1 Like