I have a question regarding the batch processing mechanism in a Multi-LoRA deployment using NIM. Specifically, I would like to understand how tasks are handled in this setup.
Is the task loading and execution handled asynchronously—i.e., does the system allow streaming outputs for completed tasks while accepting new incoming tasks—or is it strictly synchronous, where all inputs are processed in parallel and outputs are returned simultaneously?
Any clarification on how NIM handles this kind of workload distribution and scheduling in a Multi-LoRA context would be greatly appreciated.