I’m wondering if it is possible to specify and use a general network or host storage location, NFS mount, host folder or something else within a pipeline definition such that it could be mounted at a specified location in one or more operator containers? I’m working with very large (multi-TB) image datasets that already exist on fast storage that I’d like to directly mount into an operator container to use rather than going through the payload mechanism. Instead I’d like to just use payloads for managing job parameters and meta-data etc.
It seems as if our data was in DICOM format or in a PACS system it would probably be ok with supported mechanisms, however this is not applicable or viable in our case.
I’d be happy to hear of any possible workarounds or hacks etc.
There’s nothing preventing this from happen other than the manner in which your network, and the Kubernetes cluster Clara Deploy is deployed into, are configured. A pipeline-job operator is just another containerized process running on the cluster and can have access to anything you grant it access to.
The only officially supported method is to use the payloads system. This is because Clara Platform can ensure that the data is reachable by pipeline-job operators before deploying them.
Unlike Pipeline Service definition that can use
$(NVIDIA_CLARA_SERVICE_DATA_PATH) for referencing
/clara/model folder at host, using such common folder is not available for operator definition in Pipeline Definition file for now. We are sorry about that.
One workaround you can use is as follows.
It assumes that you have a root permission to the machine that Clara Platform is running.
1. Create pipeline and job as usual (but without input image data)
PAYLOAD_ID, you can find where the input is available at host machine.
/clara/payloads/<PAYLOAD_ID> by default).
In the following example, we upload a payload with only configs folder.
$ clara create jobs -n ai-test -p ead19e59043c4fa4b1e9b6dafeee040b -f input/
Payload uploaded successfully.
$ tree /clara/payloads/c0ee2b25b0b7400d90a44e0f45853059/
│ ├── configs
│ │ ├── config.txt
│ │ └── metadata.txt
2. Bind input data to the payload folder
Let’s assume that the big input image data is available at
/ssd/data, you can bind(symbolic link won’t work) the folder into the payload folder.
It assumes that the first operator would use
/input/data folder in the container for loading input image data,
sudo mkdir -p /clara/payloads/c0ee2b25b0b7400d90a44e0f45853059/input/data
sudo mount --bind /ssd/data /clara/payloads/c0ee2b25b0b7400d90a44e0f45853059/input/data
3. Start job
clara start job -j 2ec2af3c08e5459ea77363dbd64e40b5
4. Unbind folder
Don’t forget to unbind folder once the job is finished.
sudo umount /clara/payloads/c0ee2b25b0b7400d90a44e0f45853059/input/data
Thanks @gigony and @jwyman,
I did discover $(NVIDIA_CLARA_SERVICE_DATA_PATH) yesterday and tried (unsuccessfully) to use it in both the a operator container and pipeline definition file. I’m happy that you have confirmed this is not an option.
The approach of binding folders is interesting, however in the solution I am currently developing all job creation and control is being done via GPRC API calls from a remote client so I’m a bit constrained as to what extra operations I can do in this context.
It appears as though I may be able to use a Kubernetes persistent volume in some way, however my current work with Clara is my first exposure to Kubernetes (I’ve been using Docker for years) and I am a bit unsure of how to configure this specifically for my pipeline operator containers.
I also thought I may have been able to achieve what I was looking for via some use of Clara “Services” , however it seems that these are on the way of being phased out.
I’ll keep thinking and trying out some of the things you have suggested.
I am unsure of your actual use case here, but if you’re looking to avoid uploading the same input multiple times, I recommend that you look at our reusable payload feature. Reusable payloads support publishing of data once and then inclusion in any number of pipeline-jobs as input.
I’m having a play with this now, it may be one way to get around my issue.
The main thing i’m trying to avoid it having to minimize the number of large file transfers from one system to another. Our datasets (HDF5 files) are already sitting in well organised and referenced locations and I was hoping to just refer to them in a “job definition” file that gets submitted to the payload feature. Ideally, I would then like to access them directly from the operator containers.
BTW, is there any actual difference in the two PayloadsCreateRequest messages defined in sections 17.52 & 17.60 of the online API docs? They seem to be exactly the same to me.
Adding to the message above, if the docs are correct I can’t currently see how to create a payload of type PAYLOAD_TYPE_REUSABLE as there doesn’t appear to be a mechanism to differentiate the creation of different types of payloads.