Please provide the following information when requesting support.
• Hardware: AWS
• Network Type - Want to use LPD/LPRNet
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Followed steps here: Setup - NVIDIA Docs
I do this in a SageMaker Studio image terminal. I install ngc cli first for step 1.
For step 3 in Deployment section: Setup - NVIDIA Docs
- I dont add anything into optional API params.
For step4:
- I customise the set-up.sh file (attached) to remove sudo and other minor changes… like hardcoding terraform location.
setup-no-sudo (1).txt (14.3 KB)
When i run the command to install the .sh file here are my inputs:
(image removed)
I used ssh-keygen for SSH public key
for API chart values - the file is empty:
I am met with this error:
What is the issue here? I’ve attached the main.tf file
main_tf.txt (4.8 KB)
In AWS - i see the cluster activate with 1 node. In S3 - i see a folder in the S3 bucket with cluster and config.
Can someone please provide some visbility into whats going on? I could not get the Python Wheels tao installation to work on SM Studio, and this isn’t working either. I’m keen to try out your models + tao toolkit but may need to move to something else.