Hi,
I am getting a RTX 3090, and I am wondering if all the relevant libraries support the new Ampere architecture (so that the Clara AIAA-server can be run on the machine)?
Hi,
I am getting a RTX 3090, and I am wondering if all the relevant libraries support the new Ampere architecture (so that the Clara AIAA-server can be run on the machine)?
Hi
Thanks for your interest in Clara train and AIAA. Yes new GPUs should work fine as long as you have the latest drivers.
Please let us know if you face any issues.
Also FYI we will be releasing Clara Train V3.1 soon so stay tuned
Hi aharouni,
thanks for the reply! I tried running the AIAA on the RTX 3090 the same way that I succesfully ran it on NVIDIA RTX 2080 Ti. However, when trying to upload the model by
curl -X PUT "http://172.17.0.2/admin/model/deepGrow" -F "config=@deepGrow/config/config_aiaa.json;type=application/json" -F "data=@deepGrow/models/network.pt,
I get
{"error":{"message":["5","Failed to export model to TRTIS"],"type":"AIAAException"},"success":false}
Also, upon the launch of the docker with Clara docker-image (clara-train-sdk:v3.0) I get
WARNING: Detected NVIDIA GeForce RTX 3090 GPU, which is not yet supported in this version of the container
ERROR: No supported GPU(s) detected to run this container
As you can see, in nvidia-smi there is no process “trtserver” as should as the AIAA-server is currently running.
This is similiar error message as in the end this case Covid19 model download - "Invalid model config"
Hi
Could you clarify/verify if you have tried this on the new Clara train V3.1 that was release on Friday? https://ngc.nvidia.com/catalog/containers/nvidia:clara-train-sdk
it makes sense to see this error using V3.0 since RTX 3090 was not out at that time
Thanks
Hi,
I have now tested it with the V3.1, and it works without problems. Thanks for the good and quick work!
Getting the same error on V3.1
================
== TensorFlow ==
NVIDIA Release 20.08-tf1 (build 15440644)
TensorFlow Version 1.15.3
Container image Copyright © 2020, NVIDIA CORPORATION. All rights reserved.
Copyright 2017-2020 The TensorFlow Authors. All rights reserved.
NVIDIA Deep Learning Profiler (dlprof) Copyright © 2020, NVIDIA CORPORATION. All rights reserved.
Various files include modifications © NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
WARNING: Detected NVIDIA GeForce RTX 3090 GPU, which is not yet supported in this version of the container
ERROR: No supported GPU(s) detected to run this container
NOTE: MOFED driver for multi-node communication was not detected.
Multi-node communication performance may be reduced.
root@139940a9e943:/opt/nvidia# start_aas.sh --workspace /aiaa-experiments/aiaa-1/aiaa-experiments/aiaa-1/aiaa-launch-config.json
ENGINE:: engine=TRITON
TRITON:: Backend is enabled
TRITON:: triton_ip=localhost
TRITON:: Will setup TRITON Server on localhost
TRITON:: triton_http_port=8000
TRITON:: triton_grpc_port=8001
TRITON:: triton_metrics_port=8002
TRITON:: triton_proto=http
TRITON:: triton_shmem=no
TRITON:: triton_model_path=/aiaa-experiments/aiaa-1/triton_models
TRITON:: triton_verbose=false
TRITON:: triton_log=/aiaa-experiments/aiaa-1/logs/0/triton.log
TRITON:: triton_start_timeout=120
TRITON:: triton_model_timeout=30
TRITON is already stopped
TRITON:: Waiting 1 seconds to fully up…
TRITON:: Server started with pid: 540
AIAA:: aiaa_log_file=/aiaa-experiments/aiaa-1/logs/0/aiaa.log
AIAA:: aiaa_log_dir=/aiaa-experiments/aiaa-1/logs/0
AIAA:: aiaa_workspace=/aiaa-experiments/aiaa-1
AIAA:: aiaa_ssl=false
AIAA:: aiaa_ssl_cert_file=null
AIAA:: aiaa_ssl_pkey_file=null
AIAA:: aiaa_background=false
mazino@cardinal:~$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE nvcr.io/nvidia/clara-train-sdk v3.1.01 a0db52064cbc 2 months ago 20.6GB nvidia/cuda 11.0-base 2ec708416bb8 5 months ago 122MB hello-world latest bf756fb1ae65 13 months ago 13.3kB
the same question with my RTX 3070 and It works welle with GTX 1660.