RTX 3090 capable of running Clara Train SDK?

Hi,

I am getting a RTX 3090, and I am wondering if all the relevant libraries support the new Ampere architecture (so that the Clara AIAA-server can be run on the machine)?

Hi

Thanks for your interest in Clara train and AIAA. Yes new GPUs should work fine as long as you have the latest drivers.
Please let us know if you face any issues.
Also FYI we will be releasing Clara Train V3.1 soon so stay tuned

Hi aharouni,

thanks for the reply! I tried running the AIAA on the RTX 3090 the same way that I succesfully ran it on NVIDIA RTX 2080 Ti. However, when trying to upload the model by

curl -X PUT "http://172.17.0.2/admin/model/deepGrow" -F "config=@deepGrow/config/config_aiaa.json;type=application/json" -F "data=@deepGrow/models/network.pt,

I get

{"error":{"message":["5","Failed to export model to TRTIS"],"type":"AIAAException"},"success":false}

Also, upon the launch of the docker with Clara docker-image (clara-train-sdk:v3.0) I get

 WARNING: Detected NVIDIA GeForce RTX 3090 GPU, which is not yet supported in this version of the container
 ERROR: No supported GPU(s) detected to run this container

As you can see, in nvidia-smi there is no process “trtserver” as should as the AIAA-server is currently running.

This is similiar error message as in the end this case Covid19 model download - "Invalid model config"

Hi
Could you clarify/verify if you have tried this on the new Clara train V3.1 that was release on Friday? https://ngc.nvidia.com/catalog/containers/nvidia:clara-train-sdk
it makes sense to see this error using V3.0 since RTX 3090 was not out at that time
Thanks

Hi,

I have now tested it with the V3.1, and it works without problems. Thanks for the good and quick work!

Getting the same error on V3.1

================
== TensorFlow ==

NVIDIA Release 20.08-tf1 (build 15440644)
TensorFlow Version 1.15.3

Container image Copyright © 2020, NVIDIA CORPORATION. All rights reserved.
Copyright 2017-2020 The TensorFlow Authors. All rights reserved.

NVIDIA Deep Learning Profiler (dlprof) Copyright © 2020, NVIDIA CORPORATION. All rights reserved.

Various files include modifications © NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
WARNING: Detected NVIDIA GeForce RTX 3090 GPU, which is not yet supported in this version of the container
ERROR: No supported GPU(s) detected to run this container

NOTE: MOFED driver for multi-node communication was not detected.
Multi-node communication performance may be reduced.

root@139940a9e943:/opt/nvidia# start_aas.sh --workspace /aiaa-experiments/aiaa-1/aiaa-experiments/aiaa-1/aiaa-launch-config.json

ENGINE:: engine=TRITON
TRITON:: Backend is enabled

TRITON:: triton_ip=localhost
TRITON:: Will setup TRITON Server on localhost

TRITON:: triton_http_port=8000
TRITON:: triton_grpc_port=8001
TRITON:: triton_metrics_port=8002
TRITON:: triton_proto=http
TRITON:: triton_shmem=no
TRITON:: triton_model_path=/aiaa-experiments/aiaa-1/triton_models
TRITON:: triton_verbose=false
TRITON:: triton_log=/aiaa-experiments/aiaa-1/logs/0/triton.log
TRITON:: triton_start_timeout=120
TRITON:: triton_model_timeout=30

TRITON is already stopped
TRITON:: Waiting 1 seconds to fully up…
TRITON:: Server started with pid: 540

AIAA:: aiaa_log_file=/aiaa-experiments/aiaa-1/logs/0/aiaa.log
AIAA:: aiaa_log_dir=/aiaa-experiments/aiaa-1/logs/0
AIAA:: aiaa_workspace=/aiaa-experiments/aiaa-1
AIAA:: aiaa_ssl=false
AIAA:: aiaa_ssl_cert_file=null
AIAA:: aiaa_ssl_pkey_file=null
AIAA:: aiaa_background=false

  • Stopping Apache httpd web server apache2 *
    Site AIAA disabled.
    To activate the new configuration, you need to run:
    service apache2 reload
    Site AIAA-ssl already disabled
    Enabling AIAA site
    Enabling site AIAA.
    To activate the new configuration, you need to run:
    service apache2 reload
    +++++++++++ Starting AIAA Server (press Ctrl+C to stop)…
    AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
    AIAA Server stopped!!!
  • Stopping Apache httpd web server apache2 *
    Site AIAA already enabled
    Considering dependency setenvif for ssl:
    Module setenvif already enabled
    Considering dependency mime for ssl:
    Module mime already enabled
    Considering dependency socache_shmcb for ssl:
    Enabling module socache_shmcb.
    Enabling module ssl.
    See /usr/share/doc/apache2/README.Debian.gz on how to configure SSL and create self-signed certificates.
    To activate the new configuration, you need to run:
    service apache2 restart
    Enabling site AIAA-ssl.
    To activate the new configuration, you need to run:
    service apache2 reload
mazino@cardinal:~$ sudo docker images
REPOSITORY                       TAG         IMAGE ID       CREATED         SIZE
nvcr.io/nvidia/clara-train-sdk   v3.1.01     a0db52064cbc   2 months ago    20.6GB
nvidia/cuda                      11.0-base   2ec708416bb8   5 months ago    122MB
hello-world                      latest      bf756fb1ae65   13 months ago   13.3kB

the same question with my RTX 3070 and It works welle with GTX 1660.