Question about AODT with RTX3000 series

I have a question about the minimum GPU specification for Nvidia AODT.
Currently, I have 2 x RTX 3090 (24 GB VRAM). Is there any way to run a simple configuration of AODT in my system?

With a glimmer of hope, I tried this on my system: I successfully installed the AODT in my workstation; however, attaching a work is not possible.


Thank you !

@jgjang0123
Looks like the backend services are up and running
Can you please send :

  1. Are you able to connect to the database from the UI? e.g. you should see this
  2. Out put of the command (run on backend machine)

netstat -an | grep 9000 | grep -i

Hello, @kpasad1.

Thank you for your reply. I ask for your understanding for the late reply.

Instead of @jgjang0123, I would like to provide you with the answer.

Regarding the first question, there is an issue with the database failing to connect to the UI. You can see the problem in the attached image.

For the second question, when I ran the following command, the output was blank:

netstat -an | grep 9000

Additionally, the grep -i command is not working either.

Can you advise us on how to connect the database?

I will reply right away from now on.

Thank you!

@jgjang0123
Can you run docker ps on the backend terminal and check if there is a clickhouse process running?

@kpasad1@jgjang0123
We reviewed the

{docker ps -a}
command to investigate the exited containers. Upon inspection, we found that the ClickHouse server had been shut down.

To resolve this, we executed the command:

docker start [container id]

we restart the database server and check the connectivity to UI.

docker_restart

This restarted the database server, and we then verified the connectivity to the UI.

However, we encountered an issue where we were unable to deploy the “attach worker.”

Moreover, We have 2 x RTX 3080 (12 GB VRAM), not 2 x RTX 3090 (24 GB VRAM)

We have deployed one GPU for the frontend and another for the backend. It seems that the backend is experiencing insufficient VRAM, as it requires more than 40 GB of VRAM.

We also plan to install AODT on a server equipped with 2 x RTX 3090 GPUs (24 GB VRAM each) until tommorrow.

By the way, could you suggest any alternative methods to address this issue?

@sjh1753 There are two backend docker containers running. Can you kill both and then start a new one?

@kpasad1 @jgjang0123

We have attempted to remove the both containers and restart the one of containers, but unfortunately, the backend Docker container shuts down after only a few seconds of running.

The issue persists regardless of the configuration, whether using 2 x RTX 3080 GPUs (12 GB VRAM) or a single one.

We have identified the following backend containers that are not functioning in our setup:

We would appreciate any guidance or suggestions for resolving this issue. Thank you for your assistance.

@sjh1753 Backend need A100, A10 or L40. We are unable to debug backend issues with RTX 3090. Can you set up the back end on our qualified system and retry?

@kpasad1 @jgjang0123

Sadly, we currently don’t have a qualified GPU. After testing with RTX 3090s, we will plan to buy suitable GPUs :D Thank you for your kind reply.

Do you use docker-compose to restart the back-end containers from SM89 to SM86?

@guofachang

No, we just remove the SM89 container and then use the below command

docker run -d --gpus '"device=0, 1"' [backend-container-name]

or

docker run -d --gpus '"device=1"' [backend-container-name]

We verify this command succesfully deploying GPUs in the created container.

Maybe you could try the command in the path backend_bundule/docker-compose.yml

docker compose down

Then, generate the file docker-compose-sm86.yml copied from docker-compose.yml where the
connector’s image is modified from: nvcr.io/esee5uzbruax/aodt-sim:1.1.0_runtime_$GEN_CODE

to nvcr.io/esee5uzbruax/aodt-sim:1.1.0_runtime_SM86

docker compose -f docker-compose-sm86.yml up

1 Like

@guofachang

Thanks to your guidance, we successfully resolved the issue. We hadn’t realized that the backend container configuration was within the backend_bundle folder, and we also mishandled the Docker container individually.

We’re going to explore AODT with pleasure. Have a nice day!