I have an SKU 12, vehicle version of the Drive Thor. I was wondering what is the correct way to power this thing.
I am using a High Power 16 Pin PCIE 5.0 600W power cable to connect to one of the power connectors on the Thor.
Right now the device can boot up and run. However, it feels like the performance is very bad. I also saw that there are power options that I can set for the Thor but I was not able to set it using nvpmodel.
I contacted Arrow for this power question and they guided me here. Thanks!
May I know why do you feel like that? Do you see any issue with application?
Are you talking about sc7 mode? Please use MCU console to enter sc7 mode. nvpmodel is used for Jetson devices.
You can use any video card connector. One end is connected to DC power connector port and and other end is connected to battery using custom cable. Please check Table 3-4. DC Power Connector Pinning(for pin connection details) and 5.1.2 In-Vehicle Power Input in Mechanical installation guide or pin connection details..
I was benchmarking Qwen2-VL performance on the Thor using the DriveOS LLM SDK. I followed the exact guidelines here to benchmark FP16 Qwen2-VL-2B-Instruct. The benchmark result was that it took 3 seconds to run 1 iteration which is really bad. Hence I have doubts about the power supply.
The power mode I talked about is actually the option in the Desktop UI. It can be changed between regular and power saving on the top right corner. However, as soon as I open up “Settings”, the Settings UI crashes.
Currently, we are powering the SKU 12 using a regular PC power supply that supports 600W PCIE 5.0 16pin. The cable we use look like this. And it seems like connecting either of the 2 power connectors would power on the device. I was wondering if connecting 2 power connectors would be somewhat different.
The cable is suppose to handle 600W. I have now 2 PSUs connected to the Thor and they seem to be working. However, I can not observe any difference. For benchmarking Qwen2-VL, here is the command and the result that I got:
./build/examples/multimodal/vlm_benchmark --llmEnginePath=/media/sda2/thor/thor-ws/fp16/trt_engines/llm.engine --visualEnginePath=/media/sda2/thor/thor-ws/fp16/trt_engines/visual_enc_fp16.engine --modelType="qwen2_vl" --textTokenLength=512 --imageTokenLength=512 --outputLength=256 --warmUp=2 --numRuns=10
[INFO]: ATTENTION_PLUGIN_PATH variable is not set. Default to build/libAttentionPlugin.so
[INFO]: INT4_GEMM_PLUGIN_PATH variable is not set. Default to build/libInt4GemmPlugin.so
[INFO]: Loaded engine size: 1277 MiB
[INFO]: [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +67, now: CPU 0, GPU 1338 (MiB)
[INFO]: Loaded engine size: 2954 MiB
[INFO]: [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +74, now: CPU 0, GPU 4356 (MiB)
[INFO]: [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +73, now: CPU 0, GPU 4429 (MiB)
[INFO]: Switching optimization profile from: 0 to 1. Please ensure there are no enqueued operations pending in this context prior to switching profiles
[INFO]: ========================================================
[INFO]: Benchmarking done. Decoder setup: 4.29s, Iteration: 10, GPU Time: 33.48s.
[INFO]: batch_size: 1
[INFO]: text_token_length per batch: 512
[INFO]: visual_token_length per batch: 512
[INFO]: input_length per batch: 1024
[INFO]: output_length per batch: 256
[INFO]: pipeline_latency(ms): 3392.99
[INFO]: visual_encoder_latency(ms): 45.41
[INFO]: llm_latency(ms): 3347.57
[INFO]: llm_first_token_latency(ms): 33.92
[INFO]: llm_tokens_per_sec: 76.47
[INFO]: llm_generation_latency(ms): 3313.63
[INFO]: llm_generation_tokens_per_sec: 77.26
[INFO]: cpu_peak_mem(GiB): 0.23
[INFO]: gpu_peak_mem(GiB): 4.68
[INFO]: execution_context_mem(GiB): 0.07
[INFO]: ========================================================
You can see that the pipeline latency is more than 3 seconds.
I remember playing with TensorRT-LLM and there was a parameter where you can change the KV Cache size. Is there something similar in Drive LLM?