Aurix CAN communication performance issues

Please provide the following info (check/uncheck the boxes after creating this topic):
Software Version
DRIVE OS Linux 5.2.6
DRIVE OS Linux 5.2.6 and DriveWorks 4.0
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
[*] NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
[*] Linux
QNX
other

Hardware Platform
[*] NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.8.0.10363
other

Host Machine Version
[*] native Ubuntu 18.04
other

Hi everyone,

We are using DRIVE AGX Xavier platform and our application requires 4 CAN buses in order to communicate with the vehicle.

As mentioned on the documentation, only CAN 2 and 6 are directly accessed from the Xavier processors, while all remaining 4 channels are accessed from the MCU Aurix processor.

Since our system requires 4 CAN buses, we have opted to use the Aurix CAN configuration and we are in the middle of performing the first tests. During testing we noticed that the Aurix CAN buses present quite poor performance, since the period of the messages is not always steady at 10ms.

On the plot below you can see the performance of aurix can when running the default sample with command:

./sample_canbus_logger --driver=can.aurix --params=ip=10.42.0.146,bus=a,config-file=/usr/local/driveworks/data/samples/sensors/can/EasyCanConfigFile.conf --send_i_understand_implications=10 --send_id=150 --send_size=8

As you can see, from the 4000 recorded samples, the period is almost always maintained at 10ms but every now and then the period spikes to 20,30,40 and even 100 or 110 ms, which is not acceptable for a robust CAN communication.

For comparison, when running the exact same sample with same arguments, using socket CAN the performance is perfect.

./sample_canbus_logger --driver=can.socket --params=device=can0 --send_i_understand_implications=10 --send_id=150 --send_size=8

As you can see on the caption below, for 4000 samples the period is constantly maintained at 10ms with very little error at ~5%.

What is going wrong here? Is there an inherit problem with Aurix CAN interface that creates these delays and can we configure Aurix CAN in another more robust way?

Since our application requires 4 CAN buses this is a major issue for us, so please advise us on what to do.

Dear @aathanas1,
We will check internally and update you

Hi,

Do you have any update on the topic? We are quite stuck at this point, since we cannot establish vehicle communication. Please let me know if there is anything else I can do to support your analysis.

Dear @aathanas1,
The EasyCAN interface(in DRIVE OS 5.1) ie ethernet packet from Xavier to Aurix and then Aurix forwarding to CAN is only for demo and not for high performance usecase.

We have deprecated EasyCAN when we moved to Vector Aurix firmware(CAN Gateway in DRIVE OS 5.2)

Hi and many thanks for your answer.

Is it possible to update only the AURIX MCU Firmware without updating DRIVE OS + DW? We would like to still use DRIVE SW 10, so please let us know if it is possible.

Additionally, when updating to Vector FW, which is intended for high performance use case, will we be able to use 4 CAN buses at 500kbps baud rate? I guess that depends on the details of the Gateway solution you have used.

I had located a documentation for EasyCAN solution (DRIVE-V5.1.6-E3550-EB-Aurix), can you please point me to the corresponding documentation for Vector Aurix FW?

If you can point me to the corresponding documentation for updating DRIVE SW 10 to DRIVE OS 5.2 + DW 4.0 and downgrading again to DRIVE SW 10 (in case it is needed again) would be great.

Many thanks in advance.

Hi @SivaRamaKrishnaNV

We are about to flash NVIDIA DRIVE to the newest DW 4.0 + DRIVE OS 5.2.6. From what I see on the documentation (DriveWorks SDK Reference: Receiving and Sending Data) in order to access Aurix CAN we need to set a PDU ID to CAN ID mapping, that should match the “security model” compiled. I do not understand how we need to construct this, or how to check what kind of security model will be flashed. Apart from this difference in sensor initialization (can sensors “params”) the rest utilities should be the same with what we already know, ex dwSensorCAN_readMessage() to receive dwSensorCAN_sendMessage() to transmit, etc)

Can you please provide some feedback on our questions?

Dear @aathanas1,
Please see if the discussion at Compiling Aurix CAN security model helps? If not, let us know what needs to clarified.

@SivaRamaKrishnaNV
Many thanks for your reply. Unfortunately, I read that the updated Vector FW is fixed on communicating one specific CAN ID per CAN channel using PDUR to forward over ethernet more efficiently the data between MCU and Xavier A.

This is not useful since we need to communicate multiple CAN IDs of various ID numbers. Unfortunately, my understanding is that this does not solve the problem.

Probably the only option to use the system would be to modify and re-flash the Vector FW in order to define our own CAN ID-to-PDU relations. How can we do that?

It doesn’t look like the DRIVE OS upgrade will solve our problems since the bus will be useless again.

I fear that from all documentation I have read, there is no simple solution to use multiple CAN interfaces out-of-the-box at all, which seems very strange.

If the only way forward would be to reflash Vector FW please provide us some guidelines on how to set this up

Hi @SivaRamaKrishnaNV

Can we follow up on this please? Since the Vector FW solution at DRIVE OS 5.2 is fixed on transmitting only a single, predefined CAN ID per channel, how can we setup CAN to transmit/receive multiple CAN IDs from the Aurix processor?

Dear @aathanas1,
You need to re build the vector FW for your requirement. So You have to license Vector MICROSAR and create own FW. Could you contact Vector for this issue?

Hi @SivaRamaKrishnaNV and thanks for your reply.

Can you give me any Vector contacts that I will need for this (PM me if necessary) ? I suppose this is not just “Vector MICROSAR” but it also contains customizations for the NVIDIA platform on their MICROSAR solution, which I will probably need to convey to them.

As far as I understand, you (as NVIDIA), do NOT handle any kind of support for this issue and the only option would be direct communication with your supplier (Vector).

Before we jump-in to something like this that will be both time/money consuming, can you tell us if there is any evidence that this will work in the end? Has anyone else from your customers successfully utilized CAN communication from AURIX at a real-like CAN Bus load (ex >5 CAN IDs) or have you (as NVIDIA) tested this solution? To be precise, I mean testing Vector FW solution for a CAN load of > 5 CAN IDs, at 10 - 20ms. Can you confirm (as NVIDIA) that this works and has been tested?

I fear that the design of forwarding the messages between Xavier A and Aurix may not be sufficient for usage at a load of few-tens of CAN IDs, even if we do modify the FW.

Dear @aathanas1,
Unfortunately, I don’t know any customer use cases simialar to your request. I would suggest to check with Vector on this perf request. They have idea about our DRIVE AGX platform.
I will share the vector contact details via PM