Sorry for being so brief regarding the whole setup.
I have from Stereolabs:
a ZED X camera
a ZED link DUO capture card
Is very good that they provide the camera and a fine GSML2 capture card.
For rapid tests I use the sdk from Stereolabs, particulary ZED_Explorer which lets you through a GUI test the camera. The sdk must be installed first and then the drivers for capture card, afterwards I can use ZED_Explorer.
Stereolabs offers another more complete solution:
ZED X + ZED BOX ORiN (Orin nx 16GB & capture card inside)
It wasn’t my choice due to warranty void if the case is opened and the price point.
My solution was a SEEED j4012 + some CSI cables + capture card+ ZED X + custom case. Less money and more research on my part and some cool discoveries :D
Here it is something caught on camera from my lab:
I need to be sure that my Jetson is properly set up because is the big brain of my little robot, not just the solution for vision. This is why I fight with this pesky drivers.
I’m still a novice, but docker seems like ideal case of a physics problem. Is too incapsulated, good for soft development, but hard to put to hardware.
This is very important: How does the capture card connect? Is it via USB? PCIe? Are the drivers available in source code format? Is there a support page URL you can provide for the capture card itself?
Have you been trying to use any software from ZED Box Orin on the Seeed board? I ask because ZED provides a lot of software, and most of what I’m familiar with is just a user space app. If you try to use ZED Box Orin on the Seeed board, I need to know why…what was it you needed from this, and what form is the software in?
The above would change how to work with this. Everything said previously though about using the Seeed device tree is still exactly needed. How the device tree is affected and what works or fails using non-Seeed software on the Seeed board will probably change with the above information. Device tree adaptation is still valid, but there wouldn’t be any need for device tree when using something like USB instead of integrated parts of the Jetson.
Cool pic btw. I wish I had a stereo headset monitor, there’s a lot that could be fun with stereo beyond feeding it to AI for depth finding and mapping. I do see what looks like a low light condition noise in that picture. Is that graininess what you were speaking of when you mentioned noise or artifacts? If so, I think that’s just low lighting for the sensor.
The drivers from Stereolabs are DEB files. Install is sudo dpkg -i stereo_XXXXX
I’ve talked to them about the source code. This will involve signing some NDA.
The card is the DUO version from here:
Install instructions:
Drivers for ZED link capture card
Regarding image noise, it could be an RF problem and/or low lighting. Definitely is increasing with resolution and framerate. After testing JP6 with Seeed method, I’ll get back to JP5.1.1 to be sure.
Because the connection is via CSI to the capture card, it means the device tree related to CSI must be for the Seeed carrier board, and the driver must be for the ZED capture card.
If the issue is reduced quality at higher resolutions, then it is possible that CSI and device tree and drivers have been correct the entire time, but wiring might be an issue.
Other quality issues might occur from dropped frames, which are probably different than signal quality issues. Even so, one would have to first make sure cabling and signal quality are correct before looking at missing frames and counting this as something the CPU/GPU just cannot keep up with. I would not be surprised though if higher frame rates at higher resolution would drop frames; the frames that remain should be of full quality.
A lot depends on the nature of the failure at higher resolutions. You might want to closely describe this.
There is some missing or incorrect ZED driver installation. FYI, you can think of a “symbol” as some feature. Also, “version of symbol” tends to imply that the driver was compiled for a different kernel, and not the running kernel (often a dynamically loaded module needs a specific binary “shape” to load into other software; basically, the function signature has to be correct to load into the kernel). I see this:
# Excerpt of SEEED_like_UART.log
[ 9.465638] imx219 9-0010: imx219_board_setup: error during i2c read probe (-121)
[ 9.470756] imx219 9-0010: board setup failed
[ 9.470819] imx219: probe of 9-0010 failed with error -121
[ 9.493027] imx219 10-0010: pixel_clock.val = 182400000
[ 9.493140] imx219 10-0010: serdes_link_freq err: -61
[ 9.493142] grees about version of symbol tegracam_device_unregister
[ 12.331374] sl_zedxone_uhd: Unknown symbol tegracam_device_unregister (err -22)
[ 12.331409] sl_zedxone_uhd: disagrees about version of symbol tegracam_get_privdata
[ 12.331410] sl_zedxone_uhd: Unknown symbol tegracam_get_privdata (err -22)
[ 12.331429] sl_zedxone_uhd: disagrees about version of symbol camera_common_mclk_enable
[ 12.331430] sl_zedxone_uhd: Unknown symbol camera_common_mclk_enable (err -22)
[ 12.331441] sl_zedxone_uhd: disagrees about version of symbol camera_common_mclk_disable
[ 12.331442] sl_zedxone_uhd: Unknown symbol camera_common_mclk_disable (err -22)
[ 12.331450] sl_zedxone_uhd: disagrees about version of symbol tegracam_set_privdata
[ 12.331451] sl_zedxone_uhd: Unknown symbol tegracam_set_privdata (err -22)
[ 12.331459] sl_zedxone_uhd: disagrees about version of symbol tegracam_device_register
[ 12.33mbol tegracam_device_register (err -22)
The above is a driver issue. The kernel either cannot load the module, or else what it loads is an incorrect version. In this particular case the versioning issue is between kernel and module, and is not due to device tree. When a module loads correctly the hardware using that module might then depend on device tree for some types of hardware, but module load did not get far enough to see if that would have worked or not worked.
You can ignore the CUPS error, it is unrelated for printer setup.
The first log is the flash log, flash_3-1_SEEED_like_method.log. If we need to know about the device tree, then this would be useful. However, we need the kernel module to load before we know more. The camera and/or CSI capture card (some combination) is not going to work until that module loads. Once the module loads you will know if errors are related to the device tree.
On this Jetson, as flashed, what is the output of “uname -r”? Is there a Stereo Labs URL for support of this exact camera and/or CSI capture card? I’d like to see what is available for drivers (there might be more information than just having a .deb file). Also, for this current flash, add a reminder of the output of “head -n 1 /etc/nv_tegra_release”.
Now I do not have the jetson close. But before posting here on this forum I remember I’ve done some tests and the version of the driver and kernel was the same.
Please, I do not understand that to do with “head -n 1 /etc/nv_tegra_release ”.
I’ll try and compare the Linux for tegra folder from SEEED and Stereolab. Do you know some software for comparison of folder/files in Linux. On windows I play with Total Commander & Notepad ++…
Driver and kernel version is insufficient to make the module work; it is also a question of what configuration the kernel was in as the module was compiled to work with that configuration and that configuration only. Every time you change a non-module feature in a kernel, you will typically invalidate all modules; the “lock and key” format of module interface to the kernel changes when the configuration of the base kernel changes. They no longer fit together. So the first step is to know the kernel version releases matched. Many steps beyond this are still required to know if the module will work with the kernel.
On a command line, type this command to find out some information, and post the information here: head -n 1 /etc/nv_tegra_release
(this does nothing except provide the current running kernel’s “symbol” CONFIG_LOCALVERSION in combination with source code release…a subset of seeing if things “match” to allow the module insert)
If the module is compiled against an exact combination of proper source code matching, including CONFIG_LOCALVERSION, and source release, then it can insert.
There is still a possible question of other kernel symbols (basically, function signatures) being required from a module. There is a reason why a dependency-aware symbol/config editor is needed in many cases to make a feature change (such as using the menuconfig or nconfig target): If your module is in fact correct, but one of its dependencies on a symbol are missing, then the module still will not be able to function. Or of a dependency’s API is a wrong version, then your module still cannot load even if it “seems” like the dependency is met (the API version of a module also counts…this is a function signature, not just a function name).
We have to know about the exact kernel plus configuration matching what the module was compiled against. If there is something different, then perhaps ZED would compile their source for the module against your configuration (but they would have to have your exact configuration; not too bad if you use the official kernel from NVIDIA on JetPack 5.x or the mainline kernel from JetPack 6.x, but they would need exact details).
That is flash information, and is useful. This means matching documentation and software (if this were a dev kit; however, this is a Seeed carrier board, and so the flash information only partially matches the dev kit content) can be found here (for the parts which match):
Any difference from what is there is basically what Seeed has customized for the carrier board (the module part of this would be the same; most likely the kernel source is the same, although configuration might differ).
If the “uname -r” output is still 5.15.136-tegra, then it means that kernel was compiled while the symbol “CONFIG_LOCALVERSION” was set to the string -tegra: CONFIG_LOCALVERSION="-tegra"
On this exact Jetson (implying that if you flash it again or change the kernel, then the following is no longer valid) you can copy this file somewhere for safekeeping which will represent this kernel’s configuration (except for CONFIG_LOCALVERSION, this is an exact and perfect match to the running kernel): /proc/config.gz
To get the actual file (using your copy; the original is actually a driver in the kernel pretending to be a file, so you can’t change the original file) you could: gunzip config.gz
If you were to then edit this file (not normally recommended, but it is plain text, and CONFIG_LOCALVERSION does not have any dependencies), then you could find the line with CONFIG_LOCALVERSION and edit that line to look like this, and it would become a fully perfect and exact copy of the kernel which now runs: CONFIG_LOCALVERSION="-tegra"
The reason this is so very important is that if you take the kernel source from the L4T release you are using, and set the configuration up using this config file (renamed to “.config”), then compiling that kernel would result in your new kernel being an exact match to the current kernel. That means that any module you build against this configuration can properly load and work in your running kernel without building the kernel itself; you would build only modules which is easier.
The previous paragraph tells you how to set up to build a module feature that can “fit” correctly in your running system just by copying that file to the right place. What you have from ZED was created just this way (they have provided a kernel module). The problem is that apparently that ZED kernel module (driver) perhaps does not match this kernel configuration. You (or the people from ZED) would need to compile against that exact .config for it to “just work” by file copy. The ZED people might actually be willing to do this for you and provide the module if you give them this information (this would make their product work on the newer release which you are using). Summary of details:
Your config file, once it has CONFIG_LOCALVERSION set to “-tegra” is what one would compile the ZED kernel module against.
If the ZED people build against this kernel, then they can send you the relevant file. If there is some dependency, and thus more than one module is required, then the ZED people would need to send more than one kernel module, but all else would be the same.
One would copy any modules to the correct place, and run the command “sudo depmod -a”, and probably reboot. Everything that is not broken in the device tree should just suddenly start working. Keep in mind that the kernel module(s) must load prior to being able to tell if the device tree is valid. Based on the current errors though, and some partial response from the camera capture, odds are good that you only need the kernel module(s).
Without these kernel modules you cannot really guarantee that the device tree is valid or not for the CSI part.
What I’ve done. I try to mimic Stereolabs method with a twist
1.Download and prepare the Linux_for_Tegra source code
wget https://developer.nvidia.com/downloads/embedded/l4t/r36_release_v3.0/release/jetson_linux_r36.3.0_aarch64.tbz2
tar xf jetson_linux_r36.3.0_aarch64.tbz2
Download and prepare sample root file system
wget https://developer.nvidia.com/downloads/embedded/l4t/r36_release_v3.0/release/tegra_linux_sample-root-filesystem_r36.3.0_aarch64.tbz2
sudo tar xpf tegra_linux_sample-root-filesystem_r36.3.0_aarch64.tbz2 -C Linux_for_Tegra/rootfs/
Stereolabs device tree
wget https://stereolabs.sfo2.cdn.digitaloceanspaces.com/utils/zed_boxes/zedbox_device_trees_363.tar
tar -xf zedbox_device_trees_363.tar
Stereolabs roofs
wget https://stereolabs.sfo2.cdn.digitaloceanspaces.com/utils/zed_boxes/zedbox_rootfs_363.tar
sudo tar -xf zedbox_rootfs_363.tar
At the end I can boot and log on if I use serial connection. If I connect the monitor and try to enter normally, the module boot up, and hangs… I cannot log on.
@linuxdev after bashing my head into module kernels now I realize that I have to take it baby steps again.
To make what you’ve suggested steps 1-6 are ok ?
1.Download and prepare the Linux_for_Tegra source code
wget https://developer.nvidia.com/downloads/embedded/l4t/r36_release_v3.0/release/jetson_linux_r36.3.0_aarch64.tbz2
tar xf jetson_linux_r36.3.0_aarch64.tbz2
Download and prepare sample root file system
wget https://developer.nvidia.com/downloads/embedded/l4t/r36_release_v3.0/release/tegra_linux_sample-root-filesystem_r36.3.0_aarch64.tbz2
sudo tar xpf tegra_linux_sample-root-filesystem_r36.3.0_aarch64.tbz2 -C Linux_for_Tegra/rootfs/
Stereolabs device tree
wget https://stereolabs.sfo2.cdn.digitaloceanspaces.com/utils/zed_boxes/zedbox_device_trees_363.tar
tar -xf zedbox_device_trees_363.tar
Stereolabs roofs
wget https://stereolabs.sfo2.cdn.digitaloceanspaces.com/utils/zed_boxes/zedbox_rootfs_363.tar
sudo tar -xf zedbox_rootfs_363.tar
Just adding some notes as I go, comments will probably change later. I will quote your content and comment as I go. This goes beyond asking about kernel modules, so don’t panic (that other content is just to take notes of things I am wondering about; comments on setting CONFIG_LOCALVERSION follow). You might want to skip through to the end the first time you read this for some summary and questions.
This is the correct location for NVIDIA content of L4T R36.3.0. So far this is good. However, consider that most of the time people use JetPack/SDK Manager to install this.
Without JetPack combining NVIDIA content for you (into the rootfs) there is an implication that the source code is not ready; without the “sudo ./apply_binaries.sh” step from the “Linux_for_Tegra/” location (this is never needed twice). For licensing reasons NVIDIA ships a purely Ubuntu rootfs. The NVIDIA content is added via the apply_binaries.sh step. This is in fact when the working changes from “purely Ubuntu” to “L4T” (Linux for Tegra). Without the apply_binaries.sh step (which is automatic when using JetPack/SDKM) you would pretty much have no hope of a working system due to missing drivers and content.
Question:Did you run “apply_binaries.sh”?
A device tree from Stereolabs for the ZED could be either of:
A fragment somehow merged in to the rest of the tree.
A whole tree which is an edited version of the NVIDIA tree.
In the first case one would merge into the Seeed device tree. In the latter case, this would be valid only if the Stereolabs CSI adapter part of the tree is hand edited into the Seeed tree. The Seeed tree itself is a modified NVIDIA tree, and so if Stereolabs has provided a “full/complete” tree, and you just replace the Seeed tree, then it means you’ve lost Seeed’s modifications to the stock tree. You’ll get the correct CSI tree, and the parts of Seeed that were unmodified compared to NVIDIA would be correct, but anything Seeed had done as a modification of NVIDIA’s content due to customization of their carrier board would now be broken. If everything functions, then probably it is ok.
There isn’t really a need for both the Stereolabs rootfs and another. It is quite reasonable though that you’ve installed the NVIDIA flash content and you would overwrite the default NVIDIA rootfs with the Stereolabs rootfs. On the other hand, if that content were from Seeed for their carrier board, then you’ve probably used the wrong rootfs. If lucky, then maybe the Stereolabs rootfs is only an overlay onto the original rootfs and only adds content rather than replacing content. I don’t know. If the Seeed carrier works, then it probably isn’t a problem, but you might end up with small pieces that don’t work and won’t notice for some time to come.
Once again, I don’t know all that Seeed has in its git repository, but it appears to be you are copying from their git repository by adding (or overlaying) onto the existing rootfs. Which rootfs was at “Linux_for_Tegra/rootfs/” when apply_binaries.sh was performed? How much does the Stereolabs add or edit that location? Maybe all three are correct with each succeeding operation adding into this. If it works, then it is probably correct.
None of the above is for compiling a kernel. I have no idea if Seeed or Stereolabs installs any modification of the kernel or the kernel modules. This is a big question. I will comment on the build process for modules and on who provides that.
Every integrated kernel (the “/boot/Image”) is a combination of features which were named as “symbols” in the kernel config. An example symbol is “CONFIG_LOCALVERSION”. In a context-aware editor, e.g., when building your own kernel and running either the make target of nconfig or menuconfig (I like nconfig which is the same as menuconfig, except it has a symbol search function), there are these possible answers to any symbol to enable or not, if they are simply being selected or not:
y (“=y”) (this integrates the feature into the base Image file)
n (either “=n” or commented out) (this is left out from build)
m (“=m”) (this builds a separate module capable of loading into the kernel)
Some features accept strings or numbers. The CONFIG_LOCALVERSION example is: CONFIG_LOCALVERSION="-tegra"
For this to matter you have to build the kernel; the rootfs might contain a kernel, but this in no way builds or configures a kernel.
When you build a kernel you must first configure it. Despite using the exact same kernel source between two builds, there can be thousands of different configurations, and if you want the modules to load, they have to be built while the kernel is configured for the kernel they will load into. The features (symbols) which actually get loaded into the kernel directly are the “=y” features. The features (symbols) which can load into that particular kernel are selected with “=m” (while selecting in a dependency-aware editor such as nconfig).
If you have a kernel you like, and all you want to do is add a module and not replace all modules and the base kernel Image, then you start with the same source code which created the original Image file. You’ve got that by going to the L4T R36.3.0 URL (it might be a whole set of source, but the kernel would be within that source).
Have you built a kernel before? If so these notes get you where you need to go:
The “make defconfig” build target will set up the mainline kernel default configuration (this is what L4T R36.x uses; R35.x does not use mainline, and its default config is “make tegra_defconfig”).
That config is a default with the exception that it won’t correctly set CONFIG_LOCALVERSION. It happens that CONFIG_LOCALVERSION has many ways to set it, so I’m just showing you some examples:
Due to no dependencies, you could edit the .config file of the kernel source directly (this is the file the make targets of either menuconfig or nconfig edits): CONFIG_LOCALVERSION="-tegra"
You could use nconfig and tell the symbol search to find localversion, and then set the string in the editor: -tegra
Sometimes people name an environment variable for this in the terminal that will compile, and the build picks this up.
Now I’ll show you something that will make a bit more sense about what CONFIG_LOCALVERSION is about…
Earlier you ran the command “uname -r”. You came up with 5.15.136-tegra. This was the kernel itself telling you that information. The 5.15.136 came from the source used to build the kernel and is not controlled by the build process. The “-tegra” suffix on this came from the string set up via CONFIG_LOCALVERSION. This is how the kernel finds modules to load.
As the kernel starts it looks for modules located here: /lib/modules/$(uname -r)/kernel/
(the “$(uname -r)” substitutes what the command replies with; try it: “cd /lib/modules/$(uname -r)/kernel”…you’ll end up where the modules are for that exact kernel)
If a kernel has a given set of integrated features (“=y” symbols), and you do not change them, then you can add modules to that kernel at any time for either (A) a replacement module or (B) a new module that was previously =n. That module, placed in the correct location, adds that feature which the symbol is designed for.
If you compile against a kernel with the wrong CONFIG_LOCALVERSION, then chances are the module will fail to load.
If you compile against a kernel configured with a different set of integrated features (the =y symbols), then there is a very high chance that the modules compiled against the different integrated features will fail to load in the different kernel. Think of the “=y” features forming a kind of lock, and a module compiled against the exact “=y” set forming a key (reminds me of enzymes…often described as a “lock and key” shape).
Imagine you have one kernel source, but you want to be able to boot to two or more versions of that kernel for some sort of development. They would use different modules due to different =y features. They would break if they tried to use the same modules. What you’d do in that case is to compile with a different CONFIG_LOCALVERSION symbol string. Then you could name the Image files something like this:
Then, for that example, you could have boot targets naming Image-default, Image-test, and Image-test2. If the base kernel source is 5.15.136, then you could have complete and different sets of modules here:
/lib/modules/5.15.136-default/kernel/
/lib/modules/5.15.136-test/kernel/
/lib/modules/5.15.136-test2/kernel/
(all of the modules would load and find their specific set of modules)
The implication is that if you simply want to add a new kernel module to an existing kernel without reinstalling all of it, then you simply need to find uname -r to get CONFIG_LOCALVERSION, plus the original configuration, set the kernel source to that, edit with something like nconfig` to add your module, and build the modules. After that it is a simple file copy.
Here’s the problem: Maybe you don’t have the kernel source. You’re at the point where you need the module to load before you really know if your device tree is valid. The ZED/Stereolabs people have provided a binary module that fits only in certain kernel configurations. To adapt to a new configuration the module must be built against your kernel config, including CONFIG_LOCALVERSION. I think you said you do not have the CSI capture card driver source (or maybe it was from the camera itself; maybe it is both). Without the source you cannot build this.
If you are using an unmodified kernel with the -tegraCONFIG_LOCALVERSION (meaning all is default), then it is trivial for the ZED/Stereolabs people to build against that and provide a binary driver module(s). I doubt they would do this for custom configurations, but if you have installed the Seeed software, and you have a default kernel and CONFIG_LOCALVERSION, then this is no different from what NVIDIA would have provided, and Stereolabs can provide you with a module to load (Stereolabs would eventually get around to doing this anyway).
Summary and questions:
Does “uname -r” still return 5.15.136-tegra?
Do you still find this is L4T R36.3.0 from “head -n 1 /etc/nv_tegra_release”?
Comment: L4T R36.4.0 is now out. Don’t know what that would change, but probably it would be good to wait to change unless Seeed and Stereolabs both support this.
This log indicates it is a ZED module failing to load, but it doesn’t say if it is compiled against an incorrect release, or if instead it is due to needing yet another module loaded prior to this module; the functions failing are listed under: sl_zedx
(thus it is a ZED kernel module or module dependency which you must get from Stereolabs; the alternative is that if you have the source it is easy to build a version for your specific kernel).
I’m sorry for this big long and confusing reply, but there is a lot of mixing of Seeed content, Stereolabs content, and NVIDIA content. I don’t know which pieces are present due to having multiple rootfs. I can’t exactly build this myself because I don’t have the hardware to test on, and if I did, I might still come up with the comment that we need Stereolab’s help or source code. The bottom line though is that the kernel module won’t load correctly if it is missing another module or compiled against the wrong config. Tell me more about what you know from Stereolabs.
@linuxdev thank you again for joining this back & forth situation.
What I have done:
I wanted to redo the Stereolabs Flash method, but the J4012 (Orin NX 16Gb onboard) cannot boot with GUI. I can login only with SSH.
Maybe it has something to do with the new installed SSD? I’ve had some pain setting this up. I decided to give up at this method.
I have installed JP6.0 on J4012, as SEEED (the manufacturer) suggested on Github.
The ZED X camera does not work, but since is the manufacturer method
the computer has to be functional with all HW modules
Finally I have installed with Nvidia SDK. After several tries I ended up getting a full correct install.
With this I had the camera working for nearly an hour (while making person recognition)
I’ll post a summary section, but I need to verify right away if this is from the JP6.0 flash, or if this is from the Stereolabs flash? Please note that if the Seeed carrier board instructions are for JP6.0, then that should be completed first before ever touching any of the Stereolabs content. We can work on video failures if we know the install is correct. Are you unable to log in via anything other than ssh or serial console still? Or did you work around that with the Stereolabs flash? I think it is important to get the Seeed content working in JP6.0 if this is not already JP6.0.
Tip: Usually an NVMe install has different flash instructions (an initrd flash is typical).
The summary is for other people starting to read this. You can search for “DISCUSSION” if you already know this summary information. Or you could go straight to the “Questions” section to get an idea of what I am after.
Technical Fact Summary:
Some info in the “VIOseed_github_zed_rootfs_bf.log” of the previous post which I’m replying to:
“uname -r” is 5.15.136-tegra.
L4T release is R36.3.0.
Install is to an NVMe of a carrier board from Seeed Studios:
CSI module at fault: sl_max96712 (alias i2c:sl_max96712) (module attempts to load, but seems to have unmet dependencies; version magic does not appear to be the cause of load failure).
CSI module seems to have code in conflict with symbols “tegra_device_*”. I am guessing that despite version magic not being the problem, that the module was designed with a different NVIDIA module which would provide a different set of function signatures tegracam*.
CSI module insert error excerpt:
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "rmmod sl_zedxone_uhd" outputs "rmmod: ERROR: Module sl_zedxone_uhd is not currently loaded\n"
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "rmmod sl_zedx" outputs "rmmod: ERROR: Module sl_zedx is not currently loaded\n"
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "rmmod sl_max9295" outputs "rmmod: ERROR: Module sl_max9295 is not currently loaded\n"
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "rmmod sl_max96712" outputs "rmmod: ERROR: Module sl_max96712 is not currently loaded\n"
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "insmod /usr/lib/modules/5.15.136-tegra/kernel/drivers/stereolabs/zedx/sl_zedx.ko" outputs "insmod: ERROR: could not insert module /usr/lib/modules/5.15.136-tegra/kernel/drivers/stereolabs/zedx/sl_zedx.ko: Invalid parameters\n"
Sep 28 22:28:17 jetson1 ZEDX_Daemon[1060]: [ "Sat Sep 28 22:28:17 2024" ] Process "insmod /usr/lib/modules/5.15.136-tegra/kernel/drivers/stereolabs/zedone4k/sl_zedxone_uhd.ko" outputs "insmod: ERROR: could not insert module /usr/lib/modules/5.15.136-tegra/kernel/drivers/stereolabs/zedone4k/sl_zedxone_uhd.ko: Invalid parameters\n"
Conflict (??): Stereolabs has a “flash method” apparently designed for R35.x (maybe?). Seeed has what “seems” to be a compatibility with JP6.0 device tree. The current kernel seems to be from R36.3, but because the Seeed CSI module appears to be compiled against the running kernel it is hard to say exactly why insmod is failing (the choice between missing dependency and code for some other kernel are possibilities).
The two boot entries in extlinux.conf both use the same kernel Image. Both use the same initrd. It is probably critical to know if that kernel comes from JP6.0, versus something Stereolabs might have added. It does appear that this kernel is stock/original, but I can’t confirm that. Somehow the faulty CSI module has correct version magic, but incorrect function signatures.
The boot entry label “primary kernel” does not specify a device tree. This means that the device tree in use is in a partition for that entry. The “Stereolabs kernel” label specifies a device tree which appears to be an NVIDIA tree (“kernel_tegra234-p3768-0000+p3767-0000-nv.dtb”), plus an overlay (“tegra234-p3768-camera-zedbox-onx16-sl-overlay.dtbo”).
DISCUSSION
If (and only if) Seeed said that JP6.0 is valid on their carrier board (without patches or modifications), then it implies the device tree is not modified between their carrier board and the reference board (the dev kit carrier board has the reference schematic). There is still a question related to the CSI board: If the layout does not still require something like an i2c change for control of that CSI board, only then would you be able to work with a fully “stock and unmodified” device tree in JP6.0. I just don’t know what is being modified by Stereolabs when you use their flash method. That unknown reference in an earlier log suggests there is a missing dependency or design issue in the sl_max96712 module.
You should start with the Seeed instructions. If you cannot get video, then you really must consider flashing with those instructions and posting a full serial console boot log. That log can be used to help figure out what is wrong with video. I wouldn’t even try to work on the camera until you have a working JP6.0 install without the Stereolabs camera software. Serial console is critical, and if ssh works, it is quite helpful, but it can’t give you any logs prior to Linux loading (we’re interested in what goes on in boot stages too, not just when the Linux kernel is running things).
The only difference between the two boot entries is the device tree.
Questions:
(there is so much going on, please bear with the repetitive questions since it is important to know exactly which boot entry is used with a given question and/or answer, and I don’t know that)
Of the two boot entries, “primary kernel” versus “Stereolabs kernel”, which boot entry was used for the logs?
Is the current flash/install that of JP6, or did this involve the Stereolabs flash method? I’m hoping it is purely JP6, although addition of the device tree overlay would be ok
Of the two boot entries, is the “original” the one with no GUI and only ssh access? Do you have serial console access?
There is a lot going on, and I’m sure you’ve posted URLs already for some of the downloads, but I am wondering if you have a URL you can post again which is specific to the content which added the CSI kernel module? Something is wrong with this module, but it appears to be compiled against this kernel. I need to find out why this conflicts with the “tegracam_*” function signatures from NVIDIA (module version magic seems ok, but what the module is calling seems to be a wrong version). The end problem is that the Stereolabs CSI module cannot load despite having the right version magic. So this is what we’re concentrating on one step at a time until we find out why it fails. I think there is a mixing of versions or a missing dependency.
Let’s stick to first getting the Seeed carrier board and flash method completely working before working on device tree or Stereolabs content.
Unfortunately now I have just two workable methods for JP6.0.
We have these situations:
A1. I have tried a pure SEEED (from their github) install of JP6.0 - WORKS
A2. I install STEREOLABS SDK and capture card drivers - Camera DOES NOT WORK
B1. SEEED (from their github) install of JP6.0 with rootfs from Stereolabs - WORKS
B2. I install STEREOLABS SDK and capture card drivers - Camera DOES NOT WORK
So we can have a JP6.0 install that works, but not with Stereolabs’ ZED camera
The best situation for now is:
C1. Nvidia SDK manager install JP6.0 - WORKS
A2. I install STEREOLABS SDK and capture card drivers - WORKS
So, my assumption is that Seeed somehow has a modified kernel or something. On pure nvidia jetpack all seems to work.
Oh, but there is also something related to Stereolabs drivers. The normal drivers do not work, just the ones for their jetson that has comparable specs with J4012 (Orin NX 16GB with HDMI).
So, maybe the Stereolabs drivers are made for a specific DTB. JP6.0 is still young and they still learn about it.
For the moment, ignore the Stereolabs install method. Stick to the Seeed method. If Seeed does not use a modified device tree, then it should be equivalent to the NVIDIA installer method. If Seeed does use a modified device tree, then the Seeed method would give you the device tree you needed and the NVIDIA installer would be missing some subset. In the case of Seeed providing their modifications purely with a device tree overlay, then either method would work so long as that overlay is loaded prior to running driver loads. However, the errors in the log do not (at least not yet) relate to an incorrect device tree.
Note in your extlinux.conf that in one of the entries (“Stereolabs”) it names both a device tree (.dtb file) and a device tree overlay (.dtbo file). Overlays are intended to modify existing subsets of a tree. Overlays are not intended to add “new” tree branches. One of your entries in extlinux.conf (“Stereolabs”) has an overlay, the other does not (“primary”). We don’t know if the base.dtb file is the same or not because the “primary” entry does not have the FDT line to name a device tree file; instead, that entry without the FDT is taking the device tree from a partition. We don’t know if the partition content is the same or different compared to the FDT file of the “Stereolabs” extlinux.conf boot entry.
If an overlay finds the intended tree nodes, then it should “edit” those nodes. If the overlay does not find the correct node(s), then it should in theory “do nothing” and leave the original tree (I’m not sure but there should probably be an error message about loading an overlay if the overlay cannot do what it is intended to do).
I’m thinking of you adding another extlinux.conf boot entry. Explanation follows. I’m going to reemphasize a quote from one of my earlier posts:
Two things are critical at this point: The Seeed method likely names a device tree file; the other method does not. This means your base device tree for the Seeed method is that .dtb file, modified by the overlay .dtbo file; your other entry gets its device tree from a partition, not from a file, and if I recall correctly, does not have the .dtbo file. You need to keep a working entry in case something goes wrong, but you need to try an extlinux.conf boot entry which is the “primary” entry, remains without the FDT line from Stereolabs, but does include the part about the overlay .dtbo file. Post a copy of the new extlinux.conf for reference. Try this, and save a full serial console boot log (a dmesg may be less informative regarding device tree logs prior to the Linux kernel loading). Attach a serial console boot log and emphasize the boot entry this goes with for people just joining the thread.
The module which fails due to apparently missing dependencies is the sl_max96712 module. Run lsmod, verify if sl_max96712 is or is not loaded. Probably it is not loaded. *If not loaded, then use sudo modprobe <module name> on the module, but monitor “dmesg --follow” prior to attempting the module load. We are interested only in the new log which occurs as a result of the attempt to modprobe, plus any kind of message which occurs from the modprobe line itself.
If anything fails in the modprobe (assuming the command itself is correct and the module refuses to load), you must attach file “/proc/config.gz” here to the forum.
Also, if there is a failure, attach a copy of the .dtbo file named in your extlinux.conf.
What this should accomplish:
You’ll have the Stereolabs device tree, so the devices themselves can be found by a valid driver.
We will know about the load of the driver. Paired with config.gz we can perhaps guess at what is missing or suspicious about the pairing of kernel module and kernel configuration (remember, loading a module requires that (A) the module must be compiled against that kernel config, and (B) any functions/signatures required by the module are implemented either in the kernel Image or another module loaded prior to loading your module.
For good measure, show your new extlinux.conf and make sure that in any log we know which boot entry you were in.
I am interested in your observation:
This suggests that the above combination has a driver which loads into the kernel Image of the NVIDIA install method. Perhaps the NVIDIA kernel install is slightly different when using the SEEED method. *In that case we have a hint that we should use the Seeed method, but with the NVIDIA kernel, and to use the device tree and device tree overlay of the Stereolabs entry.
Another way of saying this same thing:
A module only loads to a compatible kernel.
Your capture card is not plug-n-play, which means the device tree is required for the driver to find the capture card. Perhaps this is in the Stereolabs device tree, or perhaps it is in that tree after the overlay is installed. This more or less defines the binary interface of the Stereolabs capture card which the driver can work with. The error you saw for loading the driver was not from lack of finding Stereolabs because the driver never loaded. It is possible that once the module loads, then there could be a failure if device tree or overlay are incorrect, but we’d make progress because the kernel module would load. Since you have a working case, I’m assuming that case has a correct device tree and overlay.
We are going to try to get the Seeed method (which might be the same as the NVIDIA method if their carrier board layout is the same as the reference board) with the Stereolabs driver module, and so it is loading of that module we are experimenting with.
Basically, I think your observation that Seeed may have modified the kernel could be correct. However, it might just be that the Seeed kernel needs a kernel module added. Alternatively, something quite different could have occurred, e.g., some patch changed the source code itself, in which case we’d have more problems (to be discussed if we get there).
The device tree does name where to find a device when that device is not plug-n-play. Additionally, there would be a a “compatible” entry which tells the name(s) of a driver(s) which are capable of working with that hardware. Be sure to attach a copy of the device tree and the overlay named in your extlinux.conf so that we can reverse compile that and see what it actually says.
Be sure to include the /proc/config.gz for kernel configuration. Both of your existing boot entries use the same kernel I think, “/boot/Image”, and so it won’t matter which you boot to (if there is not a line entry in extlinux.conf naming the same LINUX /boot/Image, then it is possible one of them uses a partition which has a different kernel in it compared to the one in /boot; we must know exactly which kernel is used, so make sure to include that extlinux.conf after you modify it, and name which entry is booted whenever you add information like the result of attempting modprobe).
The part which is most different about the JP6 kernel from NVIDIA is that it is a mainline kernel. In the past NVIDIA has produced a modified source just for the Jetson, and the source would have been tightly controlled by NVIDIA. By switching to mainline there may be differences which NVIDIA did not create.
Now I have JP6.0 with Nvidia SDK install. I’ll try what you’ve requested with this configuration because I need to make some tests with the camera
A more detailed description here is needed, if I may:
the capture card has two sets of drivers
driver design specifically for STEREOLABS Computer based on ORIN NX 16Gb
with this there are no errors and the ZED X camera works
driver design for JP6.0 in general
with this there are no VISIBLE errors and the ZED X camera does not start
On this configuration is needed:
serial startup log extlinux.conf /proc/config.gz lsmod , verify if sl_max96712 is or is not loaded. Probably it is not loaded. *If not loaded, then use sudo modprobe <module name> on the module, but monitor “dmesg --follow ” prior to attempting the module load. We are interested only in the new log which occurs as a result of the attempt to modprobe , plus any kind of message which occurs from the modprobe line itself.
–test the general STEREOLAB driver and the specific one.
That is correct. I will add that in the JP6.0 case, when you test the camera, also monitor “dmesg --follow”. Then see if running any program to use the camera works as you intended, but also see if logs are created from the attempt to run the program (which would be visible as new log lines in “dmesg --follow”).
One thing I had forgot to ask: In your previous logs I had asked you for the output of “lsmod”. You had some of that output. Was this the case of the camera working, or was it the case of not working in JP6? If at any point you are able to specify specifically the “working” case (even if that case is on a desktop PC running Ubuntu…it doesn’t have to be the Jetson), post your “lsmod” (the set of modules which load which are related to the camera when it works is half of what will be combined with the config.gz to see the complete kernel configuration for a working case).
Almost forgot: If you post an lsmod for a working case of the camera, we also need to know the “uname -r” during that lsmod.