I was also very new to embedded Linux when I started my multi-camera MAX9286 project, so even though there is a lot of low level technical information from nvidia as well as device tree code and source code available from vendors and other posts in this forum I did not know what to do first. Since the high level overview of bringing up a GMSL camera seems to be missing I’ll add my thoughts on that here.
First, to answer your question, about what to start with: It is quite challenging to bring up a new sensor and DeSer pair even if you are already skilled in kernel development. If you comfortable with Linux i2c drivers, understand the device tree well, and have the 9286 and 96705 datasheets you may use the max9296 and max9295 as reference drivers and succeed. After a successful De-Ser link then the sensor driver can be addressed somewhat on its own, but the 9286 driver will require refinement and adjustments specific to your sensor to work with two or four sensors based on my experience (see my other posts). If you are not familiar with kernel work, or time is critical, then the fastest route to get a solution is to contact a software vendor like RidgeRun who has MAX9286 and MAX96705 drivers ready to ship to you. You will have to modify the device tree and source code to match your hardware, including virtual channel IDs for multi cam, but buying drivers should get you close enough to start debugging settings rather than fighting basic driver concepts.
When you start developing your system you need to start by getting your Jetson talking to the deserializer. Assuming your 9286 hardware is known to be working it is a simple I2C interface that you can implement in your kernel driver or modify device addresses in your purchased driver. In fact, the 9286 is simple enough that you could set it up using a user-space bash script with i2ctools because after it is configured you do not need to change any of its settings to use the camera. Once you can read and set the 9286 registers you are ready to talk to the serializer over the coax cable.
The serializer is also a simple I2C device, once it is configured it also does not need to be changed during camera usage. So once again a simple i2c kernel driver can be written, or purchased, or you could do it in user space to start in the easiest way. The serializer and deserializer settings must be matched to establish reliable communication via the low speed configuration link and reliable streaming via the high speed video link. Read the deserializer datasheet very closely, especially the initialization steps where it gives you step-by-step instructions on what to do with each device to establish a video link. Do this initialization with a single serializer attached to your deserializer first to keep it simple, then add more. The I2C address translation is very important. There are some serializer settings that are specific to your image sensor and ISP, but you don’t need to get those right to establish the control link and video link. Read the deserializer register table details very closely for status and error bits that you should watch carefully while debugging.
Once you have the serializer address translated and video link active with forward and reverse control channels enabled you can start configuring your sensor. For the sensor you will need support from the sensor vendor or the camera module vendor to get the proper mode tables because most sensor datasheets do not have sufficient information to create them by yourself, and there are 1000s of registers to set correctly so it would be almost impossible even if the datasheets were detailed enough. Leopard Imaging has very good sensor driver support if you buy cameras from them. Once you have the mode tables you can use an I2C write loop or regmap to set all the registers and the sensor should be ready. The camera sensor driver, however, needs to be a kernel module and include proper V4L interface implementation and must have a proper device tree with CSI and VI4 settings so the sensor can be probed, added as a /dev/vid* device, and used by the V4L subsystem to capture frames. This is not trivial, but possibly any sensor driver (e.g. IMX390.c) can work as a starting point.
After your sensor is configured you should debug a single camera sensor per deserializer and get it working well before trying to add a second sensor using virtual IDs. There are many other posts in this forum that address that addition. From there you can start adding more sensor resolutions or modes, or adding more V4L commands to make the camera more usable from user space.
In my opinion GMSL2 deserializers like the MAX9296 are a much more robust approach to multi-cam architectures since they can be configured to split their CSI lanes to separate cameras, but they are difficult to buy so here we are with the MAX9286 : )