A longer answer is here:
https://forums.developer.nvidia.com/t/real-time-kernel-package/232296/6
A more intuitive answer of what realtime is is about how well timing can be guaranteed. Let’s say you have a computer which is very fast on average, but every once in awhile performance lags. Someone who plays computer games will be quite familiar with getting to some scene in the game where the load goes up and the frame rate suddenly drops. The average throughput does not matter when discussing realtime, but guarantee of some minimum performance does matter.
I think one of the most famous realtime systems was in the Apollo moon lander. There are a lot of videos on this, but an error incident really shows where hard realtime shines:
Incidentally, in that Apollo moon landing, there was an error code. They could not give good messages upon error because there wasn’t enough memory. It turned out later that this message was due to the work load going too high, and the computer was dropping low priorities in order to keep the flight computer from lagging as the lander approached the moon.
This was the first solid state miniature hard realtime solid state computer. The moon lander had to have sensors and controls respond flawlessly during landing to avoid getting destroyed, and it had to be in a tiny and light weight computer. The computer had multiple tasks, and although there were other small computers, none had been programmed quite like this. Control was guaranteed to never lag on control systems. If the load got high enough, the lower priority routines would be skipped (an example of “scheduling”). There was never even a single clock cycle which was missed or skipped or lagged for the critical systems. Imagine your computer game never lagging, and everything it does in the game occurring at the exact same time each time. That’s hard realtime.
There are actually different levels of realtime. In your desktop PC hardware, and in the ARM Cortex-A series of CPUs (which is what you see in Jetsons and many smart phones), they are interested in increased performance without increasing clock speed (which would also increase power consumption and requirements for size to dissipate heat). Caching and buffering is used extensively. However, a cache miss requires a slow memory operation to fill the cache before the cache can fill the CPU, so you get a burst of lag when you need memory that was not recently accessed. Buffering is another way to get memory ahead of time to prevent a burst of lag, but it also adds latency (although the standard deviation of timing will probably go down on average). You will find that hardware designed for realtime does not use caching and buffering on any critical routine.
If you read the ARM documents you’ll find they actually have a lot of different levels of “hard” realtime. Functional safety (mentioned for vehicles and aircraft) includes redundant CPU cores which shadow the running core, and if there is a hardware failure, the shadow core takes over without missing a single instruction. This tends to be the realm of the ARM Cortex-R series.
The Cortex-M series is sort of a half way between Cortex-A (which cannot perform hard realtime) and the Cortex-R. Cortex-M is not available for functional safety since it has no concept of shadow cores.
Scheduling load of multiple processes in a hard realtime environment goes up exponentially as processes increase. A Cortex-M would become overwhelmed rather quickly in a complex environment, but Cortex-R actually has hardware assisted scheduling. Cortex-M can do hard realtime if you don’t need functional safety, and if the software is not too complicated. No desktop PC architecture is capable of hard realtime. Cortex-R is ideal for hard realtime if (A) the complexity is high, or (B) functional safety is needed (Cortex-R can just use cores individually, they don’t have to shadow, but the option is there).
Hardware is half of realtime, software scheduling is the other half regardless of whether or not functional safety is needed.
There is also “soft” realtime. Audio on a desktop PC is a good example. Audio is typically given a higher priority, and there may be cases where caching or buffering is different for audio. Linux logs will often have a note after initial boot about “rate limiting” something related to audio. That’s a case of not being able to run the o/s while giving audio the low latency it needs. This is dropped audio frames. Similar exists for dropped video frames: Hard realtime never drops a frame. Inadequacies in the CPU hardware in combination with scheduling makes it impossible for the average computer to never drop audio or video frames without lowering the rate quite a bit.
You cannot normally get to it, but the Image Signal Processing (ISP) unit of Jetsons tends to be an ARM Cortex-R5, and the same for the Audio Processing Unit (APE).
Imagine you are flying a drone at about one meter/sec. If there is a problem, there isn’t much of an issue, e.g., if you use video to see the flight path, it doesn’t much matter if a frame is dropped here or there. Now instead consider you’re in a jumbo jet landing at 180 knots, you’re very close to the ground, and the software that keeps you on glide slope in zero visibility decides to lag a couple of seconds while cache misses are filling cache. It won’t work out too well. It gets even more intense when you use a phased array radar (controlled by computer) on a fighter jet, and you are moving around the speed of sound, perhaps with an autopilot holding you fifty feet above the terrain (that’d be an adrenaline rush when you realize the computer might need an extra half of a second before it can change the controls to avoid hitting that mountain).
Automobiles need hard realtime with functional safety. They don’t need this to the same degree as a fighter jet or spacecraft, but as soon as the vehicle is controlled by anything other than a human (or a human using “fly by wire”), there is a requirement that hardware failure not kill everyone.
Hard realtime is actually one of the most interesting topics.