CPU scheduling looks at an affinity mask (keep in mind a “mask” is the inverse applied over the top of “allowed”). One can assign (or more correctly, provide a strong suggestion) to use or not use given cores. There are multiple ways to access this, e.g., the program itself, or from outside settings (some of which would require root permissions). Then there may be a finer division based on threading instead of full processes.
See:
man sched_affinity
man pthread_setaffinity_np
You have access to “soft” affinity…and if allowed, this will result in the same “hard” affinity. If one divides a driver up correctly, then the smallest amount of work will be done with hardware I/O, and then the rest migrated to another CPU core (ksoftirqd). You don’t have control over other people’s driver designs, but many of the drivers used at the core of the system are well-optimized. You do have access to priorities of PIDs…be careful with adjust other people’s PIDs or increasing your PID’s priority (too negative of a “nice” on your process could deadlock the system…it’d be rare that more negative than “-2” would improve something without a risk of destroying something you never intended to change).
If you run this you’ll see affinity is related to IRQ (it is the IRQ which triggers the running of a driver or other code):
find /proc -name "*affinity*"
You should always see “/proc/irq/1/”. If you “cat” the “affinity_hint” file you will see a hexadecimal mask with “0x00” not removing any core from the list (and it is a strong emphasis on “_hint” for most cases). In fact if you run this you’ll likely see the default that any IRQ can run on any core:
cat `find /proc/irq -name "affinity_hint"`
See:
https://www.kernel.org/doc/Documentation/IRQ-affinity.txt
https://www.linuxjournal.com/article/6799
You might find this more interesting (the “-c” counts lines of occurrence):
cat `find . -name 'smp_affinity_list'` | sort | uniq -c
…these correspond to cores. Most are “0-5” (all 6 cores being available). The ones corresponding to just “0” are likely drivers doing hardware I/O.
You’ll probably ask how to map an IRQ to a PID, but I think this cannot be done…the listing of IRQ “/proc” files above is for illustration. However, you can sort of do this via:
cat /proc/interrupts
If you do this, then you’ll see a very large number of interrupts on CPU0 since this is the only core with wiring to many of the controllers you see performing interrupts there. If your application can run without access to those hardware controllers, then it is a good candidate for masking to disallow CPU0. If your program does both I/O through hardware, and also does software computation without requiring hardware controller I/O, then you would be advised to split the program into two threads and set affinity of the software portion to disallow CPU0. If the process does mainly hardware I/O, then you will end up on CPU0 no matter what affinity you set (you could tell it to go to CPU1, but it would migrate back to CPU0). This isn’t always going to do what you want since cache plays a part in performance and splitting cores can result in cache misses. On the other hand, since CPU0 is a very critical resource, then the reduced time on this core can be a very important benefit even if a given program or driver is slightly slower as a result.
I don’t know if it is purely part of the ARM architecture or if it is specific to this particular SoC, but if there were a mechanism to distribute interrupts over all cores, then this same SoC would have a massive performance boost. Many IRQs would have very slightly increased latency, but the overall system would become much better behaved even under the heaviest of loads. Intel x86_64 cores have support in CPU working with the IO APIC to do this…this SoC has no such support, it simply has an aggregator which sends much of the work only to CPU0.