I am building a small cluster of TK1s, they are all still in boxes and flashed with the initial release of 19.2 from April of 2014. Would it be best to reflash them all to 21.4? Are there significant benefits to this? We will be running MPI applications that are CUDA aware.
R21.4 is definitely a very good update with a lot of fixed issues. It is possible you could update one TK1, tweak till content, clone the root partition to your host, and then simply clone back into each out-of-date TK1. The cloned partition would be loopback mountable, and so you could mount the image and edit something (e.g., host name or ssh keys) prior to pushing the image onto the next TK1. See:
If you want to skip cloning, you could also use the option to re-use the existing image and save much time after the first flash.
Great! Thanks a lot for the reply. On another note, I have not found any build logs of other people constructing NVIDIA Jetson clusters, just pictures of 5 or 6 Jetsons stacked on top of each other… is there some technical limitation that I am not aware of for these little machines?
At SC '14 Orange Silicon Valley (telecoms giant) showed 96 Jetsons plumbed together:
They stated at the time that they could put 4,000 TK1s per rack unit. Yes, there are technical limitations, but it probably has more to do with communication bandwidth issues, wiring and concern over ones wallet.
Limitations depend greatly on what you’re doing. The limitation I am aware of is that it is 32-bit, and thus uses CUDA 6.5, although 7+ is out. Going to CUDA 7+ requires a 64-bit platform.