Board Choice: TK1 vs TX1 vs TX2 for CNN image processing

New to here, I’m an undergrad doing a hand posture recognition project.
I’m planning in implementing Tensorflow with python.
My advisor advised me to implement my project on TK1.
However, TK1 has been discontinued since April 2018 by Nvidia.
Is there any advantageous reasons for doing it with TK1 since it has been around few years?
Or is the TX2 out-performed TK1 in anyways?
What about TX1 vs TX2?

I can’t figure good reasons for choosing TK1 vs TX1 or TX1 vs TX2, except price or a requirement to use a binary bound to a specific release.

TK1 has 4 32-bit cores (armhf), 2 GB RAM and its CUDA version won’t be higher than 6.5.
Not sure you can get Tensoflow for this CUDA version.

TX1 has 4 64-bit cores (aarch64) and 4GB RAM.
TX2 has 6 64-bit cores (aarch64) and 8GB RAM.
So if you application needs more than 3 GB, you would have to select TX2.
GPU performance is also increasing with models.

I wish to emphasize something @Honey_Patouceul mentioned: CUDA version 6.5 is the newest a TK1 can use. Development stopped on the TK1. None of the currently published code is even remotely that old. Although the TX1 is not actively developed now, it has much more recent CUDA available (I’m not sure which, but probably CUDA 9…someone who knows for sure may want to answer). The TX2 and Xavier are both actively developed (although L4T R28.2 is the last version listed for a TX1, R28.2.1 works on a TX1 as well…and this happens to be the most recent release available on a TX2…but at the next release, e.g., early in 2019, the TX1 should be expected to be incompatible with the TX2 release).

My thought is that if you can get a TX1 on a developer carrier board (a separate module on a third party carrier is probably not practical for you) it would be superior to working on a TK1.

As an alternative to a TK1 developer kit, you might use the Toradex module with one of their carrier boards (not all carrier boards support the same I/O connectors, be careful to check):

Please note that Toradex uses a different device tree, and so although TK1 releases can be made to work on this, that there are some differences and so you’d end up wanting to use the software released by Toradex (the device tree is most of the difference).

Maybe I was wrong on my post. So at the CNN phase, I will train in tensorflow. Once I satisfy my CNN result, I will implement it onto a jetson board.
After few posts I read, is it right to say that in common, developers do training on a PC with GPU and then implement it on a jetson board [1].

[1] Is it possible to train CNN on jetson tx2? Or jetson is used only for a pre-trained neural network?

So in that case, would I have to uphold the decision to which board to pick after my CNN training completed and see the RAM size I would need?

You may train on host with its GPU, and only deploy into Jetson with tensorRT.
So be sure which CUDA version is expected by your expected tensorRT version on Jetson.

Jetsons are always slow with training. Not all training software may be available on a Jetson, but in cases where it is possible you will find it slow. You are correct that people train on the PC and then deploy on the Jetson. Video RAM size does matter (I think it has been said before that a lot can work with a 6GB VRAM, but 3GB will often be insufficient…more VRAM is better, e.g., a 12GB Titan Xp or a Titan RTX would be even better…a good excuse for getting the ultimate gamer card).