How many physical blocks can TX2 concurrently compute?

I don’t know how many shared memory the TX2 has.

Should be 8G,
cat /proc/meminfo

I mean shared memory between threads. 64KB per block. but I don’t know how many 64KB shared memory has the TX2.

What particular kind of “block” are you talking about? CUDA blocks? What kind of threads are you talking about? CPU or GPU threads?
The Jetson uses “unified memory” – all memory is potentially usable both by the core CPUs, and by the GPU.
When it comes to CUDA scheduling, though, there’s really only two top-level execution units AFAICT.