Tensor cores of Volta: no way to test/tune performance

Tensor cores in Volta GPU is a really big step (i.e. 120 Tf). However, there is no way to test/tune performance for many developers. V100 is too expensive.

Does NVDIA have plans to produce cheap version of V100? Any ideas?

For example, smaller chip (1/2 of V100) with only 320 tensor cores (400m2 on 16nm) using cheap GDDR5X (e.g. 547 GB/s like Titan Xp) and with price in range $2000-$3000 would be really great buying option!

Any ideas where to test/tune performance of tensor cores are really welcome. Cuda9 has support for tensor cores but there is no way to test/tune performance and to write optimal code using tensor cores :(.

Titan V is now available at ~$3000 price point.

Lower-level programming details of TensorCore are now available in this blog:

https://devblogs.nvidia.com/parallelforall/programming-tensor-cores-cuda-9/