Just got my 2 Titan Vs today, have tested on a few kernels, the results are good, but it seems that the boost clock is overrated, in my tests, the GPU boost clock only reach to 1355MHz, vastly lower than my GP102, which can reach to 1850+MHz.
The most interesting part is GEMM test with CUBLAS_TENSOR_OP_MATH enabled:
With tensor core enabled for GEMM with fp16 x fp16=fp32, Titan V can reach 83Tflops/sec, which is quite impressive, espeically considering it only have 3/4 of the bandwidth of V100.
And the most unexpected result is, when tensor core is enabled, it seems that it can accerlate sgemm as well for whatever reason yet to know:
Without tensor core, the SGEMM test on Titan V can get just ~12Tflops
But with tensor core enabled, the SGEMM on Titan V can reach to 30-40Tflops.
I dont know how this is possible, maybe Nvidia forget to mention their tensor core can accerlate sgemm as well? just hope this is a hidden feature, instead of a bug of CUDA 9.1.