Nvidia Volta Speculation Thread

That's really nuts. And bodes well for gaming volta. Or ampere, or whatever is next gaming generation.

AMD is probably quaking in their boots seeing this...
Not really surprising considering the size of the chip. Adding all those cores should significantly reduce power for similar performance and there is a newer, more efficient process on top of it. HBM2 surely helps as well with the bandwidth involved.

AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
 
AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
What kind of efficiency do you mean? It's completely counter intuitive that given X number of cores, a single die would be less efficient than a multi-die.
 
Not really surprising considering the size of the chip. Adding all those cores should significantly reduce power for similar performance and there is a newer, more efficient process on top of it.
The size of the chip is what makes this impressive. Power could have exploded, but it didn't, it's running some compute jobs faster than previous chips at lower power consumption. I'd have a hard time believing that's all down to HBM and a new process which is really just a tweak of an existing process.

AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
Maybe we should wait for a multi-chip GPU to make its appearance first before declaring victory, and besides, Epyc isn't a GPU either so I don't really see how it is applicable.
 
AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
"Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.
 
"Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.
By that logic where were any of the patients for the new things Zen implemented a year out from its release? Only a few precursor patients can be found like the "stack cache" but the memfile/store to load forwarding looks nothing like what was described in that patient.
 
"Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.

Probably year and a half; June 2019 launch window seems most logical.
 
Direct3D 12 Feature Checker on nvidia TITANV

Post said:
ADAPTER 0
"NVIDIA TITAN V"
VEN_10EN, DEV_1D81, SUBSYS_121810EN, REV_A1
Video
Total video memory: 28448.44 MB (29822961664 bytes)
Video driver version: 23.21.13.8859
Maximum feature level: D3D_FEATURE_LEVEL_12_1 (0xc100)
DoublePrecisionFloatShaderOps 1
OutputMergerLogicOp: 1
MinPrecisionSupport: D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
TiledResourcesTier: D3D12_TILED_RESOURCES_TIER_3 (3)
Resource Binding animal: D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported: 0
TypedUAVLoadAdditionalFormats: 1
ROVsSupported: 1
ConservativeRasterizationTier: D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
StandardSwizzle64KBSupported: 0
Cross Nodes Haring Animal: D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported: 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSup portedWithoutGSEmulation: 1
ResourceHeapTier: D3D12_RESOURCE_HEAP_TIER_1 (1)
MaxGPUVirtualAddressBitsPerResource: 40
MaxGPUVirtualAddressBitsPerProcess: 40
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1
HighestShaderModel: D3D12_SHADER_MODEL_6_0 (0x0060)
WaveOps: 1
WaveLaneCountMin: 32
WaveLaneCountMax: 32
TotalLaneCount: 163840
ExpandedComputeResourceStates: 1
Int64ShaderOps: 1
RootSignature.HighestVersion: D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
DepthBoundsTestSupported: 1
ProgrammableSamplePositionsTier: D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags: D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
CopyQueueTimestampQueriesSupported: 1
CastingFullyTypedFormatSupported: 1
WriteBufferImmediateSupportFlags: D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
ViewInstancingTier: D3D12_VIEW_INSTANCING_TIER_NOT_SUPPORTED (0)
BarycentricsSupported: 0
ExistingHeaps.Supported: 1
 
NVidia Titan V GPU Core vs. HBM2 Overclocking
Titan V behaves a bit differently from the other recent HBM2 launch (Vega), primarily in that it appears less memory-constrained than Vega. The Titan V card responds better to core overclocks, in some cases, where other benchmarks produces roughly equal uplift from both core and memory overclocking. Based on our thermal and gaming benchmarks from earlier, it would appear that the Titan first needs a core OC, or at least a power offset and improved cooling solution, as performance grows significantly with core overclocks alone. Once that’s solved for, memory is actually providing meaningful uplift in some of these applications; it’s not like other GPU architectures where memory OCs can sometimes appear not worthwhile.

https://www.gamersnexus.net/guides/3172-nvidia-titana-v-gpu-core-vs-hbm2-memory-overclocking
 
These features are worse on Titan V compared to Vega

MaxGPUVirtualAddressBitsPerResource: 40
MaxGPUVirtualAddressBitsPerProcess: 40
MinPrecisionSupport: D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
PSSpecifiedStencilRefSupported: 0
ResourceHeapTier: D3D12_RESOURCE_HEAP_TIER_1 (1)

Don't have a reference to Vega for the newer features DmitryKo added, last 6 lines.
 
These features are worse on Titan V compared to Vega
You forgot standard swizzle not being supported according to the above in the Titan V.

edit - I just quickly checked the feature level thread and the post on vega doesn't seem to support standard swizzle either. Isn't Vega supposed to support standard swizzle?
 
Last edited:
Titan V is 80% faster than TitanXP in Superposition 4K extreme benchmark:

TitanV @1920/1000: Score 2490
TitanXP @2088/3234:
Score 1396

https://forums.overclockers.co.uk/posts/31427769/

So superposition with extreme shaders is quite the outlier, probably the changed SM of volta helps there bigly. The Vega64 results are surprising as well, maybe something else is limiting the GP102 cards in it because unigine benchmarks are usually better on nvidia and superposition is no exception.


Looks pretty bad for gaming performance here, even overclocking doesn't help. The SM change doesn't look like will make into the gaming cards, another Pascal-like iteration would be better especially if they can increase the clocks even further.
 
Going by the ETH-Mining rates, hothardware.com achieves, webservice whattomine gives a profit (after expensive electricty cost of 28 us-cents) of >6.5 US-$ per day for a 82 MH/s. That's 195 US-$ per month and 2,340 per year. And after that you have a nice, fast graphics card, that costed you merely 650 bucks. Yeah, I know, trying to justify by all means possible... :D :D :D and a Vega is at 4.35 US-$ per day with XMR right now, so...
 
Looks pretty bad for gaming performance here, even overclocking doesn't help. The SM change doesn't look like will make into the gaming cards, another Pascal-like iteration would be better especially if they can increase the clocks even further.
Can you come up with a good reason to not use the Volta SM going forward in gaming GPU?

I’m not aware of anything in Volta that is different from Pascal in terms of feeding the SMs (geometry handling etc.)

So as long as the new SMs don’t regress in anything compared to the new ones, I don’t see why Nvidia would choose Pascal.

The new SMs seem to be similar in terms of area (if you ignore the extraneous stuff), they are more power efficient, they have much better caches, they are so much better for integer stuff, and the clocks are in the same ballpark as well as long as it doesn’t get throttled due to power limits, which could be explained by the leakage of large, idle FP64 and tensor cores.

So just strip the FP64 units, replace the FP16 tensor cores by INT, remove whatever ECC stuff is in there, and done!
 
I'm pretty sure the Titan V is somewhat impacted by its lower ROP and geometry rates compared to the Xp and 1080 Ti. At least in some games.

And due to the low amount of games tested so far, a significant driver improvement in such games isn't out of question IMO, which would dramatically improve the current perception of how fast it is.
 
Back
Top