Infinisearch
Veteran
It seems its a good overclocker, but it has a bad cooler holding it back.
Not really surprising considering the size of the chip. Adding all those cores should significantly reduce power for similar performance and there is a newer, more efficient process on top of it. HBM2 surely helps as well with the bandwidth involved.That's really nuts. And bodes well for gaming volta. Or ampere, or whatever is next gaming generation.
AMD is probably quaking in their boots seeing this...
What kind of efficiency do you mean? It's completely counter intuitive that given X number of cores, a single die would be less efficient than a multi-die.AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
The size of the chip is what makes this impressive. Power could have exploded, but it didn't, it's running some compute jobs faster than previous chips at lower power consumption. I'd have a hard time believing that's all down to HBM and a new process which is really just a tweak of an existing process.Not really surprising considering the size of the chip. Adding all those cores should significantly reduce power for similar performance and there is a newer, more efficient process on top of it.
Maybe we should wait for a multi-chip GPU to make its appearance first before declaring victory, and besides, Epyc isn't a GPU either so I don't really see how it is applicable.AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
"Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.AMDs alternative would be Navi's likely multi-core solution which could throw even more cores at the problem with even more efficiency. That's based on Epyc numbers and Nvidia's own MCM studies with multi-core being significantly more effective.
By that logic where were any of the patients for the new things Zen implemented a year out from its release? Only a few precursor patients can be found like the "stack cache" but the memfile/store to load forwarding looks nothing like what was described in that patient."Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.
"Likely MC" you say. Where is a single patent, leaked schematic or a leaked sample photo? Mind you, this solution should be available in a year or so.
Post said:ADAPTER 0
"NVIDIA TITAN V"
VEN_10EN, DEV_1D81, SUBSYS_121810EN, REV_A1
Video
Total video memory: 28448.44 MB (29822961664 bytes)
Video driver version: 23.21.13.8859
Maximum feature level: D3D_FEATURE_LEVEL_12_1 (0xc100)
DoublePrecisionFloatShaderOps 1
OutputMergerLogicOp: 1
MinPrecisionSupport: D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
TiledResourcesTier: D3D12_TILED_RESOURCES_TIER_3 (3)
Resource Binding animal: D3D12_RESOURCE_BINDING_TIER_3 (3)
PSSpecifiedStencilRefSupported: 0
TypedUAVLoadAdditionalFormats: 1
ROVsSupported: 1
ConservativeRasterizationTier: D3D12_CONSERVATIVE_RASTERIZATION_TIER_3 (3)
StandardSwizzle64KBSupported: 0
Cross Nodes Haring Animal: D3D12_CROSS_NODE_SHARING_TIER_NOT_SUPPORTED (0)
CrossAdapterRowMajorTextureSupported: 0
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSup portedWithoutGSEmulation: 1
ResourceHeapTier: D3D12_RESOURCE_HEAP_TIER_1 (1)
MaxGPUVirtualAddressBitsPerResource: 40
MaxGPUVirtualAddressBitsPerProcess: 40
Adapter Node 0: TileBasedRenderer: 0, UMA: 0, CacheCoherentUMA: 0, IsolatedMMU: 1
HighestShaderModel: D3D12_SHADER_MODEL_6_0 (0x0060)
WaveOps: 1
WaveLaneCountMin: 32
WaveLaneCountMax: 32
TotalLaneCount: 163840
ExpandedComputeResourceStates: 1
Int64ShaderOps: 1
RootSignature.HighestVersion: D3D_ROOT_SIGNATURE_VERSION_1_1 (2)
DepthBoundsTestSupported: 1
ProgrammableSamplePositionsTier: D3D12_PROGRAMMABLE_SAMPLE_POSITIONS_TIER_2 (2)
ShaderCache.SupportFlags: D3D12_SHADER_CACHE_SUPPORT_SINGLE_PSO | LIBRARY (3) (0b0000'0011)
CopyQueueTimestampQueriesSupported: 1
CastingFullyTypedFormatSupported: 1
WriteBufferImmediateSupportFlags: D3D12_COMMAND_LIST_SUPPORT_FLAG_DIRECT | BUNDLE | COMPUTE | COPY (15) (0b0000'1111)
ViewInstancingTier: D3D12_VIEW_INSTANCING_TIER_NOT_SUPPORTED (0)
BarycentricsSupported: 0
ExistingHeaps.Supported: 1
Titan V behaves a bit differently from the other recent HBM2 launch (Vega), primarily in that it appears less memory-constrained than Vega. The Titan V card responds better to core overclocks, in some cases, where other benchmarks produces roughly equal uplift from both core and memory overclocking. Based on our thermal and gaming benchmarks from earlier, it would appear that the Titan first needs a core OC, or at least a power offset and improved cooling solution, as performance grows significantly with core overclocks alone. Once that’s solved for, memory is actually providing meaningful uplift in some of these applications; it’s not like other GPU architectures where memory OCs can sometimes appear not worthwhile.
MaxGPUVirtualAddressBitsPerResource: 40
MaxGPUVirtualAddressBitsPerProcess: 40
MinPrecisionSupport: D3D12_SHADER_MIN_PRECISION_SUPPORT_NONE (0) (0b0000'0000)
PSSpecifiedStencilRefSupported: 0
ResourceHeapTier: D3D12_RESOURCE_HEAP_TIER_1 (1)
You forgot standard swizzle not being supported according to the above in the Titan V.These features are worse on Titan V compared to Vega
Titan V is 80% faster than TitanXP in Superposition 4K extreme benchmark:
TitanV @1920/1000: Score 2490
TitanXP @2088/3234: Score 1396
https://forums.overclockers.co.uk/posts/31427769/
HotHardware's Titan V review:
NVIDIA TITAN V Review: Volta Compute, Mining, And Gaming Performance Explored
Can you come up with a good reason to not use the Volta SM going forward in gaming GPU?Looks pretty bad for gaming performance here, even overclocking doesn't help. The SM change doesn't look like will make into the gaming cards, another Pascal-like iteration would be better especially if they can increase the clocks even further.