https://www.nvidia.com/content/dam/...pere-GA102-GPU-Architecture-Whitepaper-V1.pdf
Look at the table 3 in the GA102 whitepaper
There is a feature called "Instance Transform Acceleration", it's related to the BVH building (probably it cost nothing for SIMD's in GA102 to apply the instance transformation - move/rotate a box or a model).
Do we know whether RDNA2 supports this feature?
TLAS contains instances for every object of a scene, which are stored in BLASes. If different instances refer the same BLAS, that's instancing.It accelerates traversal in the presence of instanced geometry (e.g. building a forest by reusing the same tree many times with different poses..)
TLAS contains instances for every object of a scene, which are stored in BLASes. If different instances refer the same BLAS, that's instancing.
Not sure why the "Instance Transform Acceleration" should refer just to instancing, it may as well be referring the instance and BLAS transformations in general.
By accelerating AABB transformations, a lot of optimisations become possible at BLAS build time - faster refitting, better AABB alignment for geometry, etc.
What would happen if hardware doesn't support instance transforms?My understanding is that instance transforms are done just in time during intersection testing
What would happen if hardware doesn't support instance transforms?
Following the description here - "This data structure is used in GPU memory during acceleration structure build" and "Per customer request, clarified for D3D12_RAYTRACING_INSTANCE_DESC that implementations transform rays as opposed to transforming all geometry/AABBs."
You might be right that with HW acceleration it can happen during intersection testing, still some BVH builder assistance might be required for cases without HW acceleration.
Yep, I thought about this variant, but probably doing transforms per ray on SIMD is cheaper, have no idea to be honest.The alternative is to create unique BLAS entries for each instance during BVH build but that would likely be very wasteful.
Benches for Black Ops and Watch Dogs in Computerbase are old, using broken AMD drivers, the difference is rather large in these titles with proper drivers.Computerbase.de once again makes a point of RDNA2 competing better in the recently released Black Ops and Watch Dogs Legions. Black Ops even has better results in the 0,2% values on RDNA2 except from the test in 3.840 x 2.160
https://www.computerbase.de/2020-12...itt_benchmarks_in_sieben_topaktuellen_spielen
Cold War RT @4K: the 3090 is 66% faster than 6900XT, the 3080 is 50% faster. The 1440p results are not logical as they only have the 3090 being 18% faster than 3070. Suggesting a different bottleneck in the scene they selected.New The Computerbase test is actually for the 6900XT and from the eighth of December, so it's actually more than two weeks more recent than that video you posted of Black Ops
Anyway, Computerbase.de once again makes a point of RDNA2 competing better in the recently released Black Ops and Watch Dogs Legions. Black Ops even has better results in the 0,2% values on RDNA2 except from the test in 3.840 x 2.160
Both Cold War and Legion are running better than average on AMD h/w without RT and this likely skew the RT results in AMD's favor as well.
I.e. "it's just plain DXR so there's no reason to believe this game whose RT implementation was co-developed by nvidia would be favoring one architecture over the other".
I guess this is just empirical proof of that.
Actually, there's a similar proposition by Agner Fog for a hybrid CISC/RISC forward-compatible ISA: https://www.forwardcom.info/But I would prefer MS would invent an ISA actively (forward looking), which can be extended and / or optional (like SSE, AVX). Or a consortium would. Or AMDs involvement with Samsung leads to basically an establishment of a situation like x86, where multiple vendors co-develop the ISA.