I have strong doubts someone knows the engineering choices of both chips.
At least it is supposed to be the case that the information is kept separate. The tiny doubt at this point stems from the comparative leaks in 2013 that hinted that AMD's compartments weren't airtight.
I had thought things would be more refined this time around, so I suppose comparing the alleged leaks we have now with whatever comes out will be the test for that.
But that's exactly what Sony have done with their custom ID Buffer in order to implement their own method of CBR or improved TAA. They came with their own hardware solution to resolve a problem tailored for their specific needs: how to display relatively sharp 4K with only 2X 1080p pixels.
And I actually expect that Sony will use some RDNA2 features ported on their custom RDNA GPU (the same way they used Vega features on their GCN GPU). But I won't be surprised if their solution is totally custom because they have already done it in the past.
The ID buffer itself operates like a tweaked depth unit from the existing RBEs. Its output is parallel with the Z-buffer and it updates a given pixel in step with the Z-buffer.
I think it's an example of how externally different features can be created by modifying or repurposing similar starting elements.
AMD's patent is a broad set of claims that doesn't commit to a single implementation, and it's still not guaranteed that the one we know about will be reflective of what is used.
If it's used as a starting point, there are sub-elements that clients could modify or replace like the algorithms used by the node traversal hardware, the implementation/presence of intersection hardware. Since the hardware is in a sub-block of a compute unit, there may be implementation details about what kind of CU those blocks are linked to, and where they might be placed/reserved. AMD's patent doesn't define what level of exposure the RT hardware has to developers, and varying what is available for coding to the metal versus custom microcoded tasks or API calls can significantly change what the solution looks like to the programmer.
The PS5 using a separate chip for RT would be much nicer as long as said RT chip was able to offload the majority of raytracing implementation overheads towards it.
However, in games where RT isn't needed the SeX would have a sizeable performance advantage. Plus, we'd also need to know if this same RT chip would process ray traced audio, or if the GPU still needs to distribute those 9.2TF for audio.
One significant source of overhead is the compute resources devoted to building the acceleration structures, which Nvidia's driver manages using the SM hardware. If part of that was offloadable, then a separate chip might have some decent general compute capability itself, perhaps for a subset of asynchronous compute. That might put more pressure on bandwidth if it's reliant on the GPU's pool, and could complicate the geometry phase or transition to pixel shading. A separate memory pool would be more complex in a manner Sony wanted to avoid earlier.
Traversal and intersection testing would be another set of tasks for a separate chip, and those might be more modest in terms of bandwidth and could function on a separate chip on a more modest inter-chip link.
The latency question for a separate solution would rear its head. I'm not as concerned about link latency, but rather that GPUs are more weakly synchronized and can take significant amounts of time before work done in one portion of the chip is visible to the other--and there's usually a cost in throughput or bandwidth to make things more readily visible.
I think there could a more complex set of trade-offs in terms of flexibility, ease of use, latency, and performance impact that the console vendors would have to make decisions on even with hardware that has strong similarities at the sub-unit level.
Besides avoiding lower yields from having a larger 400mm^2 chip?
Even more if you consider the PS5 was initially thought to launch this year, using 2019's 7nm yields.
If that happened, there could be a higher chance that Sony would have been overly pessimistic about the yield picture in 2020, since TSMC has been touting highly improved yields now versus fears in earlier years that the improvement curve would be longer.
Even without that, Sony's position as market leader in a way could make the PS5 a victim of the PS4's success. Projected sales for Sony versus Microsoft would put more weight on per-die cost on the volume leader versus the minority player. If there's an expectation of selling 100 million or more PS5 chips, even modest extra die cost adds up to a big number that Sony may have decided wasn't worth the upside.
If only AMD had released several solutions that use high-bandwidth / low-latency communication between chiplets and I/O chips using Infinity Fabric over a substrate, during the past couple of years...?
That may depend on what workloads would straddle the link. Within the SOC we could be talking about hundreds of GB/s, and within the GPU domain several TB/s.
An array of tensilica cores with additional custom instructions specific to RT would be very flexible both for RT tasks and for audio. Cadence sells this IP as a semi-custom business and allows significant custmization.
It does cost extra money to license it, which I could see the console makers exploring options to replace. AMD likely wouldn't mind cutting them out and pocketing the difference.
Some of AMD's patents on modified CUs with different programming models and hardware layouts do point to a possible desire to create a set of standard parts that could serve in this capacity.
Some of the more advanced chiplet 3D integration or MCM integration patents similarly hint at a desire to have AMD sub-blocks that can be inserted for client needs in custom solutions.