Next Generation Hardware Speculation with a Technical Spin [post E3 2019, pre GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
The DDR3 pool's primary reason for existing is to serve as RAM for the separate OS running on the southbridge.
Whether that particular platform quirk will continue on hasn't been discussed to my knowledge.
Right, I forgot about that, it was not a big success since the goal was to allow standby network operations, and for some reason it still needs to power on for downloading. I wonder if they will double down and do it better, or forget about it completely.
 
It's one of their patents which includes a million "embodiments" and they put so many different methods that it was probably from an R&D team looking at different ways to improve cooling for multiple reasons (thickness, layout, efficiency, etc..) but without being for any particular product.

Using filled vias to conduct heat through a PCB has been done forever and it's been more than enough for anything in the range of portable electronics. The only thing from the patent that catch my eye is the tetris-like pieces going through the PCB and it leaves enough room for the pcb traces to go around. It could be used to increase cooling of some specific hot spots (are there difficult hot spots on each WGPs? on Zen cores?), it's a potential for increasing the overall cooling efficiency. the hard part is going through the package substrate too, but substrates have copper filled vias, and the underfill can possibly be some heat conducting epoxy. I don't know how these things are built.
The key is CTE matching the materials. You can’t use a material wildly mismatched to the intrinsic silicon unless you want warped die, cracked bumps, etc. Unless of course you want Red Ring 2: Electric Boogaloo.

I will say that most of Sony’s physical design patents end up in shipped products. There are several patents regarding the cooling and EMI mitigation features of the PS4.
 
Right, I forgot about that, it was not a big success since the goal was to allow standby network operations, and for some reason it still needs to power on for downloading. I wonder if they will double down and do it better, or forget about it completely.

If AMD has improved their power management enough over the last several years, it should be redundant to have a secondary low-power processor in the system.
 
If AMD has improved their power management enough over the last several years, it should be redundant to have a secondary low-power processor in the system.
Gddr6 also added a lot more power management, very low clocks and low power modes.
 
If AMD has improved their power management enough over the last several years, it should be redundant to have a secondary low-power processor in the system.

I think you can make a case that AMD has. Look at the PPW improvements of Zen 2 over Zen+, the power savings in the new Vega variant in APUs, and the claimed gains of RDNA 2 over RDNA 1 without a node jump.

Gddr6 also added a lot more power management, very low clocks and low power modes.

GDDR6 has a lot of nice enhancements, including the 16b addressable blocks.
 
Gddr6 also added a lot more power management, very low clocks and low power modes.

On the subject of GDDR6, am I right to think that the move from a single 16/32-bit channel connection to each chip to 2 x 8/16-bit channels should help with GPU/CPU memory contention?
 
On the subject of GDDR6, am I right to think that the move from a single 16/32-bit channel connection to each chip to 2 x 8/16-bit channels should help with GPU/CPU memory contention?
Edit: Each channel is the same speed as a gddr5, so it's truly an effective twice the number of channels like a 512bit gddr5 at 7/8gbps. But the latency didn't improve, and the total data demand increased, I don't know if that means twice the cache required... It might be a status quo.
 
Last edited:
Edit: Each channel is the same speed as a gddr5, so it's truly an effective twice the number of channels like a 512bit gddr5 at 7/8gbps. But the latency didn't improve, and the total data demand increased, I don't know if that means twice the cache required... It might be a status quo.

I was wondering if having multiple data paths to each chip would lower the odds that either the CPU or GPU would be completely locked out from the data on a chip during a cycle due to the access demands of the other.
 
I was wondering if having multiple data paths to each chip would lower the odds that either the CPU or GPU would be completely locked out from the data on a chip during a cycle due to the access demands of the other.
It does, but since the time lost from a collision is the same, and there's twice as many request in flight, doubling the number of channel should barely compensate.
 
It does, but since the time lost from a collision is the same, and there's twice as many request in flight, doubling the number of channel should barely compensate.

So, over time, the advantage that might be realized at any one moment doesn't hold up as it is mitigated by these other factors. Got it. Thanks!
 
Interesting patent here from SIE. Looks like an optical/holographic medium can have different types of data on different layers that can be simultaneously or interleaved in its read patterns.

http://www.freepatentsonline.com/10586566.html

It references this technical research, which includes a DNN analyzing the optical information read back.

https://www.photonics.com/Article.aspx?AID=63751

This is intuitively like a very complex maze of glass and mirrors. The light enters a diffractive network and bounces around the maze until it exits. The system determines what the object is by where most of the light ends up exiting,” said UCLA professor Aydogan Ozcan.

In experiments, researchers placed images in front of a THz light source. The D2NN viewed the images through optical diffraction. Researchers found that the device could accurately identify handwritten numbers and items of clothing — both of which are commonly used in artificial intelligence studies.

It’s a giga-scale game of Plinko.
 
Last edited:
I think that’s what MS is doing with XSX. No need for clamshell though. 16Gb chips are readily available.
I think so too. Any reason why they would be doing this though? Is it strictly a function of cost? Would it still be a unified system? Or is there any advantage to separating the OS memory space or CPU from the GPU if that is what they are doing?
 
This is most probably the 3d audio hardware of PS5 :

13ea Ariel HD Audio Controller
13eb Ariel HD Audio Coprocessor

For reference Arden has no audio coprocessor, they are maybe using some kind of AMD True Audio integrated in the GPU:

1637 Renoir HD Audio Controller

It was under our nose since the beginning :
https://pci-ids.ucw.cz/read/PC/1022
 
I think so too. Any reason why they would be doing this though? Is it strictly a function of cost? Would it still be a unified system? Or is there any advantage to separating the OS memory space or CPU from the GPU if that is what they are doing?
Cost is the only reason I can think of. It would still be unified because it’s a contiguous physical address space (and virtual, for that matter). You just have some memory accesses they won’t get the benefit of parallel access across all modules, lowering your theoretical peak data throughput.
 
Cost is the only reason I can think of. It would still be unified because it’s a contiguous physical address space (and virtual, for that matter). You just have some memory accesses they won’t get the benefit of parallel access across all modules, lowering your theoretical peak data throughput.
Yeah, if they have the 320bit bus necessary to feed the 12TF gpu, they end up with either 10GB(not enough) or 20GB(too expensive) so a cost compromise is a mix, and that upper address space above 10GB would be slower but just for the OS anyway. If the OS does almost nothing during gameplay it has no negative impact, but it's still only 10GB of the address space being at full speed.

For a 256bit bus with 16GB, the entire space is at full speed, but if there's 4GB reserved for the OS it's just 12GB anyway.

So it's pretty much four quarters for a dollar.

I wish the memory manufacturers did the 1.5GB chips in the jedec specs, that would have been ideal here.
 
Yeah, if they have the 320bit bus necessary to feed the 12TF gpu, they end up with either 10GB(not enough) or 20GB(too expensive) so a cost compromise is a mix, and that upper address space above 10GB would be slower but just for the OS anyway. If the OS does almost nothing during gameplay it has no negative impact, but it's still only 10GB of the address space being at full speed.

For a 256bit bus with 16GB, the entire space is at full speed, but if there's 4GB reserved for the OS it's just 12GB anyway.

So it's pretty much four quarters for a dollar.

I wish the memory manufacturers did the 1.5GB chips in the jedec specs, that would have been ideal here.
15GB would have indeed been ideal. You’d think the lifetime of a console would be enough to justify it. It’s hundreds of millions of chips.
 
Dont suppose they'll be able to optimize the OS/Dashboard use down to just 2GB ram? Maybe too many buffers still in use for video and image sharing or streaming. Or would they use any freed up memory space and maybe more for reading from the host-controlled SSDs? Or does it read directly from SSD to program address space?
 
Cost is the only reason I can think of. It would still be unified because it’s a contiguous physical address space (and virtual, for that matter). You just have some memory accesses they won’t get the benefit of parallel access across all modules, lowering your theoretical peak data throughput.
Sounds complicated. Will that be another headache for developers ?
 
Status
Not open for further replies.
Back
Top