Next Generation Hardware Speculation with a Technical Spin [pre E3 2019]

Status
Not open for further replies.
But Cerny didnt say PS5 has an SSD, so it is not at odds with anything. To assume things that were not explicitly said or tested can make an ass out of you.
 
Is foveated rendering something that can be used on a TV or is it a VR thing only? Not by eye tracking but by detecting and reducing resolution in like darker areas of a scene? Or is variable rate shading a better/only option here? I guess we don't know if Navi will support VRR,

I think foveated tendering uses variable rate shading. This is also something the Switch already makes use of, IIRC. I think I read somewhere that Navi may support it, as AMD filled a patent on it.
 
I already quoted that part. And once again... Those are not Cerny's words. That is only showing fast-travel once the game was loaded and not new game boot-up.
If you don't believe the reporter said PS5 devkit has SSD, how do you believe other info like 7nm navi or 8-core CPU which are said by the reporter?
 
Lets try to put some numbers around things to try to see what's reasonable and what isnt to baseline expectations ...

What's Known
Time used: 0.8 seconds

What's Unknown
Amount of data read
Read speed of storage

What's reasonable ranges?
Amount of data read is under less than console memory, but unknown max memory size for PS5.
Can we assume data read was less than PS4 game memory of 5GB?

PCIExpress 4.0, 1x Lane bandwidth 1 * 1.969 GB/s * 0.8 seconds = 1.5752 GB
PCIExpress 4.0, 2x Lane bandwidth 2 * 1.969 GB/s * 0.8 seconds = 3.1504 GB
PCIExpress 4.0, 4x Lane bandwidth 4 * 1.969 GB/s * 0.8 seconds = 6.3008 GB
PCIExpress 4.0, 8x Lane bandwidth 8 * 1.969 GB/s * 0.8 seconds = 12.6016 GB

Samsung 970 Evo Plus NVME, sequential read 3500 MB/s * 0.8 seconds = 2800 MB


If we assume memory read is less than PS4 game memory, then it could be possible all 5GB of new fast-travel destination was already loaded into PS5 game memory that should definitely be more than 12GB. Now I'm not saying that is the case, but it is possible.

To be clear, I truly want nextgen consoles to have amazing storage subsystems and memory subsystems. I'm just playing devil's advocate and assuming the worst to not be disappointed.
 
If you don't believe the reporter said PS5 devkit has SSD, how do you believe other info like 7nm navi or 8-core CPU which are said by the reporter?
They were explicitly stated that the PS5 had these. Nowhere does it say, "PS5 has an SSD." Everyone thinks it does when they read it, but when you really look at it, it doesn't and only implies heavily. But what's shown could be achieved by a cache. Might be a 64 GB flash cache between HDD and RAM. Might be a 1 TB SSD. Might be a 256 GB SSD and a large HDD. We don't know.

Cerny said he had an SSD in his laptop and Excel takes 15 seconds to load. Then we have a line saying, "What’s built into Sony’s next-gen console is something a little more specialized." So it's more specialised than an SSD. Does that make it a different type of SSD, or a different storage solution?

There's no attempt to be deceitful on the part of the article, and all the info in it is accurate, but the info is buried under subjective prose. That's the nature of non-technical writing, and because of that, it's not possible to be sure of the specifics. We cannot say categorically that PS5 has any form of hardware raytracing acceleration even though it was hinted at because it wasn't explicitly stated (unlike the CPU being Zen 2), and we cannot say categorically that PS5 has an SSD even though it was hinted at.

There's a qualitative difference between the types of info present. Us 'doubters' are just pointing that out, not to blow cold on anything, but to keep the technical, fact-based discussion technical. If it's later shown PS5 has HWRT and a special SSD, great - changes nothing. But limiting discussion now to a theory that assumes these when it later proves they're not in just means discussion now is focussed in the wrong direction. We can confidently accept it's x64 and not ARM, at least. :yep2:
 
A caching solution would be fine.

It'd work just as well with external HDD storage (like an enormous number of people already use and will want to continue using), and when GB/$$ is in favour of SSD you can just go completely with SSD inside the console.

It's a solution that would work in the present and be ready for the future.
 
I should have included numbers from the PS4 using mechanical drive, that would have given the more limiting range of how much was loaded. I dont recall the bus speed limit of the PS4 IO subsystem but it seemed to be around the physical limits of mechanical hard drives, so lets use some ranges.

What is Known
Time taken for PS4 Fast Travel: 15 seconds

What is Reasonable Ranges?
Reasonable speed of mechanical drive from 2013: 100 MB/s
Upper end of Mechanical drives from 2013: 150 MB/s
Extreme end of Mechanical drives even now: 200 MB/s

100 MB/s * 15 seconds = 1500 MB
150 MB/s * 15 seconds = 2250 MB
200 MB/s * 15 seconds = 3000 MB

Now the speed of a NVME:
Samsung 970 Evo Plus NVME, sequential read 3500 MB/s * 0.8 seconds = 2800 MB

The 2250 MB read is right around the speed of what an NVME could read in 0.8 seconds, assuming sequential read burst speed since thats their fastest.

I think that would reasonably fit into ram cache on even the current gen One X, and more than reasonable to fit into ram cache on nextgen PS5 or Xbox Next.
 
For these calculations Brit, doesn't the CPU have a big factor to play here as well?

This is the problem assuming things, only Sony know. If the data is packed in a way not directly supported by any of the PS4 hardware then the CPU is unpacking data before it is usable by other CPU tasks (world generation) or the GPU for rendering. If that's not the case then it's largely an I/O issue and the CPU is irrelevant.

As had been demonstrated through PS4's eerily fast install process, Sony have some great tech for games being quickly playable but to what extent this tech leveraged in fully installed games themselves is not known.
 
Oh it absolutely does, but for analyzing absolute upper bounds to range out how much data could possibly be read we can ignore CPU. Naturally the actual amount of data read would be below these numbers because not 100% of the 15 seconds or 0.8 seconds was spent reading.

But overall, it looks like a maximum of 2250 MB to 3000 MB was read during this test which is around the current sequential read limits of NVME devices and under the theoretical max bus speed of 4 lane PCIExpress 4.0 subsystem.

Not a bad bounding box given the extreme lack of details actually presented.
 
What is the minimum quantity of flash that could deliver > current consumer SSD bandwidth based on the number of chips used?
 
I try not to crosspost between forums, but I felt this was important to share. This is regarding the reddit rumor saying PS5 will have HBM2.

Another interesting point to make regarding this rumor is that TSMC actually consulted with Sony (among others) during the development of InFO_MS, which would make Sony familiar with it and probably in the back of their mind when developing consoles. I'd also like to point out that Sony is not a stranger to being inventive with memory. The Vita had a special packaging method in use to get very high bandwidth to the Vita's GPU.


Also, let's assume for a second there was a typo in the original rumor, because it doesn't quite make sense as it is. It says:


allowed them to go below ~50 GFLOPs per GB/sec. bandwidth but still keep above 40 GFLOPs per GB/sec.


Which doesn't make sense. Typically when we talk about GPUs, we talk about GB/s per GF/TF, not the other way around. However, 40GB/s per GF would be a stupidly low number, so let's now assume they meant TF. We need a second part of the comment for future information


InFO_MS allows them to drive their 1.6 Gbps chips @ 1.7 Gbps (435 GB/sec.) without having to increase the voltage above 1.2v


1.6Gbps per second would result in 409.6GB/s in a 2-stack config. 1.7Gbps bumps it to 435.2GB/s, which is competitive with 256-bit GDDR6 solutions. If we're now assuming the 40-50 GB/s per TF is valid, this gives us a range of 8.7TF to 10.875TF for PS5.


Finally, regarding my skepticism around HBM2 supply, some key things have happened since Samsung's comments about low capacity. SK Hynix and Micron have both entered the market in full force (after the latter abandoned HMC development). And the crypto market crashed. With DRAM and NAND markets easing up, that capacity has to shift somewhere.


Regarding HBM pricing, it's hard to know much it has eased over the past few years (we do have some Vega VII rumored costs for reference), but I think it may be possible to get it down to less than 50% more than GDDR6 per GB, perhaps even just 35% to 40% higher. When you consider that they just need 8GB instead of 16GB of GDDR6, their solution is extremely cost competitive. At that point, it becomes a lot more attractive. HBM is also done on contract pricing (i.e. not floating with market costs), so a big order from Sony locks that factor in and sets up a mutually beneficial relationship with that partner to help them build up their own capacity.


The only rumor around this giving me pause is the digitimes rumor that stated ASE will do the packaging. Other than that, a lot of this rumor makes sense the more I dig into it.


Also, I imagine if that PS4 rev mentioned is coming, it's definitely this Fall. Since it's a console rev, it may not get cracked open to confirm the 7nm EUV from Samsung part, but the timing makes so much sense with MS pushing costs down with the SAD model and the rumored E-revision of the device internals.


Finally, here's the rumor in its entirety for posterity:


PS4 refresh

  • sometime between september and november
  • 199
  • fabbed on samsung 7nm EUV
  • best wafer pricing in the industry
  • die size 110mm²
  • no PRO refresh, financially not viable yet
  • too close to PS5 as well
PS5 memory and storage systems

  • 24 GB RAM in total (20 GB usable by games)
  • 8 GB in form of 2 * 4-Hi stacks HBM2
  • Sony got "amazing" deal for HBM
  • in part due to them buying up bad chips from other customers which can't run higher then 1.6 Gbps while keeping 1.2v.
  • HBM is expected to scale down in price a lot more than GDDR6 over the console lifetime
  • Samsung, Micron and SK Hynix already shifting part of their capacity towards HBM due to falling NAND prices
  • Sony will be one of the first high volume customers of TSMCs InFO_MS when mass production starts later this year (normal InFo already used by Apple in their iPhone)
  • InFO_MS brings down the cost compared to traditional silicon interposers - has thermal and performance advantage as well
  • InFO_MS allows them to drive their 1.6 Gbps chips @ 1.7 Gbps (435 GB/sec.) without having to increase the voltage above 1.2v
  • HBM is more power efficient compared to GDDR6 - the savings were invested into more GPU power
  • additional 16 GB in form of DDR4 @ 256 bit for 102.4 GB/sec.
  • 4 GB reserved for OS, the remaining 12 GB usable by games
  • memory automatically managed by HBCC and appears as 20 GB to the developers
  • HBCC manages streaming of game data from storage as well
  • developers can use the API to take control if they choose and manage the memory and storage streaming themselves
  • memory solution alleviates problems found in PS4
  • namely that CPU bandwidth reduces GPU bandwidth disproportionately
  • 2 stacks of HBM have 512 banks (more banks = fewer conflicts and higher utilization)
  • GDDR6 better than GDDR5 and GDDR5x in that regard but still less banks than HBM
  • at the same time trying to keep CPU memory access to slower DDR4
  • very satisfied with decision to use two kinds of memory for price to performance reasons
  • allowed them to go below ~50 GFLOPs per GB/sec. bandwidth but still keep above 40 GFLOPs per GB/sec.
 
we are not moving away from 8 cores on the PC yet, Intel barely started to push those on the mainstream, AMD still mostly wants you to buy 4-6c bellow $300, PC is still mostly 4c-6c dominated

[...]

Why do you come to that conclusion? From what I see they push towards more cores since Bulldozer or at least since Zen. With Ryzen they were the first to introduce 6 and 8 cores to a mainstream socket while Intel kept the mainstream to 4 cores since the Core 2 Quad (and only increased the count after AMD returned to relevance). Since AMDs multithreading performance is better than Intel while being slightly worse (at the same clock, not to mention that Intel can also clock higher) in single threaded workloads, they profit more from a push towards more cores than Intel.

Another Digital Foundry video discussing the PS5 was published now

What's up with their memory allocation slide at 0:31? Is the 18 GB they show a typo? 16 GB would make more sense unless they know something (like 24 GB in total but 6 GB reserved for OS).


Thanks for bringing this up. I hope it's okay to piggyback on this and to point to the post from @DmitryKo again, since it was not only buried under comments when I posted it on reddit, but thanks to my luck it was buried here as well. :D

I disagree. Vega includes a lot of useful new features which Navi would definitely use and expand on. For a start, HBCC virtual memory paging capabilities look especially promising in a game console.

Consider an IO die with a HBCC derived memory controller, and HBM3 die on the package in a high-end SKU. This configuration could give you:
* 4-8 GB of local HBM3 memory - 512 GByte/s;
* 8-16 GB of DDR5 system memory - 30-50 GByte/s;
* 30-60 GB of NVRAM scratchpad memory - 3-5 GByte/s.
All this memory would be connected directly to the crossbar memory/cache controller and mapped into virtual address space, with the ability to detect and unload idle pages from local memory to another partition.

That would be a cross between complicated high-speed memory subsystems of PS3 and Xbox One, but without the burden of manual memory management. Just load your assets all at once, and the OS will move them between memory partitions as necessary.

So, my questions is: if we replace HBM3 with HBM2 and DDR5 with DDR4 from the example of @DmitryKo, gives the inclusion of such a (probably) ~3 GB/s. (or faster) fast NAND storage the rumor more credibility?


Another question: would such a setup affect GPU decompression which was mentioned multiple times over the last months?
 
1.6Gbps per second would result in 409.6GB/s in a 2-stack config. 1.7Gbps bumps it to 435.2GB/s, which is competitive with 256-bit GDDR6 solutions. If we're now assuming the 40-50 GB/s per TF is valid, this gives us a range of 8.7TF to 10.875TF for PS5.
:
435.2 GB/s x 40 GFLOPs per GB/sec = 17.41 Tflops
435.2 GB/s x 50 GFLOPs per GB/sec = 21.76 Tflops

He may mean that PS5 RAM bandwidth can meet the requirement of a 17.41~21.76 Tflops GPU.

It's not that surprising since RTX 2080 only has 448GB/s bandwidth.
 
Status
Not open for further replies.
Back
Top