Next Generation Hardware Speculation with a Technical Spin [pre E3 2019]

Status
Not open for further replies.
256Gbit TLC chips are now $2 a piece (entire wafers). Adding a tera byte of NAND storage is now complety economically viable. All the major flash producers have bigger chips in the pipeline which will put negative pressure on cost in the future.

I don't think this core flash storage will be replaceble. Sony hinted the performance of their storage solution "to exceed all NVMe solutions"; They might have integrated the storage controller in the main SOC, circumventing the 4 lane PCIe bottleneck. They could operate 8 NAND channels at ONFI 4 speeds (800MT/s), exceeding 6GB/s.

Storage expansion would just be handled by USB3.0/3.1.

Cheers
 
In Raven Ridge Slides is clear that the coherent part of the uncore (of course in Raven Ridge all the access is coherent) takes memory clock*32 bytes of bandwidth.

ryzendataflow.jpg
Wel,,, the last rumor of Navi talks about how the GDDR6 speed is 12.8 Gbps and we have the rumor that PS5 GPU runs at 1.8 Ghz, of course tha this is not official data and we don't know the true speed of the PS5 GPU but I observed something interesting. If we use GDDR6 then we are going to see how all the subjects minus the GPU will use only 1/8 of the total bandwidth and the GPU 7/8. Using the same metric for the bandwidth and filrate of the PS4 and PS4 Pro we have 115.2 Gpixels/s and a bandwidth of 460.8 GB/s (4 bytes per pixel).

Well, the only configuration where this condition is meet is:

  • 12.8Gbps GDDR6
  • 320 bit bus
  • 512GB/s of the total bandwidth
  • 51.2 GB/s fot the coherent access.
  • 460.8 GB/s for the non-coherent access (GPU and some accelerators).
  • 20GB RAM.
I know that this is a bizarre configuration but this is not impossible. I believe that we will see a split of 16GB RAM for the gaming environment and 4GB for the OS environment instead of the current 5GB+3GB. The 16GB is interesting because you can replace the entire RAM in near 1 second of time using two M.2 modules connected everyone to a PCIe 4.0 x4 each (8GB/s each) making a total of 16GB/s. The console will come with both M.2 modules with the option to add one or replace both. The Operating System and the firmware will have its own NVM soldered in the board with its own private access.
 
I really hope that having Cerny in the team means that Sony will hit the sweet spot again. But isn't 1.8Ghz too high? Digital Foundry if I recall estimates 1.4Ghz as the achievable frequency considering it is a console that needs cooling and be prized attractively.
But maybe they will surprise us just like last gen. This is more challenging to achieve though because it is memory and GPU they have to consider to surpass expectations on what can be achieved.

I suspect that MS this time are planning either to take a small money hit for a noticeable performance boost as standard or put Sony in the middle by offering cheaper and more powerful expensive alternatives simultaneously, removing Sony's ability to market the console as the cheaper or the more powerful option.
I suspect MS will hit the 1.8Ghz and above 20GB of memory
 
Clock speeds not only depend on the process, but also the micro architecture, physical layout, power delivery, and handling the thermals.

We know it will be a 7nm design, but we know nothing about the rest of the design. I would guess 1.8 GHz would be too high for a Vega/Polaris design, Navi no idea.
 
Clock speeds not only depend on the process, but also the micro architecture, physical layout, power delivery, and handling the thermals.

We know it will be a 7nm design, but we know nothing about the rest of the design. I would guess 1.8 GHz would be too high for a Vega/Polaris design, Navi no idea.

I have a question about that: how much can MS and Sony customize the design? Specifically I mean the following: AMD seems to use more automatic layout (I don't know the name for it) compared to Nvidia which could be using a more hand crafted layout - at least that was brought up on multiple ocassions in the Vega arch thread about the possible differences between Nvidia and AMD cards.

So, assuming AMDs Navi design intended for discrete GPU uses mostly automatic layout, could MS and Sony take that design and remake it with hand drawn optimizations? And if yes, is something like that worth spending money on assuming that automatic layout probably does a decent job, and that AMD most likely has done some hand drawing itself for optimizations already?
 
Isn’t QLC substantially slower than TLC, and because of the increased number of states to resolve for the cells, it’s a physical limitation - not a question of product segmentation or yield.

Yes. This is why you cannot make a small, fast pool out of it. The bandwidth/chip is lower, which is why you need >512GB to come close to exceeding 4GB/s.

I was under the impression that SLC has 100 times or more the write endurance of QLC, which would mean 64 GB would have several times more write endurance than an entire 1TB SLC drive. Even a small 16 GB SLC cache would have more, though I don't know the prices. Something like the Samsung 970 Evo has an SLC cache for it's TLC, but that might be more about write performance (less important for consoles) than write endurance. The case is less compelling for MLC, thinking about it.

You are right. I was a bit off on the SLC endurance. SLC right now costs ~$4/GB on the spot, and promises ~100000 writes per cell. So assuming 500-write endurance for the 1TB drive, you'd need ~5GB at ~20$ (currently) to match it. So there still is ~4x cost advantange for writes in SLC.

Could I ask about your figures for ""daily suspend of 32GB" is just 0.000007%"? I get different ones.

I should post less when tired. You are absolutely correct, a factor of 1000 snuck up there somehow.

It's really difficult to imagine a 2TB NVMe drive anywhere near $40 considering even Apple paid that much for 128GB of MMC in their iPhones last year. We're looking at 512GB being quite possible after the ongoing price drops, 1TB being optimistic and require a BOM compromise, and 2TB requiring a dramatic move beyond any sort of SSD supplier contract predictions.

Your pricing is way off. Even cheap eMMC costs more than what cheap flash does, 128GB is boutique and costs more/GB than even cheap eMMC. (on the spot, 128GB eMMC currently costs ~4 times as much per GB as the raw flash). Since this time last year, flash chip prices have fallen by more than half. Right now, the analyst predictions are that they will fall by more than half in the next 12 months. That is not the extreme prediction, that's the middle-of-the-road one.

The annoying part about pricing right now is that QLC does not have a readily available market price. It's going to be even cheaper, and whether or not 2TB is possible without major BOM compromises depends on just how low QLC will fall when there are multiple competitors.

I don't think this core flash storage will be replaceble. Sony hinted the performance of their storage solution "to exceed all NVMe solutions"; They might have integrated the storage controller in the main SOC, circumventing the 4 lane PCIe bottleneck. They could operate 8 NAND channels at ONFI 4 speeds (800MT/s), exceeding 6GB/s.

AMD will very soon ship PCIe 4.0. 4 lanes of that gives you ~8GB/s. PCI-E controllers are cheap, and NVMe is simply technically and operationally too good solution to pass up on. The only meaningfully wearable part being user-replaceable, and a commodity that you can buy anywhere just makes sense.

I think they can "beat anything available on the PC" while using the same interface. Firstly, because right now there are no PCIe 4.0 NVMe drives, so it's strictly truth in advertising (even though there will be such drives when PS5 launches), and secondly because they can plan the entire storage subsystem of their hardware and their os around having the ability to stream into memory at above 4GB/s, which is something that Windows probably won't do nearly as well even years after PS5 launches. Inefficiencies anywhere get amplified when the raw bandwidth goes up that much.

Dah-yum. Intel 660p QLC SSD. Nice and affordable and all, but the 1TB version has a total stated lifetime of 200 TB of writes.

Note that Intel is sandbagging a little there. The chips in that drive have a stated lifetime of 500 writes, but Intel has a reputation of their consumer drives being more reliable than advertised to watch out for, so they promised a lower number.

That or they're afraid write leveling will burn more than half of their write allotment...
 
Last edited:
MD will very soon ship PCIe 4.0. 4 lanes of that gives you ~8GB/s. PCI-E controllers are cheap, and NVMe is simply technically and operationally too good solution to pass up on. The only meaningfully wearable part being user-replaceable, and a commodity that you can buy anywhere just makes sense.
Could I get your insight on these results:
Motherboard: Asus Z370-G Gaming
CPU: i7 8700K
GPU: MSI Trio X GTX 1080 Ti
RAM: 32GB DDR4 3000MHz
SSD: Crucial MX300
NVME: Samsung 960 Evo
PSU: EVGA 750W Power Supply (Gold)
Cooler: Be Quiet Silent Loop 360
Monitor: Acer Predator XB271HU (165Hz - 1440p)
Monitor 2: Acer Predator XB271HK (60 Hz - 4K)


There is such a massive gulf in bandwidth between nvme and SSD. But loading times are coming up the same for the most part.

What are the reasons in your opinion that could be causing this?
Where is the bottleneck that is holding nvme back? Is it in the game code or in the drivers (say the OS side of things), or transferring data to the GPU, or having to transfer to system memory then GPU etc?

I suspect the reasons may provide a lot of insight on what Sony is doing/trying to resolve here.
 
Last edited:
Could I get your insight on these results:

There is such a massive gulf in bandwidth between nvme and SSD. But loading times are coming up the same for the most part.

What are the reasons in your opinion that could be causing this?
Where is the bottleneck that is holding nvme back? Is it in the game code or in the drivers (say the OS side of things), or transferring data to the GPU, or having to transfer to system memory then GPU etc?

I suspect the reasons may provide a lot of insight on what Sony is doing/trying to resolve here.
Decompression and load times are also related to the CPU very much so, so that could be the primary bottleneck in like for likes. Just a hunch.
 
ToTTenTranz posted about that couple days ago.


Next Generation Hardware Speculation with a Technical Spin [2019]

back then I thought if I ran my games from the RAM drive, I'd get instantaneous loading times.
In practice, I didn't. At least not with the game I tried by back, which were AFAIR Witcher 3, Dragon Age Inquisition and Dying Light, among others.

Turns out most games, when paired with very fast storage seem to be bottlenecked by the CPU instead. Even worse, in some games like Dying Light they seemed to be using a hopeless single CPU thread to decompress/decrypt the data into the RAM, so when playing multiplayer I'd often wait more than my friends who had faster CPUs and slower SATA SSDs.
 
Decompression and load times are also related to the CPU very much so, so that could be the primary bottleneck in like for likes. Just a hunch.
right, sorry just posted the specs above:
CPU: i7 8700K is pretty beefy but I guess not enough.

so what is happening here? We have all these cores to decompress data, is decompression not done in parallel? How could loading be done faster if CPU is the bottleneck and is this something that the GPU could do much better than a CPU?
 
Last edited:
Some new info on HDD shipments: https://www.storagenewsletter.com/2...down-to-around-77-million-in-1q19-trendfocus/

right, sorry just posted the specs above:
CPU: i7 8700K is pretty beefy but I guess not enough.

so what is happening here? We have all these cores to decompress data, is decompression not done in parallel? How could loading be done faster if CPU is the bottleneck and is this something that the GPU could do much better than a CPU?

Since at least december 2018 the xbox insider on resetera has mentioned GPU decompression on multiple ocassions in regards to better loading times. Apparently MS is showcasing the enhanced loading times to their partners already. But off course, for all we know the person could be a hired influencer.
 
It's not Cerny that's misleading. We're the ones trying to reverse engineer an editorial non technical piece to figure out what PS5 is. It's our own arguments that are being presented that I'm casting doubt into.

I'm just conservative, because the release spec is unlikely to match some of these 'guess specs' within a reasonable price point. It's a lesson well learned with this generation and with the mid-gen refresh.

Your assumption is that Sony/Cerny allowed the (potential) misinformation out? I mean, you know that Sony would have to ok what Wired wrote...and that 0.8 seconds figure very likely came from Sony/Cerny - not from the interviewer holding a stop-watch. My guess is Cerny specifically gave those examples for a reason.

Again, and this really shouldn't need repeating at this point, this wasn't a technical expose. It was a first look. Absolutely nothing shown was false, but that doesn't mean the end result is going to perform the same, same as God knows how many demos and technical showcases!!

It's just a simple statement of fact that between a preview and final product, things don't necessarily look the same, without any deceit being involved (sometimes it is, due to Marketing, but sometimes it ain't).

Cerny is excited about his storage solution, and fast loading, and showcased a great example that, running on the devkit, shows how a good storage solution can be way faster. If PS5 ends up taking 2.2 seconds to load instead of 0.8, it wasn't a lie, because PS5 wasn't shown - only the devkit and the demo. And if PS5 takes 2.2 seconds to load instead of 0.8, that's still bloody brilliant as opposed to 15 seconds!

I don't know, I'm a big Sony fan and if the loading time is over double what Cerny has implied I'd be upset to have been that misled...and that's really my point, I remember the whole Xbox X 6TF thing back in the day and the debate over that 6TF was the machine as a whole or just the GPU...it's kind of a similar debate. I'm not saying people should take it as gospel, but it seems too much negativity against PS5, what with the 'lack of news from Sony' thread being a bit OTT and now we have something from Sony and it's 'well it might not be true'!? lol

No-one should be making an argument that we takes these figures as gospel, especially in a technical prediction thread. We should still be discussing possibilities and scenarios, like NAND costs. If you aren't happy with ideas being questioned, please just refrain from conversing and let those who do want to discuss the viability of different solutions and interpretations to engage in their discourse without interruption.

You're right - as ever, I lack the technical knowledge and input of you guys, for that I appologise. I'll stay out of this from now on.
 
Your assumption is that Sony/Cerny allowed the (potential) misinformation out?
No that's not what I'm assuming. They released information and that's what was presented. His story ends there.
What we do with the article is where I'm putting more rigor. We were coming up with different ideas on how those numbers are achieved. I was putting doubt into how the numbers were achieved as we brought forward ideas. I was just not satisfied that SSD was the only plausible solution; and so I have been looking for information to refute that SSD was the only plausible solution.

The solution was demo'd off a PC build of PS5. I think that's an important footnote I wanted to get across, the idea that PC can't do it, is irony because it's clearly doing it. So what is it that they are doing; and I"m hoping to see the discussion branch from there instead of blinding accepting this black box of 19x faster of loading time and then extrapolating that to being; Sony has developed technology that allows for no wait load times. I will question until I'm satisfied with what's presented.

As we're discussing we're learning. Its likely we're asking the wrong questions. Which is why you're interpreting this as being negative for PS5. That's not what this is.
 
Last edited:
Cerny tried to explain there's more to acheiving his result than the raw bandwidth. It's the journalist who said it's more b/w than any PC. We are missing a clear question and answer. Cerny is extremely precise and accurate in his wording whenever he is interviewed. I hate that format, it should have been a filmed interview or at least a proper Q&A.

For all we know, the statement could have been about the highest bandwidth than any PC in practice in the context of game engines architectures, not in theorical raw b/w. Or may he said faster than any ssd he can put in this laptop which was his example (sata at 500mb/s). I'm being a bit cynical because of the precedents with journalists paraphrasing where they shouldn't have.

It has flash. We don't know how much. My theory so far is that they skipped the file system entirely, and have a decompression/crypto in hardware matching whatever bandwidth they designed the storage for. Possibly a lighter compression. Possibly a memory mapping like FusionIO, which bypass a truckload of underlying OS storage bottlenecks with some useless copy and moving data in memory.

That would explain why we are not seeing this gain even on the most expensive PCs. It requires work from devs, and they would have to design their whole storage layer around this capability, which makes a lot of sense in a console however.
 
Last edited:
Cerny tried to explain there's more to acheiving his result than the raw bandwidth. It's the journalist who said it's more b/w than any PC. We are missing a clear question and answer. Cerny is extremely accurate in his wording whenever he is interviewed. I hate that format, it should have been a filmed interview or at least a proper Q&A.

For all we know, the statement could have been about the highest bandwidth than any PC in practice in the context of game engines architectures, not in theorical raw b/w. Or may he said faster than any ssd he can put in this laptop which was his example (sata at 500mb/s). I'm being a bit cynical because of the precedents with journalists paraphrasing where they shouldn't have.

It has flash. We don't know how much. My theory so far is that they skipped the file system entirely, and have a decompression/crypto in hardware matching whatever bandwidth they designed the storage for. Possibly a lighter compression. Possibly a memory mapping like FusionIO, which bypass a truckload of underlying OS storage bottlenecks with some useless copy and moving data in memory.

That would explain why we are not seeing this gain even on the most expensive PCs. It requires work from devs, and they would have to design their whole storage layer around this capability, which makes a lot of sense in a console however.

"precise" --- Sorry... pet peeve.
 
Resetera's hmqgg, using iroboto's post, seems to suggest that Arden (Lockhart) will be a SoC that uses HBM.
hmqdd then invites people to speculate on what Argalus (Anaconda) is.

My speculation : Dscrete cpu and gpu instead of a SoC.
 
Resetera's hmqgg, using iroboto's post, seems to suggest that Arden (Lockhart) will be a SoC that uses HBM.
hmqdd then invites people to speculate on what Argalus (Anaconda) is.

My speculation : Dscrete cpu and gpu instead of a SoC.

CPU, GPU, HBM, IO die on MCM?
 
Status
Not open for further replies.
Back
Top