AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
I'm still skeptical about this magical cache, like if it was so easy to just add more cache why haven't it be done before? even in a smaller scale...But at the same time what I find more intriguing is why Sony is being so extremely secret about It's GPU...We have no pics of the chip, or even de PCI or anything. Maybe it's AMD prohibiting it's revealing until they officially lunch the architecture? Or maybe this specs are AMD doing the same as Nvidia "leaking" false specs to hide the real one.

For me I just want to be able to play at 1440p at 120FPS for less than 400 bucks.
I am so far surprised of the PS5 RT capabilities, still wondering if the SRAM in IO complex is the so called infinity cache. Would explain why they went with a 448 GB/s bandwidth.
 
I am so far surprised of the PS5 RT capabilities, still wondering if the SRAM in IO complex is the so called infinity cache. Would explain why they went with a 448 GB/s bandwidth.
I am also pretty surprised. However, to be fair to the general discussion, we've had little to no other way to compare nvidia results except to itself. Our understanding of what can be done with RT and what to expect from RT performance has largely been guided by Nvidia's hand here. Once RNDA 2 is out and both consoles are out pushing boundaries, we can get a better idea of what we can actually expect from RT.
 
@Love_In_Rio
@iroboto
I think the excellent RT we have seen so far on a few PS5 games could be mostly explained because of easy and mature APIs. Sony have being thinking about RT since at least 2013 as Cerny was already considering it for PS4 (but couldn't do it because of how expensive it was).
 
Nvidia aren't really in the business of making the fastest GPUs, they're in the business of making the most profitable ones. And they excel at it.
If nVidia really wanted to bring the fastest gaming GPU they could possibly make to the PC market, they wouldn't use Samsung's 8nm node. They'd go with TSMC's N7+.
They also wouldn't use GDDR6X, they'd use HBM2.
They can't sell GA100-like gaming GPUs for $10 000 in the gaming market in enough quantities to justify the investment, so there's no such product.


Putting a lot of cache into a GPU or SoC makes it considerably larger. If your economics dictate a maximum die area, using lots of cache will eat away at the available area for execution resources and significantly lower the perf-per-mm^2 compared to a competitor that just uses faster external memory (e.g. XBOne with eDRAM vs. PS4 with wider GPU).

My point is that lots of cache isn't a magic bullet. If true, it's just a means to compensate for not using a wider VRAM bus, at the cost of having a larger chip to fabricate.

Every business have the goal to maximize revenue. is the same for AMD or even more considering it's smaller size. If Nvidia didn't use TSMC is because they couldn't that is sure of. and this " infinity cache" would probably allow them to safe a lot in the cost of PCB design and manufacture, that would reduce at least the impact in the price of die. We will see in a month what is the final product is.


They didn't move to chiplets because it would be magically better, they moved to chiplets because of cost and bad scaling of analog parts like memory PHYs at smaller processes.
Moving to chiplets allowed them to easily fill parts of their obligations to GloFo by producing the IO-die at 12nm and 14nm (for X570 chipset), as well keep the expensive 7nm chips as small as possible. Being able to scale server parts without putting a ton of CPU chiplets when the customer only wants huge bandwidth didn't hurt either.

You completely missed my point :D
 
Every business have the goal to maximize revenue. is the same for AMD or even more considering it's smaller size. If Nvidia didn't use TSMC is because they couldn't that is sure of. and this " infinity cache" would probably allow them to safe a lot in the cost of PCB design and manufacture, that would reduce at least the impact in the price of die. We will see in a month what is the final product is.

He failed to notice they have the fastest GPUs out there, probably even comparing to RDNA2 in raw rasterization performance. Then we have all the other advanced features like ray tracing, dlss etc. All the while they came down in price and adding RTX IO, which is better specced then the PS5s solution.

I wont complain, only very intresting in what AMD will do, looks very promising so far. This competition is very much needed. AMD is on a roll with Zen3 leaving Zen2 in the dust.
 
All the while they came down in price and adding RTX IO, which is better specced then the PS5s solution.
Which is nothing new compared to what AMD has had since Fiji (HBCC), only thing that changed is that there's now Microsoft API coming to actually make use of it
 
Aren't caches fairly large energy consumers?

Yes and no. Large caches were large consumers when transistor leakage current was a major problem, but finfets can push leakage really low, leaving only the active switching as the energy cost, and the only part of a cache that switches any given clock is the part that is accessed, and the access path. As caches grow in size, their energy consumption is almost flat. So while caches consume a lot, making caches larger makes them consume less per area. Improving hit rate of course increases energy consumption (as cache is effectively hit and therefore accessed more often), but linearly increasing cache size doesn't linearly increase hit rate. So again you are winning on power per area.
 
And you know that because...?. Get ready for a surprise.
It can theoretically go way faster than PS5's SSD allows it to go. Of course we have no clue how fast their controller might be with faster SSD, but considering the fact that they could have done it on GPU too, there must have been a reason for the proprietary controller.
 
Yes and no. Large caches were large consumers when transistor leakage current was a major problem, but finfets can push leakage really low, leaving only the active switching as the energy cost, and the only part of a cache that switches any given clock is the part that is accessed, and the access path. As caches grow in size, their energy consumption is almost flat. So while caches consume a lot, making caches larger makes them consume less per area. Improving hit rate of course increases energy consumption (as cache is effectively hit and therefore accessed more often), but linearly increasing cache size doesn't linearly increase hit rate. So again you are winning on power per area.
While dynamic leakage is getting lower, static leakage is growing with each generation and growing caches. Fortunately it still looks like the dynamic power advantage outweighs the static leakage cost - at least in conventional high perf devices, battery powered devices have to use more aggressive power gating techniques on idle hardware.
 
Started and ended?
We don't know whether the same capabilities are included in RDNA memory controllers or not, but quite surely it's (or equivalent tech) included in RDNA2 for DirectStorage even if there's no such functionality in RDNA
 
We don't know whether the same capabilities are included in RDNA memory controllers or not, but quite surely it's (or equivalent tech) included in RDNA2 for DirectStorage even if there's no such functionality in RDNA
AFAICR, no RDNA products have every mentioned it. Also I don't believe Direct Storage is the same technology as HBCC, with the latter being attached storage that is directly memory addressable.
 
AFAICR, no RDNA products have every mentioned it. Also I don't believe Direct Storage is the same technology as HBCC, with the latter being attached storage that is directly memory addressable.
HBCC allows connection to "anything", i.e. direct attached storage, network storage, nvram etc, I can't see an option where it wouldn't cover DirectStorage too
 
HBCC allows connection to "anything", i.e. direct attached storage, network storage, nvram etc, I can't see an option where it wouldn't cover DirectStorage too
Sure, but it extends beyond DirectStorage in that it can be used as directly assigned VRAM. I don't believe there's been anything to indicate DS has that capability.
 
Status
Not open for further replies.
Back
Top