Next Generation Hardware Speculation with a Technical Spin [pre E3 2019]

Status
Not open for further replies.
I'm not talking about that much variability. I mean boost clock based on operational TDP limits, not temperature or binning. The same code would behave exactly the same on any consoles.

I know it's a tradition to lock the clocks but are we ready to move past that?
 
I'm not talking about that much variability. I mean boost clock based on operational TDP limits, not temperature or binning. The same code would behave exactly the same on any consoles.

I know it's a tradition to lock the clocks but are we ready to move past that?
You have to have a give and take. Maybe limiting CPU clocks to raise GPU clocks, or vice versa. Else, why not just make the ‘boost’ clock stock and downclock the whole system where possible.
 
Maybe boostclock could be enabled when variable refresh rate is enabled. That would be nice and also drive consumer adoption of better tv's/displays.

Or perhaps there could be some possibility for game engine to choose clocks from few options to allow game/level/install/load time specific optimizations.
 
I'm not talking about that much variability. I mean boost clock based on operational TDP limits, not temperature or binning. The same code would behave exactly the same on any consoles.

I know it's a tradition to lock the clocks but are we ready to move past that?

I'm not sure I understand what you mean by "boost clock based on operational TDP limits" There is a large variance on power consumption between chips at same clocks across the clock range. One chip can hit 150Mhz more at same power consumption than another sample.
 
Boost clock based on occupancy or other metrics which keep the whole chip within a predetermined TDP margin. Like if half the CU are idle the clock is allowd xx% higher. Same for cpu cores.

I thought that was how tdp-limited boost worked? Did I get it wrong?
 
Boost clock based on occupancy or other metrics which keep the whole chip within a predetermined TDP margin. Like if half the CU are idle the clock is allowd xx% higher. Same for cpu cores.

I thought that was how tdp-limited boost worked? Did I get it wrong?

I think you see that more with the CPUs when all cores are not loaded or the load is not fully taxing the CPU. Maybe next gen consoles could have a let's say 4 core mode with higher clocks and that would be a good choice for some games and up to the developer. I could see that happening. On the GPU side I think there are less chances for something like that. Graphics are very paraller and the limiting factor is usually power or temperature. If the load is lower and allowing higher clocks, there probably wouldn't be a need for higher boost in the first place. I think the GPUs are running pretty much full tilt anyway in a console.
 
The only kind of boosting I see happening, is negative boost. :LOL:

More advanced power gating and power states. To lower power and performance when applicable. Not boosting beyond normal performance profile.
 
I think you see that more with the CPUs when all cores are not loaded or the load is not fully taxing the CPU. Maybe next gen consoles could have a let's say 4 core mode with higher clocks and that would be a good choice for some games and up to the developer. I could see that happening.

Sounds good in theory considering the variability of engines/CPU-side workloads. On the other hand, I wonder if it'd be too much extra work/time to validate the chips for these different (sustained) operational modes :?:

Would there be sensitivity regarding core-to-core/cache latencies :?: (Although Zen 2's chiplet setup might make it a moot point with highest common latency. j/k :V)
 
Last edited:
Sounds good in theory considering the variability of engines/CPU-side workloads. On the other hand, I wonder if it'd be too much extra work/time to validate the chips for these different (sustained) operational modes :?:

Would there be sensitivity regarding core-to-core/cache latencies :?: (Although Zen 2's chiplet setup might make it a moot point with highest common latency. j/k :V)

I don't really know, but I'd think that the clock rates would still be quite conservative compared to the desktop parts that it might be doable and feasible. Say 8 cores at 2.8Ghz and 4 cores at 3.5 or something along those lines...
 
I'm not talking about that much variability. I mean boost clock based on operational TDP limits, not temperature or binning. The same code would behave exactly the same on any consoles.

I know it's a tradition to lock the clocks but are we ready to move past that?

I've thought the same thing myself. If you bin for GPU and CPU frequencies that are - when combined - beyond your intended power and thermal limits, and also test for the usual 'optimal' default balance, then you have the option of robbing Peter to pay Paul. Make it all entirely deterministic (not based on chip lottery or environmental factors).

Have modes that are: default (locked), developer selected, or applied to the game automatically by the system based on some kind of load based algorithm. Good options might include: lower the CPU clock and boost GPU, or sleep x number of CPU cores and boost CPU clock (many games are still limited by a primary thread). Stuff like that.

Games have variability anyway based on things like: map location, explosion alpha, enemy number and positions, where the camera is pointing etc.This would at least allow every system to minimise drops by diverting underutilised resources to where they are most needed. If your frame rate is tanking down to 14 fps because a single CPU core is limited (lol PUBG) you might as well drop your beast of a GPU to a lower power mode and disable six of your game cores and max out the CPU frequency. It's not like those resources are doing you any good where they are.

Even on an extremely crude level it would be great for BC. Zen 2 is a beast compared to Jaguar. You could halve CPU clocks, save 75% of your CPU cores' power (hell conservatively say 50% to cover any level of individual SoC variability) and boost GPU clocks to get allow up-resing and/or apply additional machine learning enhancements.

Whatever the case, locked clocks in a world where we can determine - however conservatively - that you can save power in one place and use it in another is leaving an awful lot of performance on the table.

Power and bottleneck balancing between components is long overdue in console land. I think It'd entirely doable if you're prepared to invest the money in the chip, and entirely fair if done predictably and consistently and the developer can take control if they chose.
 
Read this document "fully parallelized LZW decompression for CUDA-enabled GPU", Sorry can't give a link on my phone this is a PDF.

The MS insider rant about GPU decompression is easy to prove. It is interesting from 2016 a 980 GTX against I7 using a SSD, the GPU destroy the CPU. And it show UMA is an advantage move data from RAM to VRAM is slower than GPU decompression.

Edit: It is only 9 pages and comparisons table are on page 8
 
Last edited:
Read this document "fully parallelized LZW decompression for CUDA-enabled GPU", Sorry can't give a link on my phone this is a PDF.

The MS insider rant about GPU decompression is easy to prove. It is interesting from 2016 a 980 GTX against I7 using a SSD, the GPU destroy the CPU. And it show UMA is an advantage move data from RAM to VRAM is slower than GPU decompression.

Edit: It is only 9 pages and comparison table are on 8

Kind of a given when the conduit between memory pools is the PCIe bus, as it is in this case.

Edit:Ideally, though, if you had multiple memory pools you'd want to do moves as little as possible and instead tie the move to some processing operation. So, in this case if you could read from the SSD into main memory then have the GPU access the data in that RAM directly and decompress into its local RAM, you'd get a further speedup and if your platform had a more optimal path to main memory than through PCIe via the CPU it could be faster still.
 
Last edited:
In Sony's latest financial report they mention "Reduce PS4 hardware cost" which might lend credence to the 8GB HBM 16GB ddr4 leak? Didn't that leak also mention a 7nm PS4 coming at the end if the year?
 
In Sony's latest financial report they mention "Reduce PS4 hardware cost" which might lend credence to the 8GB HBM 16GB ddr4 leak? Didn't that leak also mention a 7nm PS4 coming at the end if the year?
Yes

PS4 refresh
  • sometime between september and november
  • 199
  • fabbed on samsung 7nm EUV
  • best wafer pricing in the industry
  • die size 110mm²
  • no PRO refresh, financially not viable yet
  • too close to PS5 as well

I've been thinking if a 8GB HBM2 would really be viable though? How many GDDR6 would it take to be equivalent?
 
They could maybe get away with either 8 x 8gbit chips on a clam shelled 128-bit bus or 4 x 16Gbit setup at 11Gbps, barring any memory channel quirks.

Would be neat if they tossed in DCC while they’re doing the revision.
 
Last edited:
I’m fine using boost clocks if they’re indefinite boosts. If they’re not indefinite, it just sounds maddeningly frustrating because your “bin” of console will determine your sustained FPS. People will return consoles and repastes will become a lot more common.
I think boost clocks would make sense to use with variable refresh rate.
Base clocks would always be used on normal TVs to guarantee the 30/60 FPS target by design. TVs with VRR could make use of boost clocks to bring free framerate enhancements. If implemented as an automatic feature, devs wouldn't even need to do anything for this to work.


I dont know about statistics, but many/most of my friends have external hdd's with their ps4's, I would not say that the reason is that we dont want to download games again and again because of time it takes, as internet speeds are fast at Finland, and 100/10 connections cost only 9.90e/month on lowest and no data caps.
I guess if I was limited to 100/10 I might be using an external HDD too…
Me and my console friends all have 1Gb down / 200Mb up. I think the HDD's write speed is the bottleneck in my PS4 Pro, but 50GB games take less than 10min to download. With the play while downloading feature, we can usually start playing just 2 minutes after starting the download.​
 
Last edited by a moderator:
I think boost clocks would make sense to use with variable refresh rate.
Base clocks would always be used on normal TVs to guarantee the 30/60 FPS target by design. TVs with VRR could make use of boost clocks to bring free framerate enhancements. If implemented as an automatic feature, devs wouldn't even need to do anything for this to work.



I guess if I was limited to 100/10 I might be using an external HDD too…
Me and my console friends all have 1Gb down / 200Mb up. I think the HDD's write speed is the bottleneck in my PS4 Pro, but 50GB games take less than 10min to download. With the play while downloading feature, we can usually start playing just 2 minutes after starting the download.
Could you please remove the formatting from half of your message? (probably caused by the infamous Xenforo-copy/paste-bug)
 
Me and my console friends all have 1Gb down / 200Mb up. I think the HDD's write speed is the bottleneck in my PS4 Pro, but 50GB games take less than 10min to download. With the play while downloading feature, we can usually start playing just 2 minutes after starting the download.
Wow that's fast. I don't think many people will have access to such speeds in a very long time, if not ever, in most parts of the world. Heck, my 'super fast' fiber optics connection, in central-ish London, is a 70 down 20 up kind of deal, and it's not cheap.
 
Wow that's fast. I don't think many people will have access to such speeds in a very long time, if not ever, in most parts of the world. Heck, my 'super fast' fiber optics connection, in central-ish London, is a 70 down 20 up kind of deal, and it's not cheap.
That's the standard fiber to the cabinet that people receive when they opt for 80Mbs fiber.
Like you say don't see it changing much anytime soon.
I'm personally on 1Gbit down and up, in London. But that's far from the norm.
 
Status
Not open for further replies.
Back
Top