Predict: The Next Generation Console Tech

Status
Not open for further replies.
I think one of the most important thing we need to understand about Durango is the memory architecture. Assuming a 256-bit interface to DDR3/DDR4, the lower and upper bounds for the main memory pool should 59 GB/s at 1866 Mt/s and 77 GB/s at 2400 MT/s, respectively, The big question mark is the embedded memory.

My very optimistic wish: 128 MB at 1 TB/s accessible for any intent and purpose by any component in the SoC, connected to the other components through a crossbar or ring bus.

My most realistic wish: 64 MB at 512 GB/s accessible for any intent and purpose by any component in the SoC, connected to the other components through a crossbar or ring bus.

My guess based on the current rumors: 32 MB on a daughter die (in the same package) with the ROPs; internal bandwidth 819.2 GB/s*; bandwidth to the main die 102.4 GB/s. Maybe a smaller pool of ESRAM (8-16 MB) could be used in the main die to facilitate data sharing between the CPU and the GPU.

* The Xbox 360 has 8 ROPs at 500 MHz and 256 GB/s of internal bandwidth in the daughter die. If Durango has 16 ROPs at 800 MHz, scaling the bandwidth gives: (16/8) * (800 / 500) * 256 GB/s = 819.2 GB/s, which is incidentally 8x the rumored 102.4 GB/s, which I believe is only the bandwidth to the compute die.
This is one of the most interesting posts I've recently read on the subject of next gen consoles.

The theory you have made is very thoughtful, and offers a good deal of common sense and interesting info to people.

It explains something that might be the missing link in some crucial ways, at least to me --if the data and numbers we have been given are to be believed.

The 102.4GB/s bandwidth for the main RAM isn't bad but it's relatively meagre these days. Your most realistic wish explains that part of the equation that people seemed to be missing for now.

If we consider that might be the bandwidth speed between the main RAM die and the SRAM, yet the SRAM has a good internal speed, then it sounds very interesting!!

If not, 60 fps at 1080p /the holy grail of gaming for me/ might not be achievable often. 819GB/s or 512GB/s is a very respectable speed if you ask me, provided you can read or write to the SRAM.

But I am not sold on the current specifications yet. There are many guarded secrets in these consoles.

The only thing that I can figure out and I am basically 100% sure now is that the X720 will feature an audio DSP, it seems pretty obvious to me because of the words and the conversations of certain person here.

PS4 on the other hand seems to have some customized and interesting hardware, but the project is also very secretive.

The secret ingredient of these consoles is yet to be known and that's why none of the specifications for now appears to be making the slightest impression in me.

I might still want to do a free-flung search for the fun of it, and listen to everybody, but I want definitive data. :p
 
Everyone, stop saying "special sauce," for the love of God!

Also, I'd still treat these rumours as rumours and not bullet-proof fact.
I wouldn't :p No really, the "down to earth" rumors are almost 100% legit. PS4 is definitely the console DF wrote about yesterday, Durango is also there, you just have to pick right rumors.
 
Sony's ex-CTO (Makimoto) seems to be a big fan of FPGA. I saw some reference to Makimoto's Wave. Is there any advantages to FPGA vs ASICs in costs and other production factors ?

I could see him being a big fan of FPGA's in general, and I could see some excitement concerning their applicability in things like BD players and smart TVs. As it applies to the PS4 though, my issue is that unless Sony has seen fit to build in low power (as in power draw) functionality outside of the processing offered by its primary silicon, then is there really a need for anything *more* than an audio DSP, for example? And if the answer is yes, that it might assist with physics operations, heavier encryption needs, depth-fo-field operations for a new EyeToy variant... whatever... then, with the size and transistor count it begins to warrant, at a certain point why not just put in a Cell/SPURS?

I understand the appeal of FPGA's, but it'll be the kind of thing I need to dissect after the fact rather than get excited about in advance, because the needs seem so "extra" compared to what's already there.
 
I wouldn't :p No really, the "down to earth" rumors are almost 100% legit. PS4 is definitely the console DF wrote about yesterday, Durango is also there, you just have to pick right rumors.

We won't know any of the rumours are 100% legit until the boxes are out and people are allowed to talk about them. They could be 80% right. They might be 100% right. Maybe 100% right for one and 75% right for another. People might be wasting a lot of their time arguing over specs they believe are 100% accurate, but aren't. Discussing them is fine, but the tone of this thread has become a little argumentative, and I would want to be the person with egg on their face.
 
This is one of the most interesting posts I've recently read on the subject of next gen consoles.
Well no offense but is not "that interesting".
You can't compare the bandwith available to the simplisitic ROPs in Xenos daughter die with the bandwidth you have in GPUs that include more complex ROPs, with compression scheme, caches, etc.
Even without looking into modern GPUs that does a hell of job out of reasonable bandwidth, get your eye on a RSX, does its ROPs "operate" anywhere close the x10 ratio there is between the bandwidth available to them and the bandwidth within xenos daughter die? It is no where near that bad.

Modern ROPs have a lot of bandwidth to their color and Z cache, use data compression, etc.

ERP made an interesting calculation, 102GB/s is enough to make make sure that 12 ROPs operates all the time at their best.

Kind of related is the situation in Pitcairn and Tahiti, Pitcairn does have enough bandwidth (AMD 's own statement) to make the most of its ROPs, that is why they did not add ROPs in Tahiti but add lot of bandwidth (/wider bus). Though now I wonder if it is not unbalanced in the other direction (though could be useful when ROPs perfs are out of the picture, say compute performance which was a strang point of the HD797xx cards).

-----------------------------------------------

Another thing I don't get is why people keep speaking about ESRAM as something kind of magical.
It seems to me that is the standard type of memory cells you find in most CPU, GPU, etc.

A more practical way to speak about it, is just to say that the GPU should have access to pool (I think on die but it is just rumors for now) of 32GB (/xxGB, rumors again) of scratchpad memory, that all it is.
As side I think somewhere on this board the density you can achieve for memory cell on TSMC process is damned high. To me it could be a reason why MSFT stuck with the most 'standard" type of memory cell, they are not designing a cache with a lot "overhead" with the impact in power consumption and die size, they just want a pool of scratchpad memory, it should be lower powerr than a cache, significantly tinier. EDRAM may not have been worse once everything is taken in account.
 
Last edited by a moderator:
We won't know any of the rumours are 100% legit until the boxes are out and people are allowed to talk about them. They could be 80% right. They might be 100% right. Maybe 100% right for one and 75% right for another. People might be wasting a lot of their time arguing over specs they believe are 100% accurate, but aren't. Discussing them is fine, but the tone of this thread has become a little argumentative, and I would want to be the person with egg on their face.
There is certain person who said that Durango specs are out there, you just need to pick the right one. We know PS4 target specs, and the rumored specs basically just added 2GB and dropped Steamroller for 8 core Jag (like Sweetvar26 said 7 months ago).
 
This is all quite fascinating, so I thought I would chime in and try to add a little strategic analysis to the discussion. Fair warning, I am not a technical person. However, I am a strategy professional.

I will take Orbis first.

The latest information and rumors lines up fairly well with Sony’s vision, capabilities, and limitations. The only questionable item is the cost of the memory, which I do believe was raised due to competitive pressure from MS and developer requests.

Durango is far more interesting since there are such wildly divergent rumors/information out there. I will first layout “facts” and assumptions and follow that with analysis/predictions.

Facts:
Powerpoint Leak
MS has publicly discussed the idea of forward compatibility
MS wants/needs to dominate the living room
Apple and Google are the true competitors
Gaming will drive early adoption and provide a differentiator to above competitors
Multiple media outlets have written articles about xbox surface tablets supported by multiple MS sources, see verge article http://www.theverge.com/2012/11/6/3608432/xbox-surface-xbox-tablet-7-inch.
The same sources indicated that the June surface leaks did have the correct specs.


Assumptions:
Initial plans called for at least 680 level performance
Many, if not most, of the rumors with concrete numbers are true, for certain values of true.
Lots of others that I don’t have time to reference.

Analysis/prediction:
1. all software for next box is scalable through forward compatibility
2. There are at least 4 SKUs being prepared
a. Set-top xbox
b. Xbox surface (if the 24th meeting confirms this, then the probability of the rest of this shoots through the roof)
c. Xbox next
d. Xbox Pro/server
3. The leaked powerpoint has most of the major initiatives in it

Description of SKUs
Set-top box – either super-slim 360 soc at 28 nm or some jaguar APU. This is the basic cable box alternative.

Xbox surface – 7 in tablet that plays windows mobile games and can act as a terminal for streamed content / games from xbox next and server. Has jaguar APU

Xbox next – base games machine. Has most of the leaked specs – 1.2 TF GPU, jaguar apu, 8 gb, secret sauce created something roughly equal to Orbis. Target 1080P 30 FPS, high image quality. Can play games and handle multimedia simultaneously, including streaming to 1 surface. May have HW BC initially.

Xbox pro – enhanced xbox with more memory, hardware BC to 360, and 2 GPU units, can play graphically enhanced version of next box games 1080+, 60 FPS, enhanced IQ, and stream to multiple surfaces.

This structure best fits the available data and provides a framework to achieve MS goals for living room dominance. Could they go in another direction? Absolutely, but given the available information, this seems to be the best observation. The meeting the 24th should be very telling.
 
Modern ROPs have a lot of bandwidth to their color and Z cache, us data compression, etc.

The rationale behind the daughter die is explained in the Xbox 360 System Architecture article on IEEE Micro.

HD, alpha blending, z-buffering, antialiasing, and HDR pixels take a heavy toll on memory bandwidth.
...
One approach to solving this problem is to use a wide external memory interface. This
limits the ability to use higher-density memory technology as it becomes available, as well
as requiring compression. Unfortunately, any compression technique must be lossless,
which means unpredictable—generally not good for game optimization.

ERP made an interesting calculation, 102GB/s is enough to make make sure that 12 ROPs operates all the time at their best.

ERP explictly stated that 102 GB/s is enough *if the internal interface is faster*. Otherwise you cannot guarantee to sustain the ROPs throughput with alpha-blending, MSAA, etc.

If it's similar to 360, and the ESRAM is exclusively for render targets and the MSAA/Blending is dealt with by the ROPS interface to the memory which is faster than GPU's connection, then 100GB/s sounds about right.

So 102.4 GB/s sounds right for the GPU connection. But the internal bandwidth should be in the order of 819.2 GB/s to sustain the same kind of ROPs operations that Xenos did, without resorting to compression (which, according to the Xbox 360 designers, is undesirable because unpredictable).
 
If you're asking me this, then you didn't see the post I was responding to and the post that person was referring to. Lherre also initially said Xbox 3 "was a beast".

So you are saying that Lherre has told you that it isn't so anymore? Because I don't remember him saying anything like that. All he has said of late, with regards to the leaked specs are that some details are wrong, and that both consoles will be very close. So unless he has told you otherwise, his original statement does not contradict anything. It is possible for the fortcoming consoles to be "beast" relatively speaking, while not being as powerful as current pc. Beast could be in relation to the current gen and wii u.
 
-There isn't going to be a blitter. Something that has started off as a somewhat joke has become a hopeful rumor.

Nice, how do you know this?

The rumor started here. A week ago, Shifty started this thread as a speculation thread because of the rumors that Durango has 3 custom logic blocks. In it, he listed some things custom logic has been used in the past, and whether it would be a good idea to use them in the future. Note that this is pure, honest speculation -- no-one here claimed that anything *will* have a blitter next gen. There was just talk on whether or not it made sense. And, the conclusion is that it probably wouldn't.

Apparently a lot of rumormongers read B3D, and read it badly, because in about 8 hours since shifty posted that, the web was full of rumors on how the next gen will have blitters.

All in all, it's a very interesting exercise, and shows well how little actual real information there is out there. Every time you read a new rumor, remember blitters. Just because you hear something a lot does not mean it's true. And every time you hear something new, you have to consider the possibility that it was not just badly relayed information, it was in fact completely made up on the spot.
 
Please not that again. :rolleyes:
Both companies will spend based on what makes the most business sense. If GDDR5 allows sony to sell more consoles, keep their fanbase happy, solidifies the PS brand, and yield more profit, that's what they'll do. Just like they spent billions to build CMOS sensor fabs and buy back Ericsson (both of which have been very good moves). If they think they can have an edge, they'll spend the money in gaming too.
All I'm saying is that their position is not exactly flattering, yet they will outspend their direct rival on product that will be out for ~7 years.
 
So you are saying that Lherre has told you that it isn't so anymore? Because I don't remember him saying anything like that. All he has said of late, with regards to the leaked specs are that some details are wrong, and that both consoles will be very close. So unless he has told you otherwise, his original statement does not contradict anything. It is possible for the fortcoming consoles to be "beast" relatively speaking, while not being as powerful as current pc. Beast could be in relation to the current gen and wii u.

You must have missed or forgotten the posts about lherre's comment and if you noticed the other "hype" comments were from around or before that time as well. No one is saying the same thing now. That's what I said before and that's what I'm saying now. You're only getting (or focusing on) a part of the context and that's the recent comments.

Let's go back to June 2012:

http://forum.beyond3d.com/showpost.php?p=1649097&postcount=12175
http://forum.beyond3d.com/showpost.php?p=1649106&postcount=12182
 
I could see him being a big fan of FPGA's in general, and I could see some excitement concerning their applicability in things like BD players and smart TVs. As it applies to the PS4 though, my issue is that unless Sony has seen fit to build in low power (as in power draw) functionality outside of the processing offered by its primary silicon, then is there really a need for anything *more* than an audio DSP, for example? And if the answer is yes, that it might assist with physics operations, heavier encryption needs, depth-fo-field operations for a new EyeToy variant... whatever... then, with the size and transistor count it begins to warrant, at a certain point why not just put in a Cell/SPURS?

I understand the appeal of FPGA's, but it'll be the kind of thing I need to dissect after the fact rather than get excited about in advance, because the needs seem so "extra" compared to what's already there.

Ha ha, perhaps they just want to factor commonly used high level functions into the smallest IP blocks possible for developers to use. The compute units would be reserved for app logic like physics calculation and AI ?

Things like H.26* decode is kinda complex and may not fit nicely into the vector engines. You'll need extra working data in the main memory too. Will these audio, Zlib and video decode functions mess up th cache of the compute unit if they are mixed with regular code ?
 
Hornet said:
Interesting fact: the Xbox 360 has a 32 GB/s interconnect bandwidth between the GPU die and the daughter die. Scaling that bandwidth by the ratio of the size of the EDRAM pools gives us exactly the rumored bandwidth 32 GB/s * (32/10) = 102.4 GB/s. It fits too perfectly to be just a coincidence. So, either the rumor was pulled out of thin air by doing the same calculation that I'm doing or we are indeed seeing the comeback of the daughter die. If it was just speculation, though, they would state clearly that it is the external bandwidth and not the internal one. A scaled up daughter die, with 32 MB, 819.2 GB/s internal bandwidth and 102.4 GB/s interconnect bandwidth, seems a very reasonable setup to me.

If MS cares about BC more so than launching a new platform that refines and corrects past mistakes with an eye toward a better design, then sure, that makes a lot of sense. The eDRAM in Xenos while helpful also caused its own set of problems that modern GPUs--8 years later--have significantly shifted the targets and bottlenecks.

Going with a Xenos style eDRAM setup would be a major mistake IMO and a complete mismanagement of silicon resources.
 
with the size and transistor count it begins to warrant, at a certain point why not just put in a Cell/SPURS?

I understand the appeal of FPGA's, but it'll be the kind of thing I need to dissect after the fact rather than get excited about in advance, because the needs seem so "extra" compared to what's already there.

Because it's potentially attractive enough from a cost per calc/watt perspective compared to CPU/GPU today in some areas whereas ages ago FPGA was monolithic only and slow. Haven't consoles always been trying to get ahead of the curve as much as possible?

And wouldn't programming for FPGA be more open and widely known? People are using OpenCL on them for example. Cell may be small now at 2xnm but it doesn't mean something else doesn't make a better choice to embed in a SoC especially when it has many other uses (medical imaging, CMOS sensors, etc). Cell had it's chance.
 
If you're asking me this, then you didn't see the post I was responding to and the post that person was referring to. Lherre also initially said Xbox 3 "was a beast".

If he was referring to the alpha kits that were housed in the server towers he would have been very impressed. Didn't they have 780 watt PSU?
 
The rationale behind the daughter die is explained in the Xbox 360 System Architecture article on IEEE Micro.





ERP explictly stated that 102 GB/s is enough *if the internal interface is faster*. Otherwise you cannot guarantee to sustain the ROPs throughput with alpha-blending, MSAA, etc.



So 102.4 GB/s sounds right for the GPU connection. But the internal bandwidth should be in the order of 819.2 GB/s to sustain the same kind of ROPs operations that Xenos did, without resorting to compression (which, according to the Xbox 360 designers, is undesirable because unpredictable).

How do you come up with the 819.2 GB/s number?
 
Because it's potentially attractive enough from a cost per calc/watt perspective compared to CPU/GPU today...

Absolutely, in those areas where you would be able to sub it in for said CPU/GPU - but we're talking in addition to those chips. Is this CPU/GPU combo going to be unable to decode AVCHD streams, however inefficiently? I don't think so. So for me, now it's about - are these tasks expected to be taking place while the user is engaged in gameplay? I wouldn't think so there either. Which begs the question: would the entire idea be redundant for a console?

And wouldn't programming for FPGA be more open and widely known? People are using OpenCL on them for example. Cell may be small now at 2xnm but it doesn't mean something else doesn't make a better choice to embed in a SoC especially when it has many other uses (medical imaging, CMOS sensors, etc).

I don't think there's much in PS4 that's going to be ported to CMOS sensors outside of a pure R&D angle, do you? An FPGA can be a totally valid form of IC to include in any number of CE devices... when it supplants other more inefficient silicon. When said silicon is there regardless though, it's just an extra chip.

(OpenCL can be used for Cell as well, but that's neither here nor there, as I do not expect Cell to be present either)
 
If MS cares about BC more so than launching a new platform that refines and corrects past mistakes with an eye toward a better design, then sure, that makes a lot of sense. The eDRAM in Xenos while helpful also caused its own set of problems that modern GPUs--8 years later--have significantly shifted the targets and bottlenecks.

Going with a Xenos style eDRAM setup would be a major mistake IMO and a complete mismanagement of silicon resources.

If MS goes with a Xenos style eDRAM setup it most likely means those horrible yield rumors were true and the only way to get them under control was to remove them from the big chip. Not sure if that'd be a mistake per se but definitely a setback.
 
Status
Not open for further replies.
Back
Top