Predict: The Next Generation Console Tech

Status
Not open for further replies.
It's because mm^2 is a significant factor in the manufacturing cost.

Acer93 was using it as some kind of.... performance measurement ( ? ) though.
I like number of transistors a lot more, for that kind of comparison (still doesn't really relate to performance that much, if at all).

edit: maybe someone has a list of "fps per mm2", or "transistor per fps" for a number of GPU's, then you would see how much sense that makes (or doesn't make) when comparing architectures.
 
Yes I got that I'm just playing with the fact that they said that the final specs would be 10X the PS3.


Cell is 204 GFLOPS in the PS3 using the 6 SPEs & RSX is 400 GFLOPS. ( well that's what Wikipedia said)

If the final specs 10X PS3 part is to be believed then that would mean that the CPU would have to be around 2 TFLOPS which is about the same specs as the APU with the 1.8 TFLOPS GPU.

& 10X the RSX 400 GFLOPS is 4 TFLOPS which is about the same as the Radeon HD 7970 3.79 TFLOPS.


I know all this is crazy but that's what 10X PS3 is in my book, maybe the person who wrote 10X PS3 has a different way of calculating 10X PS3?

Edit: does anyone have the real RSX specs? 400 GFLOPS seem to be fake.

I don't see it as straight up raw power. Someone earlier talking about MS' multiplier and what it possibly meant I believe and I see it similar for PS4 as well in that this "10x" includes efficiency of modern hardware along with raw power to achieve that target.
 
Back when the targets specs of the PS4 were leaked, people thought it was going to have an APU + discrete GPU. That would have been nice for the PS4. For if it is only going to have the APU, with a mid-range GPU and 2 Gb of fast RAM, then the specs appear to be quite weak IMO (without know much about hard). Maybe the only hope is what lherre said about the specs still open to be changed. But if what sweetavatar said at neogaf was true, and Sony has changed steamroller cores for jaguar ones, this would mean they are targeting a mid-low range APU.

I hope at least one system offers an interesting performance. Last months we´ve been reading interesting things about the 720 (something that has not happened with the PS4), I hope MS comes with a good hardware at least.
 
RSX is 400 GFLOPS ( well that's what Wikipedia said)
That wikipedia number is misleading because it seems to be calculated in a weird way.

As far as my own math goes, RSX is a ~250 GFLOPS chip [24*2*4 way Pixel ALUs + 8*5 way Vertex ALUs)@550Mhz].

Also, consider that XENOS is rated @ 240GFLOPs [48*5 way unified ALUs @500Mhz] - and is generally considered faster than RSX.

Curiously enough, Pitcairn XT offers ~ 2.500 GFLOPs. So that should basically be the target spec (although, as I mentioned earlier, Pitcairn@1Ghz is way too power hungry to make its way into a console). An optimized and underclocked Oland-derivative should probably be capable of reaching those target numbers within a reasonable power budget, though.

They'd roughly need 24CUs running @800Mhz to end up with the target of 2.500 GFLOPs.
 
I notice quite a lot of "mm2" or transistor numbers throwing around. Why is that?
"more is better"

It is popular belief that the RSX pales in comparison to the Xenos.
But the transistor numbers and the "mm2"-s are in favor of the RSX.
Xenos and RSX work in fundamentally different ways, giving Xenos a significant efficiency advantage. All things being equal, two GPUs built around the same sort of design will ahve their performance defined by number of transistors (summarised with mm^2 at the same manufacturing process) and clock. Where the compared GPUs different in design, it's impossible to gauge performance counting mm^2, transisitors, FLOPs, or anything else, other than to have them broadly identify performance brackets they come under.
 
Cell is 204 GFLOPS in the PS3 using the 6 SPEs & RSX is 400 GFLOPS. ( well that's what Wikipedia said)

That's the Nvidia marketing number, it bears about as much relation to reality as if Intel marketed Ivy Bridge as being a 10+ TeraOp processor (which it is for 1b ands and ors!)
 
Xenos and RSX work in fundamentally different ways, giving Xenos a significant efficiency advantage. All things being equal, two GPUs built around the same sort of design will ahve their performance defined by number of transistors (summarised with mm^2 at the same manufacturing process) and clock. Where the compared GPUs different in design, it's impossible to gauge performance counting mm^2, transisitors, FLOPs, or anything else, other than to have them broadly identify performance brackets they come under.

Great explanation!
I guess that comparing 2005 and 2012 GPU designs on the basis of mm2 is pretty useless then.
 
How often are Dev-Kits updated?

It depends.
With 360 we had mac's, got a graphics card update and then final boxes.
There were probably also very small runs of near final kits that we never saw.

Generally they get updated if there is value in the update, the Mac's were not good indicators of anything in the final hardware, but they ran the prerelease OS, the 9600's were underpowered and did affect development, so they were changed later. After that the leap is to final hardware.
On PS3 it was basically some variation of the final hardware all the way to release, I only remember 2 kits, but I might be forgetting one.

It should also be noted that traditionally MS has based it's dev machines on retail hardware, Sony hasn't

Also it should be noted these upgrades aren't dropped on a whim, there is a roadmap based on the release schedule, any change to any of the hardware requires additional work by the OS team and the schedules are usually very tight to ship anything.
 
It depends.
With 360 we had mac's, got a graphics card update and then final boxes.
There were probably also very small runs of near final kits that we never saw.

Generally they get updated if there is value in the update, the Mac's were not good indicators of anything in the final hardware, but they ran the prerelease OS, the 9600's were underpowered and did affect development, so they were changed later. After that the leap is to final hardware.
On PS3 it was basically some variation of the final hardware all the way to release, I only remember 2 kits, but I might be forgetting one.

It should also be noted that traditionally MS has based it's dev machines on retail hardware, Sony hasn't

Also it should be noted these upgrades aren't dropped on a whim, there is a roadmap based on the release schedule, any change to any of the hardware requires additional work by the OS team and the schedules are usually very tight to ship anything.

In other words, don't expect large scale changes once dev-kits have gone out unless already telegraphed by the platform holder.
 
Things do change, but it's not usually radical.

The most common case is that for whatever reason they can't manufacture the original design at the original specs, or a part provided by a vendor doesn't meet spec.
But things change for other reasons like reactions to a competitor MS doubled the memory on 360.
3DO probably made the biggest change I've ever seen when they added a second CPU to M2, but that never had a release date.
There are rumors that N64 had a fairly significant change (downgrade), but the only teams affected were the "dream team" members.

Things are a lot more stable now than they used to be, the original Saturn devkits were the size of a small fridge, were missing half the hardware ran really hot and tended to last about 5 minutes before dieing, but that was when all the console manufacturer did was ship hardware with badly translated register documentation.

It should also be noted that not all developers get initial devkits at the same time and not all developers see all of the devkits that are produced. For example there will probably be a small run of pre release hardware used by the OS team, since you can't ship devkits without the OS.
 
That wikipedia number is misleading because it seems to be calculated in a weird way.

As far as my own math goes, RSX is a ~250 GFLOPS chip [24*2*4 way Pixel ALUs + 8*5 way Vertex ALUs)@550Mhz].

Also, consider that XENOS is rated @ 240GFLOPs [48*5 way unified ALUs @500Mhz] - and is generally considered faster than RSX.

Curiously enough, Pitcairn XT offers ~ 2.500 GFLOPs. So that should basically be the target spec (although, as I mentioned earlier, Pitcairn@1Ghz is way too power hungry to make its way into a console). An optimized and underclocked Oland-derivative should probably be capable of reaching those target numbers within a reasonable power budget, though.

They'd roughly need 24CUs running @800Mhz to end up with the target of 2.500 GFLOPs.
The Xbox GPU (according to a presentation the Xbox guys gave us when we were starting HD DVD development) can and does routinely achieve max throughput, and apparently RSX doesn't. So in real terms, the stated max numbers are misleading. I don't know if current gen GPUs are similarly misleading.
 
The Xbox GPU (according to a presentation the Xbox guys gave us when we were starting HD DVD development) can and does routinely achieve max throughput, and apparently RSX doesn't. So in real terms, the stated max numbers are misleading. I don't know if current gen GPUs are similarly misleading.

Do you have a source/citation for that?
 
A number of developers have been open about stating how they would have preferred the eDRAM budget be dedicted top the GPU proper and I think that would be the major conclusion if MS presented a Cape Verde+Xenos eDRAM setup.

I thought the biggest complaint regarding eDRAM in the 360 was how the frame buffer had to reside in the eDRAM without any way to bypass it. Isn't there some way to use eDRAM for bandwidth without forcing the frame buffer to sit on the eDRAM? Apologies if this is a dumb question but some of the dev complaints seem to indicate the eDRAM could have been implemented differently in the 360.

That wikipedia number is misleading because it seems to be calculated in a weird way.

As far as my own math goes, RSX is a ~250 GFLOPS chip [24*2*4 way Pixel ALUs + 8*5 way Vertex ALUs)@550Mhz].

Not a big deal, but RSX is 500Mhz.

Do you have a source/citation for that?

I'm guess he's estimating this through the efficiencies gained through a unified shader architecture versus a more discrete shader model where you can't tailor your game's shader load specifically to your GPU spec 100% of the time. Meaning at some point your GPU may be underutilized. I can be wrong of course. :p
 
Meaning at some point your GPU may be underutilized. I can be wrong of course. :p

On a none unified GPU at all points it's under utilized, you're either pushing simply shaded tris in which case the pixel shaders are underutilized, or you're doing complex shading in which case your vertex shaders are under utilized.
Real games do both at different points in a frame, drawing shadows you're not doing any pixel shading, when you're doing the pretty lighting model, pixel shading is dominant.

On 360 the EDRAM also makes a difference, since it means the frame buffer memory is never the bottleneck.

Having said that I would be surprised if real games saw 100% utilization on a 360 for a significant portion of a frame, there are still other things that gate through put and cause ALU's to sit idle. Texture fetches, triangle setup, number of rops etc etc.
 
Great explanation!
I guess that comparing 2005 and 2012 GPU designs on the basis of mm2 is pretty useless then.
Depends what you're comparing. For cost purposes, taking 1 mm^2 of transistors to be about the same price no matter what year it's made, then a console with a given silicon area will cost about the same. Hence the idea that if 300 mm^2 was the cost-effective limit for a $300 console in year a, 300 mm^2 would also be the limit for a $300 console in year b. Factors are at play, but it seems an okay ballpark reference to me. That might be an unrealistic assumption.

Another reference point is Moore's law. If transistor density increases 2x every 18 months, then a 10x increase in power in the same chip area happens every 5ish years, which is our expected console generation, and the OP's (and most us) original expectation.
 
First: how do you know they have eDRAM? Do you actually know stuff or are you frequently tossing out predictions as facts? Just curious because you cut of the discussion with these facts and if they are facts, great as there is no point chasing pointless rabbit trails. But if not...

Second: 205mm^2 (Xenos logic) versus 135mm^2 is a LOT. 52% more silicon real estate to be exact and based on area "sunk" into basic architecture (see above on these supposed scalings--you get more performance on additional space) that is going to be substantially faster (looking at the 40-60% range).

I guess the issue is you have chosen to select Cape Verde as a baseline and then anything about that is impressive, tasty, etc ;) To review:

G71 @ 90nm: 196mm^2
RSX @ 90nm: 240-258mm^2 (depending on source)
Xenos @ 90nm: 180mm^2 (Mintmaster)
eDRAM @ 90nm: 70mm^2 (~105m transistors, of which about 80m eDRAM and 25m ROP logic -- so 205mm^2 logic)

Cap Verde @ 28nm: 123mm^2
Pitcairn @ 28nm: 212mm^2
Oland @ 28nm: 235mm^2
Mars @ 28nm: 135mm^2

Some comparing in budget:

RSX: Cape Verde: -53%
RSX: Pitcairn: -18%
RSX: Oland: -9%
RSX: Mars: -48%

Xenos : Cape Verde: -40%
Xenos : Pitcairn: +3%
Xenos : Oland: +14%
Xenos : Mars: -35%

My premise has continued to be comparing to the relative real estate of last generation. Given that as a general ball park baseline (for all the reasons I have cited in this thread) the GPU's you mention (Cape Verde, this supposed Mars) are major, substantial drop offs from last generation in terms of space. They also far and away fall short of desktop discrete GPUs from 4-5 years *before* the launch of the new consoles.

Sure, if this was 2010 maybe Cape Verde would be a nice upgrade, but we are talking about 2013+. And there is no other way to put this: putting a GPU that is 40-50% smaller than RSX and Xenos is making a HUGE cut in silicon real estate. And if we are to buy the major push in the GPU industry of GPGPU which supposedly it taking up more tasks something like a CELL processors does the GPU will be asked to do more with less total space.

There is nothing exciting about this. Heck, we will be seeing AMD/Intel SoCs in the next couple years with better than console performance.

Also I don't see 28nm as an excuse. It isn't news about node transitions but there is always someone whining (in this case NV). 28nm, which had select parts rolling out in the late 2011, is going to be very mature by late 2013. Layout, yield, power, and pricing are all going to be at a more mature point than 90nm in 2005.

If MS/Sony are going with chips that are nearly 50% smaller it has a lot to do with their visions for their platforms (read: general entertainment devices that are equally focused on core gaming, casual gaming, new peripheral gaming, media streaming, service provider, digital distribution, etc) instead of a core gaming experience first (which has tens of millions of customers out there) with those other features coming along as a robust package. I have no problem with that, but I also don't think it should be sugar coated against low end hardware like Cape Verda -- or to throw out "Kinect 2 will be so much more accurate and default hardware so it will work with core" all along pretending that the basics, e.g. it lacks proper input for core genres (like FPS, driving, sports) that regardless of precision Kinect is a total non-starter in an FPS or the like because you cannot move. The only solution is rails which is a huge step back. Which may appeal to casuals but, again, is a major assault and demand for core gamer concession.

TL;DR: Cape Verda/Mars are a huge reduction in hardware from last gen. HUGE.

Sorry but I don't quite get the logic:
You are comparing (estimated) die sizes of 90nm 2005 GPU architectures against 28nm 2012 ones. The 28nm fab process allows for a lot more transistors on the same die size. Not to mention the GPUs themselves having improved vastly (to (over)simplify it; even the same amount of transistors would yield a lot more performance in the 2012 GPUs.
So I believe your conclusion based on the comparison of the die sizes is not correct.

A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).
 
Sorry but I don't quite get the logic:
You are comparing (estimated) die sizes of 90nm 2005 GPU architectures against 28nm 2012 ones. The 28nm fab process allows for a lot more transistors on the same die size. Not to mention the GPUs themselves having improved vastly (to (over)simplify it; even the same amount of transistors would yield a lot more performance in the 2012 GPUs.
So I believe your conclusion based on the comparison of the die sizes is not correct.

A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).

5 times the performance isn't going to cut it, and I think that is part of Acert's point. If they can go for bigger and still be profitable then they should do it, as that will help them in the long run. It will be entirely sad to see these companies release sub-par hardware, especially in the graphics department, and then get rocked when their biggest competitor comes in with overpriced hardware and still manages to sell millions in minutes on day one.




But for real, any new details about the PS4 APU? The cores have changed from bulldozer to jaguar cores are the current rumors. Are Jaguar cores significantly less powerful than bulldozer cores?
 
Status
Not open for further replies.
Back
Top