Nintendo's hardware philosophy: Always old, outdated tech?

MLAA is a compute shader (GPGPU related) and Cypress+ are much improved over R700 in that area. This technology is definitely useful for games. MLAA actually first arrived with Barts (68x0) and then was backported to Cypress and newer. Perhaps R700 could do it but it's not unlikely that it has some sort of deficiency that would cripple performance. Remember that RV670 was nearly worthless for GPGPU due to architectural drawbacks and that RV770 is just a refinement of that architecture.

Again it makes no sense for Nintendo to use a 3 year old PC architecture. RV740 was in development probably 5 years ago now. It's retro tech. If they don't do something more with Hollywood tech (yuck) they will probably have a fairly customized design based upon ATI's current Cypress-level technology.

BTW, Llano is destined to be a budget CPU so it will sell for cheap. It's an Athlon II replacement with spiffy graphics.
 
I found this in regard to UVD2 in R7xx series (from hardware.fr test of HD4670):
IMG0024196.gif



I also read some other stuffs (the tech report) about how for instance the 4670 compares to the 5670. Some differences may explain Nintendo's choices:
*The 4670 win hand down in texturing power.
*The SIMD arrays are narrower so branch should be less costly vs the 5670. The 4670 included 8 SIMD arrays each including 40SP each whereas the 5760 is made of 5 arrays each including 80SP.
*The 5670 benefits from twice the bandwidth as the 4670.

I found a comparative here ti bad I can't find more in depth comparative as I wonder about how much the 4670 is bandwidth constrained in those those tests.

Some other facts and various thoughts.
The 400Sp in Llano clocked ~600MHz offer the same throughput as the 320SP of the 4670 @ its based clock (750MHz) ie 480MFLOPS.
A based 4670 (@750 MHz so) should provide superior texturing power than the GPU in llano.
A 4670 has better branching performances.

N is not likely to use a straight off the shelves GPU.
*Properly clocked the difference between 400 SP and 320 is a moot point. Still if constrained by power more units clocked lower would be better (so 10 SIMD arrays hence 400SP).
*I wonder if they could include more RBE why not 12? (so 2x128KB of L2)
*Keeping early llano figures for TDP in mind and terrific AMD cpus power consumption (even @32nm I don't expect miracles) I wonder if N could have room for increase clock speed. For highest grade llano the tdp is 100Watts and the clock speed for the GPU is unknown (should be higher than ~600MHz tho ). A "xenon +" @32nm should consume significantly less than even the two cores llano sku so there some hope.

Swaaye MLAA is possible on the 360 I can't see why it would be trouble some on a way newer architecture, R7xx are way closer to barts for example than to either xenos or R6xx.
For GPGPU R730 doesn't support DP but it's a moot point for a console game. Yes it's missing features but I'm not sure it's that bad.

I was not interested at first when rumours spread but I realize now that june 7th can come fast enough :LOL:
 
MLAA is a compute shader (GPGPU related)

MLAA is such a crappy term. In the end it's just edge detect and blur/blend however best suits the game performance. Just look at what solutions are already on 360 already.

AAA, DLAA, ML AAAAAAAAAAAAAAAAAAAAAA

Anyways, it's not impossible to do post-process edge filtering without dx11, which is all that really matters in this context.
 
I just want to backtrack slightly and say that Gamecube was not so outdated for its time but still a little conservative (RAM amount and clockspeeds).

As to RV770 being used, makes good sense, good performance per transistor and it won't have DX overhead to contend with. Tesselator and other features can be exposed directly "to the metal" ... no idea what Nintendo will actually use though. It is all speculation at the moment (unless there has been some announcement?)
 
Did ATI/AMD not add any additional functionality or performance to it from Xenos?
I guess clockspeeds are going to be similar in any case (500MHz vs maybe 600MHz)?
Did think there were several generations of the tessellator between product cycles.
 
It basically remained the same until DX11 since there wasn't really much incentive to do much more without a ratified standard on Windows. They had more important things to improve between R6xx and R7xx anyway. :p
 
Xenos had the same tessellation hardware as R600 and I think that hardware was improved in R700, though not significantly.
 
Yeah, not much.

For R7 generation hardware, and therefore RV740, ATI claims to have tweaked the tessellator by adding interaction with geometry shading. We can only assume, based on how ATI represents the tessellator in its newer presentations, that this means it is now possible to modify the tessellated mesh via a GS program, because verifying it is impossible: at this moment, only the DX9 tessellation libraries and programming examples are available, so the interaction with D3D10+ and OpenGL (and indeed compute, since the tesellator is a legit avenue for compute acceleration) is anyone's guess.
Obviously there's the geometry shader added from DX10. (Big Whoop)
 
amdhd6800presentation14.jpg


AMD claims the tesselators are 3 generations apart between Xenos and R700.

The HD2xxx and HD3xxx tesselators seem to have been almost untouched, with no additional performance or features from the tesselator in Xenos.
That probably means the tesselators between Xenos and R700 are only "1.5" generations apart. The "1" for geometry shader compatibility and the "'0.5" for the supposed added performance.
 
Last edited by a moderator:
A Radeon 8500 would be a nice upgrade from Hollywood. How small would that be on 40nm? ;)
 
I found this in regard to UVD2 in R7xx series (from hardware.fr test of HD4670):
IMG0024196.gif



I also read some other stuffs (the tech report) about how for instance the 4670 compares to the 5670. Some differences may explain Nintendo's choices:
*The 4670 win hand down in texturing power.
*The SIMD arrays are narrower so branch should be less costly vs the 5670. The 4670 included 8 SIMD arrays each including 40SP each whereas the 5760 is made of 5 arrays each including 80SP.
*The 5670 benefits from twice the bandwidth as the 4670.

I found a comparative here ti bad I can't find more in depth comparative as I wonder about how much the 4670 is bandwidth constrained in those those tests.

Some other facts and various thoughts.
The 400Sp in Llano clocked ~600MHz offer the same throughput as the 320SP of the 4670 @ its based clock (750MHz) ie 480MFLOPS.
A based 4670 (@750 MHz so) should provide superior texturing power than the GPU in llano.
A 4670 has better branching performances.

N is not likely to use a straight off the shelves GPU.
*Properly clocked the difference between 400 SP and 320 is a moot point. Still if constrained by power more units clocked lower would be better (so 10 SIMD arrays hence 400SP).
*I wonder if they could include more RBE why not 12? (so 2x128KB of L2)
*Keeping early llano figures for TDP in mind and terrific AMD cpus power consumption (even @32nm I don't expect miracles) I wonder if N could have room for increase clock speed. For highest grade llano the tdp is 100Watts and the clock speed for the GPU is unknown (should be higher than ~600MHz tho ). A "xenon +" @32nm should consume significantly less than even the two cores llano sku so there some hope.

Swaaye MLAA is possible on the 360 I can't see why it would be trouble some on a way newer architecture, R7xx are way closer to barts for example than to either xenos or R6xx.
For GPGPU R730 doesn't support DP but it's a moot point for a console game. Yes it's missing features but I'm not sure it's that bad.

I was not interested at first when rumours spread but I realize now that june 7th can come fast enough :LOL:

Yes I think it's becoming more and more clear to me that we'll be looking at a 4670. I dont know why it's didn't hit me before, but RV770 is 256 bus, making it even more unlikely. You're realistically limited then to RV740 or RV730, the two with 128 bus. Of course for other reasons my money is heavily on the 730.

I also believe that for cost reasons, Nintendo will leave the chip as perfectly stock as possible. Nintendo doesnt spend any dollar on hardware they dont have to. They wont spend the $ to rejigger anything.

BTW, i'm hoping we'll know more details before June 7. IGN said they have another Wii2 tech article coming "this month".
 
Yes I think it's becoming more and more clear to me that we'll be looking at a 4670. I dont know why it's didn't hit me before, but RV770 is 256 bus, making it even more unlikely. You're realistically limited then to RV740 or RV730, the two with 128 bus. Of course for other reasons my money is heavily on the 730.

I also believe that for cost reasons, Nintendo will leave the chip as perfectly stock as possible. Nintendo doesnt spend any dollar on hardware they dont have to. They wont spend the $ to rejigger anything.

BTW, i'm hoping we'll know more details before June 7. IGN said they have another Wii2 tech article coming "this month".

Your reasoning is sound, however I'm a bit wary of this becoming "internet truth". It's all based on the "R700" rumour, and if that is incorrect or misinterpreted, along with the assumptions on power draw et cetera, the whole intellectual house of cards falls down.
It is vexing that we may never know the specs. (Those of us not under NDA.) In a sense I sympathize with that, since the proof of the pudding is in the eating, and throwing specs around is generally a sign that there is nothing particularly worth mentioning anywhere else. But on the other hand, I feel that the engineers doing the design deserves a bit more recognition for their work than they can get if it is all under wraps. The GC wasn't very secret though, and since the Wii2 will probably outperform PS360, they may disclose a bit more this time around in order to drive that point home. Lets hope for that.
 
I think we'll find out, due to IGN/other internet's sleuthing. At least we will end up with a pretty good idea. Worse comes to worst there's teardowns after it's out too. And of course, the games at E3 will give us an idea.

But yeah cant wait for e3.

And yup, it is all based on the R700 rumor, because that was pretty specific, and all we have to go on.

Well, I also think a tri-core, (Xenon based apparently, but even if not), CPU also signifies we're not dealing with anything cutting edge here. Cutting edge would be 6, or at least 4 cores. And you're not going to pair a monster GPU without enough CPU, they'll be balanced.
 
By what means would the various components be identified? How certain could you be of the specifications if you were working on a NES 6 game right now? Maybe they call it a R700 because certain parts of the architecture haven't been exposed yet? Im not sure, maybe someone call enlighten me here...
 
I also believe that for cost reasons, Nintendo will leave the chip as perfectly stock as possible. Nintendo doesnt spend any dollar on hardware they dont have to. They wont spend the $ to rejigger anything.

Stock RV730 would be great, it would mean Nintendo didn't castrate the GPU somewhere to save costs, and PC ports would be a breeze.

But I think a GPU without eDRAM may have some trouble emulating some Wii titles, unless they go with an UMA of 128bit GDDR5 or at least 128bit >1600MHz DDR3 (assuming the eDRAM in Napa may go up to ~27GB/s, given the Gamecube's peak of 18GB\s (sum of texture's 10.4GB\s and framebuffer's 7.6GB/s)).


Although I wonder if the 1MB texture eDRAM for the Wii was ever really used in Wii titles and wasn't there only for Gamecube compatibility, as it's really very small and the developers had the 64MB GDDR3 at their disposal for texture buffering.
For some reason, the Dolphin emulator allows many Gamecube games to be played full speed with a 2.0GHz C2D + GMA X4500 + 128-bit DDR2\3, and many Wii games to be played with a Phenom 2 X2 + Radeon 4200 IGP + 128-bit DDR3 .
 
Last edited by a moderator:
Stock RV730 would be great, it would mean Nintendo didn't castrate the GPU somewhere to save costs, and PC ports would be a breeze.

But I think a GPU without eDRAM may have some trouble emulating some Wii titles, unless they go with an UMA of 128bit GDDR5 or at least 128bit >1600MHz DDR3 (assuming the eDRAM in Napa may go up to ~27GB/s, given the Gamecube's peak of 18GB\s (sum of texture's 10.4GB\s and framebuffer's 7.6GB/s)).


Although I wonder if the 1MB texture eDRAM for the Wii was ever really used in Wii titles and wasn't there only for Gamecube compatibility, as it's really very small and the developers had the 64MB GDDR3 at their disposal for texture buffering.
For some reason, the Dolphin emulator allows many Gamecube games to be played full speed with a 2.0GHz C2D + GMA X4500 + 128-bit DDR2\3, and many Wii games to be played with a Phenom 2 X2 + Radeon 4200 IGP + 128-bit DDR3 .

1MB isn't small for a texture cache (what was XBox's texture cache?, 128k?). Its several times faster then the 64MB pool, both in pure bandwidth and latency, so there's no reason not to use it.

I also think the GPU will be modified, since when have Nintendo gone with stock PC parts? Nintendo like modified/custom parts and this won't be any different IMO. Also IGN claim its a revamped R7xx based chip.
 
Last edited by a moderator:
By what means would the various components be identified? How certain could you be of the specifications if you were working on a NES 6 game right now? Maybe they call it a R700 because certain parts of the architecture haven't been exposed yet? Im not sure, maybe someone call enlighten me here...

Well, lets work from the rumors and common prejudice re: Nintendo.
The CPU is rumoured to be a tri core. The most obvious tri-core in use is the one in the XB360. So let's assume that Nintendo uses exactly that. Lets also assume that the R700 rumour is correct. Tech forum prejudices regarding Nintendo says that they will go low in power, and with minimal engineering effort, which pegs the HD4670 as a likely candidate for GPU. Let's also assume that they use Global Foundries for fabbing, at 32nm, same as AMD uses for their new CPUs.
That would give us a CPU that would perform slightly better than the one in XB360 simply due to advances in memory bandwidth (and probably internal communication with memory controller/GPU). It would also give us a GPU with twice the ALU horsepower over Xenos, along with some architectural advances and three times the memory bandwidth assuming 128-bit GDDR5. Lets go cheap and assume 1GB of unified memory.
On the assumed process, this would draw some 50-ish Watts, and neatly provide the required step up from the XB360 to render cross-platform games at a step up in resolution.

It all fits together. It's all based on conjecture, prejudice and rumors.
IF Nintendo did this, it would actually be kinda neat. I could hope for a bit more in the graphics department (the HD6670 draws roughly the same as the HD4670), but otherwise this would do the job, and the main innovation and extension of the current state of the art would lie in the controller. Do the same thing as the other consoles, but better, and add your own unique twist and selling points. Makes sense.
 
Back
Top