1up claims Kojima says MGS4 demo could be done on 360

aaronspink · Dec 17, 2005

ihamoitc2005 said:
Hardware context of which you speak is very difference from full independent hardware threads, which is what CELL has of 7 count. Xenon has no fully independent hardware threads all have competition for resources. CELL has only 2 non-independent threads.

whatever. you are, once again, incorrect.

As for"waste" what do you mean by this? Why would you want processing core reading directly from RAM? Entire purpose of SPE design is to prevent this. You are criticizing SPE for not having a large flaw of normal processors.

A) its not a flaw
B) the LS will have double buffering due to the need to DMA in and out of it.

They are equally not independent so but both must share resources because they are only thread contexts, not true full independent hardware threads. You can have one thread with very good performance but if you add second thread you will not get double performance. That is why at any one time it is more like 1.5 threads per core so that is why one can say Xenon is not 6 threads but 4.5. CELL PPE has same problem so that is why one can say CELL is not 9 threads but only 8.5.

You're incorrect. But hey, thats not new.

What is relevent is proportion of non-latency cycles. Because of high clock speed, this is very high for XDR therefore proportion of latency cycles is low compared to GDDR3.

again, incorrect.

Are you referring to Heavenly Sword? Then you may or may not be right depending on how much bandwidth is limitation on output quality. At 1080P at 30fps, it seems bandwidth is very sufficient. For other games extra pixel fill-rate might be useful and not limited by bandwidth.

16 rops at 550 MHz is 8.8 GP/s which is 70.4 GB/s. The RSX only has 22 GB/s attached to it. And we haven't even gotten into overdraw. 8 rops will be more than sufficient.

Aaron Spink
speaking for myself inc.

ihamoitc2005 · Dec 17, 2005

True 720P at 60fps

TurnDragoZeroV2G said:
Many shaders use textures. Shading power is useless if you have nothing to apply it to/use it for.

Question is not if texture is needed but what proportion of total shader cycles is required in typical situation. Maybe a developer with real experience can provide us with this answer.

We weren't talking about whether RSX will output those graphics. We were talking about whether bandwidth would be a bottleneck for fillrate, and if those 16 ROPs were even necessary (if you look back to your response to aaron). You can do the calculations yourself. how will it support 8.8Gpixels/s in a realistic situation (just like: how will RSX sustain its peak theoretical shader operations all the time, even if it's 100% efficient, due to the way the nV pipelines are organized?)

My understanding of Aaron's comments was that bandwidth is insufficient for sufficient fill-rate not just for peak fill-rate but for simply rendering at same quality as Xenos. If you look at his post then it seems this is what he is trying to say. ROP count is secondary to his main statement: "The fill-rate of RSX will be severly limited by the memory bandwidth available."

With Heavenly Sword and other real-time demos we know fill-rate is sufficient to equal or be better than reference platform so therefore it is not a limitation. As for extra ROPs, if included, maybe it is for SSAA.

What problems do you feel exist with organization of "nV" pipelines? Real world performance of G70 is excellent and many believe it is extremely effective and proven design.

BTW, what guesses do you speak of? Bandwidth has been confirmed, with no mention of changes. shader instructions/clock have been stated, and match up with G70. It has been stated that it follows from the same path as G70 (all of which implies plenty about its pipe structure, despite what some rough dot product calculations seem to lead people to believe). And didn't nV even try to release a chart showing the capabilities of G70, RSX (and even incorrect Xenos info)?

I do not know of which charts you are speaking since I have not seen any with incorrect Xenos info, only correct Xenos info, but there is uncertainty of what is RSX precisely due to some incompatible specs with RSX such as dot-product and what changes to pipeline architecture are made as well as precise method of accessing CELL and XDR.

Real-world 360 output is bottlenecked by development time at the moment, not the issue we are discussing regarding PS3 (bandwidth and fillrate support in light of bandwidth restrictions)

How are you certain that "development time" is restriction on Xenos output quality? Do you have information that it is not memory bandwidth restriction or other hardware limitation such as unified shader inefficiency that causes low output for games such as PGR3 with false 720P with upscaling and 30fps?

If Xenos is super efficient and capable as some say then 60fps graphics output for games like PGR3 should be "piece of cake" no? Remember at false-720P resolution, inexperience with tiling method cannot explain poor performance because no tiling is required and eDRAM is enough for entire frame.

See above. (and yes, not all final pixels that are output have had the same number of pixel read/modify/writes applied to them)

Also not all pixels are same format and have same bandwidth requirement. If all pixels were the same, then original Xbox would not be capable for 720P at 60fps using limited 6.4GB/s bandwidth.

BTW, are there any true 720P xbox360 games at 60fps? If any one has answer please respond.

scooby_dooby · Dec 17, 2005

ihamoitc2005 said:
How are you certain that "development time" is restriction on Xenos output quality?.

have you seen gears of war??

"With Heavenly Sword and other real-time demos we know fill-rate is sufficient to equal or be better than reference platform so therefore it is not a limitation."

if you want to compare to the so called 'reference platform' you must do so with games being released in the same time period. Heavenly Sword is not comparable to any current X360 titles, because heavenly sword(or any 'demo') is not launching until next year, therefore it's comparable to Gears of War, Mass Effect, Too Human, Splinter Cell 4 etc, titles that are launching in the same timeframe.

ihamoitc2005 · Dec 17, 2005

bad mood

I notice that when you run out of things to say you become hostile in tone. Why is this? I have not insulted you or been hostile in my posts yet from beginning of your recent post you have been rude and in response I find it is difficult to not be in a bad mood as well.

aaronspink said:
whatever. you are, once again, incorrect.

Whatever? That does not make sense. Simply saying "your are incorrect" is pointless. Better to make rational explanation of why you feel Xenon can magically run 6 full independent threads without architecture to support such capability.

A) its not a flaw
B) the LS will have double buffering due to the need to DMA in and out of it.

A)Processor reading directly from RAM must be avoided at all cost due to very very large performance "hit" so if Xenon suffers cache miss and reads from RAM it is a very large flaw.

B)As for "double-buffering" of LS, you have issue reversed. Operation of LS is to prevent the very large flaw of Xenon design by insulating processing unit from slow RAM. This clever design choice similar to pawn-sacrifice in chess and is great reason to have party and celebration not complain.

You're incorrect. But hey, thats not new.

Also pointless, "but hey, thats not new."

again, incorrect.

"Again," pointless.

16 rops at 550 MHz is 8.8 GP/s which is 70.4 GB/s. The RSX only has 22 GB/s attached to it. And we haven't even gotten into overdraw. 8 rops will be more than sufficient.

It seems you propose redesigned PS3 so RSX can only access GDDR3 for 22 GB/s peak memory bandwidth. I prefer original design by Sony-Nvidia where RSX has access to GDDR3 and CELL-XDR for upto 48GB/s peak memory bandwidth.

psp111 · Dec 17, 2005

aaronspink said:
whatever. you are, once again, incorrect.

So you are saying that Xenon can run 6 independent threads at full speed, ie. 6 x 3.2 Ghz?

I must have misunderstood this, as I was under the impression that on each core the second thread was only able to make use of the execution units that were not being used by the first thread. Could you provide some more information about how this works?

tema · Dec 17, 2005

scooby_dooby said:
have you seen gears of war??

"With Heavenly Sword and other real-time demos we know fill-rate is sufficient to equal or be better than reference platform so therefore it is not a limitation."

if you want to compare to the so called 'reference platform' you must do so with games being released in the same time period. Heavenly Sword is not comparable to any current X360 titles, because heavenly sword(or any 'demo') is not launching until next year, therefore it's comparable to Gears of War, Mass Effect, Too Human, Splinter Cell 4 etc, titles that are launching in the same timeframe.

We all have. GOW why does it run like a PC game with high gpu resource but choking cpu speed? GOW, Mass Effect, Too Human, 99 Nights, Lost Planet, Blue Dragon, you keep adding NON launch games but there still is no sign of exotic graphics! MGS4, Heavenly Sword, Gundam, DMC4 why do they look better? What is holding Xenos exotic unified gpu back? Why are Xenos games gloss heavily with bumpmapping to cover up polygon edges? What about XCPU VMX128 extensions to 1-1-5GFLOPS? Where are the physics? I would like some answers. kthxbye.

Nvidia said:
"Debating unified against separate shader architecture is not really the important question. The strategy is simply to make the vertex and pixel pipelines go fast. The tactic is how you build an architecture to execute that strategy. We're just trying to work out what is the most efficient way.

"It's far harder to design a unified processor - it has to do, by design, twice as much. Another word for 'unified' is 'shared', and another word for 'shared' is 'competing'. It's a challenge to create a chip that does load balancing and performance prediction. It's extremely important, especially in a console architecture, for the performance to be predicable. With all that balancing, it's difficult to make the performance predictable. I've even heard that some developers dislike the unified pipe, and will be handling vertex pipeline calculations on the Xbox 360's triple-core CPU."

"Right now, I think the 7800 is doing pretty well for a discrete architecture?

So what about the future?

"We will do a unified architecture in hardware when it makes sense. When it's possible to make the hardware work faster unified, then of course we will. It will be easier to build in the future, but for the meantime, there's plenty of mileage left in this architecture."

ihamoitc2005 · Dec 17, 2005

Gears of War

scooby_dooby said:
have you seen gears of war??

Yes. It is good and bad. Very very good normal mapping (sometimes too much use), some good textures, some bad textures, bad sprite fire and explosions, improving particles, fake rain, average animation, probably 30fps @ 720P but I do not know if true 720P or upscaled.

"With Heavenly Sword and other real-time demos we know fill-rate is sufficient to equal or be better than reference platform so therefore it is not a limitation."

if you want to compare to the so called 'reference platform' you must do so with games being released in the same time period. Heavenly Sword is not comparable to any current X360 titles, because heavenly sword(or any 'demo') is not launching until next year, therefore it's comparable to Gears of War, Mass Effect, Too Human, Splinter Cell 4 etc, titles that are launching in the same timeframe.

What I am comparing is not game quality but output quality. Since Heavenly Sword shows 1080P at 30fps, we know PS3 bandwidth is no problem for 720P at 60fps. If future Xbox360 games have 720P at 60fps, then that is great but I do have such information at the moment. If Xbox360 games cannot do high quality 720P graphics at 60fps, then we know it cannot or is too difficult (tiling) to output same pixel output as PS3. If future Xbox360 games can output 720P at 60fps, then maybe we will receive future HDMI upgraded Xbox360 hinted by Microsoft in earlier interview with 1080P graphics. Then difference will not be output quality but CPU and shader capability.

TurnDragoZeroV2G · Dec 17, 2005

ihamoitc2005 said:
A)Processor reading directly from RAM must be avoided at all cost due to very very large performance "hit" so if Xenon suffers cache miss and reads from RAM it is a very large flaw.

B)As for "double-buffering" of LS, you have issue reversed. Operation of LS is to prevent the very large flaw of Xenon design by insulating processing unit from slow RAM. This clever design choice similar to pawn-sacrifice in chess and is great reason to have party and celebration not complain.

You're on a totally different wavelength, here. You're trying to refer to the latency of hitting RAM, he isn't.

ihamoitc2005 said:
It seems you propose redesigned PS3 so RSX can only access GDDR3 for 22 GB/s peak memory bandwidth. I prefer original design by Sony-Nvidia where RSX has access to GDDR3 and CELL-XDR for upto 48GB/s peak memory bandwidth.

PS3 has other things to do other than just the framebuffer, if you haven't noticed. What you're proposing leaves less and less of 48GB/s of bandwidth for the CPU and non-framebuffer reads for the GPU.

weaksauce · Dec 17, 2005

A bit offtopic but about GOW, you guys seen the video where they are at the petrol station, and the player opens a valve and some liquid flows out.
It looked pretty bad. The liquid came up from under the ground, didn't look as if it was flowing down the road, and didn't really look liquid.

Judging from the lots of ducks demo, do you think it would be possible to do a nice water simulation instead? Doesn't have to be as advanced as This:smile: but just so you see that there's actually water running with some waves and such.

ihamoitc2005 · Dec 17, 2005

TurnDragoZeroV2G said:
You're on a totally different wavelength, here. You're trying to refer to the latency of hitting RAM, he isn't.

He is referring to it but he does not know it. Entire purpose of LS design is to avoid "hitting" RAM. He is saying LS based design prevents SPU from "hitting" RAM, I am saying that is a good thing he is saying that is a bad thing.

PS3 has other things to do other than just the framebuffer, if you haven't noticed. What you're proposing leaves less and less of 48GB/s of bandwidth for the CPU and non-framebuffer reads for the GPU.

Original Xbox had 6.4GB/s unified memory and can perform 720P at 60fps so what is your estimate of PS3 capability and effectiveness of compression with 48GB/s available? Also this is not my proposal, it is PS3 design. CELL can access GDDR3 and GPU can access XDR.
Think of it as 48GB/s unified memory.

tema · Dec 17, 2005

What about 360? Why does PS3 have to do more things but not 360? XCPU is a hackjob and it have to compete with Xenos at half the PS3 bandwidth. 360 games have too many polygon edges for my next gen taste. Sony showed us what we will get from next gen, they get stoned. ms had a less than impressive launch and NON launch games hardly excites, they are excused! They should've been executed! We have our friend, aaron and his neverending calculations to downplay PS3.

see colon · Dec 17, 2005

tema said:
What about 360? Why does PS3 have to do more things but not 360? XCPU is a hackjob and it have to compete with Xenos at half the PS3 bandwidth. 360 games have too many polygon edges for my next gen taste. Sony showed us what we will get from next gen, they get stoned. ms had a less than impressive launch and NON launch games hardly excites, they are excused! They should've been executed! We have our friend, aaron and his neverending calculations to downplay PS3.

we all know PS3 is a myth created by foreign governments to keep a reputable US company down

my take on this is simple. i have full faith that if kojima and co. choose to, they could reproduce the quality of rendering seen in the MGS4 promo using X360 hardware. and why not? is it really that much more impressive than any tech demo released by ATI or nVidia (look at the toy shop demo or mad mod mike)?

in the end it doesn't matter anyway. games made exclusivly for the 360 will exploit the strengths of the platform, and the same goes for PS3. multiplatform games will look much the same regardless of platform, being constricted by the limitations of both. and fans of either system will point at the exclusives as proof positive that their choosen hardware is superior. hell, people are doing that now, and we're a year away from comparing any real software.

Deepak · Dec 17, 2005

Laa-Yosh said:
Yeah, what, 640 * 540 scaled up to 1920 * 1080i?

Hence that <wink>.

TurnDragoZeroV2G · Dec 17, 2005

ihamoitc2005 said:
Question is not if texture is needed but what proportion of total shader cycles is required in typical situation. Maybe a developer with real experience can provide us with this answer.

If there is one single filtered texture, in a single second, then RSX fails to sustain peak theoretical shader operations, even if it's 100% efficient. Since textures are needed to have useful shading power, then we should consider realworld performance to be in the range of max texture usage and no texture usage, and not at either end, correct? (average, as you've said). However, looking at R520 and G70, how often is it that the latter seems to gain twice as much speed at all? Even when there are fewer textures, in how many situations does it seem like that second ALU can actually be used, effectively? Some games, like F.E.A.R., have a higher than 3:1 ratio for arithmic to texture ops already, yet the lead that G70 has matches fairly well with the advantage it has in fragment pipes x clockrate.

ihamoitc2005 said:
My understanding of Aaron's comments was that bandwidth is insufficient for sufficient fill-rate not just for peak fill-rate but for simply rendering at same quality as Xenos. If you look at his post then it seems this is what he is trying to say. ROP count is secondary to his main statement: "The fill-rate of RSX will be severly limited by the memory bandwidth available."

You misinterpreted, then. He said that half of the ROPs were pointless because it didn't have the bw to support them. This has nothing to do with sufficient fillrate, but rather peak fillrate (and how realisitc and approachable it is). And ROP count is integral to his main statement, since it DEFINES what the fillrate is. Do not separate them. The statement that it will be limited has additional and futher implications, beyond simply the 8 vs. 16 ROPs, but the main point is that there's no way it has bandwidth to support 16 ROPs in a realistic situation (i.e., with textures and geometry). Once it's down to 8 ROPs there's still the possibility of running into bw issues, but it is far less severe.

ihamoitc2005 said:
With Heavenly Sword and other real-time demos we know fill-rate is sufficient to equal or be better than reference platform so therefore it is not a limitation. As for extra ROPs, if included, maybe it is for SSAA.

Er.... sufficient for videogame != sufficient for peak fillrate (which is what the original argument stemmed from: Xenos vs RSX fillrate. In which case, Xenos has exactly the amount of BW it needs to support 4 Gigapixels w/ 4 samples each at 32bpp. 4billion pixels x 4 samples each x 64 bits (color + z/stencil) x 2 (read and write) = 256GB/s)

ihamoitc2005 said:
What problems do you feel exist with organization of "nV" pipelines? Real world performance of G70 is excellent and many believe it is extremely effective and proven design.

Not necessarily problems, but it goes against what I see as the preferrred method: KISS. ATI has now, in the PC space, a decoupled TMU. The ALU setup consists of just one fully capable Vec3+Scalar ALU with a mini-ALU (which, according to them, is only there now due to having a good compiler for this arrangement). Then one ROP for that. Xenos takes it further by fully dissociating the TMUs, removing the mini-ALUs (AFAWK), and keeping ROPs in line with what can realistically be supported.

G70 has two ALUs, two mini-ALUs, does partial precision normalize, and the first ALU is tied to the texture address processor and is consumed in a texture fetch. You always use the two ALUs (hopefully, but not some of the other hardware),, but ultimately, how well does it work when using both for arithmic ops? Despite increased capability of first ALU to MADD capable, it hasn't really seen a performance jump. So far, seems inefficient and like a waste of hardware. Xenos design changes in regard to that, as well as unified shaders, just define Xenos as a more elegant solution IMHO.

It's powerful, effective, and proven. Doesn't mean it isn't broken, however. Unified shaders wouldn't exist for either nv or ati, otherwise. They both have part of what is probably the right design, however. Units that can serve as fragment ALUs, vertex ALUs, or TMUs. Of course, the latter might be extremely expensive in transistor costs, preventing GPUs from going in that exact direction (especially if texture growth stays the same while shader growth continues to increase relative to that).

ihamoitc2005 said:
I do not know of which charts you are speaking since I have not seen any with incorrect Xenos info, only correct Xenos info, but there is uncertainty of what is RSX precisely due to some incompatible specs with RSX such as dot-product and what changes to pipeline architecture are made as well as precise method of accessing CELL and XDR.

mostly-correct info. But it's PR-glossed. And missing the fact that each ROP can support 4 multisamples, so while fillrate for RSX goes to 17.6 at 2xAA, Xenos makes a second jump at 4x to 16Gsamples, making it quite comparable to RSX. And saying that RSX has 48GB/s of bandwidth to Xenos' 22.4, completely ignoring the 256GB/s for all the bw-consuming parts of framebuffer ops? Heh.

And why should we toss away other info that chart presents [2x (Vec4+scalar) + fp16 normalize and all subsequent data] because of one dot-product number which isn't entirely dependant on GPU calculations. The reasoning just doesn't make much sense to me. And as for prcise method of accessing Cell and XDR, well, I'm waiting for all the nity-gritty on Xenos too. Don't see that coming, even after the console's release (though it's undoubtedly floating around in one of those white papers that most other people seem to have :???:

)

ihamoitc2005 said:
How are you certain that "development time" is restriction on Xenos output quality? Do you have information that it is not memory bandwidth restriction or other hardware limitation such as unified shader inefficiency that causes low output for games such as PGR3 with false 720P with upscaling and 30fps?

Perhaps because it is? PGR3's team wanted to rewrite the engine once they got further hardware. Obviously there were problems or extra room there. And why shouldn't one of the problems be one that we've known about since Dave's article (which it seems you haven't read). Tiled rendering needs to really be done when you're making the game, rather than just tacked on at the end (if that would even be realistically possible, considering everything). W/o tiled rendering, you're limited to 720p, or lower with ANY level of AA on it. And, as far as everyone has said, that is precisely why PGR3 renders in the lower res (which just happens to be the right amount for 2xAA in the eDRAM framebuffer? Heh) Why would it ever be a memory bandwidth restriction?

Sorry, but there is 256GB/s of framebuffer bandwidth. With 2xAA, only half of that can even be used. Fillrate is hit (at 8Gsamples/s) first. Therefore the only bottlenecks can be in fillrate, RAM bandwidth, or something else. Fillrate is unlikely. At 60fps and 720p, even, that's over 70 pixel writes per output pixel? (someone correct me if I'm horribly off). 22.4 GB/s to main RAM is a possibility, but if so, then PS3 is in equal trouble, since it has to fit the framebuffer into little over double that. If there's any problem besides tiled rendering, then it's in the computational core. But, tell me, why would the unified shaders be inefficient? Why would the have continued development of unified shaders if they couldn't get reasonable performance? And would those performance losses, in the unlikely case they existed, be greater than performance gains due to automatic load balancing?

ihamoitc2005 said:
If Xenos is super efficient and capable as some say then 60fps graphics output for games like PGR3 should be "piece of cake" no? Remember at false-720P resolution, inexperience with tiling method cannot explain poor performance because no tiling is required and eDRAM is enough for entire frame.

You ignore the system as a whole to focus on the GPU despite earlier saying CELL could help RSX to do more vertex processing? What was Carmacks comment on the new CPUs? I believe he said Xenon was comparable to a 1.6GHz OoOE CPU if you just took your old code and put it on there. And how long did PGR3 have with final hardware? There's also the issue that the game is using extremely high-quality textures in a very large world, and Xenos has filtered texture fetch abilities that are outpaced by the X1800XT and 7800GTX 512MB (peak). Which could potentially be a problem. Also, doesnt PGR3 use rendertargets for the cubemaps used in reflections? And these, since you might use them on more than just one car, would multiply the number of times you're rendering geometry and such by a huge number. Of course, I don't know exactly how they're doing everything (and if that's correct) and how bad that would be if that's what they're doing.

ihamoitc2005 said:
He is referring to it but he does not know it. Entire purpose of LS design is to avoid "hitting" RAM. He is saying LS based design prevents SPU from "hitting" RAM, I am saying that is a good thing he is saying that is a bad thing.

No. You misunderstand what he's referring to.

ihamoitc2005 said:
Original Xbox had 6.4GB/s unified memory and can perform 720P at 60fps so what is your estimate of PS3 capability and effectiveness of compression with 48GB/s available? Also this is not my proposal, it is PS3 design. CELL can access GDDR3 and GPU can access XDR.
Think of it as 48GB/s unified memory.

Then you remove your fillrate argument presented earlier. Good to hear. It was somewhat out there in the first place. (But, it's PS3's design to split the framebuffer in half across two separate memory pools? Just because it can go into XDR memory doesn't mean you want it to, especially for framebuffer ops! Not only that, but last I heard, Xbox had some problems with SOME game with rain in it, due to having far less framebuffer bandwidth than PS2? particles = fillrate. Remember that.)

SynapticSignal · Dec 17, 2005

there's two point that are very important:

a) you can't, can't, add a bus bandwidh to another when we talk about backbuffer operation, if framebuffer is on ddr3, then rsx can use for BB Ops only 22 GB/s for aa, hdr, blending and others bb ops. that's the reality. period.
adding bus a + bus b + bus c etc have no sense at all when bus b, bus c cannot reach the framebuffer
remember that ddr3 and xdr mem are separate banks of memory, so people stop this stupid thing of adding the bus bandiwidh.

b) in XeCpu we can have 6 indipendent threads, IBM have modded the VMX units to permit this, with more local resource (registry mem) and a new set of directx function that don't touch the rest of PPE

this two thing are often ignored by 'some' person
remember

aaronspink · Dec 17, 2005

ihamoitc2005 said:
I notice that when you run out of things to say you become hostile in tone. Why is this? I have not insulted you or been hostile in my posts yet from beginning of your recent post you have been rude and in response I find it is difficult to not be in a bad mood as well.

its because its pretty pointless arguing with you. You have a limited understanding of the issues and always purposely mis-read things or flat out make things up. I can't ever tell whether you are just trolling or incompetent.

Whatever? That does not make sense. Simply saying "your are incorrect" is pointless. Better to make rational explanation of why you feel Xenon can magically run 6 full independent threads without architecture to support such capability.

maybe because it DOES have architectural support for such capability.

A)Processor reading directly from RAM must be avoided at all cost due to very very large performance "hit" so if Xenon suffers cache miss and reads from RAM it is a very large flaw.

Are you honestly this stupid to interpret the capability of a processor to directly access memory as a cache miss? Christ!

B)As for "double-buffering" of LS, you have issue reversed. Operation of LS is to prevent the very large flaw of Xenon design by insulating processing unit from slow RAM. This clever design choice similar to pawn-sacrifice in chess and is great reason to have party and celebration not complain.

Caches insulate the processor, local stores put a level of indirection into the memory path and add complications. Its fine to have either a private local store, or lockable cache, but giving up the capability to directly address memory is a major issue and flaw.

It seems you propose redesigned PS3 so RSX can only access GDDR3 for 22 GB/s peak memory bandwidth. I prefer original design by Sony-Nvidia where RSX has access to GDDR3 and CELL-XDR for upto 48GB/s peak memory bandwidth.

Well, sure, it will be great for the RSX to use all 48 GB/s. Those cells can just burn power all day. There is no realistic way for RSX to sustain even 4.4 GP/s in the PS3 design with the memory bandwidth available and allow cell access to any bandwidth at all.

Aaron Spink
speaking for myself inc.

Titanio · Dec 17, 2005

SynapticSignal said:
a) you can't, can't, add a bus bandwidh to another when we talk about backbuffer operation, if framebuffer is on ddr3, then rsx can use for BB Ops only 22 GB/s for aa, hdr, blending and others bb ops. that's the reality. period.
adding bus a + bus b + bus c etc have no sense at all when bus b, bus c cannot reach the framebuffer

Bandwidth consumption is not purely framebuffer related. RSX needs bandwidth for more than just the framebuffer. But what you are saying depends on whether or not you could move the framebuffer to a different memory location mid-frame or not. Frankly, I don't think it matters since consumption will be kept from that level i.e. requiring more than what one bank will provide.

SynapticSignal said:
b) in XeCpu we can have 6 indipendent threads, IBM have modded the VMX units to permit this, with more local resource (registry mem) and a new set of directx function that don't touch the rest of PPE

I think the point being made is simply that these threads don't have the VMX/PPE unit entirely to themselves. They are sharing the resource. It's just the same point Crytek's CEO was making previously.

1up claims Kojima says MGS4 demo could be done on 360

aaronspink

ihamoitc2005

scooby_dooby

ihamoitc2005

psp111

tema

ihamoitc2005

TurnDragoZeroV2G

weaksauce

ihamoitc2005

tema

see colon

All Ham & No Potatos

Deepak

B3D Yoddha

TurnDragoZeroV2G

SynapticSignal

aaronspink

Titanio

Similar threads