PS3 GPU not fast enough.. yet?

EndR

Regular
It comes from The INq, sorry about that... but IIRC, there were some talks about a "spec change" for the PS3 and that something might have been "up:ed" or "down:ed".
Anyways here goes..

SOME VERY INTERESTING tidbits have emerged about the PS3 GPU during my flight to Japan yesterday. It seems like the second gen dev kits were running nowhere close to full speed or spec. How close were they? Fairly.

The disturbing part is that the slide I was shown had "Current DEH's aren't final spec or speed" in bold letters. Speed, OK, but not final spec at this point in time leaves precious little room for debugging before the console release. On a different note the current ones are running the RSX core at 420MHz with 550 expected for launch. Memory is set at 600MHz with 700 hoped for as final.

The rest is here:
http://www.theinquirer.net/?article=32159
 
EndR said:
It comes from The INq, sorry about that... but IIRC, there were some talks about a "spec change" for the PS3 and that something might have been "up:ed" or "down:ed".
The actual Devkits RSX chips aren't indeed running at 550MHz, but ~100MHz slower, therefore it's logical to expect the final parts not to have the same clockspeeds.
The part about the specification is just the Inq spreading speculation as if it was a rumor. In other words, it's just the usual Inq stuff.
 
This is really a filler story.. so what if pre-release dev kits are not running at full speed - this happens all the time.

[/Yawns at Inq]
 
Meh. The memory speed is known (i.e. from what I have read it currently was/is 600MHz), but 700MHz GDDR3 wont be an issue seeing as the 360 has it and it was released in 2005. As long as they have good ventilation they should be fine.

As for the GPU clock, if the rumor is true, it is not a big deal. The initial Xenos chips were like 350MHz in June/July '05 with 550MHz chips with working eDRAM did not get delivered to developers until August '05. PS3 devs have had access to NV's SM3.0 tech for a long time now and on the PC side do have access to 550MHz chips to benchmark.

While interesting, I would not associate it with any upgrade/downgrade talks. There were people rumoring more memory months ago, but everything seems pretty locked in now. The "downgrades" are also known (fewer ethernet ports, fewer USB, only 1 HDMI, etc) and did not impact the core performance parts. Cell, RSX, and the memory all look to be on track for the E3 2005 targets.

I would be shocked if Cell, RSX, or the memory underwent any changes.
 
Hey Vysez do devs code for the current 400+ RSX speeds with higher framerate while in development; or do they code for the higher 550 Mhz speeds with lower framerates?

For example would the Evolution devs making Motorstorm or NT making Heavenly Sword expect a boost in graphics, physics, particles, blah blah blah once they receive the real RSX or are they already coding for it and not expect any boost in anyway?
 
I think he's just poking fun at the wording in the original article...
SOME VERY INTERESTING tidbits have emerged about the PS3 GPU during my flight to Japan yesterday.
(My emphasis.)
 
anyone else see this article?

http://www.theinq.com/?article=32171

let me know if it's posted elsewhere.
This is the important picture.
Memory performance:
Cell (Read) - Local: 4MB/sec

As for triangle setup, that would be down to unified shaders I'd guess. Hmm. Can't find any comparitive specs for other desktop hardware. Guess I'm not looking hard enough.

How exactly do you define 'local' memory on the cell chip anyway? I'm a tad confused
 
Last edited by a moderator:
Graham said:
let me know if it's posted elsewhere.

Nope, never seen that! Not even in the PDF none of us have :LOL:

Btw, for that slide:

Main Memory = XDR
Local Memory = GDDR3

EDIT: Btw, you flip flopped the numbers to a degree.

CELL (Read) - Local Memory: ~16MB/s
CELL (Write) - Local Memory: ~4GB/s

You got the 4 and 16 flip flopped, but got the MB part right.

As for the triangle setup rate it is not due to the unified shaders but is more related to the setup engine. Xenos does 1 vertex per cycle (one every 2 cycles when tesselating I believe); RSX seems to do one every 2 cycles. The peak triangle rate for the vertex shaders on each is higher than their setup rate though (of course vertex shaders can do more than just that). I am sure at some point the PDF I don't have will become more public knowledge and people will begin talking about it. Until then you can ignore me...

For those of you that believe in religions with karmic tendencies, scoops like this meant one of two things, the wings of the plane are about to fall off and I am going to die in a fiery ball, or worse yet, the movie selection will be worrisome. Cell memory access appears to be broken, RSX has half the triangle setup rate of the ATI chip in XBox360, and the true horror, Big Momma's House 2 and a Queen Latifa movie.

:LOL: at the funny plane humor. As for the "Cell is broken" :rolleyes:
 
Last edited by a moderator:
I would guess its because the 'Local' RAM is the GDDR3 (like was pointed out) and for the CELL to read and write from it it basically has to go through two memory controllers... First its own.... then make the request to thr RSX's memory controller... then the RSX has to fulfill it...

The 16MB seems really low to me though.....
 
Acert93 said:
Nope, never seen that! Not even in the PDF none of us have :LOL:
CELL (Read) - Local Memory: ~16MB/s
CELL (Write) - Local Memory: ~4GB/s

:LOL: at the funny plane humor. As for the "Cell is broken" :rolleyes:

Those numbers to me look like something from a broken design.
 
mozmo, Local Memory != Local Store. Local Store would be the 256K that each SPE has; the "Local Memory" in that slide is RSX's local memory, i.e. the 256MB of GDDR3. At least I believe that is the termonology being used there.

EDIT: What happened to the rest of your comment! Bah, anyhow...
 
Last edited by a moderator:
hehe i changed it cause the slides are confusing. I think it is referring to cell being able to read the GDDR memory pool, still that's pretty crappy. Looks like you gotta do a lot of redundant memory copying back and forth from system memory to leverage the cell for some advanced read from gpu framebuffer/depth buffer effects. Same goes for procedural textures/models being generated from cell and fed into the RSX. Looks like a pc where the system memory is the bottleneck when it comes to the cpu/gpu working together.
 
Last edited by a moderator:
Acert93 said:
mozmo, Local Memory != Local Store. Local Store would be the 256K that each SPE has; the "Local Memory" in that slide is RSX's local memory, i.e. the 256MB of GDDR3. At least I believe that is the termonology being used there.
Acert, any ideas about the inclusion of the "no, this isn't a typo..." comment on the slide?

-aldo

Edit: My bad. Just Sony emphasizing that RSX should be focused on the Local Memory.
 
Last edited by a moderator:
aldo said:
Acert, any ideas about the inclusion of the "no, this isn't a typo..." comment on the slide?

Well, the "this isn't a typo" is to make sure people know they did mean MB instead of GB. The comment is just pointing out that you really shouldn't try reading directly from the GDDR3 with Cell. But there are other ways around this (possibly RSX rendering to a target in XDR?) I am sure the devs and more technically inclined here will have some substantial comments later today.

mozmo said:
Looks like a pc where the system memory is the bottleneck when it comes to the cpu/gpu working together.

I don't think it is that bad. RSX can read/write from/to XDR fairly fast (as expected) and Cell can write to RSX as well. The only surprise is the limited bandwidth Cell has to the GDDR3. Yet in context it is not surprising (to me at least): The GDDR3 is going to be pretty saturated with the framebuffer so Cell reading from GDDR3 doesn't make a lot of sense in a lot of ways, at least not as a common function. Between Cell being able to write 4GB/s to the GDDR3 (for stuff like geometry and procedurally created geometry and textures) and Cell being able to write to RSX directly it does seem to work fine for the more commonly expected tasks. I am sure some were getting their hopes of for some new/unique ideas, but it looks like the memory allocation fits well into the idea that GDDR3 is for graphics and XDR can be used for Cell and some graphics.

But like I said, we will probably get a lot better/accurate comments later.
 
Acert93 said:
EDIT: Btw, you flip flopped the numbers to a degree.

You got the 4 and 16 flip flopped, but got the MB part right.

ops. yeah put that down to my wandering head today.

So would that mean copying in vertex/texture/whatnot data isn't as straight forward as one would hope?
How, for example, does warhawk get the ray-traced clouds into the gpu for rendering?
[edit]
ok never mind I just needed to get over my head cold and think :p
 
Last edited by a moderator:
Graham said:
How, for example, does warhawk get the ray-traced clouds into the gpu for rendering?
How about the same way everything else gets there - namely by GPU initaing a read from memory.
 
Back
Top