G70 Benchmarks @500/350

My apologies, I was assuming we were just using local SRAM here.

what do you mean by local sram ?


The rsx has a 128bit bus to 700mhz gdr ram . This gives the same bandwidth that the g70 has when its ram is clocked at 350mhz .

People are saying its not a great test because the ps3 has another bus to the rsx .

On the other side of this bus you find the cell processer. On the other side of cell is xdr ram . However cell needs this xdr ram so of course you will never get the full bandwidth from that ram for use with the rsx .

So they've taken up claiming that they will use the flexio bus between the cell and rsx to tile the framebuffer to the cells cache . HOwever this will end up starving the cell processer
 
DaveBaumann said:
Titanio said:
I was talking about multiplatform console games - console-only, as mentioned above. A multiplatform game with a PC version is little better, if not worst - well, depending on what platform was driving development at least. If it was PC, certainly it's at least as bad for this kind of analysis.

So, anything thats been seen using the UE3 engine isn't going to be particularly good on utilising next gen console hardware?

I think it'll be significantly outclassed by some games by the end of these consoles' lifecycles, unless they take some serious effort and time into optimising for each platform - and perhaps they will, more than the usual multiplatform title, given that it's more than "just" a game, it's an engine they want to sell and performance on each platform is a selling point. That aside, I think there will be better indicators of technical capability, though, later at least.
 
nAo said:
Oh my god, I don't how many times I have to repeat it :)
RSX DOES NOT NEED TO MAKE ACCESSES TO XDR RAM IN ORDER TO USE FLEXIO BANDWITH.
There are other memory-like resources on CELL CPU that RSX can exploit (and I'm not talking about procedural stuff, but just plain standard rendering)
Are you serious? Well, I suppose if you're going to bend over backwards, you might as well play Beethoven's Fifth while you're at it.
 
jvd said:
So they've taken up claiming that they will use the flexio bus between the cell and rsx to tile the framebuffer to the cells cache . HOwever this will end up starving the cell processer

I always considered this in terms of data being fed directly to the SPE's local sram, not the PPE cache, that's what I meant. Both the cache and the SPEs can snoop data off the EIB.

Inane_Dork said:
Are you serious? Well, I suppose if you're going to bend over backwards, you might as well play Beethoven's Fifth while you're at it.

Console developers often do :LOL: It'll be interesting to see what Sony and Nvidia have actually done to faciliate cpu<->gpu sharing..they've put it out there seemingly as one main focus of the system.
 
jvd said:
How is this going to help with the buffers ? The flexio wont help at all unless you use the cache to do that but then again your taking the wind out of the cell to feed the rsx
When I'm the dev I'm also the one who decides how to allocate caches and local stores and how to design every aspect of my game routines according to my choices.
SPEs have each 256 KB of local store, not every kind of algorithm needs it all, when you're doing streaming like processing all you need is enough mem to prefetch data in order to not stall SPEs.

Goign to take a huge amount of tiles to fit the buffers into that cache
I should remember to you that most of CELL onchip memory is not cache.
and of course your going to be taking the wind out of the cells sails . Thoses caches were made to keep the cell cpu fed . Taking those away to feed the rsx will greatly reduce the performance of the cell chip
Those memories are there to keep PPE and SPEs fed according to what developers design to run on them.
Most of the time you don't need full 256kb of local store memory just for some T&L/vertex shading on SPEs, you can live with much less than that.
(this is just an example)

Acert93 said:
And you are still talking about sacrificing the CELL CPU (if possible in real world scenarios) so you can make up a feature the 360 gets for free.
As I already explained to Jvd there are cases when you're not sacrifing that much, if much at all
This does not play to the PS3's strengths and you are talking about lobotomizing the CELL. What good is the extra bandwidth if you are going to be CPU limited?
How do you know if you're "lobotomizing" CELL if you don't even know what kind of work it's going to do?
CELL is designed to be good on a wide range of applications, so it tries to satisfy a lot of different requirements. You have tons of registers, tons of ultra fast local store, tons of bandwith. Not everything is going to use 100% of your features set even if it runs 100% efficient.
 
Acert93 said:
And you are still talking about sacrificing the CELL CPU (if possible in real world scenarios) so you can make up a feature the 360 gets for free.

This does not play to the PS3's strengths and you are talking about lobotomizing the CELL. What good is the extra bandwidth if you are going to be CPU limited?

What is the point of a powerful CPU if it has to babysit GPU tasks all day?
You shouldn't consider an embedded system as a CPU + a GPU, but as a system as a whole.

A few years ago developers didn't even imagine what they could do with 8 (or 32) threads. And to this days, not a lot of them know what to do with more then 2 (real) threads.

Actually, it wouldn't surprise me that a lot of first generation (and Multi support tittles) PS3 games have 4 or more SPEs idling.

Therefore, if any developer comes with interesting methods to run on thoses SPEs, I wouldn't see that as a bad point.

For instance, to use a concrete analogy, the PS2 GS has a broken implementation of mip-mapping, the same mip-mapping that was for free on Xbox/GC/DC. If the developers followed your POV, then all PS2 games would be shimmering mess to this day.
Acert93 said:
Seems pretty clear to me that PS3 developers would be better off spending their time playing to their strengths.
And that's what they'll do if they're using Cell's available raw power for visual calculation (HOS, Geometry Shader, Post filtering and other frame buffer effects...).
 
ninelven said:
jvd said:
You'd be wrong . There is no graphics chip on the market that can do hdr + aa . All we have is really the g70 and nv40 that can do hdr and both take huge hits with hdr on at 1024x768.
Well, judging from Dave's benchmarks, you'd be wrong, or did you mean 1280x1024?

There is no gpu that does hdr and fsaa . Dave's benchmarks clearly show hdr benchmarks and then fsaa benchmarks . Note that they are never both on in the benchmarks . Because neither the g70 nor nv40 can do hdr + fsaa . We really don't know if the rsx can do both at the same time .

What you do see however is splineter sell at 1027x768 (note less pixels than 720p ) dropping from 81.7 fps to 64.3 just by turning on hdr . The hit if you could turn on fsaa would only increase

The problem is that neither of these games are pushing next gen lvl graphics . They should only be used as a base , rsx will most likely perform better it most likely has more tweaks and has 50mhz more clock speed and can use some bandwidth from the other pool of memory . But at the same time the actual lvl of graphics will only get more demanding
 
nAo said:
Of course I'm, tell me why I shouldn't be
Because rendering to LS might not even work, let alone how constricting it makes a previously straightforward task. If it takes as much work as I'm thinking, I would expect you to cut your losses and go with the way it's intended to be used.
 
I always considered this in terms of data being fed directly to the SPE's local sram, not the PPE cache, that's what I meant. Both the cache and the SPEs can snoop data off the EIB.
your still going to starve a part of the cell chip and the more tiles you need the more your going to starve the bus between the rsx and cell and the less bandwidth there the less tasks your going to be able to do .

When I'm the dev I'm also the one who decides how to allocate caches and local stores and how to design every aspect of my game routines according to my choices.
SPEs have each 256 KB of local store, not every kind of algorithm needs it all, when you're doing streaming like processing all you need is enough mem to prefetch data in order to not stall SPEs.
Thats great . However your going to need alot of tiles to make a buffer fit in here and your still starving the spes and your still going to need to use this cache for each and every frame of the scene in your game , thus limiting the power of the cell by having limiting the tasks you need .

Not only htat but as demands shift in your engine you may loose that cache to the spe's needing it all for a task that comes up.

Those memories are there to keep PPE and SPEs fed according to what developers design to run on them.
Most of the time you don't need full 256kb of local store memory just for some T&L/vertex shading on SPEs, you can live with much less than that.
(this is just an example)
So we have this huge cell which sony tells us "Behold the power of the cell " and the devs are going to use it for t@l and vertex shading ? All of which the gpu can already do just so we can do more fsaa on the image ?

As I already explained to Jvd there are cases when you're not sacrifing that much, if much at all
Yet all your telling me is your going to limit what you can do to make up for a feature that was better designed in another console .

How do you know if you're "lobotomizing" CELL if you don't even know what kind of work it's going to do?
CELL is designed to be good on a wide range of applications, so it tries to satisfy a lot of different requirements. You have tons of registers, tons of ultra fast local store, tons of bandwith. Not everything is going to use 100% of your features set even if it runs 100% efficient.
Of course not , but demands change from scene to scene and the amount of cache and ram you need is going to change just the same . Same goes for the tiles you need to send from the rsx . Both will change each and every frame .
 
And please note when you read. That is almost twice the bandwidth the cell will have .

600mhz ram at a 256bit is like 1200mhz ram on a 128bit bus . The rsx has a 700mhz ram on a 128bit bus .

Look at the other benchmarks .

Also remember none of these tests have hdr + fsaa enabled at the same time .
 
ninelven said:
And please note when you read: I wasn't talking about cell or rsx or ps3 (and neither were you in that sentence).

and note that 1024x768 is not 720p . It falls between the two. Its also not fsaa + ansio and your looking at a game that is cpu limited at those reses
 
jvd said:
So we have this huge cell which sony tells us "Behold the power of the cell " and the devs are going to use it for t@l and vertex shading ? All of which the gpu can already do just so we can do more fsaa on the image ?
Why not? In terms of IQ, if PS3 is to match it's rival XB360 it needs to use it's resources to generate comparative games. XB360 gets 720p+4xAA by a system of CPU (100 GFlop) and GPU+eDRAM. Sony uses a system of CPU (200 GFLop) and GPU. Producing AA gobbles up half Cell's resources, leaving 100 GFlops left for other things...the same as XB360 ;)

Of couse, only rough illustrative figures. It's up to the devs to use the resources of a system however they choose. eg. In CON on the PS2, we've got full screen AA. Those resources could have been put elsewhere. In PS3, devs can either have 10,000 rocks in 1080p, or 5,000 rocks in 1080p+AA, perhaps.

Nothing wrong with an open and versatile system, I'm sure you'll agree. Though me personally, I'll be gobsmacked if someone uses SPE LS for backbuffer tiling!! :oops:
 
Inane_Dork said:
nAo said:
Of course I'm, tell me why I shouldn't be
Because rendering to LS might not even work, let alone how constricting it makes a previously straightforward task.
This a piece of cake, c'mon it can be worse than coding on the PS2 ;)
If it takes as much work as I'm thinking, I would expect you to cut your los
ses and go with the way it's intended to be used.
I don't think it takes as much work as you're thinking, if you can write a split screen multiplayer game you can do this too. Dunno why everytime people read about doing tiled rendering get so much scared.

Jvd said:
However your going to need alot of tiles to make a buffer fit in here and your still starving the spes
Maybe I'm going to starve some SPEs, or maybe I'm not, you can't know in advance.
and your still going to need to use this cache for each and every frame of the scene in your game , thus limiting the power of the cell by having limiting the tasks you need .
No offense Jvd but I'm not sure you're completely grasping what I'm talking about, please can you elaborate how I'm going to limit 'the power of cell'? I mean, it's time to get some detail here.
(and..IT'S NOT A CACHE!)
Not only htat but as demands shift in your engine you may loose that cache to the spe's needing it all for a task that comes up
I may, or I may not, it DEPENDS.
So we have this huge cell which sony tells us "Behold the power of the cell " and the devs are going to use it for t@l and vertex shading ?
I don't care what Sony says, maybe you do, I don't.
C'mon I just made an example of an application that does not need 100% of your local store to run fast.
All of which the gpu can already do just so we can do more fsaa on the image
Emh..I'm not talking about AA at all, now I don't know how did you get this idea.
Yet all your telling me is your going to limit what you can do to make up for a feature that was better designed in another console
What's the problem with that? I'm not here to push Sony or MS agenda, I'm here to discuss about technology.
As I already stated many times on this board I'm very excited about Xenos and I believe it's a much more interesting part. Too bad I'm talking about RSX and CELL in this very moment :)

Same goes for the tiles you need to send from the rsx . Both will change each and every frame .
What do you mean with 'tiles you need to send from the rsx'? what are you 'sending'?
 
Does the daughter die/eDRAM do anything else on the Xenos besides being used for backbuffering?

If not, how can the rest of the Xenos (which is what, 230 million odd transistors) be as poweful as the RSX? Does the added eDRAM sacrifice further pixel/vertex power in favor of a free way to always enable AA?

Or do I have no idea what I'm talking about? :?
 
ninelven said:
You might have a point there except that you said 1024x768:
jvd said:
both take huge hits with hdr on at 1024x768

jvd said:
Its also not fsaa + ansio and your looking at a game that is cpu limited at those reses
Did I ever say otherwise?

and i showed you the game i was talking about , which is clearly not cpu limited at 1027x768. It is you who brought up the game that was
 
Back
Top