Deferred Rendering on PS3 (Was KZ2 Tech Discussion)

Status
Not open for further replies.
this generation you have certain highly popular middleware engines that can chalk up 10k+ dynamic allocations per Frame all on their own.
Indeed.. dynamic allocations are typically only bad if you don't reduce fragmentation by utilising sensible heap partitioning strategies, and appropriate allocation schemes.

IMO, anyone doing thousands of dynamic allocations per frame via the wonders of a general 'malloc' implementation (dlmalloc etc) gets everything they deserve.

Dean
 
Indeed.. dynamic allocations are typically only bad if you don't reduce fragmentation by utilising sensible heap partitioning strategies, and appropriate allocation schemes.

IMO, anyone doing thousands of dynamic allocations per frame via the wonders of a general 'malloc' implementation (dlmalloc etc) gets everything they deserve.

Dean

So anyway, you haven't updated your blog in a while... Again!

:LOL:
 
From what I read, it almost seems that there is still a reserve on 32MB of VRAM for the OS???

Here is a huge ugly ***assumption warning***

Video Memory
~256MB = 12MB render buffer + 23MB VRAM free + 190MB VRAM used + 32MB reserve
 
Last edited by a moderator:
Well, the framebuffer is 36MB for starters... Throw in a few megs for the front buffer, so ~40MB instead of 12MB.
 
From what I read, it almost seems that there is still a reserve on 32MB of VRAM for the OS???

Here is a huge ugly ***assumption warning***

Video Memory
~256MB = 12MB render buffer + 23MB VRAM free + 190MB VRAM used + 32MB reserve

Well, the framebuffer is 36MB for starters... Throw in a few megs for the front buffer, so ~40MB instead of 12MB.

Taking that at face value (will say no more ;)):

~40MB render buffers + ~23MB VRAM free + ~190MB VRAM used = ~253MB
 
I know I REALLY shouldnt go there...
Mod : Then don't ;)
*snip*

Although I also think, with seemingly amazing culling and the deferred rendering, KZ2 is pushing boundaries to the limit in more ways than just post processing on SPU's.
 
Although I also think, with seemingly amazing culling and the deferred rendering, KZ2 is pushing boundaries to the limit in more ways than just post processing on SPU's.
GG seem to imply in their interviews that they can actually push the tech further to (i.e. their still Cell headroom)/
 
GG seem to imply in their interviews that they can actually push the tech further to (i.e. their still Cell headroom)/
These kinds of comments are to be expected and commonplace. No one wants to hear that a platform has no possible room for improvement. I'm pretty sure Sony would rip them a new one if they said as such. ;)
 
These kinds of comments are to be expected and commonplace. No one wants to hear that a platform has no possible room for improvement. I'm pretty sure Sony would rip them a new one if they said as such. ;)

Considering that syncing with the GPU was a significant amount of time (38%) for the CPU, I dare say its more then hyperbole.
 
These kinds of comments are to be expected and commonplace.
They're also invariably true. I don't think a developer has or ever will exist that will understand a hardware so well their first product maxxes it completely. The only possibility of that is if they never release anything until they had created their third game! Or perhaps the hardware is so generic or progressive and well understood they can apply years of prior knowledge to it.
 
Yes, the comments were definitely to be expected.

But it does seem very true. From looking at the way they're putting processes on the SPEs and how many they're using at any given time and for what, it would be very exciting to see the vast and countless combinations and different implementations that can be done in future games.
 
These kinds of comments are to be expected and commonplace. No one wants to hear that a platform has no possible room for improvement. I'm pretty sure Sony would rip them a new one if they said as such. ;)

I wonder, among all the questionable statements there, why have you picked the one that's most definitely true?
:cool:
 
GG seem to imply in their interviews that they can actually push the tech further to (i.e. their still Cell headroom)/

Likely they can. Are we to assume that all SPU jobs are fully optimized, either in terms of algorithm or in terms of low level optimization. My guess is that they are tuned only enough to make the product (limited by time), and next go around they will apply what they have learned and things will be even better.
 
Likely they can. Are we to assume that all SPU jobs are fully optimized, either in terms of algorithm or in terms of low level optimization. My guess is that they are tuned only enough to make the product (limited by time), and next go around they will apply what they have learned and things will be even better.

I agree that they can take everything they had discovered and apply to the next game with even more time they can further refine both their code, and I believe they would also understand what overhead they had left, together with what limitations the hardware has. Lastly they may even have ideas on better ways of approaching a task that would suit the hardware strengths.

I'm not sure if it was multiplayer only (the two to six SPE's uses) however in another interview they mentioned a 60 percent performance "max use" on each (I would imagine this max use would be mesaured rather than an abitary number). If this is the case and then this extra overhead may be used for other things (More advanced AI as discussed in the interview) or possible further GPU assistance.

The only question is when all 6 SPE's are in use feeding the GPU are there any stalls due to bandwidth, GPU I/O cache limits etc.
 
The only question is when all 6 SPE's are in use feeding the GPU are there any stalls due to bandwidth, GPU I/O cache limits etc.
Often, the more cores you're using, the more complex it becomes orchestrating processes between them and the more factors such as waiting times for memory transfer and such can come to play. Of course, time & R&D can potentially lead to higher efficiency with more clever/fine-tuned techniques.
 
Often, the more cores you're using, the more complex it becomes orchestrating processes between them and the more factors such as waiting times for memory transfer and such can come to play. Of course, time & R&D can potentially lead to higher efficiency with more clever/fine-tuned techniques.

So in the case of all six SPE's working together its highly unlikely that they do a straight SPE>GPU transfer. Coudn't they set up a program to say do direct SPE>GPU transfer using que flags for urgancy (or some other clever bit of programing).

i.e. each SPE does say 4 graphical tasks sends then to GPU flag with importance of each task for processing order something like how networks use flags for data proratisation across networks?
 
Status
Not open for further replies.
Back
Top