seismologist said:PC-Engine said:What happens if you're vertex bound?
allocate another SPE.
If you're vertex bound then all of your SPEs are already being used up.
seismologist said:PC-Engine said:What happens if you're vertex bound?
allocate another SPE.
Titanio said:Presuming you can, is it generally practical with the kind of bandwidth in the evaluation system? I think whilst you may be able to do it, having a lot more bandwidth may change your approach to accessing main memory vs the evaluation system..I think that's probably the author's main point..devs probably are not going to be using the xdr memory from the gpu in the same way they will be able to in the final box.
PC-Engine said:If you're vertex bound then all of your SPEs are already being used up.
seismologist said:PC-Engine said:If you're vertex bound then all of your SPEs are already being used up.
Then you're CPU bound.
PC-Engine said:seismologist said:PC-Engine said:If you're vertex bound then all of your SPEs are already being used up.
Then you're CPU bound.
...due to the non load balancing setup that you've proposed....
so a performance advantage in every situation that isn't the oddball case of being extreme vertex bound to the point where the CPU can no longer keep up.
By the way, next time you respond please back it up with actual numbers as evidence.
PC-Engine said:so a performance advantage in every situation that isn't the oddball case of being extreme vertex bound to the point where the CPU can no longer keep up.
In a unified shader architecture, the CPU doesn't need to keep up since everything is already being load balanced by the GPU.
By the way, next time you respond please back it up with actual numbers as evidence.
Numbers are not required in this kind of simple comparison.
seismologist said:Each SPE is capable of doing 32gflops plus an addition 32 for the GPU vertex shaders.
If these numbers sound right I'll let you do the math from here.
Titanio said:The SPEs provide peak performance of 25.6Gflops/s each. The vertex shaders in RSX should provide 44Gflops/s, assuming it's the same configuration as the RSX.
Total peak vertex processing capability: 223.2Gflops/s (7 SPEs + VS)
Total peak pixel processing capability: 264Gflops/s
Snyder said:Titanio said:The SPEs provide peak performance of 25.6Gflops/s each. The vertex shaders in RSX should provide 44Gflops/s, assuming it's the same configuration as the RSX.
Total peak vertex processing capability: 223.2Gflops/s (7 SPEs + VS)
Total peak pixel processing capability: 264Gflops/s
Sorry for this OT...but: You shouldn't take "3D acceleration" that literally.
Flops=Floating point operation per second...no /s needed. Sorry, I'm in some anal mood today...
seismologist said:PC-Engine said:so a performance advantage in every situation that isn't the oddball case of being extreme vertex bound to the point where the CPU can no longer keep up.
In a unified shader architecture, the CPU doesn't need to keep up since everything is already being load balanced by the GPU.
By the way, next time you respond please back it up with actual numbers as evidence.
Numbers are not required in this kind of simple comparison.
Though I'm not supposed to be working on Sunday I went ahead and quickly dug up the numbers. It appears that the combined unified shader peformance of Xenos are capable of 240gflops.
Each SPE is capable of doing 32gflops plus an addition 32 for the GPU vertex shaders.
If these numbers sound right I'll let you do the math from here.
Neeyik said:I was wondering how long it would before we started to see threads degenerate into tit-for-tat arguing. For God's sake - pack it in!
randycat99 said:Neeyik said:I was wondering how long it would before we started to see threads degenerate into tit-for-tat arguing. For God's sake - pack it in!
Perhaps, if you PM that person to reign it in, maybe we'll get another week or so of good behavior outta him? Given the sheer number of jabs he has left around in less than a week, it's amazing the rest of us have done so well to largely ignore him this far. What'dya say?
scificube said:If I am understand things there are two schools of thought here...
For the X360:
1. You can let Xenos handle all the load balancing
2. If you let the X360's CPU handle all the vertex processing and dedicate the GPU to only pixel processing you may see a net gain in the overall amount of processing power available to you.
caveat: If you become vertex limited you've nowhere to shift the load where as if if the CPU were not already doing vertex operations it could take some of that load.
If you allow Xenos to do all the load balancing...when it get gets limited it beyond what it can internally balance for you could pull from the CPU. I don't understand why it's not better to have the CPU dedicated to that portion of the vertex load that would cripple the Xenos and have this never occur. Concurrently when the Xenos is not approaching this limit it will automatically dedicate itself to more pixel processing and you could take advantage of this.
Is what I'm thinking in error somehow?
With the Cell:
The balancing act will really occur on the Cell. It could dynamically allocate more of less of it's resources to vertex processing, but really only in addition to the vertex processing capabilities of the RSX. As removing the load only ensures RSX's vertex pipelines sit idle.
Basically this approach can be used to remove the portion of a vertex load that would be in excess to what the RSX could do in a dynamic fashion or you could use Cell in combination with RSX to handle a constant vertex load greater than what the RSX can do alone in an efficient manner such that this is something that could be taken advantage of.
I hope I'm on track with that.
It would seem a good thing to always have the CPUs doing some vertex processing if the resources could be spared. Did I miss something?