Mintmaster said:Downsides compared to what?
From what we've heard a unified design takes up more die space than one that isn't unified. The performance could be better, worse, or the same, but it depends on what the workload is. Consequently, performance per mm2 of die size is just as much in the air.
So your question is essentially unanswerable.
Shifty Geezer said:US looks a smarter solution
Except that doesn't really mesh neatly with the notion of implementing them in very die size/power/performance critical implementations such as handheld devices, yet PowerVR SGX is unified and all indications are that ATI is taking a Xenos like architecture to handhelds this or next year as well. The strongest proponent of this line of argumentation is the company that doesn't yet have a unified design...Mintmaster said:From what we've heard a unified design takes up more die space than one that isn't unified.
MBDF said:Thanks... Are there any downsides to Xenos's unified architecture... or are they minimal?
ERP said:NV obviously currently don't thing so.
Dave Baumann said:Except that doesn't really mesh neatly with the notion of implementing them in very die size/power/performance critical implementations such as handheld devices, yet PowerVR SGX is unified and all indications are that ATI is taking a Xenos like architecture to handhelds this or next year as well. The strongest proponent of this line of argumentation is the company that doesn't yet have a unified design...
Edge said:What is the die size for the performance of those parts? You can hardly claims the benefits of something that does not exist yet, and cannot be used for comparison purposes. Saying they are going to use unified parts for that sector is not good enough, if performance is lacking due to a lower number of execution units, and corresponding data lines and associated registers.
While we are at it, how can anyone claim the superiority of Xenos, if no benchmarking metrics, or even game to game comparisons can accurately be made in the console sector? Yes, I realize that's what's being discussed here, but the end result from what I see is similar performance, with each part having different strength attributes for different circumstances.
Xenos is hardly a huge win because of unified shaders over a discrete part like Nvidia's 7900 series, and we all know that the 7900 series is meeting excellent die size and power issue requirements for the console space.
Dave Baumann said:Except that doesn't really mesh neatly with the notion of implementing them in very die size/power/performance critical implementations such as handheld devices, yet PowerVR SGX is unified and all indications are that ATI is taking a Xenos like architecture to handhelds this or next year as well. The strongest proponent of this line of argumentation is the company that doesn't yet have a unified design...
GB123 said:Nobody is claiming one is better than the other, it's a matter of which is more flexable.
These parts are low on execution units because of the die sizes/costs/power metrics they have to eat, ergo its completely counter productive to waste transistors on control if you could just end up with more units in place of those extra controls required. To make any sense in this market the cost of implementing the unified architecture has to provide more benefit than it takes away - and its hardly as though this market is crying out for complex shader architectures yet.Edge said:What is the die size for the performance of those parts? You can hardly claims the benefits of something that does not exist yet, and cannot be used for comparison purposes. Saying they are going to use unified parts for that sector is not good enough, if performance is lacking due to a lower number of execution units, and corresponding data lines and associated registers.
Its impossible to estimate these things. Not only that, but your getting things fed through for marketing that obviously has a particular agenda.xbdestroya said:Dave I hear what you're saying here and it's completely logical to point these facts out, but at the same time it kind of puts a white elephant in the room in that ironically there're probably few people more qualified to answer the question indirectly posed than you yourself.
And I mean if there are NDA issues that prevent you from answering I understand, but I just have to ask: what's your own estimate on the transistor budget allocated on Xenos to control/management logic? SGX and R600 and the rest of it aside, I imagine you must have a sense of what these transistor allocations are within Xenos.
Dave Baumann said:Its impossible to estimate these things. Not only that, but your getting things fed through for marketing that obviously has a particular agenda.
However, the more I look at it the more I believe the notion of actual unification is secondardy in terms of costs - it pretty much does the same things as a traditional architecture and it follows the same path, except that the the shader elements move to the same hardware element but then diverge again when they pop out the back. Whats more important with an architecture like Xenos is actually the command control - i.e. batch handling/juggling/sizes. Xenos, here bears many similarities with R520/580's architecture. Its impossible to tell if there is that much difference between a unified shader architecture forbeing unified, or just having that level of batch handling capabilities.
One thing that I do know is that there are deep divisions in ATI as to whether the R520 architecture should have gone unified or not.
No, I think they are more similar than they are different - they both handle many threads in flight at any point in time that can either be executed or slept dependant on whether data is ready, in order to (a.) handle latencies well whilst still (b.) providing low enough granularity to allow for good dynamic branching and small triangle sizes (and not impact vertex performance much in the case of Xenos).xbdestroya said:The bolded portion of your reply was more what I was speaking to with my comment on 'control/management' logic; wondering how you thought the dispatch logic might compare transistor-wise to something like the R580. But now that I think of it knowing explicitly R580's situation wouldn't really give any direct insights into Xenos since at the end of the day, they're still more different than they are similar.
Dave Baumann said:No, I think they are more similar than they are different - they both handle many threads in flight at any point in time that can either be executed or slept dependant on whether data is ready, in order to (a.) handle latencies well whilst still (b.) providing low enough granularity to allow for good dynamic branching and small triangle sizes (and not impact vertex performance much in the case of Xenos).
Again, I wonder if the actual "unified" control element is that costly at all - in fact, with a unified architecture, rather than command processors for both Pixel Shaders and Vertex Shaders, there is a single command processor that covers both shader types in a unfied architecture. Xenos has control elements per shader array, but then R580 has control elements for each of its 4 arrays of 12 pixel shaders.[/quote]xbdestroya said:That almost leads me back to my original question then, but if we don't know the transistor cost for dispatch we just don't know.
Of interest, PowerVR's site indicates that the lowest performance version of SGX can fit into a 90nm die size of less than 2x2mm! Thats obviously not any kind of comparison as it got far less in there and it will have smaller control elements because it has less to control.In that context, your previous post on the SGX lends the more insight