Luminescent
Veteran
The overarching design goal that permeates both architectures, Xenos and R520/R580 alike, is that of dynamically maximizing available resources according to processing load. Both architectures have been prepared to realize what I call "dynamic efficiency" in subtlely different ways.
Since we're many of you are familiar with Xenos' and R520/R580's architectural configuration, I ask: how do the design protocols of each allow for "dynamic effieciency" and how do their approaches differ?
Correct me if I'm wrong, but fundamentally, R580 and R520 offer a set of 8 MIMD vertex processors and 4 SIMD pixel processors operating independetly, in MIMD fashion, fed by a load balancing thread dispatch unit (scheduler). It offers a large register space to allow for many values for instructions that are in flight. I'm not sure whether the same scheduler feeds both the vertex and pixel pipes or whether there is a scheduler dedicated to each MIMD node, but if this is the case, R580 offers 4 scheduling/dispatch units for the pixel processors and 8 for the vertex processors (if each vertex unit operates as an independent unit). Within each SIMD pixel processor, R520/580 has 4 texture samplers and address processors available to it, although I'm not sure if they can operate independently of the ALU processors within the SIMD quad, with instructions issued to them independently in their own thread (although it would only be 1 thread for all four). In addition, R580 offers a ring bus and a programmable and dynamically adaptable memory controller with 32-bit granularity. The batch size allocated to each pixel thread in flight processors is relatively small, which I know affects the architectures ability to make efficient use of its units and handle dynamic branching.
How does Xenos' processing configuration differ from the above? What are the ramifications Xenos' approach as opposed to R520/R580's?
Since we're many of you are familiar with Xenos' and R520/R580's architectural configuration, I ask: how do the design protocols of each allow for "dynamic effieciency" and how do their approaches differ?
Correct me if I'm wrong, but fundamentally, R580 and R520 offer a set of 8 MIMD vertex processors and 4 SIMD pixel processors operating independetly, in MIMD fashion, fed by a load balancing thread dispatch unit (scheduler). It offers a large register space to allow for many values for instructions that are in flight. I'm not sure whether the same scheduler feeds both the vertex and pixel pipes or whether there is a scheduler dedicated to each MIMD node, but if this is the case, R580 offers 4 scheduling/dispatch units for the pixel processors and 8 for the vertex processors (if each vertex unit operates as an independent unit). Within each SIMD pixel processor, R520/580 has 4 texture samplers and address processors available to it, although I'm not sure if they can operate independently of the ALU processors within the SIMD quad, with instructions issued to them independently in their own thread (although it would only be 1 thread for all four). In addition, R580 offers a ring bus and a programmable and dynamically adaptable memory controller with 32-bit granularity. The batch size allocated to each pixel thread in flight processors is relatively small, which I know affects the architectures ability to make efficient use of its units and handle dynamic branching.
How does Xenos' processing configuration differ from the above? What are the ramifications Xenos' approach as opposed to R520/R580's?
Last edited by a moderator: