XDR sometime?

Discussion in 'Architecture and Products' started by fehu, Nov 17, 2006.

  1. fehu

    Veteran Regular

    Joined:
    Nov 15, 2006
    Messages:
    1,460
    Likes Received:
    391
    Location:
    Somewhere over the ocean
    Almost everything is going serial on the pc, but in the graphic realm parallel memory is still the king.
    Why?
    Using some xdr variant maybe very useful to reduce the "bitness", and make performing but less expensive board and gpu.

    There's something that I miss? Isn't serial the future even for the graphic card?
     
  2. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,489
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    The problem is [almost] not in the technology or its implementation but the "business" politic of RAMBUS as an IP holder. And the current DDR DRAM platform is well [enough] established and developing, IMHO, just read GDDR4 spec's. Sure we all want full-blown serial interface to conquer the GFX arena, as the PCIe did and does, but for the memory tech's it takes more time, as always.
     
    #2 fellix, Nov 17, 2006
    Last edited by a moderator: Nov 20, 2006
  3. complexmind

    Newcomer

    Joined:
    Nov 11, 2006
    Messages:
    9
    Likes Received:
    0
    :roll: :roll: no no no
    the latency of XDR is very high.Only when it is on 4Ghz,XDR can catch up with DDR 400.
     
  4. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Cost of memory?
    XDR isn't any more serial than DDR or multilane PCIe. It just allows higher clock rates because of differential signaling and per-bit phase compensation but it's still a completely parallel bus.

    PCB cost would be slightly lower: contrary to DDR, you have differential signalling for the databus, so you don't reduce the amount of pins and traces. But also contrary to DDR, the trace lengths only have to be matched per differential pair, so PCB layout is much easier and thus a more compact, lower layer PCB should be possible.
    GPU cost could be slightly higher though. The trace matching on the PCB for the complete databus in DDR guarantees that arrival times are ok when they enter the chip, so a signal PLL/DLL per interface should be sufficient. For XDR, some kind of magic is required to align data signals, so you need a per pin PLL or DLL to align arrival times. I don't know how cost in term of area this will be, but it won't be free.

    Nah. Irrelevant for GPUs: RDRAM had increased latency, but XDR has mostly solved that problem. And latency is only important for CPUs anyway.
     
  5. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,902
    Likes Received:
    218
    Location:
    Seattle, WA
    Latency is important for GPU's, too. You need to add on-die cache to compensate for any increase in latency.
     
  6. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    In the case of XDR, we're talking about 1, maybe 2, core clock cycles, which is only a few percent more than the DDR latency. (*)

    This may sound counterintuitive, but it's almost always a bad idea to increase cache size to reduce incremental latency: once a cache is already in place, increasing the size will obviously help to reduce the cache miss rate, but you need to increase it significantly for it to have an effect and even then it will never help to reduce the fetch latency of data that's not already in there.
    During the system architecture phase, the cache design and sizing is almost always separated from latency mitigation design, unless your system has single threaded components that block on read.

    In a multi-threaded system, the way to reduce the impact of latency is to increase the amount of outstanding reads. In practise, this usually requires little more than to increase the depth of the read data FIFO by the amount of the additional latency cycles.

    Edit: (*) When I say a few percent, I don't mean as measured on the IO pins of the chip, but the total average latency, as measured from the time the read is issued to the time it arrives at the place from consumption.
     
    #6 silent_guy, Nov 20, 2006
    Last edited by a moderator: Nov 20, 2006
  7. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,902
    Likes Received:
    218
    Location:
    Seattle, WA
    Well, by cache I really just meant general on-die memory. In order to keep hiding the latency, you need to store more in-flight pixels, as you said, which requires more storage capacity on-die. I suppose my wording wasn't the best.
     
  8. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,805
    Likes Received:
    473
    The increase in context data is a bigger pain than the input data.
     
  9. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,889
    Likes Received:
    2,304
    then again if you replaced pixel shaders with random number generators you wouldnt need any onboard memory

    /davros once again solves one of the major problems facing the industry ;)
     
  10. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    When talking about a relatively small increase in latency, not increasing the context data wont be a problem if the fetch pipeline is already saturated with requests. (In GPU that would probably mean shaders with a low number of registers and a lot of texture fetches?) But yes, with context size staying the same, the point at which you're going to see bubbles in the execution pipeline will come quicker. I guess there's no free for all...
     
  11. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    Some 3.33 ns latency at 3.2 GHz for XDR, while we have min. 11,25 ns for DDR2-533. Next time check your sources a bit better before you post :)

    Back on topic, I said that over a year ago. The problem is (besides the politics issues) that it's still too costly in comparison and the availability isn't all that great either.
     
  12. fehu

    Veteran Regular

    Joined:
    Nov 15, 2006
    Messages:
    1,460
    Likes Received:
    391
    Location:
    Somewhere over the ocean
    when ps3 production will go full throttle the xdr production will rise toghether, and in a year will be more affordable
    then if at least a manufacturer will decide to support it the production will be increased to an acceptable level

    i want to belive
     
  13. nonamer

    Banned

    Joined:
    May 25, 2002
    Messages:
    564
    Likes Received:
    7
    Unless I'm missing something, XDR does in fact seems to be the way to go. It offers basically double the per pin bandwidth. The PS3's main memory offers 25.6GiB/s at only 64-bits wide, and that's with the lowest speed version of XDR DRAM available. Potentially, XDR can deliver up to 204.8GiB/s with a 256-bit bus if I'm reading the specs right.
     
  14. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,805
    Likes Received:
    473
    Even column access ain't that fast.

    Can't readily find random access numbers for GDDDR3/4 but for XDR you are off by a factor 10.
     
    #14 MfA, Nov 20, 2006
    Last edited by a moderator: Nov 20, 2006
  15. ShootMyMonkey

    Veteran

    Joined:
    Mar 21, 2005
    Messages:
    1,177
    Likes Received:
    71
    XDR at 4 GHz effective means the DRAMs themselves are at 500 MHz, and when you've got DRAMs clocked that high, latency has to suffer, which is why it's only then equal to DDR1 at 400 MHz effective (200 MHz DRAM). If you notice, DDR-2 at 600 is about where the latencies even out with DDR at 400 for the same reason (that's for average latency, not best or worst-case). Nonetheless, the latencies are still better for XDR vs. say, GDDR-3 or 4, and putting it in the graphics arena was kind of what the OP was asking.

    Ummm... no... First of all, the number is not 3.33 ns for 3.2 GHz, it's 2.5 ns. And secondly, that is not latency of data reception -- no memory architecture will ever exist that is that fast, at least not until we crack that whole space-time thingy, that's request latency (which is really more like throughput) -- which is to say that you can issue new requests with only that little delay (basically one DRAM cycle) between each request. Doesn't mean you'll actually receive that request that fast.
     
  16. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    And that's better with DDR2 how exactly?
     
  17. Basic

    Regular

    Joined:
    Feb 8, 2002
    Messages:
    846
    Likes Received:
    13
    Location:
    Linköping, Sweden
    3.2GHz XDR can transfer 1.6Gb/s per data pin (it's differential).
    900MHz GDDR3 (what's on a GF 8800) can transfer 1.8Gb/s per data pin.
    According to xbitlabs, Samsung has 1.6GHz GDDR4 running in their labs (3.2Gb/s per data pin).

    So GDDR3/4 beats it in bandwidth per data pin.
     
  18. ShootMyMonkey

    Veteran

    Joined:
    Mar 21, 2005
    Messages:
    1,177
    Likes Received:
    71
    Ummm... noooo... 3.2 GHz XDR sends 3.2 Gb/sec per pin. That's not what differential means. Differential means that there are two pins for the same "bit-lane", one sending the data, and one sending the inverse both at the same speed (in this case 3.2 Gb/sec), which means you can verify the correctness by making sure that the "real" pin and the "inverse" pin agree. It's essentially insurance since the voltage swing of XDR is so small (0.2 V).

    GDDR-3 and 4, btw, are also differential signaling. IIRC, so is DDR-2.

    Again, wrong, because this is all very mixed up on which numbers refer to what -- XDR's quoted speed is almost always the *effective* data rate. The bus clock is half that (which is true for XDR as well as ALL DDR mem platforms). With GDDR3, the quoted speed is almost always the bus clock. Rarely does anyone tell you that. And NO, GDDR does NOT beat XDR in data rate per pin.

    I could confuse things further by getting into the DRAM clocks, which on GDDR-3 is half that of the bus clock, while on XDR it's 1/4th. So by that, 400 MHz GDDR-3 DRAMs signal at 1.6 Gb/sec per pin, but 400 MHz XDR (and GDDR-4, I think)DRAMs get 3.2 Gb/sec per pin.
     
    #18 ShootMyMonkey, Nov 21, 2006
    Last edited by a moderator: Nov 21, 2006
  19. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    GDDR and friends are not differential.

    XDR has 16 differential pairs at 3.2 Gbps per pin (and up.)
    GDDR4 has 32 pins at 2.2-2.8 Gbps per pin.
    For now, GDDR4 has a slight bandwidth advantage per chip, but the bus is harder to route.

    GDDR4 is/will be high volume and thus cheaper. In the end, that's all that counts.:wink:
     
    Jawed likes this.
  20. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    939
    Likes Received:
    35
    Location:
    LA, California
    silent_guy, what are your thoughts on XDR2? And with regards to volume - if either NV or ATI were to adopt XDR1/2, wouldn't that pretty much turn it into a volume product (who else besides ATI/NV are using GDDRx)?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...