ATI to delay 80nm GPU migration?

Discussion in 'Architecture and Products' started by kemosabe, Mar 17, 2006.

  1. FrameBuffer

    Banned

    Joined:
    Aug 7, 2005
    Messages:
    499
    Likes Received:
    3

    I certainly hope that stays true, ..

    From the very launch of the 1x00 series 3 things have always stuck in my mind as I thought "My ultimate design video card would have these ________ features,."

    namely:

    1: 512bit ring bus memory architecture
    2: 3:1 Rv530 shader ratio
    and lastly
    3: Rv515 double Z

    from the very outset one could see that the rv560 (1600)'s 3:1 ratio was the future and it didnt take long to theorize (rightly so) that the 580 would basically be 4 RV530 quads each with 3:1 ratio of shader processors. So that gets us to where we are today .. however that leaves the doiuble Z offered from the Rv515 seemingly abandoned. IMHO the 515 is great in it's own right (small package, low power, Dbl-z, 8*32 memory possible).

    Correct me if I am wrong however iirc, the Rv530 (1300) did exceptionally well when 2x FSAA is applied. I believe it was said by some to be "free fsaa",.

    so add this "free fsaa" via the dbl-z (correct?) with the 3:1 ratio and ring bus memory controller and it would seem that you have a very forward thinking product that uses tech that is already in place.

    Again purely speculative on my part however I really HOPE to see a higher performance part (250-350 USD- 256bit, 12-16 pipe GDDR3) in a X-X-3-2 config.


    then again Im still waiting for dongleless 1800 Xfire and x800 series cards.
     
  2. Pete

    Pete Moderate Nuisance
    Moderator Legend

    Joined:
    Feb 7, 2002
    Messages:
    5,777
    Likes Received:
    1,814
    Couple corrections:

    1. It's not so much that the ring bus is 512b (as, AFAIK, all 256b DDR external bus cards have two 256b SDR internal paths) as that it seems to offer an instruction window to reorder memory access requests.

    2. RV530 = 1600, and it has double-Z (4-1-3-2) and the ring bus (albeit just 256-bit, as it's got just a 128-bit external bus). It can't have 8*32b memory crossbar as it's only got 128 external bits (4*32b) to work with.

    3. RV515 = 1300, and I don't think it's got dbl z (4-1-1-1). I know it doesn't have the ring bus. So, it's basically a 9600 but with with an SM3 featureset.

    4. I don't think double-Z/-stencil helps with AA, just with stencil shadows and the like. I may be wrong about AA, though, as I don't fully understand MSAA.

    Dunno about "free" 2xAA with either RV515 or RV530. I'm hoping with you that RV570 will be 12-1-3-2 and so kick ass at $300. A $200 8-1-3-2 RV560 sounds pretty good, too. What'll that do to RV530 and RV515, though? Will we lose the XT from the X1600 series to make room for X1700 Pros and the like, or are we going to see 256MB X1300s at $79, X1600s range from $99 to $179, X1700s range from $199 to $279, and X1900GTOs stay at $299?

    As for dongleless X1800 and X800 Xfire, don't hold your breath too long. In the comments section of FiringSquad's Oblivion benchmarks, Brandon mentioned he asked ATI for this, and they said--and I quote--"no." Maybe they're just throwing him off the scent, though. :)
     
    #62 Pete, Apr 7, 2006
    Last edited by a moderator: Apr 7, 2006
  3. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    maybe the RV570 is first intended for laptops and OEM, like NV41 and NV42.
    I thought NV41 would bring the 256bit bus to the "high midrange" segment, then I thought it could be the NV42, but I was wrong. except for the recent PCIe 6800GS, but it's NV clearing the stocks
     
  4. FrameBuffer

    Banned

    Joined:
    Aug 7, 2005
    Messages:
    499
    Likes Received:
    3
    thanks much for the clarification .. R--Rv--515, 580, 520, 530, 526, 570.. 1300, 1600, 1800, 1900.. Pro, XT, XTX, XL, GTO ... too many codenames and acronyms to remeber them all, nevermind keeping them straight .. lol

    edit: ok, this appears where I got confused: http://www.beyond3d.com/reviews/ati/rv5xx/index.php?p=07

    ATI's Radeon series since R300 have doubled their Z sample rate when multi-sample anti-aliasing (MSAA) is enabled, such that 2x MSAA is achievable in a single clock cycle, 4x in two and 6x in three. With the X1300 performance we can see this is still the case as the Z fillrate is very similar with 2x FSAA or without. X1600 already features a double Z rate and we can see that this is actually still maintained with 2x FSAA enabled, effectively quadrupling the Z fillrate with MSAA.

    I guess what I desire to know is the relation (if any) between Z sampling and MSAA, in particular the effect of dbl-Z and its effect on msaa rates.
     
    #64 FrameBuffer, Apr 7, 2006
    Last edited by a moderator: Apr 7, 2006
  5. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    :lol:
     
  6. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    Out making pix :grin:

    I'm highly disappointed by this rumour.

    Jawed
     
  7. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    215
    Location:
    Uffda-land
    And yet, while I'm not willing to buy off on this unreservedly yet, that xbit report very much had the flavor of reading it right off ATI roadmap docs. And we know those (from both IHVs) don't tend to understate such things.
     
  8. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    Yep, that's the way I read it. ATI has, in my opinion, a yield model where "every die" turns out fully functional with only clock-binning required - which requires multiple designs to cover every price point.

    Expensive to design all these dies. Risky to have so many dies competing with each other for space in the fabs' production lines. Complicated to maintain inventory (though hardly rocket science). All that hassle for good yields? Dare I say it, but if it's worth doing for good yields, then it could well mean that ATI's getting really good yields. Otherwise, well...

    Jawed
     
  9. kemosabe

    Veteran

    Joined:
    Jun 19, 2003
    Messages:
    1,001
    Likes Received:
    16
    Location:
    Montreal, Canada
    [armchair quarterback mode] I would have to agree that there are going to be too many ASICs out there come summer's end. RV515 and R520 would seem like the expendable ones to me.

    The OEMs are increasingly going for the integrated stuff in their systems, and the 4-pipe RV410 core of RS600/690 should satisfy the post-Xpress 200 demand. RV515 as a discrete low-end core could continue to feed the notebook market after migration to 80nm. By the time RV560/570 arrive, RV530 should have also shrunk to 80nm and could get a second life as a great sub-$100 low-end desktop offering. A few RV560 configurations for the mainstream and RV570 takes over the $250-$300 performance segment. Phase out R520 and drop R580 SKUs into $350-450 segment. High-end R580+ refresh for the enthusiast. [/armchair quarterback mode]. :smile:
     
  10. Pete

    Pete Moderate Nuisance
    Moderator Legend

    Joined:
    Feb 7, 2002
    Messages:
    5,777
    Likes Received:
    1,814
    I thought I had this stuff down, but I had to correct myself on the x-x-x-x designations. Friggin' complexity. This is why ppl like saying, "I get 5k 3DMarks." :)

    OK, don't quote me on this (seriously, no quote functions--burn this reply when you read it! ;)), but here's my guess based on re-remembering that MSAA uses the same color value but rejiggers the geometry samples. MSAA uses x times the geometry samples, thus requires x times the Z computations per pass. Both NV's and ATI's current MSAA HW implementations allow for up to 2x MSAA per clock, so higher levels (like 4x MSAA for both IHVs or 6x for ATI) require two or three cycles, respectively, to aggregate enough MSAA samples). Doubling the geometry samples apparently implies doubling the Z samples., so 2x MSAA per clock would seem to mean 2x Z per clock. NV, since the FX series (well, technically since NV2A?) could do double Z without as well as with MSAA. RV530, aka X1600, is the first ATI part (dunno about Xenos) to allow double Z without MSAA. But, according to Dave's fillrate testing, that double Z capability seems to carry over to MSAA rather than be superceded by it, in essence doubling the inherently double-Z 2x MSAA per clock. So X1600 seems to be the first part to yield 4x Z samples per 2x MSAA pass.

    If any of that makes sense to you, then maybe you're thinking this would translate to super D3 performance. I checked the AF/AA % hit breakdown I did for Rage3D's 6800GS review and indeed X1600 takes a roughly 20% performance hit from 4xAA while all the other cards (6800s anf X800s) take a 40% hit. That's a very naive interpretation, tho, as I'm not sure if that's due to X1600's apparently AA-independent dbl-Z, its "ring bus" memory controller, or some other factor (like fillrate:bandwidth ratios). I suppose checking X1800 or X1900 Doom 3 AA perf hits from another R3D review (as they're relatively unique in benching AA separately from AF) may help. If they take a 40% hit like the other architectures, then it's likely X1600's dbl-Z that's responsible. If, on the other hand, they drop just 20%, then I'll be revealed as the clueless nincompoop I am.
     
  11. FrameBuffer

    Banned

    Joined:
    Aug 7, 2005
    Messages:
    499
    Likes Received:
    3
    Originally posted by Pete The Wise: But, according to Dave's fillrate testing, that double Z capability seems to carry over to MSAA rather than be superceded by it, in essence doubling the inherently double-Z 2x MSAA per clock. So X1600 seems to be the first part to yield 4x Z samples per 2x MSAA pass.

    So in essence does this mean then that the Rv530 (x1600) delivers fsaa 4x at the performance hit of what would normally be 2x fsaa or is it a quality issue where 2X FSAA yeilds higher (4X) AA ?? By the sound of it , Dbl-Z is a performance issue rather than a delivering higher quality. Sorry if I mis-interpreted your post.

    Looks like I need to do some homework as well..
     
  12. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    The way I understand it, X1600 takes less hit from 2xAA if it would be Z limited on X1800 for example
     
  13. AndrewM

    Newcomer

    Joined:
    May 28, 2003
    Messages:
    219
    Likes Received:
    2
    Location:
    Brisbane, QLD, Australia
    Not sure if NV2A does it, NV2x does not.
    R300+ does double-z when doing MSAA, but not without. NV3x+ does double-z with or without MSAA, but not double-double-z when doing MSAA. So, really, NV3x can utilise it's extra Z (stencil) hardware when MSAA is not enabled.

    :)
     
  14. Pete

    Pete Moderate Nuisance
    Moderator Legend

    Joined:
    Feb 7, 2002
    Messages:
    5,777
    Likes Received:
    1,814
    Way to set me up for a fall! :lol: Really, it's more like Pete the Sponge.

    Don't be sorry, be suspicious! :) Remember, I'm not sure what I told you was entirely correct. I'm just letting you in on my thought process. I do so with good intentions more than good understanding.

    Edit: Be suspicious of the entire paragraph below. See Xmas', 3dcgi's, and Dave's corrections on the next page.

    I doubt RV530 gives us 4xAA at 2xAA speeds. Remember, 4x MSAA by definition requires an extra clock cycle than 2x, another pass through the ROPs (or wherever it is those two samples per clock are hiding), so you're losing some performance right there. I'm sure bandwidth factors in, too. Oh, and to be clear, we're talking about MSAA, not FSAA. MSAA doesn't touch (or at least alter) every pixel in the scene, so I'm not sure I'd call it "full scene/screen" (like straightforward SSAA or the V5's AA).

    Thanks, Kaotik. That makes sense. It's a question of what's the limiting factor, and RV530's double Z can shift the bottleneck to another part of the GPU.

    Well, sure, Andrew, if you want to put it in readable form. Heh. I recall Deano saying that NV2A had double Z, so I thought I'd throw it out there. (I wonder if all MSAA implementations have double Z, starting with GF3/NV20, in which case NV2A would have it by default.)
     
    #74 Pete, Apr 9, 2006
    Last edited by a moderator: Apr 18, 2006
  15. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Apart from stencil fillrate limiting situations 2xMSAA is in a relative sense "for free" on GPUs for years now.

    The only other design I'm aware of that is capable of single cycle 4xMSAA is Falanx' Mali for the PDA/mobile market.
     
  16. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    http://www.beyond3d.com/forum/showpost.php?p=663192&postcount=14

    Based on fillrate tests:

    X1600XT

    [​IMG]

    X1800XL

    [​IMG]

    7800GTX

    [​IMG]

    I'm in too much of a rush to gather together the X1900XTX, 7900GTX results - I'm sure someone can manage.

    Also, we should have Colour+Z measured fillrate results for the new GPUs from B3D's reviews.

    Jawed
     
  17. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I'd prefer something stencil fillrate limited like Fablemark.
     
  18. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,493
    Likes Received:
    474
    Correct, it is a performance optimization.
     
  19. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,344
    Likes Received:
    176
    Location:
    On the path to wisdom
    Why would it need an extra clock cycle "by definition"?
     
  20. trumphsiao

    Regular

    Joined:
    Jan 31, 2006
    Messages:
    285
    Likes Received:
    11
    Well it occurs to me that RV560/RV570 is a 4xMSAA Capable GPU with dinky performance hit . Damn , I'm eagerly awaiting All these three SKUs.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...