512MB GeForce 7800 GTX

Discussion in 'Pre-release GPU Speculation' started by KimB, Oct 31, 2005.

Thread Status:
Not open for further replies.
  1. Subtlesnake

    Regular

    Joined:
    Mar 18, 2005
    Messages:
    347
    Likes Received:
    126
    It depends what you consider "working". This is what Dave had to say:

    "The nature of the issue it suffered means that development on X1800 was next to impossible. When I spoke with Huddy just before the launch he bemoanded the fact that I was likely to get a board before his devs were, when I said "Hold on, Remedy at least had one for E3", his reply "Yep, that went straight back". They haven't given out boards because the issue caused them to be flakey (as even ATI haven't been able to do much in the way of driver optimisation) and it would be more frustrating to devs to use them. "

    http://www.beyond3d.com/forum/showthread.php?t=24394&highlight=flakey
     
  2. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Mintmaster,

    Shadermark performance could be theoretically fine tuned too, or not?

    I was under the impression that large ISVs had working prototypes (with lower frequencies than after the launch) since spring this year.
     
  3. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    I had two conversations over a couple of weeks - one with Richard Huddy at the Develop awards who was monaing at me saying "the stupid thing is, you're more likely to see boards before my developers are" And I said to him, "Well, hold up, dev's had them at E3 - Remedy is in your area, they've got one", "Nope, they went back" was his reply. Subsequently I talked to Eric and Raja at the Ibiza event and said "Well, the driver should be good seeing as you've had so long with them" and they told me that the nature of the problem occured fairly randomly, once they thought they had workarounds for one app it would suddenly pop back up again, limiting the development - and it hit me that was why very few devs had them until recently as well. The driver guys went bck through the R300/R420 code base and was optimising for that, which is why you see a sudden speed boost in X800/X850 performances in the driver set just prior to X1000's release.

    Anyway, Jawed, I doubt optimisations will come from getting to know the scheduler better, or if it does, ironically, I think its going to come from shorter shaders and optimising the texture access better (or, it should make for much better HyperMemory parts). I quizzed Eric about the dispatch processor asking if this wasn't really a nice PR dressup and he came back and said the main performance aspect of it are for flow control and texture accesses. Outside of that, the old dispatch processor on their SM2.x parts was relatively fixed function based on 4 levels of dependancy - the new dispatch processor does away with those limitations and can handle a much larger pool of threads without caring why they are sleeping.
     
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    Thanks, interesting posting.

    I was comparing X1800XT to X800XT - in a broad sense. There's so many new bits in R520 that it's hard to tease-apart improvements in the pixel shader.

    At the thread level - not the instruction level. (I got bored, ages ago, of writing "out or order batch scheduling"...)

    But they can't predict texturing latency. And it's texturing latency that R5xx's ultra threaded scheduler is there to hide. Because threads are only 16 fragments in size, it's no longer possible for the GPU to rely upon batch size to hide texturing latency.

    So the scheduler is working, pretty hard, the whole time. But is it trimming the fat on texture latency? Is it maximising cache coherency? Is it slickly cooperating with the memory controller?

    Alternatively, it might be that the scheduler is hiding texturing latency simply by, in effect, creating "batches" of 256 (or whatever) fragments. Since there's no dynamic branching in any of these games, it's trivial to make the scheduler just lump fragments together into big batches - and then R520 looks just like R420. Maybe that's what's going on.

    Which would be sad, really. And worse if R520 is creating "fixed-size" batches of this type, analogous to R420's fixed-size batches. What if the right size to hide the current instance of latency is 64 fragments? Or 512? Ah well, maybe we'll find out as time passes.

    I was doing that earlier. I noticed that Driverheaven got quite different results for G70. I suspect B3D's results are out of date. Or maybe it's different clocks...

    I've tried this. It gets depressingly meaningless. Well, I've failed to extract any meaning...

    I've highlighted things you've said which I found meaningful - thanks! :
    That's only the case in 4xAA/8xAF though. So that's a function of the use of memory, rather than the pixel shader engine - isn't it?

    In general my hopes have been dashed - at least with R520 (and RV515). RV530 and R580 - now they are different matters. Soon I'm gonna start a thread on this topic - I was hoping B3D would have published a detailed review of RV515 and RV530 by now...

    I've always thought Shadermark is a spectacularly irrelevant benchmark (as far as game performance is concerned). That's the main reason why I rapidly lose interest in pulling apart the results.

    (Aside: I actually think historically NVidia built GPUs by tweaking architectures to suit these kinds of synthetic tests. G70 would appear to be the first GPU where they actually admit to analysing real game code. A corollary for this would appear to be GPUBench - which generally predicts that NVidia architectures are far better - but various GPGPU implementations show the reverse.)

    Render targets also seem much faster in R520 (tests 25 and 26).

    I can't wait to see how that battle shapes up! The question is, are FEAR and CoD2 maths-heavy? Surely someone with the nous could be tempted into extracting the shaders from these games and finding out!!!

    Agreed on all that.

    That's where my lack of game/shader dev skillz catches up with me. I presume you're hypothesizing that the driver is able to insert code that surrounds these lighting terms with a test for the visibility of each light on the pixel being shaded. Ouch.

    Jawed
     
  5. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    Flow control and texture accesses have always been the terms under which I've taken the scheduler - whether in Xenos or R5xx - particularly taking cognisance of the much smaller batches these new architectures work on.

    I'm curious to find out if the 3:1 ratio in RV530 (and R580) actually improves texturing rate above that seen by the 1:1 ratio versions of the architecture (R580 versus R520 - RV530 versus RV515 is much less clear cut). RV530 looks very good against RV410 and NV43 despite its heavy texturing disadvantage.

    I'm not thinking of the minimised texturing latency case which we've been discussing. Instead I'm thinking of the case where texture pipes are idle, and presuming that the 3:1 GPUs minimise the proportion of time that the texture pipes are idle.

    Anyway, I'm hoping B3D will provide some insights soon.

    Jawed
     
  6. SugarCoat

    Veteran

    Joined:
    Jul 17, 2005
    Messages:
    2,091
    Likes Received:
    52
    Location:
    State of Illusionism

    and theres a good reason for that too. Take a look at EVGAs own forums, theres some real hell going on with alot of GT owners, random lock ups, artifacts, etc..and this is straight out of the box. Even EVGAs own guys said that removing the clock domain as well as keeping 2D and 3D frequency locked the same promoted stability on the whole. There are literally a mess of bios to choose from on their community forum (great place btw, you can ask for a bios build and they'll do it for ya pretty much np.)

    Now that theres this much higher clocked card, i expect to see one key aspect of the GTX and GT (power consumption and problably heat) go right out the window, both load and idle. But i do think it was needed as well for stability
     
  7. MuFu

    MuFu Chief Spastic Baboon
    Veteran

    Joined:
    Jun 12, 2002
    Messages:
    2,258
    Likes Received:
    51
    Location:
    Location, Location with Kirstie Allsopp
    Hah, totally misread that entire thing. I blame Jim Bean. :D
     
  8. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Well, if new BIOSes fix this, then this is an eVGA issue, not a GeForce 7800 GT issue.
     
  9. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,511
    Likes Received:
    24,411
    Not if they were using the Nvidia Reference BIOS and PCBs.
     
  10. Sxotty

    Legend

    Joined:
    Dec 11, 2002
    Messages:
    5,496
    Likes Received:
    866
    Location:
    PA USA
    Not if they are running the reference pcb out of specs by shipping pre overclocked cards.
     
  11. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Well, all I know is that I have two 7800 GT's in my system right now that have been running for approximately one week without rebooting or crashing.
     
  12. Arty

    Arty KEPLER
    Veteran

    Joined:
    Jun 16, 2005
    Messages:
    1,906
    Likes Received:
    55
    So any place that has their review up? I wish I could read one before hitting bed.:cool:
     
  13. kemosabe

    Veteran

    Joined:
    Jun 19, 2003
    Messages:
    1,001
    Likes Received:
    16
    Location:
    Montreal, Canada
    According to AT, an R520 AIW board is launching this week. I would expect it's based on the XL, but any reason it couldn't be a preoverclocked XT core to be positioned against the 512MB GTX?
     
  14. AlphaWolf

    AlphaWolf Specious Misanthrope
    Legend

    Joined:
    May 28, 2003
    Messages:
    9,470
    Likes Received:
    1,686
    Location:
    Treading Water
    Perhaps a higher clock, faster memory than a standard XL (550/1200 or something). I doubt they will go for PE type specs in an aiw package.
     
  15. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Of course, but I don't think it would get as much of a speedup from early out. I can't run it right now due to video card problems, but I don't think the main object is ever lit from the back, and thus showing only ambient lighting. The N dot L early out won't really help in this benchmark. It's likely to help in real games, though, because there you and lit objects are always moving. It won't help a game like Doom3 because the stencil shadow acts as an equivalent early out already, but games like FarCry, FEAR, or COD2 are a different story (assuming that much of the shader load is in lighting).

    If you're talking about NVidia hand tuning the NV40/G70 driver for Shadermark shaders, that's very possible since Shadermark was an important and widely used benchmark back when NV40 was released. It would also explain why Shadermark results aren't reflected in pixel shader heavy games. But I'm not too sure about this, and it could be just coincidence that the Shadermark shaders suit NV40/G70 or are easier to compile than game shaders.
     
  16. HAL

    HAL
    Newcomer

    Joined:
    Nov 12, 2005
    Messages:
    103
    Likes Received:
    2
  17. HaLDoL

    Newcomer

    Joined:
    Jun 10, 2004
    Messages:
    140
    Likes Received:
    2
    When exactly does the NDA expire?
     
  18. Karma Police

    Regular

    Joined:
    Sep 2, 2004
    Messages:
    433
    Likes Received:
    6
    Location:
    192.168.2.1
    Right................................................about...........................
     
  19. Karma Police

    Regular

    Joined:
    Sep 2, 2004
    Messages:
    433
    Likes Received:
    6
    Location:
    192.168.2.1
  20. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Well, for the most part, you can. Textures are very predictable. Sure, bandwidth sometimes holds you back longer than usual, but that means you're throttled anyway. Dependent textures, on the other hand, are a little tougher, and you can't start fetching way back at the vertex shader. You need a fat FIFO (which is essentially a simple thread manager) to absorb the latency.
    What's so sad about this? It's the best way to do things. Before you go off on your rants, remember that since about the Radeon/Geforce3 timeframe, most video chips from ATI and NVidia have been very good at doing what they do. They use FIFOs to hide the 50-100 cycles of latency, and the memory controller groups requests to obtain high bandwidth utilization. There's no point scheduling in a different way than what's worked nearly ideally for generations.


    Huh? Improving the texture rate?
    That doesn't improve the texture rate, it speeds up the math-limited steps. You have a rather odd way of looking at things. :smile:

    Anyway, I think FEAR is the best case for RV530's 3:1 ratio:
    http://www.driverheaven.net/reviews/X16_GS/fear.htm
    Before exe rename (add ~35% to X1000 cards): http://www.anandtech.com/video/showdoc.aspx?i=2575&p=4
    If you extrapolate this to R580, you're talking about over two times 7800GTX performance. :shock:
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...