Nvidia GT300 core: Speculation

Discussion in 'Architecture and Products' started by Shtal, Jul 20, 2008.

Thread Status:
Not open for further replies.
  1. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,062
    Likes Received:
    3,119
    Location:
    New York
    Hmmm in the course of a day we've gone from GT300 showing up in 6 months to GF100 being as fast as RV870X2. Can't wait to see where we end up tomorrow :lol:
     
  2. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    22,146
    Likes Received:
    8,533
    Location:
    ಠ_ಠ
    To nFinity and beyond of course. ;)
     
  3. KonKort

    Newcomer

    Joined:
    Dec 29, 2008
    Messages:
    89
    Likes Received:
    0
    Location:
    Germany, Ennepetal
    Why are you so skeptical about the chip? If this is your opinion, you will be surprised in the next months.
     
  4. -The_Mask-

    Newcomer

    Joined:
    Sep 20, 2009
    Messages:
    51
    Likes Received:
    0
    Location:
    The Nederlands
    Impossible if you ask me. :p
     
  5. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Sure. Those weren't necessarily either/or scenarios....

    I like VLIW's chances better if it can build an "IW" using work from multiple threads. [My understanding of HP's foray into VLIW was that it was less successful than hoped for.] Mind you, there are probably easier ways of thinking of that kind of architecture than VLIW.... Simultaneous Asymmetric Dispatch or something with less of a, err, sad acronym.

    I'm also wondering if the way that DP works affected the thinking of how MADD might work. Instead of single-cycle MADD, maybe it makes more sense to have two units working on the same piece of data, across two cycles, the results of one feeding the other. Across enough work, it's basically the same speed, but it seems like less work has to happen within a cycle (fewer gates, allowing for higher clocks), and it would seem easier to expand your LIW repertoire. Of course, maybe that's how MADD works now anyway :shrug:
     
  6. SirPauly

    Regular

    Joined:
    Feb 16, 2002
    Messages:
    491
    Likes Received:
    14
    The beauty of forum conjecture at times!:)
     
  7. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,247
    Likes Received:
    4,465
    Location:
    Finland
    Chips being late month(s) haven't had a good track record usually :wink:
    But I don't have any real opinions on how it will perform, just keeping that as a possibility too.
    I'd say that in a good scenario from nVs point is that it's as much faster than Cypress as GT200 was faster than RV770
     
  8. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    I still don't think it will ship this year ... hell, I don't think we will get a clear shipping date when the first official information is released.
     
  9. KonKort

    Newcomer

    Joined:
    Dec 29, 2008
    Messages:
    89
    Likes Received:
    0
    Location:
    Germany, Ennepetal
    Why is GF100 late? How can you judge in this direction? I reported in January that Nvidia's next generation chip will come in Q4/2009. So where do you see a delay?

    You cannot say that GF100 has delayed only because of the fact that AMD has got first DirectX 11-chips few weeks before.
    But I will not deny that Nvidia's chip has got some problems in the summer and could be already in the market.
     
  10. Slappi

    Newcomer

    Joined:
    Apr 22, 2004
    Messages:
    92
    Likes Received:
    2
    It will be ready for Christmas builds.

    If you are lucky you will get one before Thanksgiving.
     
  11. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,492
    Likes Received:
    979
    Location:
    en.gb.uk
    Anyone else getting the feeling that two parallel Universes have become entangled? With the different names, dates, specs, problems/non-problems it's like we're talking about two different parts from two different companies.

    I need a lie down.
     
  12. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I'm not making any bets anymore. Last time I had to write a public apology to Rys LOL.
     
  13. Chris123234

    Regular

    Joined:
    Jan 22, 2003
    Messages:
    306
    Likes Received:
    0
    It all makes sense now!
     
  14. Slappi

    Newcomer

    Joined:
    Apr 22, 2004
    Messages:
    92
    Likes Received:
    2

    What was the bet?
     
  15. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I started a humble little write up back then with that paragraph.
     
  16. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    D3D10.1 requires 32 vec4 attributes per vertex to be supported, as opposed to 16 in D3D10. So that doubling in interpolation workload might steer the architects in the direction of increasing interpolation rate. Except, of course, that merely by adding ALUs, the increase occurs. So really what it comes down to is rasterisation:interpolation rate.

    The other important question is, what the fuck is "pull model interpolation" in D3D11?

    Jawed
     
  17. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    That's a really bad idea:
    • the techniques for calculating all kinds of transcendentals have common hardware structures, so splitting these structures into distinct units is simply a waste
    • the general trend should be for less acceleration of transcendentals, not more - in general computation transcendentals are much less commonly used (about 5% if I remember right) than the hardware provides for (~25%)
    Jawed
     
  18. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    There are so many ways to make a chip "complex". Do it smartly and you can get way more performance - it's a question of how radical you're prepared to be.

    For example it's like comparing the performance of two GPUs: one with early-Z rejection and one without. How do you "measure" complexity there? All you can do is talk about how the transistor/mm²/power budgets were spent.

    This is the Larrabee bet: let's say for 5 billion transistors on 28nm for 200W Larrabee overtakes the more traditional GPUs.

    What I find disappointing about R800 is that apart from doubling the RBEs, deleting SPI and tweaking up the GDS size and implementing UAV/append/consume-specific buffers I don't get any real sense of the architecture taking a leap forwards. Of course, apart from the RBEs, it's hard to tell how effective the rest has been (or how well tessellation actually works). And my long standing argument is that the architecture (underlying design of units) is actually solid enough to work for a long long time. And there are games whose performance is ~1.8-1.9x HD4890. So it's not even as bad as it sometime seems. So there's a degree of wait-and-see about it.

    Anywya, I'm still expecting NVidia to be pretty radical. NVidia's RV670->RV770 as it were. Plus some, with a bit of luck.

    Jawed
     
  19. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,249
    Likes Received:
    3,419
    We can certainly hope for something like this. But RV670->RV770 was accomplished by eliminating mostly obvious mistakes in the R600 design and by some magic which allowed them to pack 2.5 times more ALUs in almost the same complexity (transistors and die size). So while we can hope for GF100 to somewhat repeat that success i'd say that counting on it as a "minimum" is highly unrealistic.

    I expect GF100 to have a bigger performance advantage above Cypress than GT200 had above RV770 while the difference in complexity will be smaller. But I don't think that it's wise to expect Hemlock-level performance from one GF100 chip.
     
  20. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    The arbitrary R/W in the LDS is important too.

    I'm not convinced there are any big architectural leaps left to make, DWF seems something which can handled in software ... the only important leap left to make IMO is to fold the pixel cache into L2 (making it read/write, with coherency being guaranteed by relatively simple fences ... doesn't give the low latency cross core coherency of Larrabee, but I don't think that's really necessary). After that I don't really see how it will be much more difficult to program than say Larrabee, if you want to use the option of using the LDS with their comparitively huge gather bandwidths it will be harder to program ... but it's good to have options.
     
    #2660 MfA, Sep 29, 2009
    Last edited by a moderator: Sep 29, 2009
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...