NVIDIA GF100 & Friends speculation

Discussion in 'Architecture and Products' started by Arty, Oct 1, 2009.

  1. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    13,878
    Likes Received:
    4,727
  2. jimmyjames123

    Regular

    Joined:
    Apr 14, 2004
    Messages:
    810
    Likes Received:
    3
    So what? Where were you to cry "compromising execution" when NV30 came out later and slower than R300, when R600 came out later and slower than G80, and when Larrabee never even came out? Clearly you don't know anything about the history of GPU design cycles for NVIDIA, ATI, and others.

    Are you purposely trying to be dense, or is that simply the norm for you? I never said anything about RV7xx/8xx not meeting "requirements" (whatever that means). All I said was that GF100 has very strong graphics and compute feature set.

    Again, re-read what I said. I said that GF100 architecture leads itself more to scaling down to lower end designs compared to GT200. Anyone who claims otherwise is simply showing ignorance.


    You sound like an armchair critic (or perhaps someone who has an axe to grind?). Clearly you don't seem to know the first thing about new product development, especially with respect to high end GPU's. If it were so easy to do with flawless execution, then everybody would be doing it, including Intel. ;)
     
    #5002 jimmyjames123, Mar 26, 2010
    Last edited by a moderator: Mar 26, 2010
  3. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Oh, neat. The limitations suggest that they aren't doing what I suggested, so do you think they lengthened the ALU pipeline? I can't imagine that Cypress can do A*B*C with 8 cycles latency.

    Nothing is a major bottleneck right now, but it's still there, particularly during framerate minimums. I also think it might be a bigger problem in the future. I'd say faster backface and frustum culling would be a great first step if not setup.
     
  4. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Well if it's random then cache doesn't really make a difference ;)

    But what I was trying to say was why couldn't ATI use its L2 for R/O buffers? And is L1 only for textures?
     
  5. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Why is that? Before GF100, derivatives had half or one quarter the shading power but just as much geometry throughput. GF100 derivatives will be slower in all aspects, thus a greater step down. I don't see why it's so much easier to execute, either. Before GF100, there was basically no communication between clusters, so you could chop them off quite easily. GF100 has much more communication from the GPCs to the L2 and between the polymorph engines.

    Thinking that GF100 is at least as hard as GT200 to scale down, if not more, is not idiotic at all.
     
  6. pcchen

    pcchen Moderator
    Moderator Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,018
    Likes Received:
    582
    Location:
    Taiwan
    This could be interesting :)
    I ported my old 8 bits histogram codes to OpenCL, and did some experiments:

    1. simply using global atomics, doing 16 MB 8 bits histogram:

    GTX 285: 0.273 s
    Radeon 5850: 0.043s

    This means Cypress has insanely fast global atomics (probably with the help of the global data store?) compared to GT200.

    2. using local atomics, split into 8 banks, doing 256MB 8 bits histogram:

    GTX 285: 0.031s
    Radeon 5850: 0.041s

    All times are from OpenCL's internal profiler. These times are kernel execution times only, i.e. they do not include the times spend on copying data from/to the host memory.

    The "8 bits number overflow to 32 bits number" trick does not work on Radeon 5850 because its OpenCL does not support byte addressable local memory.
     
  7. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Cypress is very fast at global atomics, but *much* faster at local atomics. I am surprised you don't get any benefit from using the LDS on the 5850
     
  8. jimmyjames123

    Regular

    Joined:
    Apr 14, 2004
    Messages:
    810
    Likes Received:
    3
    What I meant is that the GF100 architecture leads itself more to scaling down to lower end designs in a timely, cost-effective, and efficient manner once the high end GPU is ready compared to GT200. After all, each GPC in GF100 is reportedly nearly a full GPU in and of itself, right? Wasn't it ATI/AMD who decided in recent years to move away from monolithic GPU's in part because time to market for lower end derivatives was very poor compared to introduction of new GPU's at the high end? With GF100, it seems (at least on the surface) that once the high end GPU is ready, then time to market for the lower end derivatives stands to be significantly better than before. Also, if NVIDIA can make a balanced high end GPU, then by definition the lower end derivatives should be balanced too. Is it really balanced to have lower end derivatives with the same geometry throughput as the higher end models? Of course, ATI/AMD's strategy will always have some merit. NVIDIA cannot easily get around the fact that monolithic GPU's take a long time to come to market and are very difficult to engineer. That said, the proof is in the pudding, and the results later this year will speak for themselves. I guess we'll learn a lot more in the coming months in seeing how everything plays out.
     
    #5008 jimmyjames123, Mar 26, 2010
    Last edited by a moderator: Mar 26, 2010
  9. psychoticdream

    Newcomer

    Joined:
    Mar 24, 2010
    Messages:
    24
    Likes Received:
    0
  10. TOAO_Cyrus

    Newcomer

    Joined:
    Feb 16, 2007
    Messages:
    16
    Likes Received:
    0
    When exactly does the NDA expire? Think we will see any early morning reviews?

    Edit: thats things 15$ under MSRP? I'm supprised.
     
  11. psychoticdream

    Newcomer

    Joined:
    Mar 24, 2010
    Messages:
    24
    Likes Received:
    0
    tomorrow right after 6 pm or 7 pm dunno what time the unveiling at pax will be
     
  12. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    I think it's 6PM Eastern.

    Usually a full review will get leaked from some 2 bit website online the night before NDA though. That's usually in the case of morning NDA's. So maybe something like 12 hours prior for this one.
     
  13. jimmyjames123

    Regular

    Joined:
    Apr 14, 2004
    Messages:
    810
    Likes Received:
    3
    Any chance that we will be able to view the video stream from this GF100 "launch" event?

    According to the provantage.com link above, availability of that PNY GTX 480 card is estimated at 3-10 business days. So that would mean by April 9.

    Edit: Actually, the link says nothing about availability, other than showing the company's average processing time of 3-10 business days. Looks like slim pickings for this card.
     
    #5013 jimmyjames123, Mar 26, 2010
    Last edited by a moderator: Mar 26, 2010
  14. Mindfury

    Newcomer

    Joined:
    Oct 6, 2009
    Messages:
    232
    Likes Received:
    0
  15. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    What I had in mind was that 5770 and 5750 ship with a 128bit bus and 5790 ships with a 192 bit one. The core count will need to change a bit, but that is fixable if this decision is taken early enough in design phase.
     
  16. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    LDS is definitely more power efficient (and prolly area efficient too) over an equal capacity block of gp cache. There is a definite use case for existence of these things.
     
  17. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    AFAIK, Evergreen has r/w caches attached to each MC. Since it is per MC, there are no issues of coherency. It is used for both write combining and atomics.
     
  18. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    13,878
    Likes Received:
    4,727
    Why would they do that ?

    I figure their whole line up will shift. The 5830 will fill the $200 gap not a 5790. The 5850 will drop to the $260 price range the 5830 is in . The 5870 will slot into the $350 price tag and the e6 5870 2GB will slot into the $400 price point. They can put out a 5890 in the $400 + range if they want to take the performance crown. I think its obvious that ati has high priced cards becuase there was no reason not to reap as much money from them as possible and I think dueto the large gaps in pricing we will see them drop down.

    The best thing is the 5870 at $350 is only a $30 drop from its original $379 msrp. The 5850 at $260 is at its original msrp

    Ati shifting pricing like that (which shouldn't be very hard for them when you look at the original msrps) wil lreally screw with nvidia's rumored $350 /500 price tags. Why buy a 470 when the 5870 is faster at the same price. Why 480 for $500 when the 5870 is 20% slower but $150 bucks less.

    With pricing like that the 5850 will become a really big card and most likely the sweet spot for gamers for the next 3-6 months.
     
  19. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Coz that's an opportunity to setup a 181mm2 chip against a ~330mm2 chip.

    Anyway, enough of this hole/pricing/placement talk. Back to architecture. :cool:
     
    #5020 rpg.314, Mar 26, 2010
    Last edited by a moderator: Mar 26, 2010
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...