NVIDIA Fermi: Architecture discussion

Discussion in 'Architecture and Products' started by Rys, Sep 30, 2009.

  1. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,133
    Likes Received:
    454
    Location:
    en.gb.uk
    Indeed it does, but not for many of the companies on that list.
     
  2. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,489
    Likes Received:
    907
    Sure, but OpenCL isn't just about HPC, and the Khronos group isn't just about OpenCL...
     
  3. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,133
    Likes Received:
    454
    Location:
    en.gb.uk
    I know this. The latter part doesn't bother me, the former does. Maybe I'm just myopic but the concept of a compute layer that scales from mobile phones to petaflop clusters sounds like a one-size-fits-nobody solution.
     
  4. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    Logic fails to understand your statement!

    The Johnny come lately definitions are those of AMD/Nvidia. The words they are using have been in use for 20+ years and have defined definitions and wide understanding within the EE and CE communities.

    So in essence, you need to rationalize your standpoint on the issue. Either its good to just make up random definitions for words with long established meanings which leads to zero semantic content and sense and Nvidia/AMD are doing a good job, OR its bad and Nvidia/AMD's marketing departments should be slapped for confusing the definitions for their benefit.
     
  5. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    I am more than sympathetic. But, to be fair, the "only" thing a standard thread has that an NVthread lacks is that a standard thread is the unit of dispatch. Otherwise, both types of threads contain state of execution (registers) and a PC (even if it isn't the one the warp is running/dispatching). At least, if I understand properly. I am assuming a lot. For example, if a single thread in a warp gains access to an atomic, I'm assuming that doesn't allow every thread in the warp access to it.

    Perhaps someone else wants to spool up a compelling yarn for the use of thread? ;)
     
  6. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,705
    Likes Received:
    458
    In the context of GPUs (and this is still 3D architectures and chips forums) Intel is the latecomer to this party ... language is defined in context. Regardless strands&fibers are entirely new terms with no history, and as I said counter-intuitive definitions ... and their highly Larrabee specific use of the term threads is very much debatable.

    What NVIDIA calls threads are threads even in the traditional sense. From the kernel program's point of view they execute independently, branch independently and share a memory space. Their scheduling works wildly differently than on traditional SMP machines, but meh. What Intel calls threads are also threads ... language is flexible. NVIDIA chose that in this context threads would only refer to the threads of execution of the kernel (and not the threads of the SIMD program, nor the different contexts of the SIMD programs the hardware can switch between with vertical multithreading) and Intel did it the other way ... in this context NVIDIA was first with the decision.
     
    #746 MfA, Oct 12, 2009
    Last edited by a moderator: Oct 12, 2009
  7. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    And GPUs are the domain of the EE and CE fields which actually pre-date them! You seem to be under the misunderstanding that GPUs are something entirely new when in fact they are just another application of CompArch knowledge to a slightly different problem. Of the three, only Intel has so far used the proper terminology for thread. Instead both Nvidia and ATI are trying to redefine the concept of a thread to effectively mean a data-subset.

    The truth is that neither Nvidia nor ATI support even a small amount of the number of "threads" they claim to support.

    They do not execute independently and they do not branch independently. They are merely running 16 datums in parallel and doing conditional updates of registers. If they were real threads then they could be running entirely different instructions streams which they do not.

    One of the big hype items for G300 is the support of more than 1 thead active per chip! They call it a kernal, but they really mean thread.


    Sure it does, traditional SMP machines have multiple schedulers, they had 1.

    You realize that this also goes back to the whole we have infinite billions cores thing Nvidia tried when their marketing thought it would benefit them too, right?

    Thread has a fairly well defined meaning. Nvidia isn't using that meaning. Nvidia is wrong.
     
  8. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Hmm, I thought warps executed and branched independently, otherwise what's the point? It's not clear what NV is calling a kernel to me. I see support for C++ method invocation, so we don't seem stuck within a single routine. What leads you to suspect kernel = thread rather than warp = thread and kernel = program?

    If you ignore the execution half of the definition of a term integrally wrapped around the idea of execution, expect negative feedback :)
     
  9. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
  10. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    I tend to agree with the Khronos stance on this point, which is that it's a bad idea to use terms that already have a defined meaning in another domain, particularly as the GPU domain tries to merge into the HPC domain. That's why they called them "work groups" and "work items" rather than warps and threads.

    That's the point though - it's important to create a new term for a new concept rather than confusing it with a pre-existing term. (FWIW though, "fiber" is a pre-existing term in some OSes including Win32, and it's a similar concept to the current usage, albeit not always exactly the same.)

    Furthermore I'm not sure how you can complain about the use of the term "threads" with respect to Larrabee... the definition or usage has not changed at all... it's exactly the same as it has always been, so I'm not sure why you think it is being used in some "Larrabee-specific" way.

    Not true, there are much more complicated rules with respect to "warps" that are semantically important.

    Furthermore I'd argue that the programming semantics here are far less important than the execution semantics, which is typically how "threads" are defined in my experience. In that sense there are perfectly good SIMD and SPMD language and concepts that have already existed for a long time before GPUs that apply perfectly... there's no need to create new terminology just to *seem* different.

    Don't be fooled - the renaming of terminology is pure marketing here and has nothing to do with ease of programmer understanding (who typically do just fine understanding the hardware concepts). It's just so they can say they run THIRTY THOUSANDS threads while "high end multi-socket systems" can only run 16. Yeah NVIDIA's really taking the high road in terms of helping programmer understanding ;)
     
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,416
    Likes Received:
    178
    Location:
    Chania
    The embedded market could have a dedicated API (like OpenGL_ES <-> OpenGL) that's true. Eventually OpenGL_ES is much less a mess than OpenGL, which actually supports indiretly DemoCoders original point.
     
  12. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,528
    Likes Received:
    107
    Charlie, hate on nVidia somewhere else/in another thread destined for that.
     
  13. Bob

    Bob
    Regular Subscriber

    Joined:
    Apr 22, 2004
    Messages:
    424
    Likes Received:
    47
    Except that Nvidia uses the correct meaning, and (despite what others on this thread are claiming) that meaning didn't come from the marketting department. No really, Nvidia knows how to architect chips. For real.
     
  14. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,705
    Likes Received:
    458
    I didn't say it's changed ... but the fiber is also a thead in the classical sense (and what NVIDIA calls threads are threads too in the classical sense, the software sense where the term came from). Intel are calling hardware threads just threads and dropping the hardware bit ... that shorthand combined with the pre-existing use of the term in this context, and the fact that a fiber made of strands is prima facie ridiculous is just not conducive to proper understanding.

    Just dumping everything and using the OpenCL terms without using shorthand for hardware threads in a way seemingly almost consciously designed to cause maximum confusion is good too.
     
  15. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Yeah, OpenCL at least brings some sanity.

    A work group is not equivalent to a warp though. A work group is a set of work items that can all share local memory. OpenCL doesn't have a concept like "warp".

    Jawed
     
  16. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    Then perhaps you'd like to enlighten us on HOW Nvidia's meaning is correct?
     
  17. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    You and I have different definitions of a "thread in the classical sense". And most of the definitions that I can find online - wikipedia in particular - tend to agree with my definition, but I'm not willing to argue the point. If anything it shows that there's already a lot of confusion surrounding the terms.

    Furthermore fibers are not full "threads" in a typical OS and that's the point - they are coroutines that are cooperatively scheduled by the user application, which is precisely the context in which they are being used for discussions on Larrabee. There's no redefinition going on that I've seen and I don't think you can make a real case for it... these usages are consistent with the previous usage of the terms in all major OSes and CPUs that I know of.

    I think Intel is being pretty clear about "hardware threads", but in the same way as with hyper-threading and other technologies. i.e. the threads are real, 100% OS-controlled, preempted, forkable, etc. POSIX "threads" that have real hardware resources dedicated to them. Nowhere do I see claims that these map 1:1 with "cores" (which is yet another awesome term being thrown around to mean "SIMD lane" in the GPU space, because it allows arbitrary inflation of marketing numbers).

    I don't see how it's confusing at all. It very clearly describes what you're telling the runtime semantically with work items and work groups, with the also-clear implication that these things map to different execution resources on different devices. It's extremely important to not confound the new concepts with existing ones that are *not the same thing*.

    Their meaning is inconsistent with the meaning in the CPU and particularly HPC space that long predated them. Hence the confusion and questioning of why they would deliberately overload the term except to confuse people and inflate numbers.

    Don't get me wrong, I do a lot of GPU computing and have a lot of respect for NVIDIA, but on this front I just can't cut them any slack. It was a bad call to name/rename concepts that already existed in other spaces as they did, and I'm glad to see Khronos taking the high road on this issue. Whether it was due to marketing, ignorance or it was simply misguided, I think it's pretty clear that it causes more confusion than necessary.
     
    #757 Andrew Lauritzen, Oct 13, 2009
    Last edited by a moderator: Oct 13, 2009
  18. Bob

    Bob
    Regular Subscriber

    Joined:
    Apr 22, 2004
    Messages:
    424
    Likes Received:
    47
    There is no way you can define "thread" generally to exclude the claimed "NVIDIA Threads" but also include what is commonly referred to as "thread" on many other architectures.

    NVIDIA engineers might know about HPC too.
     
  19. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    What?? NVIDIA's definition of "threads" doesn't even meet the POSIX "definition" and they certainly don't agree with the majority of the wikipedia article on threads. Conversely, they do agree precisely with a predicated SIMD lane, or more generally the SPMD model. That has been well known for a long time and the more technical reviewers called out NVIDIA for introducing the nonsensical "SIMT" nomenclature when a perfectly valid term already existed. To quote AnandTech:

    Given a definition of thread broad enough to fit what NVIDIA calls a "thread", I might as well start calling the separate bits in each of the ALU's "threads" and multiply the marketing numbers by another 32x and talk about all the fancy atomic *single-cycle* coherent shared memory operations I can do across "threads" like ADD, MUL, etc. It just gets ridiculous if you expand the term to mean "any program written in a scalar fashion that may or may not be run concurrently, predicated, given dedicated hardware resources, in a SIMD lane, ... but really guys you write it like it was an independent thread that gets launched a million times..."

    Uhhh... yeah.

    I know a lot of them and they definitely do (I have nothing but respect for them!), but that's entirely besides the point. By the same token I could just say that "Khronos might know something about standardization of terminology", which is actually much more relevant...
     
  20. Bob

    Bob
    Regular Subscriber

    Joined:
    Apr 22, 2004
    Messages:
    424
    Likes Received:
    47
    *shrug*. What's a thread then? Explain what is missing from the NVIDIA architecture that would make threads be "predicated SIMD lanes".
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...