Cell vs Tesla 10 vs Firestream 9250

Discussion in 'CellPerformance@B3D' started by randomhack, Jun 22, 2008.

  1. Carl B

    Carl B Friends call me xbd
    Moderator Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    There is no 'proof' required. The nuclear simulations that Roadrunner will be working on are just one obvious example of specific applications for which memory availability provides a very real and tangible barrier, due to their size and scope. The PCI bottleneck, on the micro scale, comes in when the chip has to swap out data in its already contrained memory footprint. Consider then the effects of several nodes feeding one another processed data and the breakdown that occurs when you have processing stalls on an ever compounded level. Think of a graphics card needing to go out to main memory because it couldn't fit all data into the on-card RAM, and you get a very clear idea of what we're discussing here. Doesn't matter that the chip itself is capable of a certain level of performance, as soon as it needs to cross that PCI-e bus, performance goes to hell compared to the theoreticals. Supercomputers are built around their interconnects as much as their architectures; this isn't anything that I'm making up here, this is established.

    I'm not saying that a consumer 'needs' it. I'm not saying either that the SpursEngine is the result of a wise economic decision. What I am saying is that the SpursEngine is more powerful, flexible, and capable than most ASICs across a number of tasks.

    Yes, the Cell competes in the same markets, obviously. It also obviously has certain advantages from an HPC perspective, advantages that I certainly don't have documents on tap to 'prove' anything to you with, but advantages that an understanding of system architecting and needs-profiling should lay bare as obvious to anyone. Your refusal to accept the memory and PCI-e constraints as 'real' sans evidence speaks, frankly, to either a naivete or lack of understanding on your part wrt the issue.
     
  2. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,614
    Likes Received:
    60
    Carl, if they have been in the market for 2 years plus a product rev, then I assume they have learned their lessons even if they went in with a mismatched solution originally. Their first supercomputer benchmark was also a testament to their applicability for problems related to the benchmarks. This is not to say that ClearSpeed will be suitable for all supercomputing applications. To me, it seems to be targeting a very specific and real need within this space. OTOH, Cell (and hence RoadRunner) is more general. We have all seen papers on 50 times speed up from Cell performing tree searches. On the number crunching front, even though it may not perform as well as specialized hardware, it is still applicable for a large number of math problems.

    Yeah not directly. There are embedded processors supporting "Java instructions" but Cell's ability to run Java programs together with other media processing will help in these hardware units.


    Toshiba has already demo'ed SUC to a panel of journalists last few weeks. The differences are noticeable but naturally they did not compare the superupscaled picture with true 1080p original image.

    I think someone used this as an example for nextgen TV or video library UI (where you can see live thumbnails of channels during navigation).
     
    #22 patsu, Jun 24, 2008
    Last edited by a moderator: Jun 24, 2008
  3. Carl B

    Carl B Friends call me xbd
    Moderator Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    Well that's what I'm saying though; who says their intiial product was mismatched? It served the market of add-in HPC quite admirably, which was the market they were targeting, and why I think it's easy enough to dub them a qualified 'success.'

    I think it's wrong to reduce it to 'general' vs 'specialized' though. More accurate to say simply that each architecture has certain strengths and weaknesses. Roadrunner was built for a specific purpose; the fact that it is also broad/general in the number of tasks it can address is a secondary benefit rather than the primary driver. And for that primary purpose though, Clearspeed's latency and memory disadvantages would have ruled it out as a contender before the questions of programming/generality ever even arose.

    We'll be seeing a supercomputer based on GPGPU in the not-too-distant future, and it should serve as a good test-bench for a lot of the angles we're discussing here.

    http://www.beyond3d.com/content/news/632

    I have a feeling that the GPU's here are going to be acting as selective turbo-chargers, however, with a substantial amount of the work still done on the Intel CPUs. It's worth noting that in Roadrunner, almost all generated Flops come from the SPE's, with the Opterons serving almost entirely in an I/O role.

    But I'm not sure we'll be finding it in any of those hardware units except for the PS3 itself, unless Spurs makes its way as well. Cell is obviously 'better' from a capabilities standpoint, but the Tru2Way manufacturers will opt for lower-power and price savings where possible I think.
     
  4. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,614
    Likes Received:
    60
    I see what you're saying (although my definition of 'success' is different but I'll ignore that to keep the discussion clean). Yes, the customers will want to make sure as "all" their needs are met adequately before committing to a spanky supercomputer.

    If we get down to this level, I think without concrete examples and customers, it's hard to argue about ClearSpeed's (and Cell's) suitability or unsuitability.

    Point taken. I think it also depends on the business model. I was told 2Wire's BOM cost is prohibitive, but AT&T chose it eventually because of specific built-in customer support features. These advanced features allowed them to conduct their business more cost effectively compared to the usual brands.

    So yes, Cell is expensive. But it depends on what else is in the system and *if* Cell enables/differentiates them. If not, then everyone will go the low cost route.
     
  5. RudeCurve

    Banned

    Joined:
    Jun 1, 2008
    Messages:
    2,831
    Likes Received:
    0
    You've been speaking in generalities yet at the same time making specific claims. Anyone can claim "more memory is better" or "more bandwidth is better", but you've taken it a step further and claiming 2GB/processor is a "problem" and 16GB/processor is suddenly "not a problem". You know this how? Have you ran any real scientific problems using Clearspeed accelerater cards to come to this conclusion? Or are you simply assuming that since 16GB>2GB the Clearspeeds must not be able to do real problem solving at the same level as CELL due to its 2GB/processor "problem"? Since you've already determined that 2GB/processor is a problem, let me ask you this, where is the point of diminishing returns wrt memory capacity per accelerating processor? Is it more/less than 16GB?

    Well that's a reasonable assumption. A DSP is more flexible than an ASIC yes.
     
  6. Carl B

    Carl B Friends call me xbd
    Moderator Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    I didn't say that 2GB is a "problem," I said it was a problem for the computations Roadrunner is targeted towards. This entire concept of 'diminishing returns' of yours works under the premise that there exists only one class/size/scope of scientific computing, and that as such we are on a graded performance slope. For certain problems, 2GB is more than enough. For others, it is not enough. It is not about some linear or asymptotic performance gain here related to memory, latency, and/or bandwidth; it is about a performance cliff that these architectures will fall off of when they are significantly constrained in any one of these areas.

    So, for the QS22, it simply is able to handle a larger scope of problems before hitting such a barrier. For the Clearspeed card, that barrier is lower. Now - the majority of simulations one is likely to run would be fine in a 2GB environment, but when we're talking about truly massive data sets... and for Los Alamos' purposes they specifically requested more. We're talking about simulating the direct and secondary effects of a nuclear explosion on surrounding matter and environment here on a second-to-second level. You're being a bit flippant in terms of what you are willing to toss aside in your consideration, to the extent that you don't acknowledge a very material spread in a key point of capacity.

    I'm personally inclined to believe that your thinking on the matter is a product of the quasi-technical articles that float around the Internet, that dismiss 'hyped' specs at the same time as they hail them when convenient. Case in point, the whole notion that Clearspeed has superior DP Flop and wattage counts... and thus these matter... but Cell having greater memory access and bandwidth must conversely mean nothing. Your rather snide 'DSP' remark makes me think you are likely one of those that when Cell was touting Flop and watts numbers back in the day though, those were in turn reduced to 'hype' in your book.

    I'm a fan of Clearspeed and what they're looking to achieve in terms of a performance/value proposition, just as I'm a fan of the disruptive power of GPGPU computing and its accessibility. But the Cell-hating gets old, and is wholly unsuited to this sub-forum. The Cell architecture doesn't suck, it's actually quite forward looking, and it's gaining good traction in the HPC space right now. If I've interpreted your tone/purpose here incorrectly, I do apologize, but realize that this sub-forum is an extension of this site here rather than a general sub-forum of Beyond3D, and as such there really is no room for tolerance here in terms of either real or perceived Cell-related trolling. As I mentioned before, this thread itself was a questionable one in terms of whether it should remain here or be moved elsewhere, but... well it's here for now.
     
    #26 Carl B, Jun 25, 2008
    Last edited by a moderator: Jun 25, 2008
  7. Shingoshi

    Newcomer

    Joined:
    Sep 29, 2008
    Messages:
    2
    Likes Received:
    0
    Thank you for this intense discussion...

    I found this as a result of a search on Google to see if AMD/ATI were yet producing a rackmount product similar to the Nvidia Tesla project. When I first became acquainted with the existence of the Cell, I was rather impressed. But I was confused, even from the beginning as to why the price was so high for it (the uninformed perception I speak of below). I'm speaking specifically of the Mercury Cell Board, at a then $8000 MSRP. So when the Nvidia came along, and now the AMD Firestream, just from my limited understanding of all the specifics involved here, it seems indeed gloomy.

    This morning I was just playing with the idea (dream) of building a compute engine inside of a 16U case. I had been looking at different options with regard to backplanes to achieve the highest level of processor density possible. I came across the ClearSpeed, but didn't know anything about it. At first, I saw the 96GFlop number and thought wow! But I wasn't certain if that was impressive or not, because I hadn't really investigated this enough. However, your spirited discussion here leads me to pursue this further to understand more fully the benefits each has to offer.

    Clearly, right now, at $1000 per Firestream 9250 card, it seems very attractive. And based on the fact that it offers a lower initial cost for installation, many people are going to be attracted to it, and the Nvidia. And the problem for these other two is that most of the public on which this competition will depend, will find themselves looking at cheaper solutions. I was thinking that Physx had missed a great opportunity by limiting the number of cards that could be used, thinking what a difference they could make for small projects. I realize that Supercomputing may have specific needs, but those are beyond the requirements of most small projects. And as those small projects gain popularity, the companies (Nvidia/AMD) will enjoy an increasing public perception of the presumed superiority, regardless of that being true or not.

    The Clearspeed/Cell front (consumer base) will likely diminish in size, even though larger sums will be spent on each installation. The problem is that most of the money will be made (by their competition) in the volume of sales acquired from smaller clients, and they will likely ultimately set the trend for what's to come. And additionally, Nvidia/AMD will have the assets to make whatever adaptation they need to compete with the larger installations. They're not going to sit back and say to themselves, "that's just too large for us to consider.

    I guess what I'm trying to say is that all of this may become moot. Because if more equity is eventually dumped into the cheaper solutions, the larger projects will find less opportunities to grow into. They have a very limited set of clientele (for supercomputing), compared to the masses which will adopt the products of the younger upstarts. And then the accountants will of course be involved in decisions which have everything to do with money, and less with performance.

    So time will prove what will come of all this. I do know that a few of those mentioned here will change their models and strategies as required by economic conditions. And whoever makes the best decisions in that regard will be the winner(s).

    Just my uniformed observations.
    Shingoshi
     
  8. Shingoshi

    Newcomer

    Joined:
    Sep 29, 2008
    Messages:
    2
    Likes Received:
    0
    I found this to be interesting...

    http://www.clearspeed.com/acceleration/reliability/ecc/

    Based on this, I feel justified in thinking that all memory should be ecc. Presently, such a move would not be well accepted. But if the computational markets become more of an asset to those manufacturing GPGPUs, ecc will likely become a more standardized feature in more systems. Also add to this that if the reasoning given by Clearspeed, concerning the increasingly smaller size of chips and the speed at which they operate, ecc may become a necessity for everyone, regardless of their activities. And when you consider how relatively inexpensive it would be for GPGPUs to add ecc to their foundation, the present advantage of this for Clearspeed and anyone else currently using ecc, will evaporate. Adding ecc would just be too simple to pass up. Just think of a Firestream/Tesla with either ecc registered or FBDIMMs. Can you see four 4GB DIMMs per card? That would significantly diminish any argument as previously mentioned.

    There's another thing to think about. Any changes in the systems of Nvidia/AMD, will be generally be seen as advancements, by increasing the functionality of their units. However, I don't think their competition will be viewed so favorably, if they now go back and attempt to introduce features to compete with Nvidia/AMD. Maybe I'm wrong here and this might not make any sense. But I do think it will cost less for Nvidia/AMD to enhance their products, than it would be for the others. And then, there's the all important factor of name recognition. Very few people know who the other companies are. Yes, they're each major names in their own right, at least for the Cell group. But they're attempting to convince the public to accept platforms with which the masses are unfamiliar. That's not true for Nvidia/AMD. Their collective presence is about as ubiquitous as one could hope for. They don't have to sell themselves to the public, that's already been accomplished.

    I guess I'm thinking that performance will be measured by who has the largest market share. Not who is more efficient at work.

    Maybe? Time will tell.
    Shingoshi
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...