Joe DeFuria
Legend
I want more polys, and I still think that TBR's will have problems with high polycounts.
Cue Dave, Simon, Kristof....
I want more polys, and I still think that TBR's will have problems with high polycounts.
Why should I say it again? Life's too short, I've just had to write a stupid doc explaining why patent XYZW is completely unrelated to a patent application of mine (but happened to have some of the same keywords), and it's time to go home.Joe DeFuria said:I want more polys, and I still think that TBR's will have problems with high polycounts.
Cue Dave, Simon, Kristof....
Joe DeFuria said:Look...let's just cut this short.
1) I mentioned doubling the clock rates as an illustration of doubling fill rate and bandwidth.
2) Whether it's my fault for "confusing you", or your fault for "being confused" is pointless to discuss and was not my intention to raise as an issue. Miscommunications are the fault of both parties.
Now. Start fresh.
Everyone wants to see a TBDR with "similar specs" as a high-end IMR. SIMILAR SPECS meaning similar raw fill-rate, and similar raw bandwidth.
Why?
Because generally speaking, we all pretty much expect that cards with similar raw specs generally cost the same.
demalion said:"fillrate doesn't matter directly for cost, it is the amount of transistors (or die area, as Simon would say, I guess) that achieving that fillrate would require. For example, how much design area do these techniques to approach the efficiency of a DR require, and what could be done instead in that space? Ignoring questions like these is a flawed approach to analyzing the cost benefits (and penalties, though I don't know what those are) that might be associated with the differences in a TBDR, which sort of defeats the point of the comparison of "equivalently costly" designs, IMO.
Joe DeFuria said:500 Mhz, 256 bit DDR-II costs "the same" no matter which chip it's paired up with.
demalion said:bandwidth doesn't matter as much (though it should matter more than it has in the past) to a TBDR, so that assumption of equivalence of cost does not seem reasonable. That phrasing seems a bit odd...perhaps it would be better to say a TBDR can achieve more with given bandwidth?
Joe DeFuria said:And the hope is, that a TBDR with the "same specs" (and therefore cost), would significantly outperform the competing IMR.
For cores, it is admitedly less black and white. But I see no reason so suspect anything other than as a best rough estimate, a TBDR core that puts out 800 MPix/sec (raw), "costs the same" as a 800MPix/sec IM core.
Can we agree on those assumptions? Before this is taken further, we have to agree on that.
Chalnoth said:Chalnoth: TBR's will have a problem with storage bandwidth. Each triangle takes a significant amount of data to store, meaning that as triangles approach the sub-pixel size, TBR's will start to require more memory bandwidth than a comparable z-buffer, particularly since some triangles will have to be read two or more times (spread across different tiles).
And the hope is, that a TBDR with the "same specs" (and therefore cost), would significantly outperform the competing IMR.
Lol...quite a change from the "Hardwarwe T&L makes no difference" mantra back in the Kyro vs. GeForce era
"fillrate doesn't matter directly for cost,
Ignoring questions like these is a flawed approach to analyzing the cost benefits (and penalties, though I don't know what those are) that might be associated with the differences in a TBDR, which sort of defeats the point of the comparison of "equivalently costly" designs, IMO.
bandwidth doesn't matter as much (though it should matter more than it has in the past) to a TBDR, so that assumption of equivalence of cost does not seem reasonable.
Well, I don't even think such a comparison (excluding design complexity and implementation as a factor in cost and focusing only on fill rate and bandwidth specifications for cost/performance analysis) works between IMRs.
What is this "raw" fillrate based on, and why is it important again?
What about the impact of "effective fillrate" with a feature such as AA turned on?
concentrating on fillrate as a point of equivalence seem to me to be a wrong turn.
MfA said:Chalnoth, your first post said you didnt want TBR to become dominant because they would have trouble with high polygon count ... if they were dominant the 3D pipeline would change to accomodate them.
the option to just transform them multiple times without needing extra storage is always there. This can hardly be counted as a negative without counting all the cases where performance breakdown for a given scene is larger for an IMR than for a tiler as positives ...
And I have shown based on the last info we have for the same spec-ed parts, Kyro vrs TNT2, that the TBDR was able to out perform not only the TNT2 but its "bigger brother" the TNT2 Ultra.
Now does this hold true today? I don't have any idea. Nor do I have proof to say that would hold true at the high end today if such parts existed.. Of course you also don't have any proof to say that it wont.
Then if your games actually really uses it then gains are bigger. So can you take a card with out TnL, double its FPS scores and say its the same as a card with TnL with the same specs?
Also the K2 only (unless my memory is bad) have a single TMU?
These are some of the reasons why I objected to your "doubling method". You can not take these parts and double their scores as they differ in too many areas.
Edit: just wanted to say that there are many things that effect FPS scores like drivers, ect that double will not accont for...
Chalnoth said:But, as I said, I want to see higher polycounts. I still don't think that significantly more fillrate will really help anywhere close to as much as significantly increasing polycounts.
Yes, but transforming the primitives multiple times is hardly a good option, particularly if vertex program lengths begin to get long.
Joe DeFuria said:why multiply the SDR memory speed by a factor of 2?
its not DDR???
Um....
Because I DOUBLED the FPS scores of the Kyro-II benchmarks. In order to do that I DOUBLED the pixel fill rate, and DOUBLED the bandwith.
In short:
Original KYRO-II: 175 clock, / 175 Mhz 128 bit SDR
"Doubled" KYRO-II, producing FPS comparable to the GeForce4 MX: 350 Mhz Clock, 350 Mhz, 128 bit SDR.
GeForce4 MX is running 200 Mhz DDR = "Effecitve" 400 Mhz 128 bit SDR.
OK?
Kyro II @ 350 MHz = 700 MPixel / 700 MTexel
GF4MX @ 350MHz = 700MPixel / 1400MTexel
Joe DeFuria said:"fillrate doesn't matter directly for cost,
I disagree. Fillrate is not the only factor of course, but fill rate does relate directly to cost. To get more fill-rate, you either throw more piplines at the problem, or you design for higher frequencies / lower yields.
Ignoring questions like these is a flawed approach to analyzing the cost benefits (and penalties, though I don't know what those are) that might be associated with the differences in a TBDR, which sort of defeats the point of the comparison of "equivalently costly" designs, IMO.
Demalion, I am not "ignoring" questions like those. I am making assumptions.
I am assuming that all things are bout equal. There is no evidence to date, one way or the other, that indicates that given the same raw fill rate target, TBDR or IM are inherently more costly.
Because of LACK of such evidence, I am making the ASSUMPTION that they cost about the same.
Instead of throwing unanswerable questions around, supply some evidence one way or the other.
bandwidth doesn't matter as much (though it should matter more than it has in the past) to a TBDR, so that assumption of equivalence of cost does not seem reasonable.
I don't get it. Is it cheaper to pair 20 GB/Sec worth of raw bandwidth on a card with a TBDR chip, than it is pair it with an IMR chip? No.
I know "bandwidth doesn't matter as much". That's not the point. The point is to build two cards with the same spec, (same cost), and thus the performance theoretically of the TBDR would be much higher.
And for the record, the Kyro-II employed a bandwidth ratio of 8 Bytes per pixel, which is actually just slightly more than the Radeon 9700, and double that of the GeForce FX. So if 8 bytes / pixel is the ideal bandwidth ration for TBDR implementations, as per Kyro-II, then that again supports my assumption as not only reasonable, but likely.
Well, I don't even think such a comparison (excluding design complexity and implementation as a factor in cost and focusing only on fill rate and bandwidth specifications for cost/performance analysis) works between IMRs.
Care to show evidence where it doesn't?
What is this "raw" fillrate based on, and why is it important again?
Raw fill rate is the number of pixel pipes times the clock rate. (Also, the number of TMUs per pipe is to be considered.) It's important because it is a gross measure of how much actual pixel writing power the chip has.
What about the impact of "effective fillrate" with a feature such as AA turned on?
The impact of EFFECTDIVE fill rate will be shown in the performance results! You really do not understand at all my position here.
The goal is to build similary spec'd parts (cost). And then compare the resultant performance.
concentrating on fillrate as a point of equivalence seem to me to be a wrong turn.
Not at all. We're talking equivalence in COST, not in performance.
2. Doubling fill rate AND doubling bandwidth? Do you really think that'll only yield 2x performance?
I'd say closer to 3x, maybe more... why has nobody else thought of this?!
But you are not basing equivalency on higher frequencies and more pipelines, you are basing it on fill rate. You go further and stipulate raw bandwidth equivalency for determing equivalent cost, when both approaches have very different demands on bandwidth to achieve performance.
You frame this question based on the assumption that we have no evidence that a TBDR with similar raw performance figures would outperform a IMR with similar raw performance figures, and I can't even agree with you that far, as I stated.
Do you think GF FX and R 9700 cards cost the same to make at equivalent raw bandwidth and fillrate specifications?
Do you think they perform the same at equivalent raw bandwidth and fillrate specifications?