View Full Version : R300 more specs (speculations)
RADEON 9x00 will be showcased, according to GZeasy, in May and launched in September. The core will have 8 rendering pipelines with 3 TMU per each one and will be fully DirectX 9 compatible (Pixel Shaders 2.0; Vertex Shaders 2.0). The chip itself will be clocked at 350 MHz and manufactured using 0.13 micron technology. Of course, the faster 128 MB at 400 MHz memory configuration will be involved in the new product.
Seems like someone has been smoking too much of the good stuff.
I can envisage 8 pipelines (but doubt it), however why 3 TMU's. Seems like the same mistake ATI made with RADEON (little support for it).
Sorry just want some opinions on what they think the R300 and next-gen hardware will be like.
demalion
17-Feb-2002, 03:21
The 3 TMUs I thought were a mistake only because not many things used it...yet we had games that did utilize it eventually (I believe SS:SE for one). With the upcoming generation of games, why would it be a mistake now?
I can envisage 8 pipelines (but doubt it), however why 3 TMU's. Seems like the same mistake ATI made with RADEON (little support for it).
Radeon was released in 2000 though, back then 3 TU's wasn't useful because no games supported it until 2001 with SS (and no other games appart from that). However that doesn't make 3 or more TU's a useless feature in general. Its 2002 now and it'll nearly be 2003 when this new card is released so what was useless in 2000 will not be useless in 2003-2005 (average amount of time a card should last as a decent card).
Although 8 pixel pipes does seem like a total waste to me, the Geforce 4 is barely using 2 of its 4 pixel pipes because of memory bandwidth limits (it hits about 60% of its peak in the SS:SE fillrate test) and at this time it seems to be the most efficient IMR around. The Radeon 8500 seems not to even be using 2 of its pixel pipes (at least in the SS:SE fillrate test) getting only about 40% of its peak fillrate. Why not just increase the actual memory bandwidth or memory bandwidth efficiency and leave the number of pixel pipes at 4. If they increase memory bandiwdth or memory bandwidth efficiency by 40% (QDR could maybe do that) over Radeon 8500 and left the pixel pipe config the same they could save on chip complexity and get the same speed boost as they would if they'd also added more pipes. But of course they'll add the extra pipes so they can advertise 2.8gpixel/s fillrate even though they'll still likely not even reach the peak fillrate of the original 4 pipes in Radeon 8500.
<font size=-1>[ This Message was edited by: Teasy on 2002-02-17 05:50 ]</font>
Ichneumon
17-Feb-2002, 04:32
While i'm not convinced there will be 3 TMUs/pipe on the r300, I'm sure it would not be the same as was on the r100 if that is the case. If the 3 TMUs had the same, or I would expect even more flexibility, in being able to be combined through multiple cycles like on the r200, then they would potentially have value. Far moreso than the 3 TMUs/pipe on the r100.
According to todays texture usage (Texturemap, Lightmap, Detailmap) 3 TMUs per pipe makes sense.
The only proble I see with the 3 TMU approach is that developers may have to specifically program for it... like Croteam with Serious Sam. I really doubt that happening in anything more than say... 1% of games being developed now.
The only proble I see with the 3 TMU approach is that developers may have to specifically program for it... like Croteam with Serious Sam. I really doubt that happening in anything more than say... 1% of games being developed now.
3 TU's don't have to be specifically programed for anymore then 2 TU's do. With the Radeon's 3 TU's not many games supported it simply because most games only used 2 texture layers and games that used more only had Radeon supporting more then 2 in a single pass so it wasn't worth supporting. However games move on, we won't still be sitting here 2 years from now playing games that use 2 texture layers or that only support 2 in a single pass because Geforce 2 will no longer be the week link in the chain in this instance. Games used to mainly have 1 texture layer back in Voodoo times and then they moved to 2 (with occasional games using 3 and 4 like SS) and soon 3 and 4+ will be normal and most games being released will probably support 1,2,3,4,5,6 textures in one pass depending which card you use (Geforce 3 and Radeon 8500 supporting 4 and 6 in a single pass dictate that and of course Kyro 1 and II and III when its released will support from 8 possibly up to 16 in a single pass).
<font size=-1>[ This Message was edited by: Teasy on 2002-02-17 06:42 ]</font>
I'd like to see transistors being used for more features, programmbility, or even pixel pipelines, rather than for a third TMU. As pixel shaders getting more complex, an extra TMU for each pixel pipeline can be expensive to implement.
Joe DeFuria
18-Feb-2002, 17:23
It's possible there is the same confusion over the number of texture units that there was over the R200. R-200 is able to combine, what, 12 textures "per pass", making many people assume it has 3 TMUs per pipe. (Not 2 TMUs per pipe plus loop-back...)
It may be the same with R-300.
If not, 3 TMUs should still be immediately beneficial in a game like Doom3...because presumably, 3 TMUs per pipe would mean 12 textures applied, per clock cycle. That is, Doom3 would theoretcially be able to operate without loop-back at all.
Whether or not this results in real-world performance gains will, as per usual, depend largely on memory bandwdith. Nothing comes for free (TM Kristof), so if clock cycles aren't sacrificed for loop-back, more bandwidth needs to be available per clock cycle to satisfy the TMU pipes.
It's a natural progression IMO. As the pixel pipeline and API interface becomes more "programmable", (and hopefully effective bandwidth increases too), This makes it easier for developers to create multi-textured effects. More TMUs per pipe will be usefull for applying more textures without requiring multiple cycles and loop-back.
<font size=-1>[ This Message was edited by: Joe DeFuria on 2002-02-18 18:25 ]</font>
Ichneumon
18-Feb-2002, 19:45
On 2002-02-18 18:23, Joe DeFuria wrote:
It's possible there is the same confusion over the number of texture units that there was over the R200. R-200 is able to combine, what, 12 textures "per pass", making many people assume it has 3 TMUs per pipe. (Not 2 TMUs per pipe plus loop-back...)
The R200 can apply 6 textures "per-pass" through 3 loopback cycles per-pipe. Each pipe can apply the 6 textures, there is no pipeline combining to achieve this on the R200 like in the GF2/3/4 to achieve their 4 textures/pass.
The R200 can do 12 Texture lookups per/pass was where I think you got the 12 number...
The R200 can apply 6 textures "per-pass" through 3 loopback cycles per-pipe. Each pipe can apply the 6 textures, there is no pipeline combining to achieve this on the R200 like in the GF2/3/4 to achieve their 4 textures/pass.
umh..it seems to me the nvidia is doing better there. their hw will be more efficient with loopbacks on small polygons not having to make some pipe idle for lack of pixels to work on
ciao,
Marco
Ichneumon
18-Feb-2002, 21:39
On 2002-02-18 22:22, nAo wrote:
umh..it seems to me the nvidia is doing better there. their hw will be more efficient with loopbacks on small polygons not having to make some pipe idle for lack of pixels to work on
ciao,
Marco
That may very well be. I was just clarifying how the R200 works... I don't know enough to be able to make a useful analysis of which way may or may not be better.
Dave Baumann
18-Feb-2002, 21:57
umh..it seems to me the nvidia is doing better there. their hw will be more efficient with loopbacks on small polygons not having to make some pipe idle for lack of pixels to work on
Eh? AFAIK NVIDIA does exactly the same as far as textureing is concerned, except they can only handle one loopback to 4 textures.
Eh? AFAIK NVIDIA does exactly the same as far as textureing is concerned, except they can only handle one loopback to 4 textures.
Are u sure? anyway..it should be better to combine pixel pipes.
ciao,
Marco
Ichneumon
18-Feb-2002, 22:57
Unless something has changed that I haven't heard about, the GF4 (and GF3) functions effectively as a 2 pipe-4 TMU chip when in "quad-texture" mode. The R200 functions differently, and 6 texture layers is achieved by the ability to do up to 3 loopback cycles for each of the 4 pipes.
I'm unclear as to why cutting your pixel fillrate in 1/2 for any reason would be a good thing, so I'm interested to understand under what circumstances combining pipelines would be better.
<font size=-1>[ This Message was edited by: Ichneumon on 2002-02-18 23:58 ]</font>
Dave Baumann
18-Feb-2002, 23:19
This is from the age old Abrash Xbox article (http://www.ddj.com/documents/s=882/ddj0008a/0008a.htm) published even before GF3:
"The texture unit supports up to four textures per pixel. Two textures can be handled in a single clock; three or four textures take two clocks"
I believe someone has pointed that it will resort to pipeline combining if the registers are required, but not, it appears, for straight multitexturing.
I'm unclear as to why cutting your pixel fillrate in 1/2 for any reason would be a good thing, so I'm interested to understand under what circumstances combining pipelines would be better.
u're cutting fillrate on radeon too.
IF (it's a big if) on gf3/4 they do pipeline combining it cuts in half pixel pipes, but it keeps them busy for half the the time cause they're working at the same time on the same pixel(s).
So the advantage would be that on small polygons when u have a few pixels to fill (with 3 or more textures..) there is a small probability that u have some pixel pipe idling cause it had run out of pixels on that small polygon.
Anyway..it's just a hypothesis.
ciao,
Marco
Thowllly
18-Feb-2002, 23:48
Ichneum, if you use loopback or combine pipes, you your fillrate is cut in half either way. pipe combining can be more effective for small triangles. If you wanted to apply 4 textures to a single pixel polygon, on a teoretical 3d chip with 4 pipes and 2 TMUs per pipe (and no setup time), you could either:
Use loopback. It would take 2 cycles and 3 pipes would be wasted
Combine pipes. It would take one cycle and 1 pipe would be wasted.
Edit: I didn't notice nAos post, damn I'm slow today...
<font size=-1>[ This Message was edited by: Thowllly on 2002-02-19 00:50 ]</font>
Hellbinder
19-Feb-2002, 08:06
Sorry guys, I know (some) of you will resist this for personal reasons *cough* Nvidia *Cough* supporters *cough* :wink:
Just kidding, Seriously. The R300 has 8 pixel pipelines and 4 geometry piplelines. 3 TMU's per pipe. I have seen the leaked Ati power point presentation showing the R300. It CLEARLY says 8 pipelines and 4x geometry. This was not a hoax, or the whimpy roadmap that some have shown. I posted a link to the german site that hosted it on the old site. Note, there are two seperate presentations shown. One that most people have seen. The other one has the info on the R300.
It also clearly showed an Rv250 and there was a diagram that showed an R200 MAXX. I imagine that the Maxx has been canned.
I know for all the Anti ATi people out there the thought of an 8 piped, 24 TMU'd, 4x geometry, 350mhz, Hyper Z III beast of a card out there as early as September is pretty scarry.... :smile:
<font size=-1>[ This Message was edited by: Hellbinder[CE] on 2002-02-19 09:13 ]</font>
Hellbinder
19-Feb-2002, 08:12
nAo,
Determined that *no matter what* Nvidia *must* be better regardless of the posted facts.
Read John Carmacks recent post on Radeon 8500 V.S. Nvidia (including GF4) texture processing.
Determined that *no matter what* Nvidia *must* be better regardless of the posted facts.
Sorry Hellbinder, but I believe you must carefully read what I wrote before write a statement like that. I just said that an approach, to me, is better than another one.
I still don't know what ATI and NVIDIA are doing regarding pixel pipes combining.
Read John Carmacks recent post on Radeon 8500 V.S. Nvidia (including GF4) texture processing.
I can read and understand his words, don't worry about me.
ciao,
Marco
Livecoma
19-Feb-2002, 09:54
3 TMU's on the R300 sounds weird... I take it everything leaked on the R300 is indeed a rumor? Do we even know if it has 8 pipes?
NV30 has sparked my interest.
There has been NOTHING leaked on it. I'm surprised nobody has made up any rumors about it yet lol.
Might as well start a rumor here...
NV30 Features
16 pixel pipes with 4 TMU's each,
32 geometry pipelines
quanti-deca-turbo-KYBAAE (kick yo butt anti aliasing engine)
NVIDIA spyware marketing engine (SME)
Microsoft spyware marketing engine
DX9 compliant
128 indipendant memory controlers (LMA4)
128 megs 2ns ddr
an nvidia lolypop
NV30a will be including the flux capacitor required for DX9.1's time travel API's. Oh, you didn't know that yet?
ATi better watch out for NV30!
Clootie
19-Feb-2002, 11:32
3 TMUs I thought were a mistake only because not many things used it...
How about 3D texture sampling with MIP-linear filtering, >8x "native" (non RIP) anisotropic filtering?
demalion
19-Feb-2002, 11:55
On 2002-02-19 12:32, Clootie wrote:
3 TMUs I thought were a mistake only because not many things used it...
How about 3D texture sampling with MIP-linear filtering, >8x "native" (non RIP) anisotropic filtering?
You choosing a section of my text out of all the rest of these posts to reply to is confusing me here...are you implying that those things are mistakes because not many things are using it?
I thought 3 TMUs could be handy for stuff like an slightly transparent pixel over a dual-textured pixel - that's 3 textures on one pixel. I believe John Carmack said Q3 could benefit quite easily from 3 texture units.
multigl
19-Feb-2002, 13:40
thasts worth a good laugh. about a good laugh only.
Multigl, what are you referring to ? :smile:
I know for all the Anti ATi people out there the thought of an 8 piped, 24 TMU'd, 4x geometry, 350mhz, Hyper Z III beast of a card out there as early as September is pretty scarry....
I'd say its crazy not scarry, and I'm deffinately not anti ATI either.
As we've seen a Radeon 8500 doesn't even reach 50% of its peak fillrate, so it doesn't even fully use 2 of its 4 pixel pipes. Now there moving to 8 pixel pipes. So there going to have to more then triple memory bandwidth over Radeon 8500 (or at least triple the cards efficiency) to actually get any advantage at all from those 4 extra pipes, and even then they'd get the same advantage with just 2 extra pipes and triple the memory bandwidth. There going to have to more then quadruple memory bandwidth over Radeon 8500 to actually warrent those 4 extra pipes.. so whats the point?.. nobody here really believes their going to do anything more then possibly double memory bandwidth over Radeon 8500 do they? (possibly a 256bit DDR bus or a 128bit QDR bus), in which case they could have kept the same 4 pixel pipes and gotten the same performance increase without having to add those 4 pixel pipes, it looks like there going to add 4 pixel pipes only to completely waste every one of them. Or am I wrong?.. will ATI more then double memory bandwidth over Radeon 8500 (or at least more then double rendering efficiency)?
Entropy
19-Feb-2002, 14:20
Everyone who in the discussion of future ATI chips refer to the datapoint of the Radeon 8500 only sustaining throughput equivalent to two theoretical pixel pipes makes a lot of assumptions.
1. It is only a datapoint. Another datapoint may show that the 8500 utilizes its' 4 pipes more effectively.
2. In order to achieve the average benchmark throughput equivalent to 2 pixelpipes working at their theoretical maximum, it is likely that in fact more than 2 pipes have to be active for significant parts of the benchmark. Any "real-life" benchmark is sure to show such behaviour.
3. It may well be that future game engines will do more on-chip work per fetched byte of data. Having more pixel pipes may help significantly in this case, particularly with more advanced (and larger sized) on-chip cacheing of gfx data.
4. We do not know what memory interfaces the next generation parts will use. We have used 128-bit memory paths for an eternity. The step up to 256-bit could well be seen as acceptable in cost by now, at least for a non low-end product. QDR might be an option, Kentron has referred to contacts with gfx companies. And of course we will see evolutional improvements in memory bandwidth as well.
5. Improvements in memory controllers (like the GF2->GF3 step) and various bandwidth saving tricks (HyperZ, HSR) may increase the amount of useful work performed with a given bandwidth budget.
Overall, I'd say that a next-generation product might very well benefit from a larger number of pixel pipes. There is no reason to assume that the rest of the architecture, nor the tasks such a part will work with, will remain unchanged from now.
Entropy
PS. It would be great if this forum didn't post an empty message when you press return after typing your password. :/
<font size=-1>[ This Message was edited by: Entropy on 2002-02-19 15:40 ]</font>
<font size=-1>[ This Message was edited by: Entropy on 2002-02-19 15:42 ]</font>
Joe DeFuria
19-Feb-2002, 14:37
Ichneumon,
Thanks for the pipeline corrections / clarifications!
Teasy,
As we've seen a Radeon 8500 doesn't even reach 50% of its peak fillrate,
Where has that been shown? (Perhaps only when single texturing?)
Personally, I'd say that IF (big if though) ATI does in fact effectively double the bandwidth over the R8500, then it's logical to assume that doubling the pixel pipes goes along with it.
It's widely understood / speculated that the R300 is a completely different architecture compared to the R200. So aside from any physical increase in memory bandwidth (256 bit bus, QuadMem, eDRAM, etc...) supplied, it is reasonable to speculate that the memory interface also steps up a "notch" in efficiency. So 2.5x the "effective" memory bandwidth isn't completely out of the question.
I'd say the largest hurdle though, is getting that first 2X bump...can they really deliver a 256 bit bus? Or implement a QuadDDR solution at something less than $500?
Livecoma,
I take it everything leaked on the R300 is indeed a rumor?
Yes and no. There have been official roadmaps leaked this past summer, which included the "specs" of the R300. The roadmap did state that the R300 was DX9 and 8 pixel pipes. (Did not have number of TMUs per pipe though.) I believe it also stated gemoetry power as "4X" that of R200.
So, I'd say that at least as of last summer, the intended specs for R300 did in fact include 8 pixel pipes. Whether the actual product ends up shipping with that configuration is another question altogether. :wink:
Where has that been shown? (Perhaps only when single texturing?
That was from the SS:SE fillrate test and AFAIK it was multi-texturing, Radeon 8500 reached about 40% of its peak fillrate.
Personally, I'd say that IF (big if though) ATI does in fact effectively double the bandwidth over the R8500, then it's logical to assume that doubling the pixel pipes goes along with it.
Its logical to assume that if ATI double there bandwidth they will double there pixel pipes because that seems to be the way Nvidia and ATI do things, but I'm not doubting they will do it, I'm saying its silly and pointless to do it.
It's widely understood / speculated that the R300 is a completely different architecture compared to the R200. So aside from any physical increase in memory bandwidth (256 bit bus, QuadMem, eDRAM, etc...) supplied, it is reasonable to speculate that the memory interface also steps up a "notch" in efficiency. So 2.5x the "effective" memory bandwidth isn't completely out of the question.
On R200's current efficeincy 2.5x the effective bandwidth would only fully use 4 pixel pipes, so again going to 8 pixel pipes is a total waste of silicon and nothing but a marketing ploy "oh look at our massive fillrate, its much bigger then your fillrate.. I might only hit 50% of it but in theory its massive". If ATI just upped there bandwidth by 2.5x and left there pixel pipe config the way it is (maybe add a third TU) they could have the same performance increase without doubling the pixel pipes to 8, which would save everyone money. Is nobody here a little pissed off at the idea of paying loads extra just so ATI and Nvidia can have bragging rights on theoretical fillrates that they never entend to even come close to reaching?
<font size=-1>[ This Message was edited by: Teasy on 2002-02-19 16:50 ]</font>
Clashman
19-Feb-2002, 16:02
By the way, I didn't think it was worthy of a new topic so I decided to post in this one: What happened to rage3d? I haven't been able to get on the site in days.
Entropy, didn't see your post before.
Everyone who in the discussion of future ATI chips refer to the datapoint of the Radeon 8500 only sustaining throughput equivalent to two theoretical pixel pipes makes a lot of assumptions.
Not really no, I'm going by what seems to be the most reliable fillrate test around and in that fillrate test Radeon 8500 is putting out a fillrate that is less then 2 of its pixel pipes should be capable of. It puts out 470mpixels/s, 2 275mhz pixel pipes can theoretically push 550mpixels/s.
My only assumption is that Radeon 8500 hasn't got driver problems with SS:SE, and I don't believe it has, even the Geforce 4 only hits 60% of its peak fillrate in that test and it has very good drivers in almost all games.
In order to achieve the average benchmark throughput equivalent to 2 pixelpipes working at their theoretical maximum, it is likely that in fact more than 2 pipes have to be active for significant parts of the benchmark. Any "real-life" benchmark is sure to show such behaviour.
Wether more then 2 pipes are being used in any part of the test is irrelivent, the final output is the important part.
It may well be that future game engines will do more on-chip work per fetched byte of data. Having more pixel pipes may help significantly in this case, particularly with more advanced (and larger sized) on-chip cacheing of gfx data.
More yes, but for the most part 8 pipes will still be a waste.
We do not know what memory interfaces the next generation parts will use. We have used 128-bit memory paths for an eternity. The step up to 256-bit could well be seen as acceptable in cost by now, at least for a non low-end product. QDR might be an option, Kentron has referred to contacts with gfx companies. And of course we will see evolutional improvements in memory bandwidth as well.
And that in your opinion is going to allow 4 times the effective bandwidth over Radeon 8500?.. because thats what it'll need to come close to its peak fillrate.
Improvements in memory controllers (like the GF2->GF3 step) and various bandwidth saving tricks (HyperZ, HSR) may increase the amount of useful work performed with a given bandwidth budget.
Again though there's no way that any of this stuff plus 256bit DDR bus is going to give a bandwidth improvement of even 300% never mind 400%+ over Radeon 8500, in which case they'd be better off keeping 4 pipes or even moving to 6 pipes if they feel like wasting pipes, but not 8, its a massive waste IMO, and a waste that you and I will be paying for, well I won't actually because I'll simply refuse to buy it, but allot of people will be paying for those wasted pipes, basically paying for a marketing ploy.
<font size=-1>[ This Message was edited by: Teasy on 2002-02-19 17:10 ]</font>
Its logical to assume that if ATI double there bandwidth they will double there pixel pipes because that seems to be the way Nvidia and ATI do things, but I'm not doubting they will do it, I'm saying its silly and pointless to do it.
How do you know that if (I suppose) you know almost nothing about their future architectures? There are many unexplored ways to achieve a better efficiency on memory accesses even on a plain IMR.
Is nobody here a little pissed off at the idea of paying loads extra just so ATI and Nvidia can have bragging rights on theoretical fillrates that they never entend to even come close to reaching?
I'm not pissed at all. If you don't want to buy it for whatever reason just don't buy it. Keep it simple :smile:
ciao,
Marco
<font size=-1>[ This Message was edited by: nAo on 2002-02-19 17:09 ]</font>
Joe DeFuria
19-Feb-2002, 16:17
That was from the SS:SE fillrate test and AFAIK it was multi-texturing, Radeon 8500 reached about 40% of its peak fillrate.
Please post a link and show performance relative to other boards.
Its logical to assume that if ATI double there bandwidth they will double there pixel pipes because that seems to be the way Nvidia and ATI do things, but I'm not doubting they will do it, I'm saying its silly and pointless to do it.
You can't say it's "silly" unless you know exactly what the bottlenecks are, and just as important, where the bottlenecks will lie in the future with heavily multitextured / "pixelshaded" DX9 apps. (See Entropy's post).
On R200's current efficeincy 2.5x the effective bandwidth would only fully use 4 pixel pipes, so again going to 8 pixel pipes is a total waste of silicon and nothing but a marketing ploy
Again, I simply feel you are making this assumption with little / no data to back it up.
If ATI just upped there bandwidth by 2.5x and left there pixel pipe config the way it is (maybe add a third TU) they could have the same performance increase without doubling the pixel pipes to 8, which would save everyone money.
Again, I think you're wildly extrapolating here. At least provide some test that show increasing core clock on the Radeon 8500 doesn't impact score at all, and the scores linearlly increase with memory clock.
But even then, we know nothing of the R300 memory interface itself.
Is nobody here a little pissed off at the idea of paying loads extra just so ATI and Nvidia can have bragging rights on theoretical fillrates that they never entend to even come close to reaching?
The proof is in the actual price/performance in real-world apps with the final product. You are going to pay "loads extra" for any of the "highest performance" parts, like the Ti 4600. I'd also wager that, if ATI does double the raw bus and inclde 8 pipes, that the bulk of the "extra" cost is not with the silicon space for the chip, but for the bandwidth architecture. Yes, the additional pipes do come at a cost, but what does the highest performacne chip sell for nowadays? $40? What do you think a 256 bit bus will "cost"?
How do you know that if (I suppose) you know almost nothing about their future architectures? There are many unexplored ways to achieve a better efficiency on memory accesses even on a plain IMR.
Well if I remember correctly I did say in one of my previous posts that adding 8 pixel pipes it pointless unless ATI are going to somehow improve there effective bandwidth by over 400% over Radeon 8500, if they can do that then obviously its not pointless. What I'm saying is if they merely use a 256bit bus and a slightly improved version of HyperZ II and manage 250% boost in effective mem bandwidth then there's not point of 8 pixel pipes as that bandwidth boost will only just manage to fullfil the peak fillrate of 4 of its pipes, maybe going to 6 would be acceptable but 8 is just over the top.
I'm not pissed at all. If you don't want to buy it for whatever reason just don't buy it. Keep it simpl
Thats exactly what I'll be doing, as I said in my last post.
<font size=-1>[ This Message was edited by: Teasy on 2002-02-19 17:37 ]</font>
Entropy
19-Feb-2002, 16:22
Teasy, with all due respect, you are making very cathegorical statements regarding the validity of the rumoured R300 design based on YOUR interpretation of ONE datapoint. And that single datapoint is from another chip!
It doesn't wash.
/speculation mode = "1"
As far as I can see, the most straightforward way of taking a major step up in bandwidth is to double the data path to 2x128-bits. QBM is attractive since it doesn't require increases in chip-connector counts and trace counts. However it does require additional design work, and it may (or may not) introduce unforseen signaling issues. Needs testing though. Using a wider datapath is relatively straightforward. And not too expensive either. The Voodoo 5500 doubled everything - gpu, memory, traces and used a larger board. With all that, and larger absolute profit margins for 3dfx and retailers over the 4500, it still was only roughly $100 more, typically less. So what was the end user cost of the dual 128-bit paths to memory? $5? $10? Is this a reasonable price to pay for significantly increased real-world performance? Of course it is. It is easily marketable too: "Twice the Bandwidth!" or "Twice the Performance!" or really simple "256-bits!". Plus of course it would actually yield roughly a factor 2 improvement in all high resolution and AA bencmarks, making for nice graphs to put on the box.
So I believe that 2x128-bit datapaths will make it to consumers first. Better to deal with the devil you know first, while getting chummy with the devil you don't providing you with the option of _another_ nice step upwards in performance once the competition in the market requires it.
/speculation mode = "0
Think of it this way. Regardless if you are ATI of nVidia, could you afford _not_ to take such a step, if you have a chip design with a transistor budget and clock that it would recieve a major boost from faster memory? It would only be reasonable to assume that your competitor would be in the same situation, and if they doubled their memory bandwidth, then you would get hammered by roughly a factor of 2 in a LOT of benchmarks. That would be disastrous not only as far as sales are concerned but perhaps more importantly for your public image as a leader.
Given that a high end consumer card retails for $300 and up, once your GPU design requires it, adding bandwidth is straightforward and a no-brainer.
I don't think we'll see 2x128-bit QBM this year though. :smile:
Entropy
Joe DeFuria
19-Feb-2002, 16:30
Well if I remember correctly I did say in one of my previous posts that adding 8 pixel pipes it pointless unless ATI are going to somehow improve there effective bandwidth by over 400% over Radeon 8500,
And we are saying that you haven't made even a reasonable case to show that even with a "mere" (cough!) bandwidth doubling, over the R200, that R300 wouldn't significantly benefit from a doubling of the pixel pipes as well.
Teasy, underclock the Radeon 8500 50%, and if the benchmark scores don't change, you might have at least a little something to base your argument on.
<font size=-1>[ This Message was edited by: Joe DeFuria on 2002-02-19 17:33 ]</font>
Please post a link and show performance relative to other boards.
http://www.anandtech.com/video/showdoc.html?i=1583&p=10
You can't say it's "silly" unless you know exactly what the bottlenecks are, and just as important, where the bottlenecks will lie in the future with heavily multitextured / "pixelshaded" DX9 apps. (See Entropy's post).
I can, I might turn out to be wrong of course but from the info currently available it looks like 8 pixel pipes with only double or 2.5x the bandwidth will be wasting most of the extra pipes most of the time.
Again, I simply feel you are making this assumption with little / no data to back it up.
Clearly I'm not making this assumption with no data to back it up though, the SS fillrate test has shown itself to he quite a reliable benchmark, wether that data is flawed (maybe by driver inefficiency) is anyones guess and would be an even bigger assumption then I'm making considering nobody has any info to say that SS has problems with Radeon 8500.
Again, I think you're wildly extrapolating here. At least provide some test that show increasing core clock on the Radeon 8500 doesn't impact score at all, and the scores linearlly increase with memory clock.
I suppose I could do these tests, I'd have to download SS:SE first though.. does the demo have the fillrate test in it?
But even then, we know nothing of the R300 memory interface itself
When I said "If ATI just upped there bandwidth by 2.5x" Iwas including memory efficiency improvements, so what I'm not saying (or not meaning to say) is that 8 pipes will deffinately be pointless no matter what, I'm saying if they only increase effective bandwidth by say 2.5 or 3 times over Radeon 8500 then 8 pipes will be pointless considering the current efficiency of Radeon 8500.
I'd also wager that, if ATI does double the raw bus and inclde 8 pipes, that the bulk of the "extra" cost is not with the silicon space for the chip, but for the bandwidth architecture. Yes, the additional pipes do come at a cost, but what does the highest performacne chip sell for nowadays? $40? What do you think a 256 bit bus will "cost"?
Actually I'd say that doubling the pixel pipes (especially with 3 TU's per pipe) would increase the chip cost by quite a high percentage, it might not be a high percantage of overall board cost but if the extra pixel pipes were useless then you have to aggree thats sort of annoying that your paying extra so ATI can shout about its massive peak fillrates.
Well, with 8 pipes and 3 TMU each it means 3 (three) times the peak texel fillrate and probably with current DDR technology it will be a waste of silicon.
Of course more pipes will improve the performance but the improvment will be small because of the Amdhal´s law.
There's still a lot of situations where bandwidth needs aren't very high and pure fillrate is what matters, stencil shadow volumes for instance, which will be important once doom3 arrives.
Another thing to consider, even if they put in 8 pipelines, nothing says they will all be fullfeatures, they can have 4 fullblown pipes and 4 simplier which can be utilitized for simple stuff such as pure stencil etc, like the U and V pipes in pentium processors.
Joe DeFuria
19-Feb-2002, 16:49
http://www.anandtech.com/video/showdoc.html?i=1583&p=10
You'll notice that the Radeon is getting hammered relative to the GeForce*3* in just about every test with that version of SS and the drivers. Clearly, this is not represenetative of how most games perform on the Radeon relative to the GeForce3.
What about 3D Mark 2001 fill-rate tests?
Clearly I'm not making this assumption with no data to back it up though, the SS fillrate test has shown itself to he quite a reliable benchmark, wether that data is flawed (maybe by driver inefficiency) is anyones guess...
Reliable compared to what? It certainly doesn't show the Radeon's absolte perofmance being on Par with GeForce3 boards, which is what just about every other game indicates.
My guess is that benchmark + that version of Radeon drivers = atypical results = bad data point to base your theory on.
and would be an even bigger assumption then I'm making considering nobody has any info to say that SS has problems with Radeon 8500.
Huh? Cro-team has themselves acknowledged an issue found with their engine and the Radeon, that they are developing a fix for. (This comes out of all the "High-Poly-Bug" threads over at Rage3D). And on the very page you linked to:
ATI's latest beta drivers, although fixing a number of problems in arguably more important benchmarks (e.g. UPT2002), have reduced performance significantly under Serious Sam. As you can tell by the above performanc echart, the Radeon 8500 drops from where it normally resides between the GeForce3 Ti 200 and the GeForce3 down to the level of the GeForce2 Ti 200. ATI is approximately a month away from shipping a WHQL certified version of these drivers (v7.66) which should hopefully have all of the kinks worked out.
What were you saying about nobody having any info related to problems with SS and Radeon? It's right on the page you linked. In short: this is CLEARLY a bad data point to base your theory on.
I'm saying if they only increase effective bandwidth by say 2.5 or 3 times over Radeon 8500 then 8 pipes will be pointless considering the current efficiency of Radeon 8500.
Again, you have no reasonable basis to evaluate the "current efficiency of Radeon 8500". That's the point we're all making. :wink:
if the extra pixel pipes were useless then you have to aggree thats sort of annoying that your paying extra so ATI can shout about its massive peak fillrates.
Yes, if the extra pipes were in fact always just sitting there idle because they can't be fed any data, that would be annoying. You seem to have already reached a conclusion about radeon 8500's bandwidth though, based on a bad data point, which is equally as annoying. :wink:
Teasy, with all due respect, you are making very cathegorical statements regarding the validity of the rumoured R300 design based on YOUR interpretation of ONE datapoint. And that single datapoint is from another chip!
It doesn't wash.
No I'm not.. I'm not saying R300 will have 8 pixel pipes and thats pointless so I won't buy it. I'm saying if it had 8 pixel pipes and couldn't reach more then a 250% bandwidth improvement over Radeon 8500 then those 4 extra pipes seem sort of pointless.
As for your comments on 2x 128bit DDR buses, I don't disagree with you and never have throughout this thread, my only point is with a 256bit DDR bus the Radeon 8500 would be hitting 80% of its peak in SS:SE, add some nice extra or more advanced bandwidth saving tech and you can have that at 100%, so why double the pixel pipes?.. yeah ok move to 6 pixel pipes, thats reasonable but doubling them IMO just seems like adding pipes for the sake of theoretical fillrate numbers on the back of the box.
Teasy, underclock the Radeon 8500 50%, and if the benchmark scores don't change, you might have at least a little something to base your argument on
Tried that once, never again.. my screen corrupted badly. If I can get the SS:SE fillrate test somehow then I'd be happy to do some tests though, but does the demo include the fillrate test?
Joe DeFuria
19-Feb-2002, 16:53
I'm saying if it had 8 pixel pipes and couldn't reach more then a 250% bandwidth improvement over Radeon 8500 then those 4 extra pipes seem sort of pointless.
Well, at least you've come down from that 400% figure. :wink:
You'll notice that the Radeon is getting hammered relative to the GeForce*3* in just about every test with that version of SS and the drivers. Clearly, this is not represenetative of how most games perform on the Radeon relative to the GeForce3.
Ok then if we say Radeon 8500 should be matching the Geforce 3 TI500 in that test then we can take its fillrate upto the level of the Geforce 3 TI500, in which case it'd be reaching 50% of its fillrate rather then 40%.. not a huge difference.
What about 3D Mark 2001 fill-rate tests?
Thats a bit of a joke test to be honest, it shows pretty much all cards hitting there peak fillrate from what I remember, the SS test seems a whole lot more believable with only Kyro 1 and II hitting 100%.
My guess is that benchmark + that version of Radeon drivers = atypical results = bad data point to base your theory on.
If we match the Radeon 8500 to the Geforce TI500 then it's still only hittinh 50% of its fillrate, that doesn't change my opinion that doubling the pixel pipes while slightly more then doubling the bandwidth seems very frivalous (unless they can triple of more the effective mem bandwidth of course, but I doubt that).
Huh? Cro-team has themselves acknowledged an issue found with their engine and the Radeon, that they are developing a fix for. (This comes out of all the "High-Poly-Bug" threads over at Rage3D). And on the very page you linked to:
The actual game tests don't go with the fillrate tests though, whatever is slowing down the game is not hitting the fillrate tests as badly.. look at both tests, in the game test Radeon 8500 is below both the Geforce 4 MX 460 and the Geforce 3 TI 200 and in the fillrate test its above both of those cards.
What were you saying about nobody having any info related to problems with SS and Radeon? It's right on the page you linked. In short: this is CLEARLY a bad data point to base your theory on.
No its not because the fillrate tests are not so badly effected like the actual game test is, also even if we assume Radeon 8500 is as fast as Geforce 3 TI500 in SS without any bugs then its still only hitting 50% of its peak.
Again, you have no reasonable basis to evaluate the "current efficiency of Radeon 8500". That's the point we're all making.
Yes I do.. thats the point I'm making :wink:
You seem to have already reached a conclusion about radeon 8500's bandwidth though, based on a bad data point, which is equally as annoying.
And you seem to have readily jumped to the assumption that this is bad data even though the fillrate tests are obviously either not effected by this bug or only slightly effected by the bug.
Right off to have my dinner and then I'll probably go to bed after I've watched TV for a bit, up at 7am tommorow.. you'll all just have to wait if you want to argue with me some more :smile:
Sorry to disturb you guys in your discussion :wink: but I have some questions/thoughts.
Didn't someone here some time ago posted about a type of AA-ing that burns fillrate but not bandwith? I can't remember specifics, anyone know more about that?
And I always hear that R300 is an ArtX design, so one should look at what they achieved with Flipper: Virtual Texturing, high bandwith 1T-SRAM. I would assume both would be incorporated into R300, which would mean the chances of all 8 pixel pipes being in use aren't that bad. :smile:
I think the 3rd TMU would be helpful in many cases where the Radeon does it's quick and dirty trilinear filtering. Serious Sam may specifically use the 3rd TMU, but I don't think it's necessary for games to explicitly take advantage of it, perhaps a few extra TMUs to make filtering more efficient isn't that bad an idea.
Joe DeFuria
19-Feb-2002, 18:47
[3D Mark Fillrate test] is a bit of a joke test to be honest, it shows pretty much all cards hitting there peak fillrate from what I remember, the SS test seems a whole lot more believable with only Kyro 1 and II hitting 100%.
Why is it a joke, because something refutes your "theory?" I haven't seen anyone else question the validity of the theoretical fill-rate tests of 3D Mark. That is, given "optimal" fill-rate conditions, how close to "peak" can these cards get?
The question then, is why is 3D Mark's fill-rate test getting much closer to "theoretical" fill rates on all cards than SS is? (Assuming this is the case...I have not seen the 3D Mark data points.)
If the reason is simply and solely "bandwidth", then how did Radeon magically come up with the bandwith for 3D Mark 2001? Seems to me it might be a question of how they lump passes together, requiring multiple passes for Radeon, GeForce, etc.
also even if we assume Radeon 8500 is as fast as Geforce 3 TI500 in SS without any bugs then its still only hitting 50% of its peak.
Again, where is the evidence that more memory bandwidth is the culprit behind Radeon only reaching 50% of the peak on the SS test? Why can't it be a fill-rate limitation based on how the engine performs the multitexturing passes? Can the test be run in 16 bit and 32 bit color? Does that drastically increase the fill-rate scores?
There are simply too many unanswered questions for you to make the claims that you are.
Yes I do.. [have a reasonable basis for my argument]
Well, we'll have to agree to disagree. :wink:
And you seem to have readily jumped to the assumption that this is bad data even though the fillrate tests are obviously either not effected by this bug or only slightly effected by the bug.
No, I didn't jump to anything. I gave my opinion which includes the need to run more tests to come to any sort of supportable "theory". That includes my own.
I didn't say anything like 8 pixel pipes would be useless, or useful for that matter. I said I wouldn't rule out their usefulness, even if bandwidth is "only" increased two fold.
Right off to have my dinner and then I'll probably go to bed after I've watched TV for a bit, up at 7am tommorow.. you'll all just have to wait if you want to argue with me some more
Well, there's not really much more to argue about unless we have more data to work with. :sad:
Just found something (again):
http://www.3d-center.de/images/grafikfilter/1527-1639.gif (from http://www.3dcenter.de/artikel/grafikfilter/, thanks aths!)
Didn't someone here some time ago posted about a type of AA-ing that burns fillrate but not bandwith? I can't remember specifics, anyone know more about that?
Maybe is the other way around, an AA-ing that burns bandwith but not fillrate, as on gf3/gf4.
Are you talking about that?
And I always hear that R300 is an ArtX design, so one should look at what they achieved with Flipper: Virtual Texturing, high bandwith 1T-SRAM. I would assume both would be incorporated into R300, which would mean the chances of all 8 pixel pipes being in use aren't that bad. :smile:
Umh..they'd need a lot of 1T-SRAM...dunno if this is feasible at this time and on a 0.13 micron process.
ciao,
Marco
LeStoffer
19-Feb-2002, 18:55
On 2002-02-19 17:22, Entropy wrote:
As far as I can see, the most straightforward way of taking a major step up in bandwidth is to double the data path to 2x128-bits.
Right on. It might be time to go back to some kind of Voodoo2-way of splitting the memory up on two blocks. Maybe 64 MB for the framebuffer and the vertex-"buffer" and other 64 MB for textures. I know, I know, it's really not flexible, but 2x128 bit should still be a lot cheaper than real 256 bit. And with 2x64 MB things should work nicely. Or am I missing something?
BTW: This is not specific to the R300.
Regards, LeStoffer
On 2002-02-19 19:53, nAo wrote:
Maybe is the other way around, an AA-ing that burns bandwith but not fillrate, as on gf3/gf4.
Are you talking about that?
Could be that I confused those things, my brain is a bit flappy lately. :smile: AA-ing burning fillrate but no bandwith would be more usefull these days..
Umh..they'd need a lot of 1T-SRAM...dunno if this is feasible at this time and on a 0.13 micron process.
Maybe with some tricks they won't need as much? I believe R300 will have Virtual Texturing, could a similar technique be applied to the framebuffer? (sounds strange, I know)
Could be that I confused those things, my brain is a bit flappy lately. :smile: AA-ing burning fillrate but no bandwith would be more usefull these days..
What about AA-ing via multisampling without
a bandwith it and a slightly fillrate hit? :wink:
Maybe with some tricks they won't need as much? I believe R300 will have Virtual Texturing, could a similar technique be applied to the framebuffer? (sounds strange, I know)
Current (small, just some kbytes) texture caches work as kind of virtual texturing. Even multiplying by ten the size of these caches will not give almost any boost to performances cause there is no data locality to capture anymore. To have a breaktrough in this field one would need a BIG texture cache with virtual addressing to capture inter-objects and inter-frames texture locality.
Virtual addressing of frame buffer? I believe nvidia do it since nv10 :smile: Remember, a current GPU has caches for everything: textures, vertexes, frame and z-buffer, states, etc..
ciao,
Marco
<font size=-1>[ This Message was edited by: nAo on 2002-02-19 20:47 ]</font>
Joe DeFuria
19-Feb-2002, 20:22
I know, I know, it's really not flexible, but 2x128 bit should still be a lot cheaper than real 256 bit. And with 2x64 MB things should work nicely. Or am I missing something?
The only thing you're missing, is that I believe the "cost" of 2x128 bit isn't much difference than "real" 256 bit...if we're talking about single chip solutions here.
One reason why Voodoo SLA was cheap "enough", was that there were multiple chips in action. The original Voodoo2 had a 192 bit bus...but it was spread over three chips....3x64 (Heck, the quantum 3D "Voodoo2 Single Slot SLI") was a 384 bit card!!
So, while it's cheaper (bus wise) to implement a 2x128 bus than a single chip 256 bus. But it's more expensive (chip wise) to implement the 2x128 bit solution. Trade-offs, as with everything.
At this point, I would rather see a single-chip, 256 bit wide bus become the next standard....but in the mean time I would welcome a new SLI solution while they work out single chip 256 bit packageing issues. :wink:
Umh..they'd need a lot of 1T-SRAM...dunno if this is feasible at this time and on a 0.13 micron process.
Just looked it up: Flipper uses NEC's embedded DRAM 0.18 process and has 51 million transistors. It has 3MB 1T-SRAM which takes up ~25 million transistors. NV25 already has 63 million transistors on 0.15 with 144mm², R8500 60 million. So, there is really not much space left for 1T-SRAM onchip, even on 0.13.
For comparison: BitBoys' Glaze3D with 72Mbit/9MB on 0.20 had 150mm² with a 1.54 million transistor rendering core. Sadly I don't know about transistor count of Infineons embedded memory.
Oh, and I found a nice quote from the Bitboys: "Others will follow eventually [with embedded memory], but their huge 10-15 million transistor rendering cores are way too large and inefficient to be combined with embedded DRAM."
:smile:
mboeller
19-Feb-2002, 20:51
To give this discussion another twist :
I don't believe for one second that the R300 will have 3TMU's per Pipeline. This would be stupid, cause you cannot send enough Textures to the TMU's to use them more than 30% (using an normal 128bit DDR-Interface).
The 8-Pipeline-architecture was mentioned in the old presentations so how about an :
8 x 1 System
used like the 4 x 1 Units the Rampage ( = texture-computer) had. The R300 would then be an chip with twice the speed/cycle of the old Rampage. This would be useful for the up to 8 textures per pass allowed in DX9.
IMHO; multichip with 2x128bit DDR would work, but only when the 2 chips are connected with an highspeed-low latency interconnection like inside AMD's Hammer-series ala Hypertransport or something similiar. This would mean that the chips are connected like the old Voodoo's in SLI (or something similiar like MAXX) but use only one set of textures and (maybe) two framebuffers and two Z-buffers.
Manfred
<font size=-1>[ This Message was edited by: mboeller on 2002-02-20 09:38 ]</font>
LeStoffer
19-Feb-2002, 21:52
Hmmm, actually I wasn't talking about Voodoo2 in regard to the SLI (two chip solution) but in regard to the idea of using segregated memory in an one chip solution. You make two separate 128 bit busses on one card which each has 64 MB DDR-memory dedicated. One is used by the chip to hold the framebuffer and let’s say, a cached vertex pool, while the other is used to hold the textures (like the Voodoo had separate memory for framebuffer and textures).
But sorry if I’m babbling too much. :wink:
Regards, LeStoffer
LeStoffer:
That actually wouldn't be a bad idea. Consider that basically the only negative spin put on two separate memory banks when they were used in the Voodoo Line was the fact that during SLI, you had to replicate the textures.
In a one chip solution, that would not be the case.
multigl
19-Feb-2002, 22:38
On 2002-02-19 14:44, nAo wrote:
Multigl, what are you referring to ? :smile:
:smile:
I think seperating out the Frame Buffer and texture memory into different banks would be a step backwards.
While it might provide increased bandwidth it would mean lots of messy work arounds when it came to using render to a texture, and that is becoming more and more pervasive.
On 2002-02-19 19:16, Nexus wrote:
Didn't someone here some time ago posted about a type of AA-ing that burns fillrate but not bandwith?
That's what Kyro2 does...
That's what Kyro2 does...
No, Kyro2 requires more texture bandwith cause, afaik, it uses a supersampling approach.
If I understand the bandwidth thing
correctly, the problem is that a 256bit bus has (at least) 256 traces that connect up the GPU and memory, and that because of this
the PCB is really hard to design...
Is this correct?
Anyways, assuming you wanted to stay with 128bits, couldn't you use RDRAM? Use say
4 32bit PC1033 (PC1200) RIMMS. That would give you as much as 16.8 (19.2)GB/s of bandwidth. Regarding cost of the RDRAM, would it really be more than that of highly clocked DDR chips?
Ascended Saiyan
20-Feb-2002, 00:56
On 2002-02-19 15:37, Joe DeFuria wrote:
Yes and no. There have been official roadmaps leaked this past summer, which included the "specs" of the R300. The roadmap did state that the R300 was DX9 and 8 pixel pipes. (Did not have number of TMUs per pipe though.) I believe it also stated gemoetry power as "4X" that of R200.
Hi everyone this is my first post.If this is true that would place the triangle count at 300 Million since the R8500 is at 75 million currently.My guess is that this will only be achievable with 4 vertex shader & one Directx 7 T&l unit since the Geforce 4 has 2 vertex shaders & 1 classic T&l unit & it`s capable of 136 million triangles/sec.
<font size=-1>[ This Message was edited by: Ascended Saiyan on 2002-02-20 01:59 ]</font>
nooneyouknow
20-Feb-2002, 02:54
GF4 only has 2 Vertex Engines and not an extra 'DX7' one. 2 Total, assuming one or both can operate in programmable or fixed-function mode.
Whats the point of such a huge amount of power in the T&L engine if there is a bug somewhere in the drivers or hardware that keep stalling it?
Just a joke... :razz:
Just thought I better add I own a Radeon 8500 and thats why I am so interested in their next offering, the 'R300.'
No bashing .... been an NVIDIA card owner since the TNT and changed tracks to ATI and the R200 only because of the great performance and features, not to mention lean price !
When Life does not find a singer to sing her heart she produces a philosopher to speak her mind.
<font size=-1>[ This Message was edited by: misae on 2002-02-20 07:07 ]</font>
Hellbinder
20-Feb-2002, 07:44
For those of you making assuptions about the R8500 actual fill rate V.S peak theoretical, based on early Serious Sam SE benches.....
The judge is out on that. Croteam worked out a Serious OpenGL bug that was hampering SS and SS:SE (and many other opengl apps). It is clear that R8500 numbers in SS will be WAAAYY up after the next beta driver release with its brand spiffy new OpenGL driver.
AS far as 8 pixel pipelines and bandwith issues. (fist off is HAS 8 pixel pipelines and 4 geometry piplines as stated).....
Remember ArtX is the team that designed this chip. They previously designed a wonder of a GPU with a very unique Textureing process, Emmbedded Ram, and a powerull T&L engine (can you say gamecube?).
I imagine, and this is pure speculation on my part, that if ArtX pit 8 pipes in that sucker, then they have developed a way to USE them. Otherwise its a Waste of time.
Livecoma
20-Feb-2002, 10:06
So what your basically saying is... First R300, then... the world!! lol j/k
I agree that if the R300 has 8 pipes it will be for a good reason, and not go to waste. What I don't agree with is the speculation and/or implication that its going to be anything huge relative to whats in store from other companies.
Who exactly knows whats brewing in NVIDIA's kitchen right now?
Tagrineth
21-Feb-2002, 18:00
Multisampled FSAA needs almost no extra bandwidth. It requires extra Z tests, but no extra texture data whatsoever - it repeats the same texel per pixel space, using the extra Z data to determine opacity.
8 pixels per clock would allow some tremendous multisampling usage... depending on whether or not ATi actually uses MS AA... ^_^
Another thing... 1T-SRAM isn't a purely embedded solution. The 'fast' 32MB pool in GameCube is 1T-SRAM and is shared by Flipper and Gekko.
Another thing... 1T-SRAM isn't a purely embedded solution. The 'fast' 32MB pool in GameCube is 1T-SRAM and is shared by Flipper and Gekko.
24MB :smile: Besides..can 1T-SRAM keeps up with current fastest ddr memory? I don't believe it. And ddr latency it's not a big problem at all with a well designed gpu..
ciao,
Marco
Ichneumon
21-Feb-2002, 18:48
Rumor has it (well, a bit more than a rumor, but its unconfirmed) that SmoothVision, at least in its current incarnation, won't be implemented in the r300... the implication being that the r300 will be using Multi-sampling AA combined with Anistropic filtering.
_________________
Ichneumon
http://www.rage3d.com
<font size=-1>[ This Message was edited by: Ichneumon on 2002-02-21 19:49 ]</font>
Would make sense. Kind of sad , but would make sense... (I personally like SmoothVision)
On 2002-02-21 19:00, Tagrineth wrote:
Multisampled FSAA needs almost no extra bandwidth. It requires extra Z tests, but no extra texture data whatsoever - it repeats the same texel per pixel space, using the extra Z data to determine opacity.
I wouldn't say it needs almost no extra bandwidth, just not as much as super-sampling. nvidia still uses a frame buffer that is 2x or 4x normal size. This means they use extra bandwidth.
I think Smmothvision is visually superior to any other method used by competitors (i.e. Multisamping) but I have not seen the new Accuview in action.
Tagrineth
22-Feb-2002, 17:22
On 2002-02-22 03:15, 3dcgi wrote:
On 2002-02-21 19:00, Tagrineth wrote:
Multisampled FSAA needs almost no extra bandwidth. It requires extra Z tests, but no extra texture data whatsoever - it repeats the same texel per pixel space, using the extra Z data to determine opacity.
I wouldn't say it needs almost no extra bandwidth, just not as much as super-sampling. nvidia still uses a frame buffer that is 2x or 4x normal size. This means they use extra bandwidth.
As I said, it still uses additional Z tests, although I should have also brought up the oversized frame buffer. ^_^;
Ascended Saiyan
22-Feb-2002, 18:42
On 2002-02-22 07:42, misae wrote:
I think Smmothvision is visually superior to any other method used by competitors (i.e. Multisamping) but I have not seen the new Accuview in action.
Here are some shots of Smoothvision & Accuview in 4X & 4XS from the xbit-labs review.It seems to me that 6X Quality Smoothvision actually gets rid of the jaggies better.
Accuview 4X
http://www.xbitlabs.com/video/geforce4/a-accuview4x-s.jpg
http://www.xbitlabs.com/video/geforce4/a-accuview4x-frag.jpg
Accuview 4XS
http://www.xbitlabs.com/video/geforce4/a-accuview4xs-s.jpg
http://www.xbitlabs.com/video/geforce4/a-accuview4xs-frag.jpg
Smoothvision 6X Quality
http://www.xbitlabs.com/video/geforce4/a-smoothvision-s.jpg
http://www.xbitlabs.com/video/geforce4/a-smoothvision-frag.jpg
Thanks
I have seen Accuview on the web in static form. However what I meant was I had not seen Accuview first hand. Thats still the best way to judge as it is a bit subjective reviewing visual quality of an already high standard. In moving form however that is when the real differences come to reality and I have already got bad eyesight... maybe when things are moving I personally would not notice the discreprancies present in screenshots! :lol:
And since I do not have a GF4 nearby I would say Smoothvision is the 'best' visual FSAA I have seen moving and in static shots. However the performance drop in some games I play is noticeable but I leave it on because going back to non-FSAA is like playing an entirely different game.
Thats just my opinion :eek:
king_iron_fist
23-Feb-2002, 15:57
Just thought I'd point you towards this article at anandtech about the ArtX GPU for the gamecube.
http://www.anandtech.com/showdoc.html?i=1566&p=5
It mentions embedded dram and how useful it would be on the PC platform. Is this how the r300 will achieve its titanic bandwidth?
vBulletin® v3.8.6, Copyright ©2000-2013, Jelsoft Enterprises Ltd.