David Kirk on HDR+AA

Status
Not open for further replies.
As others have itirated: Comparing architectures that are ~2 generations apart is rediculous.
Why is that? What magic happened between 2002 and 2004?

Let's look at some facts:
- NV30 is comparable to R300
- NV30 is in a TSMC 0.13u process
- NV30 runs at 500 MHz core and memory
- NV30 is 120 million transistors
- NV30 uses DDR-2 memory.
- NV43 is in a TSMC 0.11u process
- The TSMC 0.11u process is a cost-saving node over 0.13u. That is, performance is the same (but cost isn't).
- NV43 is clocked at 500 MHz core and memory
- NV43 is 143 million transistors.
- NV43 uses GDDR-2 memory.

Let's pick up these assumptions along the way:
- Jen-Hsen's 60 million transistors used for SM3.0 on NV40
- That 60 million scales with the number of pipelines.
- GDDR-2 memory is similar enough to DDR-2 that we can interchange them.

Intermediary conclusions:
- NV43 minus SM3.0 is ~115 million transistors (give or take a few million)
- NV43 can be built on TSMC 0.13 at around the same cost as NV30

Conclusions we can draw:
- NV43 was possible to build in the 2002-2003 time-frame.
- We can compare NV43 to R300 or R350.

Feel free to point out any flaws in this argument.


Edit:
There is no case where u say "gosh, I really wish I had less bandwidth". It never hurts.
Sure there is. Otherwise, why not have 1000 bits for your memory bus? At some point, this thing called "money" enters the equation. Wider buses cost more money. Someone at some IHV went "I think we have too much bandwidth. GPU X can be Y dollars cheaper if we use less bandwidth".

Sure, more bandwidth is better. I'm not arguing that bandwidth isn't a good thing.
 
bob the whole pipeline configuration is much diffrent bteween the nv30 to the nv40 .

The r300 was made long before the nv40 . To compare the parts is just as stupid as comparing a g70 to a tnt 2
 
Bob said:
As others have itirated: Comparing architectures that are ~2 generations apart is rediculous.
Why is that? What magic happened between 2002 and 2004?

Let's look at some facts:
- NV30 is comparable to R300
- NV30 is in a TSMC 0.13u process
- NV30 runs at 500 MHz core and memory
- NV30 is 120 million transistors
- NV30 uses DDR-2 memory.
- NV43 is in a TSMC 0.11u process
- The TSMC 0.11u process is a cost-saving node over 0.13u. That is, performance is the same (but cost isn't).
- NV43 is clocked at 500 MHz core and memory
- NV43 is 143 million transistors.
- NV43 uses GDDR-2 memory.

Let's pick up these assumptions along the way:
- Jen-Hsen's 60 million transistors used for SM3.0 on NV40
- That 60 million scales with the number of pipelines.
- GDDR-2 memory is similar enough to DDR-2 that we can interchange them.

Intermediary conclusions:
- NV43 minus SM3.0 is ~115 million transistors (give or take a few million)
- NV43 can be built on TSMC 0.13 at around the same cost as NV30

Conclusions we can draw:
- NV43 was possible to build in the 2002-2003 time-frame.
- We can compare NV43 to R300 or R350.

Feel free to point out any flaws in this argument.

The flaw? Well only that you need a time machine to take the nv43 back in time and compete with the r300.
 
No mention of any bandwidth saving technologies in that comparison.

Different strokes for different folks. AN AMD cpu is in no way comparable to an Intel "netburst"cpu in terms of bandwidth dependency. I'd imagine this was the same between different IHV's producing gpu's and even within a IHV's gpu's generational range..... :?:
 
If the Geforce 6600GT comparison doesnt work for you. Compare the X700XT to a 9800 Pro. You all are splitting hairs. Obviously the Nv3x and Nv43 are very different. But if the Nv3x had a comparable pipeline to an NV43 from the get go. The memory bandwith would not have been that big of a deal because of the targetted clock speeds. Thats all he is saying.
 
maybe people need to look at available bandwidth numbers rather than how wide a bus is whilst disregarding the speed of that bus :rolleyes: I dunno but if you have a highway that's 3 lanes wide but 120kmh speed limit and you compare that to a highway with 6 lanes with a 60 kmp limit... is it the same vehicle per hour throughput? or close or am I wildly off base. Oh ... let's consider passenegrs per hour throughput given that the 3 lane highway can fit more passengers per car whilst maintaining the same vehicle per hour throughput.
 
Bob said:
Intermediary conclusions:
- NV43 minus SM3.0 is ~115 million transistors (give or take a few million)
- NV43 can be built on TSMC 0.13 at around the same cost as NV30

Conclusions we can draw:
- NV43 was possible to build in the 2002-2003 time-frame.
- We can compare NV43 to R300 or R350.

Feel free to point out any flaws in this argument.
Perhaps the NV43 could have been build in that timeframe, but it could not have been clocked as high with acceptable yields because 0.13u has matured a lot meanwhile. Had they released such as part in 2002-2003, they might have had to clock it around 350-400Mhz or even lower (guesstimate) to get acceptable yields with 8 pipelines.
Now, try clocking the NV43 around 325-375Mhz. That makes it a lot less superior over even the NV30 and R300, except in shader-limited scenarios where it will most obviously win. Also, the 60M number is kind of a myth to justify the NV40's higher transistor count; it's just 220-160M, NV40-R420.

Uttar
 
Geeforcers Toms Hardware AGP shootout does give a good example of the limitation with 128 bit bus

http://graphics.tomshardware.com/graphic/20050705/vga-charts-pcie-04.html

look at how the 6600GT and more specifically the 6600 falls off as the resolution goes up. In most resolutions you are ok though.

My own expeience with the 6600 is that the GT is not handicapped by it's 128 bit bus but the 6600 and 6200 ( with unocked pipes ) nv43's with standard 3.6ns TSOP memory are badly bandwidth limited ONCE YOU START OVERCLOCKING the core. Unfortunately ( or fortunately :) ) the nv43 overclocks very well with most plain 6600's being able to hit 500Mhz on the core. I can tell you from experience that unless you have a very complex DX9 scene that an nv43 at 500/1000 will beat an nv43 at 700/750.

Because of the ability to use faster memory for the plain 6600 though this is a limitation of the memory speed and not the bus as such as there is memory out there that will remove this bottleneck, it is just up to the card manufactureres to put in on the cards to marry up better for people who use the core to the full potential.

I think I tend to agree with Bob's ( and Kirks) general point that the 128 bit bus was not a limitation for either the Fx range nor the 6 series. The problem with the 5800/u was that the DDR3 memory was expensive and ran very very hot because it used 2.5v. The memory was far hotter than the core when you touched the heatsink. When you could easily get 2.5, 2.2 and 2.0ns DDR1 memory that was cheaper and a lot cooler and you could also have a 256 bit bus as well then it made sense for them to jump "back" to this way of doing it, they did this not because 128 bit was a dead end limitation.

And just jumping back on topic slightly, when Kirk mentions that Ati would like the problem for r520 just being process what does he mean ? Or rather what "non-process" problem is he refering to ?... is it the SM3 or the new memory bus ? Does anyone know ?
 
Well, looks like I'm a little bit late to the party.

Anyway, it does make sense that forcing AA on through the driver just won't make sense with HDR rendering.

But it really didn't sound like he was talking about that. It sounded like a total copout on the lack of multisampling support for FP render targets.
 
Ummmm, unless I'm mistaken a 9700/9800 was designed for games being played 3 years ago, with the a projected view of the type of workloads being presented in the future; NV43 (NV4x) were designed for games being played last year with further projections - the software has moved on in that time period, hence the demands the software places on the architectures has altered. Comparing something as narrow a whether a 128-bit bus or a 256-bit bus is better on architectures with a generation or two's difference in targets seems to be an exercise in futility.
 
Status
Not open for further replies.
Back
Top