ATI had to settle on 24-bit FP about two years before the release of the R300. Microsoft may or may not have had any part in that decision.MfA said:You guys make it sound like Microsoft settled on 24 bits before any line of HDL was written, I have a hard time believing that ... want to mention some timeframes when ATI "and" Microsoft settled on 24 bit?
nyt said:Anyone here thinks (hence excluding those that know and won't answer anyway ) that ATi will stick with FP24 for R4xx to kill NV4x in speed? It sounds logical to me, unless they didn't find any other use than FP32 for the die space 0.13 buys!
Chalnoth said:ATI had to settle on 24-bit FP about two years before the release of the R300. Microsoft may or may not have had any part in that decision.MfA said:You guys make it sound like Microsoft settled on 24 bits before any line of HDL was written, I have a hard time believing that ... want to mention some timeframes when ATI "and" Microsoft settled on 24 bit?
I find that very hard to believe. ATI would have had to have been simultaneously designing both cores, and even then there would be significant complications. In particular, ATI was already pushing the limits of the .15 micron process with the R300 core. Any reasonable increase in die size would have been a major setback.Dave H said:b) ATI would have been able to accomodate a decision in favor of FP32 without any hit to R300's release schedule or performance (but with the obvious die size cost)
No. The vertex units obviously have to support 32-bit FP, but other than that, only the FP texture input and output support 32-bit.kyleb said:but most of the r300 core does support 128bit color eh? so then it is just the shader units which would need to be modifed to handle higher precision.
Pete said:Oh, come on now, Dave. Game pics rendered using ATi's ASCII shader don't count as part of the article text.
Right, everything but the fragment shader registers and functional units.Chalnoth said:No. The vertex units obviously have to support 32-bit FP, but other than that, only the FP texture input and output support 32-bit.kyleb said:but most of the r300 core does support 128bit color eh? so then it is just the shader units which would need to be modifed to handle higher precision.
Placement and routing is unlikely to be a particularly lengthy process for modern GPU development; in fact, I'd be surprised if it wasn't almost entirely automated. It's an interesting problem in bleeding-edge CPU design, where large pieces of logic have custom designed circuits that play tricky games to eek out the last bit of clockability. But not in GPUs that are built (as I understand it) almost entirely with standard cells, run at a significantly lower clock speed (almost an order of magnitude), and have a tremendous number of clock stages, thus significantly reducing the difficulties due to clock skew and making large scale changes in layout much easier to incorporate without affecting the rest of the design.The main problem with switching to FP24 would be that with the pixel shader units increased in size by some amount, everything would have to be shifted around.
There are no such "limits" to "push". Rather the problem is just that, as die size increases, yield-per-wafer decreases super-linearly, because each die takes up a larger fraction of the wafer, plus each die is more likely to contain a fatal defect due to its larger size. That is, a higher transistor count for R300 would have meant slightly higher than linear increase in costs per good die. Whoop-dee-doo. This would eat into ATI's margins a bit, certainly, but in no way would have made R300 impossible as a .15u part.In particular, ATI was already pushing the limits of the .15 micron process with the R300 core. Any reasonable increase in die size would have been a major setback.
Well I was surprised to hear it too, but considering sireric was part of the team that actually designed the damn thing he presumably knows what he's talking about, wouldn't you think?I find that very hard to believe.
No.Dave H said:Right, everything but the fragment shader registers and functional units.
Of course it's mostly automated. But we've also heard that the R300 core was "hand tweaked," and you have no idea how much processing power it actually takes to figure out the optimal path. A huge number of variables need to be taken into consideration. With more than a hundred million transistors, the R300 is no simple electronic circuit.Placement and routing is unlikely to be a particularly lengthy process for modern GPU development; in fact, I'd be surprised if it wasn't almost entirely automated.
Chalnoth said:While I am not involved in the design of modern processors, I have taken enough physics to understand...
Dave H said:considering sireric was part of the team that actually designed the damn thing he presumably knows what he's talking about
And Sireric wasn't the one that just stated that the R300 could have been switched to FP32 with no delay. That was Dave H.SvP said:Dave H said:considering sireric was part of the team that actually designed the damn thing he presumably knows what he's talking about
sireric said:Dave H said:Or, since you might not be able to speak to Nvidia's design process: was R3x0 already an FP24 design at the point MS made the decision? If they'd gone another way--requiring FP32 as the default precision, say--do you think it would have caused a significant hit to R3x0's release schedule? Or if they'd done something like included a fully fledged int datatype, would it have been worth ATI's while to redesign to incorporate it?
No, it wasn't. I don't think FP32 would of made things much harder, but it would of cost us more, from a die cost.
Chalnoth said:In particular, I still think that the R300 itself was already stretching the .15 micron process, so any increase in die size would have made it more challenging to produce the processor in terms of yields and clockspeeds.
1. Adding more stages would add even more transistors.Dave H said:No reason why it should have impacted clockspeeds,
But that is one of the ways yields go down. It's just probability...Dave H said:Chalnoth said:In particular, I still think that the R300 itself was already stretching the .15 micron process, so any increase in die size would have made it more challenging to produce the processor in terms of yields and clockspeeds.
No reason why it should have impacted clockspeeds, assuming extra clock stages were added to keep the slightly longer FP32 execution unit path out of the critical path. As I said, it would impact yields modestly (particularly in terms of good dies per wafer), but it's not as if there's some die size "limit" to the .15u process above which yields suddenly fall off a cliff.