NV35 correction

CMKRNL

Newcomer
I had stated earlier that the programmable tessellation unit was going to be in NV35. It's not clear to me anymore that this is the case. Apparently this will be in a 4Q'03 part, which leads me to believe that it will be NV40, not NV35. This part will also contain a completely revamped unified shading model. This means that both vertex and pixel shaders will share the exact same ISA and constructs. In other words, pixel shaders will also have access to constant based/dynamic branching. What's most interesting is that nVidia is not the only company doing these things in that timeframe.

There was some talk in one of the threads about how peak gigaflops were calculated. I think someone already mentioned this, but they base it on MAD which is counted as two flops throughput per cycle per unit. Latency is higher, but it's essentially masked by the pipeline unless there is a dependency.

I don't have any further info on NV30 other than that it's back and "operational". I don't know if operational implies demo-able at comdex or not.
 
CMKRNL said:
I had stated earlier that the programmable tessellation unit was going to be in NV35. It's not clear to me anymore that this is the case. Apparently this will be in a 4Q'03 part, which leads me to believe that it will be NV40, not NV35.

Based on nVidia's product launch schedules over the past three years, I would expect the NV35 to be a 4Q '03 part.
 
Can someone explain what an fmad instruction is or does? I thought that each unit executed two flops per cycle because the pipeline could do 2 half-floats per cycle.
 
I don't see NV30 getting out in Q1 '03 (as all indications point to), and then both NV35 and NV40 getting out by Q4 '03.

However, Since NV30 was obviously originally targeted for Q3/4 '02, that does make it possibility for NV35 to have been originally scheduled for spring '03, and NV40 for the fall.

As I believe Bigus is suggesting though, recent history does not support nVidia being able to ship a first gen product, (NV20) followed by a "tweaked" version 6 months later (NV25), followed by a new core 6 months after that. (NV30). Which is what is being suggested nVidia execute here. It's been more like first gen product (NV20) ...speed bin same product 6 months later (NV20 ti) ... twekaked core 6 months later (NV25)....tweak of the tweaked core ;) 6 months after that (NV25 AGP 8x).

So even if nVidia's has been planning on NV35 in the spring, and NV40 in the fall, I have high doubts that they will be able to execute that plan. I expect that we'll have either the NV35 in fall '03, or possibly a NV35 "refresh" at that time. (Speed binned NV35 ti, or NV35 with 3GIO?).

On the other hand, it is possible that if the NV40 development remains on track for fall '03, nVidia may opt to do something fairly radical...like drop the NV35 altogether. Let the NV30 and possibly an NV30 "ti" hold the fort for the majority of '03. A lot of that will depend on how well ATI executes their own planned road map.
 
Luminescent said:
Can someone explain what an fmad instruction is or does? I thought that each unit executed two flops per cycle because the pipeline could do 2 half-floats per cycle.

I suppose fmad = mad?
It evaluates an expression like a0 * a1 + a2, Multiply ADd.
 
FMAD is floating point multiply add. As the name implies it is a multiply and an add (or an accumulate), hence two operations per instruction.

FMAD: a=b+c*d
FMAC: a=a+b*c

A pixel shader instruction works on a 4-tuple (or 4D vector), thus equalling 8 operations per instruction. At 400MHz and with 8 pipes you get 8*400*8 == 25.6GFLOPS. This indicates that NV30 can issue two ALU shader instructions per pipe per cycle. I wonder if NV30 uses a VLIW (well, LIW anyway) sceme like R9700.

Cheers
Gubbi
 
I'm guessing the two alu instructions per cycle which the NV30 is capable of issuing and executing are half-floats? With full floats the pipeline executes 25.6 gflops and with half-floats (double issue alu instructions) the pipeline executes 51 gflops.

What about the tmu's and the texture adress unit, don' t those execute some sort of floating point operations (although the NV30 and the R300 cannot filter floating point textures with the tmu's)? Two alu instructions indicate two pixel program fmads (vector), what about the texture adressing fmov instructions, etc., are those not counted as a floating point operations?

By the way, do you guys believe the pixel program processor has a vector and scalar unit in parallel (such as the vertex shader in the R300) or just a 4D vector unit?
 
CMKRNL said:
I don't have any further info on NV30 other than that it's back and "operational". I don't know if operational implies demo-able at comdex or not.
That's information! (hmm Mufu ;))

Hope top see something running quickly
 
CMKRNL said:
This part will also contain a completely revamped unified shading model. This means that both vertex and pixel shaders will share the exact same ISA and constructs. In other words, pixel shaders will also have access to constant based/dynamic branching. What's most interesting is that nVidia is not the only company doing these things in that timeframe.

Hmmm good info. Is this not what is going to be key in DX10; vertex and pixel shaders being more or less seamless integrated from a programability view? I cannot remember where I picked this up though...
 
Evildeus said:
CMKRNL said:
I don't have any further info on NV30 other than that it's back and "operational". I don't know if operational implies demo-able at comdex or not.
That's information! (hmm Mufu ;))

Eh?

I always just "guess" stuff. I thought you knew that... ;)

MuFu
 
My next guess is that I will get spam-filtered out of a certain individual's inbox due to me constantly pestering the poor guy for info. :D

MuFu.
 
So basically, be it NV35 or NV40, this part would have real dynamic branching in ps...
That doesn't really suit into all the DX10 talks I've been hearing about lately, since it's more ambitious there.

This more suits into the PS/VS3 standards, where static & dynamic branching is already implemented.

Still, rathering intresting info, considering that real dynamic branching in ps requires a substancial amount of additional transistors (much more so than in vs).

Also, even more intresting is that not only NVIDIA are working on this implementation, which is good to say the least...
 
really interesting!

Well, if Nvidia is as quick to introduce new GPUs and refreshes as they were in 1997-1999, starting with NV30 coming out in 4Q 2002, then you would expect a full refresh (NV35) by spring 2003, then a new core (NV40) by the fall of 2003, and so forth.

however, it is true that Nvidia as dramatically slowed down the introduction of new cores and even full refreshes, by bringing out tweaks and/or speed bins. The GeForce3 Ti 500 was an example. It's pretty much the exact same core as the GeForce3, with a small bump in speed. NV25/GeForce4Ti most definitally should have been out in fall 2001 since it is similar to the XBox GPU.

I suppose it all really depends on Nvidia getting back on track, also on their desire to push the market again as they did in 1997-1999 when they were HUNGRY. And finally, on the competition from ATI.

I am hoping that ATI will not wait a full year to have a new product. I hope ATI has R350 by spring with DDR-II, 2 TMUs per pipe and other
enhancements over R300/Radeon9700. Perhaps even better shaders. And that it surpasses Nv30 in all or most areas. Enough to force Nvidia to bring out NV35 in spring (Nv being aggressive) or at the latest by fall (Nv bearly getting back on track)

I'm sure ATI will have R400 by fall 2003. But I doubt Nvidia will have NV40 by fall, even though they should. So that means NV35 will almost have to be a 2003 launch, be it in the spring (some would say that's unlikely) or in the fall. I know there are alot of people who will disagree with me, saying that this timetable is much too fast. that might be. but I don't think it's out of the question.
 
and BTW, it is this timetable, going forward over the next year or two, that will almost assuredly determine what eventually goes into XBox2 by 2005-2006. be it NV40, NV45, NV50, NV55, or ATI R500~R700
 
While we are all speculating, here's my map for the rest of the year '03. Save this for posterity. ;)

1) NV30 architecture launch at Comdex. (Big stretch there, I know!). Extremely short supply of one NV30 variant (highest performance one) available in December. Quantity starts to ship in Feb '03. This variant of NV30 is demonstrated to beat 9700 in "certain circumstances" and debate ensues as to which one is the "best product". (I know...another big stretch! ;))

2) Other NV30 variants (different speed grades to compete more with the 9700 non pro and 9500 Pro) start to ramp up a month or two later (March/April). About the same time, NV31 (NV30 MX?) starts to ship to compete with the 9000.

3) At the same time NV30 boards start to ship in quantity, (February) ATI "launches" the successor to Radeon 9700 Pro. (Starts shipping in March / April). I really don't see this as anything but a speed binned 9700 (400+ Mhz, though Still 0.15 micron, 8x1 architecture) paired with "just fast enough DDR-II ram" to eliminate the question over which is "faster" in the majority of the situations. People might call it the "R350", but IMO, it will turn out to be a faster R300 paired with some form of DDRII ram.

4) Also by March of '03, ATI will replace the 9000 with an AGP 8X variant of the 9000...just the current 9000, perhaps a modest speed bump, plus AGP 8X support.

At this point, ATI will have a slight advantage at the high end, and I predict nVidia will have a slight advantage at the lower end, in terms of price / performance.

5) All is quiet until "Fall"

6) in the Fall, ATI launches the R-400. 0.13u, DX9 with PS/VS 3.0 support. 175 million transistors. G-DDR III ram. Possibly 3GIO support. (Is that supposed to be available by then?)

7) nVidia launches a very similar product (NV35) ...and finally adds the 256 bit bus to their architecture. These parts will be even more similar in performance / features than the R-300 and NV30.

8.) ATI also introduces 0.13u variants of the R-300 (now the RV-350), and possibly RV-250. Cheaper and faster versions of the 9000 and 9500 parts. This will bring the a slight edge to price / performance back to ATI in the value and mainstream segment.

There you go. Let's just hope that ATI and nVidia both execute this or a similar roadmap!
 
exellent speculative roadmap Joe. I like it. probably will be something along those lines. although I'd prefer a more aggressive roadmap from ATI and Nvidia.

I still think that ATI will have a true R350 (on .15u) with an 8:2 configuration. :)
 
I don't see why ATI would try to put out a new, more complex core in .15 microns. They're already bumping up against the maximum power requirements that a consumer part (even one aimed at enthusiasts) can demand. A new core would be a huge cost for a tiny part of the market, done purely for mind-share reasons; a speed-binned R300 core would be a much smaller cost to meet nearly the same marketing goals.

It would be cool if, instead, they could get the .13 micron chip out earlier next year.
 
It would be cool if, instead, they could get the .13 micron chip out earlier next year.

I agree, would be great...I just don't expect it to happen. ATI tends to give their cores 9 to 12 months on the market before "replacing" them. They'll have their entire line-up of new cores in full production...all on 0.15u, by December. I don't see much more than offering new memory speeds and/configurations for these products for 9-12 months. Then expect ATI to more or less completely phase out the 0.15 parts with a brand new line-up phasing in between 3Q / 4Q '03.

It's a much different approach than what nVidia tends to do. (More or less replaces ONE market segment every 6 months or so.) nVidia's one exception to that has been the simultaneaous GeForce4 Ti / GeForce4 MX launch.
 
Back
Top