NV-30 to have.............

DaveBaumann said:
alexsok said:
Yeah, I just noticed that...

Oh well, 1024 is unneccessary, Nvidia knows better! :D

Ugh, god! :rolleyes:

I mean look, u have 256 instructions + what u can do with loops & branches = no difference from 1024 instructions that could have been.

And like I said, the architecture is completely new so Nvidia knows better :D
 
That's the way I see it:

In a vertexshader you don't need much instructions as you do most in the pixelshader in the next gen (remember you can now do the WHOLE lightingmath in the pixelshader, no need to pass anything from the vertexshader)

in the pixelshader you need much, for procedural textures and all that, that takes much instructions. and as you don't have loops there in (as this is hard to get _THAT_ fast to plot tons of pixels), you need manual unrolling of this

vertexshaders get rather useless for SHADING but for transforming and such they get more and more useful. for this they don't need much instructions but good looping/branching support.

go to the devpage of ati to see what you can do with simple ps2. and they have 64 - 160 pixelshader instructions.. that'll be fun with 1024..
 
Ascended Saiyan said:
A 256 bit memory interface & DDR-II.I'm not sure anyone has reported this yet but NVmax is reporting this & a few other tidbits.
256bit bus != 256bit interface.

256bit bus = 128bit DDR interface.

Edit: unless, of course, nvmax didn't care about DDR. ;)

ta,
-.rb
 
I just have to say that given that the vertex shaders have always been more capable than the pixel shaders, it doesn't make much sense that the vertex shaders would suddenly become less capable after the release of this new hardware. Not that it's impossible, just sounds strange. And the white paper still says 1024 instructions, 256 constants in the VS.
 
Chalnoth said:
I just have to say that given that the vertex shaders have always been more capable than the pixel shaders, it doesn't make much sense that the vertex shaders would suddenly become less capable after the release of this new hardware. Not that it's impossible, just sounds strange. And the white paper still says 1024 instructions, 256 constants in the VS.

Wait a second, I don't remember, but the papers says 1024 static instructions and 256constants or 1024static instructions and 1024 constants? Well, the 1024 number in the static instructions is wrong, it's been confirmed by NVIDIA.

Because of looping, a vertex program can execute as many as 64k instructions from a pool of 256 static instructions. There's an error in that document - sorry. I've notified the author, so it should be corrected soon.

And confirmed to me by NVNEWS guy:
You are right, and I have changed the article - cheers, NVIDIA had an error
on the document.

Andy

But as I said previously:
In a vertexshader you don't need much instructions as you do most in the pixelshader in the next gen (remember you can now do the WHOLE lightingmath in the pixelshader, no need to pass anything from the vertexshader)

in the pixelshader you need much, for procedural textures and all that, that takes much instructions. and as you don't have loops there in (as this is hard to get _THAT_ fast to plot tons of pixels), you need manual unrolling of this

vertexshaders get rather useless for SHADING but for transforming and such they get more and more useful. for this they don't need much instructions but good looping/branching support.

go to the devpage of ati to see what you can do with simple ps2. and they have 64 - 160 pixelshader instructions.. that'll be fun with 1024..

With that said, VS don't require many instructions (only good loop & branching stuff), but pixel shaders do require many instructions (and considering that NV30 even has some form of flow control in PS) which should makes things even better.
 
Basic,

That's rather dissapointing if true. You'd think that for a chip partially aimed at acceleration of off-line rendering, multiple full precision outputs per pixel would be useful. Then again, maybe the idea is that 1024 instructions PSers make this unnecessary?

Regards,
Serge
 
I can't say that I've tried to write any realy large shaders. But I think you can fit quite a lot into 256 instructions, especialy if you have loops.

Btw, it's likely a 64 bit instruction set, so 1024 instructions would be 8KB of data that should sit in a fast internal memory. And because of the dynamic program flow, they might need to implement the vertex shaders as separate processors. So you might need to multiply that size by the number of vertex shaders. Maybe not a big part concidering how many transistors there is in that chip. But if they've looked at it and found that there's little use of it, then it's too many transistors to just waste.
And if there is some few extreme case VS that needs more instructions, you could always make a sw fallback for them. (That's rather difficult for PS.)

I understand DaveBaumanns' comment above though.
 
Not sure if these additional factoids have been posted, but its all public. Since there is no tapeout its obviously the target and not achieved yet :)

- 120M transistors
- 450 MHz clock speed
- power 15W
- 1.5 years in development (same as GF3 vs. 9 months for GF4)
 
DDRII is a go. ;)

oh and 1.5 years is way off. at its heart, its quite old.
450MHz is best case scenerio.
TSMC $ux0rz...
has everyone forgotten about UMC and IBM?
 
Sage said:
DDRII is a go. ;)

oh and 1.5 years is way off. at its heart, its quite old.
450MHz is best case scenerio.
TSMC $ux0rz...
has everyone forgotten about UMC and IBM?

IBM would be really big news. Weren't they one of the first people to do copper interconnects reliably?
 
yes, IBM was the first to have reliable copper interconnects, and, in my opinion, is probably much better equiped to handle the NV30. hmm... I seem to remember an interview in which Mr. Jen (nVidia CEO) stated that other foundries are always being considered?
 
Sage said:
DDRII is a go. ;)

oh and 1.5 years is way off. at its heart, its quite old.
450MHz is best case scenerio.
TSMC $ux0rz...
has everyone forgotten about UMC and IBM?

:p I like how you summarize your opinion of TSMC in a 2 word sentence! I want to challenge that...

From http://www.eetimes.com/story/OEG20011129S0040,
NVidia reportedly manufactured some NV2A (X-Box GPU) engineering samples on UMC's 0.15 fab, so perhaps NVidia already has future plans with UMC. (At this point, I think TSMC has more 0.13 capacity than UMC, but UMC is still a worthy alternative.)

IMHO, IBM is a long shot. Although their foundry technology is among the most advanced, here are some practical reasons which make IBM less attractive.

1) IBM's chip-design environment is *very* different from TSMC/UMC. (TSMC and UMC both export their foundry characteristics to library vendors, who encapsulate their foundry characteristics into 'tool views,' which can be loaded in 3rd-party EDA packages like Avanti Apollo, Cadence SE/PKS, Synopsys Design Compiler, etc.) In order to re-target the NV30 for IBM's foundry, NVidia's engineers basically have to re-learn a bunch of IBM's proprietary in-house tools. And/or trust IBM's design services group to perform a large portion of back-end layout work. (See http://www-3.ibm.com/chips/products/asics/methodology/design_flow.html - notice how many rows are *exclusively* IBM-only tools?)

Note, this isn't to say that design-flow (custom-owned tooling) vs (foundry-in-house tooling) is *better* than the other. TSMC/UMC offer *no* in-house design-services, so consequently, third-party CAD tools support their entire design-process. Companies like AMD, Intel, and IBM use their foundries as a means to make their core products (CPU, flash memory, etc.) Intel and IBM perform a tremendous amount of semiconductor basic research. They leverage that investment by writing their own CAD-tools/algorithms, and deploying them in their production fabs. Because NVidia has been a TSMC customer for some time, they are likely more comfortable with a custom-owned-tooling (COT) design flow.

2) IBM's pricing is less competitive than TSMC/UMC. IBM's merchant-foundry operation competes with IBM's other internal division customers (like their microprocessor group.) IBM generates the highest per/wafer income by selling its *own* silicon products (PowerPC chips, etc.), not by selling manufacturing capacity to third-parties. This is just another way of saying that IBM's core/primary business is selling chip products, and its foundry-business is a secondary priority. For that reason, IBM's pricing isn't likely to compete with pure-play foundries like TSMC and UMC. (On the other hand, IBM's superior process characteristics might justify their higher cost structure. For example, Xilinx has chosen IBM to manufacture its flagship Virtex-2 Pro FPGA.)
Unfortunately, none of the foundries named in this post publicly advertises his/her pricing structure (you must sign an NDA), so I can't directly prove this.

I can think of 1 other reason, but I can't even justify it to myself, so I'll leave it out.

Here's another relevant article http://www.eetimes.com/story/OEG20020624S0042. It tries to explain IBM's business-direction (in terms of the merchant foundry market.)
 
Back
Top