Uttar said:
Well, according to your logic, how much easier is sharing Texture Lookup units?
Well, considering that today I have pretty much alienated the entire known universe (though inadvertedly, I say to my own defense), so I'm not sure anymore I even HAVE a sense of logic, but to answer your question - and while having no real experience with these matters - I have to say to me it would seem slightly less difficult.
Texture lookups IS more like a loose pair of scissors I'd think. Having decoupled texture lookup hardware could work just fine as long as neither ps nor vs need to do lookups at the same time. With sufficient use of shaders and possibly some fifos and thingamajiggies to buffer data in case of an odd collision if it doesn't happen too often, it could be that it's fairly unlikely that both ps and vs need to use lookups in exactly the same cycle and hence parallel execution should be possible...I suppose.
After all, it seems Nvidia intended to implement things this way, they must have had at least a fairly good reason for it or they wouldn't have done it...
That said, a branching unit seems much more integral to the function of the processing core itself to me. A texture lookup thingy delivers a particular set of texels in return from a set of coordinates, right? It's like a delivery boy if I understand things correctly, it could actually function independently as long as it's not needed to deliver from two locations at the same time!
However, branch hardware actually executes instructions, and hence has to be part of the execution stream, located AT or at least near the ALU or whatever that drives the PS/VS processors so that instructions may move from previous execution stages and to the next stages in the pipeline... How do you decouple one part of the pipe and get it to run instructions from two entirely different execution streams, and what would be the real point of it?
Guess it would save some trannies if you wanted to make a good branch predictor, but with the risk of one stream stalling the other and creating unpredictable pipeline bubbles, and considering even the best of predictors guessing wrong at times thus making the bubbles in the other pipe all for nothing, would it really be worth the loss of performance?
I admit, I'm guessing as much as I'm speculating here. Just trying my best to make sense, probably not succeeding very well.
Actually, there's some required cache - but for experimentation purposes, they might do whatever they want, use memory and run it at slight show speeds, put it in the texture cache, or whatever. You get the point.
Uh... No. Not really.
But then again, judging from what I've been told over the last 24 hours or so, I'm one dumb son of a bitch so that wouldn't be strange.
*G*