I dont know about that. GTX is less than half a full fermi will be and its 30-35% from Cypress(5870) and 5-10% with in 5850. Half a fermi would have 256SPs, DDR5 with propably 256bit bus and 24 ROPs.
Well it would have considerably less texture capacity when compared even to a Juniper.
It doesn't really ... except for trivial cases it will compile to 4 separate gathers and the assembler has turn it back into a single instruction (if there are repeated fixed offsets the assembler can recognise that as easily as the compiler). Forcing a decompilation step in the assembler for the hardware for which the HLSL instruction was meant in the first place is retarded. I can come up with reasons to keep it out of the assembler/refrast level, but they aren't technical reasons.Interesting. So it decomposes into multiple instructions, one per distinct offset. Makes sense I guess.
You don't see why having timely knowledge of the DirectX specification is important to an IHV? Of course AMD might have had better specs than us ... that remains to be seen though.I don't see why the presence of this function would have influenced AMD's design either way though.
It doesn't really ... except for trivial cases it will compile to 4 separate gathers and the assembler has turn it back into a single instruction (if there are repeated fixed offsets the assembler can recognise that as easily as the compiler). Forcing a decompilation step in the assembler for the hardware for which the HLSL instruction was meant in the first place is retarded. I can come up with reasons to keep it out of the assembler/refrast level, but they aren't technical reasons.
You don't see why having timely knowledge of the DirectX specification is important to an IHV? Of course AMD might have had better specs than us ... that remains to be seen though.
Jittered point sampling is an old technique, but it stopped really making sense for a while when you could gather4 at about the same cost ... it's the changed structure of Fermi texture addressing which makes it relevant again. You'd never use this on Evergreen hardware except if you were lazy or running sponsored code.
It's better to have 16 quads from jittered locations than 16 single texel samples ... that's basically the choice with Evergreen.Well you've completely lost me now Isn't the single offset gather 4 instruction going to return you 4 point samples from a texel quad which defeats the whole reason for jittering the samples in the first place?
Why would you not want to go through existing caches for this?One silicon level architecture question for you guys. Given the 'polymorph engines' are all interconnected across large expanses of the die, and need to be kept in sync, anyone want to hazard a guess on how that affects clock scaling?
-Charlie
One silicon level architecture question for you guys. Given the 'polymorph engines' are all interconnected across large expanses of the die, and need to be kept in sync, anyone want to hazard a guess on how that affects clock scaling?
-Charlie
... If not then I personally will stake my bet on NVIDIA having received the XBOX720 contract already
Conspiracy theory! Lets make one thing crystal clear, even though I let myself get drawn lengthy arguments on this it is pretty far out there. If it's true we will probably never hear of it, some people at AMD might get mad at Microsoft but it still would not be in their best interest to antagonize them in public. If it's false and AMD confirms it was only a public documentation error I will look foolish and we can all quickly forget about it.Right, so where are we on this theory of yours then?
There are a couple components to this ... firstly the instruction itself and the kind of acceleration it can get on Fermi. It's a good instruction, with an underlying texture cache better suited to point sampling than the one in Evergreen. It will be a win in some algorithms (it will also leave some resources poorly used on ATI hardware, so ideally you will have two implementations). So in that it's a competitive advantage in a good way, better hardware (arguably depending on cost ... but intuitively I'd say the costs for allowing individually addressed 32 bit samples, as opposed to quads, are small compared to the benefits).Has NVIDIA gained a competitive advantage over ATI.... Or?
@eastmen
Business is business. Sometimes you get stung but others you might get a good deal with no obvious caveats that is cheaper and faster than the competition. Money talks louder than words.
why would ms go to nvidia after they were royaly screwed by them back in the original xbox days and why would they leave ati when ati delivered a fantastic part in the xenos that has allowed them to stay competetive with the ps3 dispite launching a year earlier ?