If Microsoft comes back with the answer "oops, this should be in there too" it would be due diligence to ask AMD if they were ever made aware of that ahead of time. If not then I personally will stake my bet on NVIDIA having received the XBOX720 contract already
Read the bit about the code, it exists in the HLSL (but not in the docs or assembler/refrast, which is relevant since assembler/refrast are more strictly documented since drivers are written with them).Whats the chances that the actual feature doesn't exist on the DX11 specification and the feature they implemented here is an intercept which nets the same result anyway?
Since it's apparently only using L/S, I'd assume, it wouldn't go through Tex-Cache at all, but rather use the (larger) L1-/Shared-Memory-Pool.It's a good instruction, with an underlying texture cache better suited to point sampling than the one in Evergreen.
I'm not sure, but I think MfA is saying that on ATI hardware you would fetch, say, all 16 samples from a 4x4 footprint using four gather4 instructions, but without gather4 you would use four jittered point sample instructions.Well you've completely lost me now Isn't the single offset gather 4 instruction going to return you 4 point samples from a texel quad which defeats the whole reason for jittering the samples in the first place?
I still don't get it, why there is a dedicated tessellator for each SM?
I think NVidia is saying that they can do the latter in one or two instructions, or equivalently they can gather 8-16 jittered point samples with the same four instructions. That wouldn't make a difference in a 4x4 sampling footprint, but it would if your samples were farther apart.
As you've been shown, ATI handles the individually-offset gather fine. Gather4 is just ATI's optimisation for when the data aligns within 128-bit buckets.What NVIDIA is saying is that there is an instruction in HLSL which up to this point has remained hidden, which if you know it exists you can decompile from assembler level and design hardware for to make it run efficiently. Knowing it exists is a rather important step though, without that knowledge you simply wouldn't expect those type of assembly instructions. They make absolutely no sense on the original hardware from which gather4 came (HD3/4/5, where you will just take all the samples).
Well don't forget that NVidia basically tacked on pixel shaders to GF3 (there's a huge difference between PS1.0-1.3 and PS1.4), made a GF4MX which held back game development due to lack of features, did god knows what in their DX9 implementation on NV3x, and also tacked on useless vertex texturing and dynamic branching to NV4x/G7x. And with the exception of only the high end NV3x, these decisions gave them great margins and all the bullet points they needed for marketing.There's an interesting thing happening in forums with these revelations happening. Months ago, there was much optimism and props given to AMD for their focus on tessellation in DX11, and from that came the assumption that NVidia put no work into it, and if they supported it at all, it would be some late additional, half-assed, bolted-on, or emulated tessellation and would not perform as well as AMD's. I'll note for the record that much the same story was repeated with Geometry Shaders (speculation that NVidia would suck at it, and that the R600 would be the only 'true' DX10 chip) AMD has had some form of tessellation for several generations all the way back to N-patches, so there's some logic to these beliefs.
I think it's obvious: B.Jawed, that's not a write in answer ... that's avoiding the question
Which is no reason for Microsoft to hide it from the docs and only expose it at the HLSL level (which effectively hides it from the other IHVs unless they bother to reverse engineer the shader compiler). It can all be an unhappy accident of circumstances of course ... which is why I would like to hear from AMD (or any of the other IHVs for that matter) if the HLSL instruction was made known to them (if not I personally would call anyone who still thinks an accident is the most likely to be slightly naive).
Which is no reason for Microsoft to hide it from the docs and only expose it at the HLSL level (which effectively hides it from the other IHVs unless they bother to reverse engineer the shader compiler). It can all be an unhappy accident of circumstances of course ... which is why I would like to hear from AMD (or any of the other IHVs for that matter) if the HLSL instruction was made known to them (if not I personally would call anyone who still thinks an accident is the most likely to be slightly naive).