Is this accounting for the effect of the L2 cache?The GTX680 manages 33.3 GPixel/s with RGBA8 blending and 16 GPixel/s with 4xFP16 blending. This requires a bandwidth of 266 GB/s or 256 GB/s, which the GTX680 clearly doesn't have (192 GB/s).
Is this accounting for the effect of the L2 cache?The GTX680 manages 33.3 GPixel/s with RGBA8 blending and 16 GPixel/s with 4xFP16 blending. This requires a bandwidth of 266 GB/s or 256 GB/s, which the GTX680 clearly doesn't have (192 GB/s).
Imagination claims to have lossless color compression with a 2:1 average ratio. Maybe Nvidia does too. It's just the first thing that came to mind other than caching effects.For non-MSAA render targets? How?
I like to think performance is affected too. Theoretically, a highly clocked, interface will not be able to sustain peak bandwidth all the time. DDR tend to cut bandwidth in half for address data and DDR5 needs some training to run at max frequency .Well I guess there's advantages and disadvantages for a low-clocked 512bit vs. a high-clocked 384bit interface.
No. AMD would profit too, at least a bit. It's probably just what I said, a bunch of screen filling quads.I think this is a proprietary benchmark they are testing with and it could be using optimized data set to align/fit the screen tiles nicely into the cache?
Is nV using the L2 also for caching ROP acesses? Anyway. It should trash the caches by design of the benchmark as the render target is much larger than the L2 and every pixel is only written once before the next quad comes. That's how I came to the idea above.Is this accounting for the effect of the L2 cache?
They have a TBDR. It's probably much easier to implement lossless compression for a tile there. For an IMR one wants relatively small tiles to load and store for ROP accesses to reduce the bandwidth overhead. But this makes general compression techniques harder to implement I guess (and the reason why compression is used only for MSAA render targets and bases on the simple fact that multiple samples belonging to the same triangle have necessarily the same color). Maybe hardware.fr should make fillrate tests with randomized content .Imagination claims to have lossless color compression with a 2:1 average ratio. Maybe Nvidia does too. It's just the first thing that came to mind other than caching effects.
Fermi made the L2 service all memory clients.Is nV using the L2 also for caching ROP acesses? Anyway. It should trash the caches by design of the benchmark as the render target is much larger than the L2 and every pixel is only written once before the next quad comes. That's how I came to the idea above.
The 512kB of a GK104 are too small in any case, if not a second thing plays into this.Fermi made the L2 service all memory clients.
http://www.beyond3d.com/content/reviews/55/9
It is an assumption of mine that the arrangement is unchanged for it successor.
I dont want break any discussion, but is really this pixelfillrate advantage ( at least on some benchmark / test ) have any translation on real game usage ?
look negligible at best for me ( outside maybe some specific case )
High resolutions and/or max/high AA settings.
I'm sure someone else can answer better.
I think a large part of the problem is that they tried pushing 4K too early; current cards just don't cut it.since 4K gaming has come up as an issue with tearing and on current eyefinity the tearing on current cards and drivers is the "fix" from amd to make the new generation cards for 4K gaming the solution and not the older gen?
One reason I havent gone crossfire with my 7970 as the issues with eyefinity have a tendency to be problematic to say the least when things dont work as advertised.
http://techreport.com/blog/25399/here-why-the-crossfire-eyefinity-4k-story-matters
http://www.guru3d.com/articles_pages/ultra_high_definition_pc_gaming_benchmark_review_uhd,12.htmlNow frame pacing has improved massively ever since Catalyst 13.8 beta drivers. AMD open has stated that Ultra HD is not yet supported for frame pacing, only resolutions up-to 2560x1440 will benefit from frame pacing with the latest drivers. With that statement in mind I was a little reluctant to show you the charts below. Then again, it would not be good taste for a journalist to not show them. So here we go, and I have limited this towards to benchmark runs.
since 4K gaming has come up as an issue with tearing and on current eyefinity the tearing on current cards and drivers is the "fix" from amd to make the new generation cards for 4K gaming the solution and not the older gen?
One reason I havent gone crossfire with my 7970 as the issues with eyefinity have a tendency to be problematic to say the least when things dont work as advertised.
http://techreport.com/blog/25399/here-why-the-crossfire-eyefinity-4k-story-matters
Looks to be close to that 425mm² estimate we had already from the "GK110 is 30% larger" quote.http://videocardz.com/45704/amd-radeon-r9-290x-hawaii-gpu-pictured-512-bit-4gb-memory
PCB shots from DICE, confirming apparently 512bit membus and rectangular die rather than square
DICE also leaked 7990, didn't they?
Article also suggests hawaii may be a cutdown of a full firepro. Do you guys see this as a possiblility? Amd hasn't done this for quite a while and with 28nm mature at this point, I dont see the reasoning unless there is low yield.
would 512 bit also semi confirm 64 rops?
Or they have large orders/contracts to fill and/or the PRO version will compete with Nvidia's offerings.
Tahiti made use of a crossbar to decouple the ROPs from the memory controllers.
I would say 48ROPs is much more likely than 64ROPs.