Ah ok. It looked plausible enoughSorry, I was wrong.
Ah ok. It looked plausible enoughSorry, I was wrong.
I hope it's the former, because the latter, I'm still not entirely convinced of being a hardware solution (as in dedicated logic) rather than GeometryFX-like driver injection.Performance over a 980 and Fury in graphics test 4 could imply great tessellation improvement or new triangle culling hardware being really good.
TBH more and more, because more and more cases have either windows or even full glass sidepanels. (I myself don't have either one, though I think (without checking under the desk) that my sidepanel does have grille for a fanWhat % of PC enthusiasts have clear side panels and PC in a position to actually see inside as well? What the GPU looks like should only be relevant to those designs specifically designed for that particular consumer market, with LEDs etc. Most GPU designs should be focused on efficiency.
Just knowing that a graphics card has LEDs improves framerates and rendering quality.What the GPU looks like should only be relevant to those designs specifically designed for that particular consumer market, with LEDs etc.
May be worth withholding judgment until we see the architecture. Fast scalar processors and the potential ability to regroup waves, while sort of a software solution, would seem to change some of those dynamics significantly. A software solution in that case may be a far more flexible option and maintain performance. The culling process could be occurring in stages as well. Seems likely Nvidia will be doing something similar with Volta, as there are a lot of papers about scalars floating about for both architectures.I hope it's the former, because the latter, I'm still not entirely convinced of being a hardware solution (as in dedicated logic) rather than GeometryFX-like driver injection.
FWIW, Fury X with tessellation set to a max of 2x is around 59-ish fps in that GT4, while tessellation switched off entirely allows for some fps north of 70. Note that I am not saying that this is what AMD is/will be doing. I was just curious how much performance impact tessellation has in that particular test and RSCE offers such a convenient way to try it.
NVidia also has a fair few papers regarding Temporal SIMT (including patents), aligned with several independent engineers and with Bill Dally talking about that maybe 2018 relating to their Tesla architecture.May be worth withholding judgment until we see the architecture. Fast scalar processors and the potential ability to regroup waves, while sort of a software solution, would seem to change some of those dynamics significantly. A software solution in that case may be a far more flexible option and maintain performance. The culling process could be occurring in stages as well. Seems likely Nvidia will be doing something similar with Volta, as there are a lot of papers about scalars floating about for both architectures.
Key architectural features: 2018 Vision: Echelon Compute Node & System
• Malleable memory hierarchy
• Hierarchical register files
• Hierarchical thread scheduling
• Place coherency/consistency
• Temporal SIMT & scalarization
What % of PC enthusiasts have clear side panels and PC in a position to actually see inside as well? What the GPU looks like should only be relevant to those designs specifically designed for that particular consumer market, with LEDs etc. Most GPU designs should be focused on efficiency.
May be worth withholding judgment until we see the architecture. Fast scalar processors and the potential ability to regroup waves, while sort of a software solution, would seem to change some of those dynamics significantly. A software solution in that case may be a far more flexible option and maintain performance. The culling process could be occurring in stages as well. Seems likely Nvidia will be doing something similar with Volta, as there are a lot of papers about scalars floating about for both architectures.
http://www.cs.nyu.edu/courses/spring12/CSCI-GA.3033-012/ieee-micro-echelon.pdfWhat's the difference between temporal SIMT and a bunch of independent scalar cores?
Then the reference 480(X) should be pretty ideal. Its design seem a remarked improvement compared to previous AMD designs, putting power delivery at the back of the PCB instead of up front made the PCB shorter and gave the blower the ability to suck air from both sides of the card. It's disappointing though that AMD chose to gimp power delivery so much with just a single auxiliary six-pin power connector, with such a smart cooler design and all.I just want a GPU that runs at stock speed that exhausts the heat out of the back of the case.
Well Nvidia went that route of LIW publically back in 2011 as shown with the paper linked.I read it but it doesn't really say how each lane can fetch and decode independent instructions. If each lane can fetch a different instruction that's 8X more ICACHE BW!
Perhaps there is a clever solution for this but to me it sounds like you got to have N scalar cores that share a bunch of other logic to amortize their cost. Temporal SIMT sounds better than 'super dumb LIW cores that share some resources'![]()
They were still talking about it last year as a requirement for eliminating waste/redundancy and improving efficiency, all of which are critical when looking to evolve more towards an exascale capable Tesla GPU.Even in this case it's not discussed how instructions are magically fetched from memory without having to provide an insanely large (in terms of area) instruction cache and instruction bandwidth. I am sure you can build it but is it worth it? I am bit skeptical.
Even in this case it's not discussed how instructions are magically fetched from memory without having to provide an insanely large (in terms of area) instruction cache and instruction bandwidth. I am sure you can build it but is it worth it? I am bit skeptical.