AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
Those have been quite improved with Fiji. High tesselation factors are still bottlenecked, but there's very little reason now or in the foreseeable future to change that in a totally significant way. Tessellation needs a lot of other software and hardware support before you're going to see practical reasons for say, a 32x tesselation factor. Perhaps just changing to a wider geometry front end would be beneficial, but there's not a huge need for changing design yet again.

Beyond that relieving registry pressure, increasing single precision performance in compute (a 980ti can beat it in many tests despite the Fury X's theoretical advantage) and getting better performance per watt would all be seen as priority. Fury X was simultaneously TDP and Die size limited. Surely fixed with the move to a new node, but a new series of GPUs on the node could quickly ramp back up to being TDP limited if the architecture isn't made more efficient.


Tessellation pipeline for Fiji, is not that good, tessellation is linked with input output of the shader array that's where the bottleneck is occurring, and its not that much improved over Hawaii. They should try fix that.

Remember the factor doesn't matter, its how much total procedural geometry is being creating by tessellation, so if you have many objects in your FOV with a low factor tessellation it will still have the same effect as one object with high factor of tessellation. And nV has been slowly improving their tessellation performance gen to gen, most likely it will go up again with Pascal, don't see why they wouldn't since the hull shader unit number seems to be attached to the quads.

Overall efficiency of the shader array needs to be improved too, not sure if they can match what nV is doing with less units, at least in games, seems like the way AMD's shader units are, only if coding specific for the architecture seems to be the best way to get more performance out of them over nV, not sure about this though.....
 
Last edited:
I think what will be interesting is how much the new architecture has evolved and the potential impact this may have on software developers/drivers.
One of the best things for AMD was capturing the console market, since then it is interesting to see a trend for games to be more efficient with AMD architecture rather than NVIDIA - this is not a criticism or suggesting AMD is doing something bad, just that the game development for those ones are very finely tuned and focused towards the GCN design, and for quite a few games this has had a positive effect for AMD on PC.

But it will be interesting to see just how this pans out if the latest architecture starts to evolve much more than previous iterations of GCN and especially that on the consoles; could be a mix of good and bad or just good news for AMD.
Time will tell as there is a lot is at stake going forward for both AMD and NVIDIA.
Cheers
 
New rasterizer and tessellation engine, please. I really do not care if they will have a +/- 10% TDP in regard to Pascal GPUs...
I think just as important or quite close is that AMD needs to create a hardware solution that can be as quiet as the best from NVIDIA.
IMO this is one reason NVIDIA has been able to corner the market (this is specific to AIB rather than their reference design) more recently as more and more want a quiet PC experience, along with of course good performance.
Seems one of the closest in competition to the AIB 970 and can finally be a recommendation on this front is the Asus 390 DirectCU III where the cooler design works well (starts to flag a bit on 390x from this POV), possible downside though is the length of most of those AMD cards.

The Fury related cards may be quiet but these are not really a mainstream card.
Cheers
 
AMD doesn't need to go overboard with reference cards "they need best sound", "they need best temps", "they need to be the smallest" is something I see repeated based on personal preference. IMO they just need to keep their HW excellency and marry with reference boards with constant attributes, "good sound-temp-size" is what's expected for a reference design. R9 290 surely suffered from their high temperature on launch by giving a easy reason for not buying it. AIB can later offer different cards with better sound-temp-size.
 
I think just as important or quite close is that AMD needs to create a hardware solution that can be as quiet as the best from NVIDIA.
IMO this is one reason NVIDIA has been able to corner the market (this is specific to AIB rather than their reference design) more recently as more and more want a quiet PC experience, along with of course good performance.
Seems one of the closest in competition to the AIB 970 and can finally be a recommendation on this front is the Asus 390 DirectCU III where the cooler design works well (starts to flag a bit on 390x from this POV), possible downside though is the length of most of those AMD cards.

The Fury related cards may be quiet but these are not really a mainstream card.
Cheers

Im sorry but smaller tthan the Fury Nano ? quieter than an H2o solution ? with lower temps tthan the Fury with less than 50°C ? ....

Really, you are speaking about the 390x, who is a 290x in reality and have something like years...

And for the temp, remember that thoses gpu's was designed to run at this temp and try to reach it..

Honestly i dont know, i put all my gpu`s under H2o since so long time, that i dont even remember the last time i have use air cooling for them..... the max temp of my gpus under 100% computing usage is around 38° ( raytracing )..
 
Last edited:
I was talking about the sizes of the 390 & 390x, not everyone can fit these full length cards into their mid and smaller cases.
And I am talking about the 390 not the 390x because the Asus DirectCU III starts to becomes less optimal and hence that generation of card is louder.
I am sorry but I thought it was pretty clear I am NOT talking about Fury type niche cards but mainstream, hence why I mentioned the 970 (and specifically AIB) which is at the upper end of the mainstream price.
So yes the point regarding the 390 Asus DirectCU III stands as being one of a very few cards from AMD's AIB that truly fit the quiet with performance criteria at a less than niche segment.

But as you point out, you are focused as a more niche consumer.... guess how many mainstream consumers would even consider water cooling.
Sadly though niche market will not save AMD, and as I repeat one reason NVIDIA has managed to gain such an upper hand recently; in general their mainstream cards are perceived as being quiet with performance, now this breaks down if one pushes the Maxwell card and then it behaves more like one expects from AMD.
Anyway to make my point, in the past HARDOCP has been rather scathing of the 3xx product range but do recommend the Asus model I mention: http://www.hardocp.com/article/2015/12/22/asus_r9_390_strix_directcu_iii_video_card_review/11

Cheers
 
Really, you are speaking about the 390x, who is a 290x in reality and have something like years...
.

Just to add, did AMD also change aspects of the power management/regulation/etc of the 3xx range compared to the 2xx?
I thought they made some power aspects more dynamic like Maxwell, which means it is more efficient than the 2xx models even if using nearly the same GCN revision.
One way is to compare power consumptions/clock speeds to each range, this 390 seems pretty good considering its clock speeds, whatever Asus as an AIB has done.
Cheers
 
Tessellation pipeline for Fiji, is not that good, tessellation is linked with input output of the shader array that's where the bottleneck is occurring, and its not that much improved over Hawaii. They should try fix that.

Remember the factor doesn't matter, its how much total procedural geometry is being creating by tessellation, so if you have many objects in your FOV with a low factor tessellation it will still have the same effect as one object with high factor of tessellation. And nV has been slowing improving their tessellation performance gen to gen, most likely it will go up again with Pascal, don't see why they wouldn't since the hull shader unit number seems to be attached to the quads.

Overall efficiency of the shader array needs to be improved too, not sure if they can match what nV is doing with less units, at least in games, seems like the way AMD's shader units are, only if coding specific for the architecture seems to be the best way to get more performance out of them over nV, not sure about this though.....
Why tho? Hawaii's tessellation performance is quite good outside of games which uses gameworks where nvidia forces unrealistic levels of tesselation for next to no visual improvements. Does AMD simply want to improve area simply because of terrible market trends or should they spend effort on areas that will actually improve quality of rendering for all games?
 
It's not about only pure raw performance, it's about building something that will not smash performance into pieces when CR will be commonly used across modern APIs. This is why I mentioned both rasterizer and tessellation (or geometry in general... no-one will care a lot about GS performance in the future thanks to VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation et similia)
 
Why tho? Hawaii's tessellation performance is quite good outside of games which uses gameworks where nvidia forces unrealistic levels of tesselation for next to no visual improvements. Does AMD simply want to improve area simply because of terrible market trends or should they spend effort on areas that will actually improve quality of rendering for all games?


Newer games will be using more triangles/ polygons and this will automatically increase tessellation without the need for increasing tessellation factors. Without removing that bottleneck, the AMD's GPU will take a double hit when it comes across because it effectively stalls the shader array.

Just as an example was talking to a friend that did some of the artwork for Batman, the characters use 100k plus triangles, Batman was around 150k, FOV shows around 5 million triangles. If that game was to use tessellation, a factor of x4 would be enough to crush an AMD card. Many of the new games are using many more polys before tessellation is even added.
 
Newer games will be using more triangles/ polygons and this will automatically increase tessellation without the need for increasing tessellation factors. Without removing that bottleneck, the AMD's GPU will take a double hit when it comes across because it effectively stalls the shader array.

Just as an example was talking to a friend that did some of the artwork for Batman, the characters use 100k plus triangles, Batman was around 150k, FOV shows around 5 million triangles. If that game was to use tessellation, a factor of x4 would be enough to crush an AMD card. Many of the new games are using many more polys before tessellation is even added.

Again though, tessellation has far too many problems to actually be used right now, and maybe even for this generation. Cracks are chief among them, combined with the fact that the two consoles out now don't have great tessellation performance to begin with means it's not going to be used unless it's deliberately as a sleight against AMD by Nvidia's gameworks program. (No, godrays using tessellation isn't smart or efficient, let's do it anyway cause it makes us look better!)

And yes, theoretically tessellation factor will just expand X control points into Y polys. But looking at tessmark, it gives a strong impression of having ever more sub pixel triangles as you up the tesselation factor. And this isn't the same as scaling more control points across the screen. Besides never wanting sub-pixel triangles to begin with because it cause horrible geometry aliasing, it's also generally a performance cliff, one that Nvidia does better on than AMD but not a smart cliff to go off at any time regardless. I'm not 100% sure, but with tessellation the performance cliff should still be there. So in effect the answer is no, tessellation factor should not scale the same as with tessellated points.

If you really, really thought a REYES solution, of doing micro-triangles anyway, was somehow the answer to image quality then maybe, maaaaybe getting sub-pixel triangle performance up massively would be important. Frankly if you were to do that though the first thing I'd think of is oversampling geometry/depth and then filtering and reducing down for shading, not re-engineering a GPU just so it can pull sub-pixel triangle tricks.

But otherwise AMD's bottleneck is geometry for getting super high framerates, sure. Even with improvements the pipeline hasn't changed drastically from the architecture and silicon in the XBO and PS4 while the ISA, ROP architecture, and number of ALUs has. Making geometry the relative bottleneck. But as I suggested, a 6 wide geometry front end on the highest end cards may be enough to solve that problem without needing some drastic redesign.
 
Cracks? What kind of tessellation are you talking about, terrain? If there are cracks then the voxel border limitations weren't done right. LOD's are being done too soon? There are reasons for cracks but it comes down to code. Objects made in a 3d modeller shouldn't have cracks since the output maps are from them.

God rays have to use tessellation, otherwise how are you going to calculate how the ray splits up when hitting an object to object? Magic? Or just guessing? we can go back to using planes with textures for god rays if you like? Personally I thought they looked like shit, and was better with out those types of visuals.

Sub pixel triangles doesn't increase aliasing, it decreases the need for hardware supported AA, and hardware supported AA will end up becoming too expensive, thus why shader based AA is better for it. And if you see my comments of 20 million triangles being prodecurally made is enough to start bottlenecking AMD hardware, at x4 factor of tessellation some of these new games will do that, don't need to go very high on the tessellation factor. Again, its not the factor amount that matters it is the total amount of generated polys.

20 million depends on the application of course, but that's what I'm looking at in a game with UE4 my team and I are making right now.
 
Last edited:
Cracks form most often with displacement mapped tessellation, and displacement maps are the only way to usefully use tessellation from a non procedural standpoint. For things like terrain and water it's no problem, your blending between layers of a heightmap or whatever and cracks are easily patched. But for skinned models you end up getting control point cracks, and it's a totally unacceptable artifact that can be quite costly from a computation to correct for, so much so that a paper about realtime crack free tessellation on the GPU was probably the most popular thing I tweeted about this year (can't find it, twitter's profile interface is god damned useless) anyway:

cs-354-surfaces-programmable-tessellation-and-npr-graphics-48-728.jpg

And yes, sub pixel triangles very, very, very do much cause aliasing as you need to filter for your sampled geometry correctly. Right now you naively get centroid sampling, which will be jumping from sub pixel triangle to triangle quite often if your geometry is too dense, causing obvious jumps with time and movement. The following gives a good example of say, a sub pixel phone wire jumping from out of centroid sampling into it: https://www.beyond3d.com/content/articles/122/3

Oh, and this is how everyone but Nvidia does godrays: http://advances.realtimerendering.com/s2015/Frostbite PB and unified volumetrics.pptx

You raymarch through a shadow map or similar, usually now some sort of voxelized goodness for easy GPU access. Still geometry, but no tessellation. It's been done so much at this point it's basically the de-facto standard, as used by Ubisoft, EA, Crytek, several other titles (Lords of the Fallen, etc.), and everyone else but Nvidia.
 
AMD doesn't need to go overboard with reference cards "they need best sound", "they need best temps", "they need to be the smallest" is something I see repeated based on personal preference. IMO they just need to keep their HW excellency and marry with reference boards with constant attributes, "good sound-temp-size" is what's expected for a reference design. R9 290 surely suffered from their high temperature on launch by giving a easy reason for not buying it. AIB can later offer different cards with better sound-temp-size.


Yes, AMD needs to do a much better job with the launch than in the past few years. Hawaii reference coolers were a disaster and Fiji suffered from a lot of pump whine in the first revisions.

First impressions matter. Even if AMD GPUs have aged better etc, people remember the initial impressions the most. And in the last two launches, NV has done a better job with both reference design as well as the availability of AIB GPUs at launch.
 
Cracks form most often with displacement mapped tessellation, and displacement maps are the only way to usefully use tessellation from a non procedural standpoint. For things like terrain and water it's no problem, your blending between layers of a heightmap or whatever and cracks are easily patched. But for skinned models you end up getting control point cracks, and it's a totally unacceptable artifact that can be quite costly from a computation to correct for, so much so that a paper about realtime crack free tessellation on the GPU was probably the most popular thing I tweeted about this year (can't find it, twitter's profile interface is god damned useless) anyway:

cs-354-surfaces-programmable-tessellation-and-npr-graphics-48-728.jpg

And yes, sub pixel triangles very, very, very do much cause aliasing as you need to filter for your sampled geometry correctly. Right now you naively get centroid sampling, which will be jumping from sub pixel triangle to triangle quite often if your geometry is too dense, causing obvious jumps with time and movement. The following gives a good example of say, a sub pixel phone wire jumping from out of centroid sampling into it: https://www.beyond3d.com/content/articles/122/3

Oh, and this is how everyone but Nvidia does godrays: http://advances.realtimerendering.com/s2015/Frostbite PB and unified volumetrics.pptx

You raymarch through a shadow map or similar, usually now some sort of voxelized goodness for easy GPU access. Still geometry, but no tessellation. It's been done so much at this point it's basically the de-facto standard, as used by Ubisoft, EA, Crytek, several other titles (Lords of the Fallen, etc.), and everyone else but Nvidia.


Cracks on objects like that, LOD isn't being calculated correctly for the patches, its a problem with the code, which I stated ;).

Btw they are talking about subpixel aliasing, when geometry edge partially intersects a pixel, with sub pixel triangles does increase that. But the trade off, with sub pixel triangles is you don't get the strong aliasing of the many pixels at one edge of a poly, its harder to notice the aliasing vs the original not tessellated model.

Good read on god rays, but you are using Vram and shader performance, so there is a trade off with tesselation vs horse power and vram.

And also it seems to only work with one strong light source, where as with tessellation, it can be done with many light sources without extra hits on vram and pixel shader.
 
Last edited:
About micro-polygons and over-tessellation, someone proposed a solution, unfortunately no-one seems to be interested.. Maybe with the CR advent someone will be re-consider the problem...


Adaptive tessellation is being used in most games, the only place where it isn't being used right now is nV's hairworks, and we can see the difference in hair if parts of it isn't tessellated as much as other parts, the angles are easily seen since the hair strains are so close.

LOL was only able to get through half of it before I had a meeting, ok pretty cool paper, the second half about micropolygons.
 
Last edited:
That was a hardware and API solution. Current adaptive tessellation is software only. But no wthe situation is a little different: we have a more programmable API (with the sorta programmable bland stage) and CR. Maybe when we will have finally a programmable/customizable MSAA pattern we will see some changes in the full hardware/API on both geometry and raster paths..
 
Newer games will be using more triangles/ polygons and this will automatically increase tessellation without the need for increasing tessellation factors. Without removing that bottleneck, the AMD's GPU will take a double hit when it comes across because it effectively stalls the shader array.

Just as an example was talking to a friend that did some of the artwork for Batman, the characters use 100k plus triangles, Batman was around 150k, FOV shows around 5 million triangles. If that game was to use tessellation, a factor of x4 would be enough to crush an AMD card. Many of the new games are using many more polys before tessellation is even added.
First of all, even a 7970's tessellation performance is good enough even if non gameworks games double the average number of polygons. You'd actually have to try to push more than x32 level tessellation in many objects to start crippling a hawaii card. The objects that have been tessellated to this degree doesn't really improve image quality, especially when you consider the game in motion. There is no bottleneck unless you are running gameworks. But gameworks's over tessellation actively sacrifice performance and game quality to make it so AMD performs worse than Nvidia. The bottleneck is artificial. Investing silicon in an artificial bottleneck might be good for AMD to compete when they are actively getting screwed but it doesn't help people who want games to look as pretty as possible.

Also I have a hard time believing those batman numbers aren't with tessellation already. 5 million triangles on a 1080p screen would be 2 triangles per pixel. You really think that is a good example? Its a perfect example of over tessellation for the sake of partnering with nvidia.
 
nope, current games outside of crysis 3 use around 1-2 million polygons before tessellation, a factor or x4 with the increase of base polys is a Very large increase. And all triangles are not visible and then you have to factor LOD's, since we are talking about 3d. But since they are part of one model they will have to get renderer regardless if they are visible or not.

Oh you better believe batman has those triangle counts I have the batman 3d mesh right here, here is a screenshot.

The batmobile is even higher. 3 times batman's poly counts ;)

http://i.imgur.com/qZ5BOSe.jpg

let me fix that, its 315k polys for the character, I was reading Max's poly count vs the actual triangle count.

http://i.imgur.com/aeYiJyX.jpg

here is arkham knight, 400k
 
Last edited:
Status
Not open for further replies.
Back
Top