AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Maybe the HWS ACEs support "Compute Wave Switch" as mentioned on the slide, or in other words support for preemptive scheduling of compute shaders.
HWS could just be another name for the compute wave switch feature (Hardware Wave Switcher?).
Apparently they're "HardWare Schedulers"
 
There is a HWS, or hardware scheduling mode mentioned for managing compute in the AMD HSA Linux kernel driver.
AMD might indicate whether that is coincidental at some point.
The naming similarity may point to it being a path where the hardware has more autonomy in how it receives commands or schedules work.
 
It's no different for a vapor chamber-based cooler or a heatpipe-based cooler; the fan hub is there regardless. Difference is, the vapor chamber has a higher capacity for heat dispersion, and thus more efficient overall.
There is another difference - and i think that is what you're missing here:
With a heat pipe, you take the heat from one point of the card (usually from above the GPU where in case of the Nano the fan's hub would be as well) and transport it to another location in x/y coordinates (when looking straight at the card). With a Vapor Chamber, your main direction of heat transfer would be analogous to the z-axis, i.e. from one side of the VC to the other, in addition to a (less distinct) x/y spread compared to a heat pipe.

So, you transfer a certain amount of heat directly to where the sun don't shine... err... the air does not flow (as much). That's why on "traditional" cards with Vapor Chambers, there's a large set of fins soldered directly onto it and the (blower type) fan mounted next to the whole construct.

But since the alleged AMD slide double confirmed a Vapor Chamber, the discussion seems moot. It's only interesting, whether or not the cooling is sufficiently quiet.
 
Last edited:
Is 0xAF now considered optimum for testing gaming performance?

AMD-Radeon-R9-Nano-Presentation-19.jpg
 
Indeed no AF is much worse than having one or two settings turned down from the 'ultra' preset which is rarely good enough to justify the performance loss.

Such comparisons like the Fury X vs. 980Ti slide only bring scorn upon AMD from those in the know.
 
No need for scorn, usually, the Fury X suffers less of a performance hit than a GM200 by going from no AF to 16:1 AF (regardless if it's driver default settings or high quality settings). AMD might only be hurting themselves here.
 
I don't know the specifics of this, but 4K resolutions are putting the onus back on graphics horsepower and trade-offs for settings need to be maintained on many solutions. Like [H]'s reviewing principle the guide is to provide the settings that will attain "playable" (~30FPS) at 4K, likely with the primary in-game quality settings being chosen as the less favourable break-point than AA/AF settings. Peoples preference may differ, but you are getting into a more subjective range when you are looking to maintain a minimum playability bar; but this is far more interesting than debating graphs with all details set to the max at stupid 100+FPS's.
 
Dave I'm not sure I like you promoting turning off AF in favor of other settings. Perhaps you prefer CA over AF?
 
I'm not promoting anything, likewise I can't speak to everyone's preferences - the same goes for a benchmark lab and things would get even more muddy if you start disabling individual graphics settings. In the interest of keeping things normalised as much as possible and for the sake of time you are likely to look at the big buckets that are tweakable on every title - resolution, major game settings buckets (Normal, High, Ultra Quality, etc.) and AA/AF levels.
 
There is a HWS, or hardware scheduling mode mentioned for managing compute in the AMD HSA Linux kernel driver. AMD might indicate whether that is coincidental at some point.
Same thing. HWS allows some ACEs to be "more equal than others" as Dave said.

The code operates in one of three modes, selectable via the sched_policy module
parameter :

- sched_policy=0 uses a hardware scheduler running in the MEC block within CP,
and allows oversubscription (more queues than HW slots)
- sched_policy=1 also uses HW scheduling but does not allow oversubscription, so
create_queue requests fail when we run out of HW slots
- sched_policy=2 does not use HW scheduling, so the driver manually assigns
queues to HW slots by programming registers

The "no HW scheduling" option is for debug & new hardware bringup only, so has
less test coverage than the other options. Default in the current code is "HW
scheduling without oversubscription" since that is where we have the most test
coverage but we expect to change the default to "HW scheduling with
oversubscription" after further testing. This effectively removes the HW limit
on the number of work queues available to applications.

http://lists.freedesktop.org/archives/dri-devel/2014-July/064011.html
 
Thank you for the information.
It felt like a small number of letters for an unintended name collision, but I couldn't cite anything to back up the assumption.
I do note that Sea Islands or the APU subset of it receives mention as the starting IP, to give context where HWS may fit in this thread.
 
No need for scorn, usually, the Fury X suffers less of a performance hit than a GM200 by going from no AF to 16:1 AF (regardless if it's driver default settings or high quality settings). AMD might only be hurting themselves here.

Interesting, the one recent review pointing out AF hit that I could find was HardOCP's Watch Dogs review where 290X had much greater performance loss than 780Ti which barely budged with 16xAF.

As for AMD hurting themselves, the other thing conspicuous by its absence was the lack of DSR vs. VSR showdown in various reviews considering they had a slide showing how VSR had negligible performance hit on Fury X. With its abysmal 1080p performance it could have been a saving grace but VSR was nowhere to be found. The funny thing being that nvidia came up with this idea of having a greater resolution on widespread 1080p resolution monitors while AMD had more to gain from it.
 
Isn't AMD's base clock a true base, in which clocks can only increase? 1GHz in this case is the boost clock.
It was just the opposite, AFAIK.
At least for the 290X, it was the cause for people getting upset because it often did not reach that number, while Nvidia always specifies a minimum speed and a boost clock that isn't guaranteed.
 
I think 290X muddied the water because the default BIOS switch was "quiet" mode, i.e. throttle clocks at the merest hint of work. Non-reference coolers were fine.

But I also think AMD changed things. After the original HD7970 (and pals), base clock disappeared and AMD only gives boost clock as you say.

Titan X with its reference cooler is basically as bad as a reference 290X in quiet mode (87 versus 85% respectively):

IMG0047707_1.png


(from http://www.hardware.fr/articles/937-9/protocole-test.html ) There's a smaller set of results for 3840x2160, where Titan X gets worse.

But where's the fuss about the Titan X?

Original HD7970 doesn't throttle.
 
Rumor .... lol
You heard that right, the Radeon R9 Nano will have the same price as the Radeon R9 Fury X, selling at $649 US.
...
The Radeon R9 Nano comes with performance slightly within the range of the Radeon R9 Fury but not close to the Radeon R9 Fury X which is clocked higher and has higher thermal thresholds so clock rate remains more stable. The Radeon R9 Nano can be seen as an premium grade offering rather than being a highly competitive solution that aims to attract the premium Mini-ITX market.
http://wccftech.com/amd-radeon-r9-n...fiji-gpu-4-gb-hbm-performance-faster-gtx-980/
 
Back
Top