Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
My understanding is that the compression/decompression hardware is indeed between MC and ROPs. The blocks are too big to be decompressed on the fly for individual access (unlike s3tc textures), not to mention it can't really work that way for writing.I am curious where the compression hardware is in the process. The least disruptive would be to have it on the path between the ROPs and the memory controllers (possibly in the controllers?), although what that does to a possible memory crossbar is unclear.
The compression and decompression process mostly intends on saving bus accesses, although that may not do much good for individual tile export or import from the ROP caches since those have to be processed and an extra DRAM burst or two is a handful of cycles at most.
The downside to that is that the ROP caches would have uncompressed data, so their hit rates would not improve. HBM's latency could be better than before, but the dominant factor is the DRAM arrays, which have not changed much.
Nvidia uses ROP (raster operations), AMD uses RBE (render backend), intel uses CC (color calculator, though their docs also use the term "output merger" which is the language used by d3d10).
https://help.netflix.com/en/node/6662 Fwiw though, I used a 7950 with Netflix, and with no problems over HDMI to my Sony XBR6
I thought NetFlix streaming and other similar services work through an Internet browser in Windows - I think you'd need a dedicated Win32 or WinRT app to use the protected video path.4K Netflix and most other legitimate services will require HDCP 2.2 for 4K.
As for 390X, at least one reviewer who are not AMD themselves(so much for transparency), shows it performing near 980Ti for a few games.
GTAV - 31.3 to 29
Evolve - 39.6 to 37.7
FC4 - 37.9 to 36.4
And over 980 for most.
http://nl.hardware.info/reviews/613...et-bestaande-chips-benchmarks-alien-isolation.
Dual fragment-cache pixel processing circuit and method therefore
Multiple graphics primitives may be processed in quick succession where each of the multiple graphics primitives produces fragments that correspond to the same pixel location. As such, rather than forcing the render backend block to handle multiple fragments corresponding to the same pixel location, a cache structure can be used to buffer the received fragments prior to providing them to the render backend block. Including a cache structure in the data path for the pixel fragments enables multiple fragments that apply to the same pixel location to be combined prior to presentation to the render backend block. Offloading some of the blending operations from the render backend block can improve overall system performance.
That's quite an optimistic interpretation of those results. In actuality the 980 beats the 390x in every one of the 1080p/max settings tests aside from 2. Evolve (which is extremely AMD friendly) and GTA4 (which looks to be down to memory limitations, although still a good win for AMD). The two GPU's trade blows at 4K with the 390x taking 7 and the 980 taking 4 with 2 draws - although often these framerates are too low to be playable anyway.
In comparison to the 980Ti, the 390x is always well behind at any resolution with arguably the 3 instances where it comes close above all being unplayable anyway (although Evolve is probably okay).
Wccftech has used "the ancient art of math" to calculate the possible FP32 TFLOPS compute performance of R9 Nano [not gaming performance].
http://wccftech.com/fast-amd-radeon-r9-nano-find/
Translating this to PNG style delta-colour filters, a scanline might be 8 or 16 pixels long with a fixed count of 4 or 8 scanlines.
Hi there
What I will say is the Fury is one of the best looking cards ever, in my PC at home it looks pretty awesome with the red Radeon logo and the rev counter LEDs which can be set to be red or blue. Pretty cool!
Also happy to report absolute zero coil whine and pretty much zero noise, never seen it go above 40c under load.![]()
Performance I can't hint to simply as that would land me into trouble but it's faster than my 290X which was clocked at 1100/6000.![]()
I don't remember last time I had a GPU without coil whine and it still is annoying after all these years listening to it!
p = a + b - c
pa = abs(p - a)
pb = abs(p - b)
pc = abs(p - c)
if pa <= pb and pa <= pc then Pr = a
else if pb <= pc then Pr = b
else Pr = c
return Pr
0 2 4 6 8 10 12 14
2 4 6 8 10 12 14 16
4 6 8 10 12 14 16 18
6 8 10 12 14 16 18 20
8 10 12 14 16 18 20 22
10 12 14 16 18 20 22 24
12 14 16 18 20 22 24 26
14 16 18 20 22 24 26 28
There's no electrolytic capacitors to go pop, either!It it really has no coil whine at all then I'm mega happy!
They claim 7.84 TFLOPs for the Nano, if they use AMD's own numbers for the R9 290X and the performance+power ratios.
60 CUs at 1020MHz would achieve 7.834 TFLOPs.
Having the Nano at 3840 ALUs / 60 CUs and 1020MHz seems feasible. If the TMUs are cut accordingly (1/16th disabled), then there'd be 240 of them.
That said, Nano could be a part with 3840 ALU : 240 TMU : 64 ROP at 1020MHz.
We don't know if Fiji boosts and this could be a boost number, so maybe the chip stays at e.g. 900MHz but can boost up to 1020MHz in short periods of time.
To be honest, this was the kind of performance I was expecting for the aircooled Fury non-X. I thought the Nano would be much closer to the 290X in absolute performance, but from those performance/power ratios and the 175W TDP it's obvious it would become much more powerful.
... But the damn thing is microscopically cheap.