AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Infinisearch · Jun 23, 2015

I figured it was backed in VRAM but my guess was there was an on-chip cache dedicated to it. It just makes sense from a performance perspective... adding a L2 cache access for every few framebuffer access's doesn't sound very nice.

mczak · Jun 23, 2015

Infinisearch said:
I figured it was backed in VRAM but my guess was there was an on-chip cache dedicated to it. It just makes sense from a performance perspective... adding a L2 cache access for every few framebuffer access's doesn't sound very nice.

It is certainly possible (and indeed pretty much a must) there's some cache dedicated to it in the ROP, but I was just pointing out it wouldn't make sense to store all of it on chip at once (as did older chips), as the max cache size you'd need is simply too big. But we don't know much about how ROP cache works in general (tile data + metadata), and I have no idea how this works for newer chips (which need to access this from TMUs, so probably indeed through L2). But certainly it could be stuffed into ordinary cache lines (one cache line, no matter where, can store information about quite a few tiles of course (with those mentioned 2 bits per tile, for a rgba8 format that's only 1/1024 of the data you'd need for the uncompressed pixels for this metadata, so unless you have a really terrible case not achieving any compression it's not a big deal).

Infinisearch · Jun 23, 2015

mczak said:
But certainly it could be stuffed into ordinary cache lines (one cache line, no matter where, can store information about quite a few tiles of course (with those mentioned 2 bits per tile, for a rgba8 format that's only 1/1024 of the data you'd need for the uncompressed pixels for this metadata, so unless you have a really terrible case not achieving any compression it's not a big deal).

You're right I forgot about morton ordering... so that would be twice the number of tiles per byte size cache line. But I was told in another thread https://forum.beyond3d.com/threads/gpu-cache-sizes-and-architectures.56731/ that at least on GCN that the CB and DB caches don't go through the L2... did it change for tonga/GCN+framebuffercompression?

mczak said:
and I have no idea how this works for newer chips (which need to access this from TMUs, so probably indeed through L2)

Why would TMU's need access to this? If you mean render target to texture doesn't the driver resolve the compressed RT first manually? At least thats what I assumed.

Miksu · Jun 23, 2015

The Fury X should go on sale tomorrow so I think we should see reviews soon. Any word when exactly reviews are out?

Lightman · Jun 23, 2015

Miksu said:
The Fury X should go on sale tomorrow so I think we should see reviews soon. Any word when exactly reviews are out?

Tomorrow as well. NDA lift and sales will coincide.

Grall · Jun 23, 2015

Further thinking about buffer compression... How do you store compressed pixels efficiently in memory, are they packed together into bundles to fit DRAM burst length, or how does it work? I can only assume it would be very inefficient to store a few bytes at most where a full 8 or 16 byte deep color pixel used to reside, and then repeat for all other pixels.

I'm just thinking that if you re-pack the pixels, you would need some kind of mechanism to quickly isolate an individual pixel inside the bundle when you need to read it back again... *shrug* Instances like this, I really wish I'd had the brains and willpower to have studied graphics engineering stuff in university...

DmitryKo · Jun 23, 2015

Grall said:
How do you store compressed pixels efficiently in memory, are they packed together into bundles to fit DRAM burst length

Yes, fixed-length encoding - probably pixel differences in a 8x8 block.

http://graphics.stanford.edu/~mhous...all/HoKo_compression_in_graphics_pipeline.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.8187&rep=rep1&type=pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.412&rep=rep1&type=pdf

I can only assume it would be very inefficient to store a few bytes at most where a full 8 or 16 byte deep color pixel used to reside, and then repeat for all other pixels

Inefficient for what exactly - perceived quality, compression ratio, or decoding complexity?

gongo · Jun 23, 2015

450GB/s, non-overclockable HBM...lolx

dskneo · Jun 23, 2015

rumors were wrong?

Lightman · Jun 23, 2015

Are you laughing because there is an error on the box or because you think Fury X will have 450GB/s memory bandwidth?

Other vendor for instance has put Fury X for pre-orders, but don't buy it as it has GDDR5! (sarcasm)

gongo · Jun 23, 2015

Sapphire is AMD premium partner and ...i think they also assembles some cards for them...it would be a silly printing mistake..no?

Lightman · Jun 23, 2015

gongo said:
Sapphire is AMD premium partner and ...i think they also assembles some cards for them...it would be a silly printing mistake..no?

On the other hand AMD designs chip and puts out specification. All Fury X cards will be AMD made for now (manufactured by a partner, most likely Sapphire). AMD states in official data-sheet 500MHz HBM clock so please do the math and make your own conclusion. Besides HBM is overclokable as is any other memory, but AMD will not allow overclocking from Overdrive, at least initially. There is no saying 3rd party tools will not break the lock and TBH there were leaks from couple of months ago claiming HBM1 on Fiji can go as high as 700MHz.

Anyway, this is non issue for now as there is no point having massive bandwidth if your engine can't utilize it. We will have to wait and see if Fiji can be memory bandwidth limited at all.

Xmas · Jun 23, 2015

Infinisearch said:
Why would TMU's need access to this? If you mean render target to texture doesn't the driver resolve the compressed RT first manually? At least thats what I assumed.

I don't know what AMD implemented, but a resolve pass required on colour render targets might often actually increase total memory bandwidth requirements.

Note that bandwidth savings aren't everything and it is sometimes possible to achieve higher performance by increasing the total number of memory accesses, if that leads to a more even spread of memory accesses over time. I.e. if you could do the resolve (which is a streaming operation thus doesn't need to pollute any cache) while the memory subsystem is mostly idle. But that seems unlikely.

lanek · Jun 23, 2015

dskneo said:
rumors were wrong?

I think the error is on the box, because the 512GB/s is coming from the AMD slides.

Deleted member 2197 · Jun 23, 2015

lanek said:
I think the error is on the box, because the 512GB/s is coming from the AMD slides.

We should know shortly from other AIB packaging.

lanek · Jun 23, 2015

pharma said:
We should know shortly from other AIB packaging.

Faster way, there's the meeting webinar with AMD t today about Fury and other graphics lineup. if some attempt to it can ask them, im registered, but not sure i could be online at this time.

HH have his sample it seems. ready for a review tomorrow.
http://www.guru3d.com/news_story/are_you_ready_for_some_fury_tomorrow.html

Rurouni · Jun 23, 2015

Maybe the HBM is clocked at 440MHz? I think AMD promising up to 512GB/s and to achieve that they must use 500MHz. Since it's up to, I wouldn't be surprised if 450GB/s on early card is correct. It will still be the highest bandwidth GPU available.

Grall · Jun 23, 2015

DmitryKo said:
Yes, fixed-length encoding - probably pixel differences in a 8x8 block.

Thanks for your reply!

Inefficient for what exactly - perceived quality, compression ratio, or decoding complexity?

Pardon my imprecise language - I meant memory bandwidth-wise, as writing a few bytes spread out here and there is quite wasteful.

gongo · Jun 23, 2015

How many hours more before NDA Reviews are up...?
Exciting times...Fury X (450GB/s) or 980Ti G1....i need an upgrade.

Grall · Jun 23, 2015

gongo said:
How many hours more before NDA Reviews are up...?

I know that upgrade demon intimately. It starts haunting me strongly every two years or so!

AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Infinisearch

mczak

Infinisearch

Miksu

Lightman

Grall

Invisible Member

DmitryKo

gongo

dskneo

Lightman

gongo

Lightman

Xmas

Porous

lanek

Deleted member 2197

Guest

lanek

Rurouni

Grall

Invisible Member

gongo

Grall

Invisible Member

Similar threads