AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
There is not any third-party benchmark could proof the “aggressive primitive culling” that AMD said introduced in Polaris so far.
At least strips seem to be much faster. I would assume that strip cut indices have a fast path now. Also degenerate triangles (common in non-indexed strips) should be much faster now (degenerate = zero area -> culled by primitive discard accelerator).
http://techreport.com/review/30328/amd-radeon-rx-480-graphics-card-reviewed/5

I remember some polygon throughput benchmark test results (few years back) where Nvidia was beating AMD badly in cases where most of the triangles were zero area or hidden. Unfortunately I can't find the results anymore and I don't remember the name of the benchmark.
So what exactly do you want to see changed? Just more ROPs?
Isn't rasterization something that's entirely hidden?
Yes the rasterizer is hidden, but can be a performance bottleneck, especially when there's lots of small triangles hitting the same area. Nvidia's tiled rasterizer is an elegant way to combat this issue. Maxwell was a big efficiency improvement for Nvidia. I would prefer more efficient rasterizer, but more ROPs wouldn't hurt either. Fury X had much higher bandwidth and more CUs than high end Radeon 300 series cards, but identical number of ROPs. Compute performance was not a problem for Fury X, filling the compute units with work was the real problem. Fixed function bottlenecks need to be solved in order to utilize the compute units better. Both the geometry pipeline and the rasterizer are still behind Nvidia.

DCC in Tonga (and Polaris DCC improvements) is a nice boost for GPU utilization. GCN 1.0/1.1 had to perform several decompression steps (fast clear elimination, depth decompress, MSAA decompress, etc) during the frame. New AMD cards can directly read DCC data. Decompress steps are awkward, since you need to wait for GPU idle twice (rasterizer work finished, decompress finished) before you can start reading the texture.
 
Wasn't that before the 16mn got delayed several years?

20nm have been removed ( this should have been allready used on Maxwell and Hawaii period ), but 16nm was seems on track for me. This said, 2013 and 2014 GTC was still speak about Volta, Pascal have take the place between Maxwell and Volta in 2015.

But as for my post, this was just for show, that AMD showing a roadmap with Polaris and Vega name in december, then giving more details the following months is really not a big deal compared to Nvidia who can "tease and show slides of their future architectures " several years before. ( with the peak on GTCmarch 2015 showing Pascal, performance figures and even the "gpu". )
 
Last edited:
Can someone wake me up when I can get a 470 or 480 non-ref at MSRP and in stock?
 
  • Like
Reactions: xEx
Can someone wake me up when I can get a 470 or 480 non-ref at MSRP and in stock?

When September ends? :LOL:
In your neck of the woods, I only see a RX 470 at the Philly Microcenter store in stock at $219 ... store pickup only.
http://www.microcenter.com/product/468062/Radeon_RX_470_OC_Black_Edition_4GB_GDDR5_Video_Card

According to this review that is the MSRP.
http://www.legitreviews.com/xfx-radeon-rx-470-4gb-black-edition-video-card-review_184908

A custom RX470 with an MSRP $40 above reference is absolutely ridiculous. The price of any custom RX470 should not exceed the reference 4GB RX480 for it to make any sense.
 
20nm have been removed ( this should have been allready used on Maxwell and Hawaii period ), but 16nm was seems on track for me. This said, 2013 and 2014 GTC was still speak about Volta, Pascal have take the place between Maxwell and Volta in 2015.

But as for my post, this was just for show, that AMD showing a roadmap with Polaris and Vega name in december, then giving more details the following months is really not a big deal compared to Nvidia who can "tease and show slides of their future architectures " several years before. ( with the peak on GTCmarch 2015 showing Pascal, performance figures and even the "gpu". )

I see. And well Polaris have been in Dev for 4 years give or take, so it is "possible" to tease but its not really recommended. I think it is easier for Nvidia since they have more "Dev. Power" so they can Dev. things much faster than AMD, so if AMD wanted to tease something years before Nvidia could look and try to "copy" and probably ended launching it before AMD itself.
 
When September ends? :LOL:


A custom RX470 with an MSRP $40 above reference is absolutely ridiculous. The price of any custom RX470 should not exceed the reference 4GB RX480 for it to make any sense.

microcenter has horrible prices on everything except cpus. They exist incase you need something asap and can't wait for shipping. At least by me in north jersey there is nothing else like it where I can go in and pick from a few dozen cpus and mobos .
 
The MSI non-reference 470 4Gb was announced at $179. We'll see how long it takes for there to be stock and available at that price.
 
Asus strix nonOC 470 was 185 +5 shipping. thats a price Im willing to pay(import fees goes high in my country after 200). BUT! there is no stock. I won't buy a 470 for more than 200 dollars is just not worth it for me and its completely stupid that I could get a faster 480 for the same price........AMD seriously?
 
Can someone wake me up when I can get a 470 or 480 non-ref at MSRP and in stock?

You may have to wait until the crypto miners have had their fill. Either that or camp store sites to attempt to beat them before they order all available stock. Oh and have to beat the opportunistic "resellers" that want to buy at an established storefront and then resell them on Ebay/Amazon/Newegg marketplace.

That last is happening a lot with both AMD and Nvidia cards currently. It took me a while to get the Asus 1070 at it's 409.99 USD retail price because the "resellers" would snatch them all up within 1-3 hours of them being listed at Newegg (you can find them on Amazon more easily at retail as they collect taxes for most states making them less attractive to "resellers" as it eats into their "profit margin"). The Amazon effect doesn't deter crypto miners though.

Regards,
SB
 
microcenter has horrible prices on everything except cpus. They exist incase you need something asap and can't wait for shipping. At least by me in north jersey there is nothing else like it where I can go in and pick from a few dozen cpus and mobos .

Its not about Microcenter though..the MSRP is set by XFX. Its ridiculous that they think a 50 Mhz core overclock and custom cooler on the 470 is worth $20 more than a 4 GB RX480 or $20 less than an 8 GB RX480 (This is also partly due to AMD pricing the 470 and 480 just $20 apart). Same thing with custom RX460s for $139 or GTX 1060s for $329, etc. At that point you might as well spend a bit more and go for a higher tier card.
 
Last edited:
Right now if I were on a really tight budget the 4GB 470 is probably what I would go for.

I've ordered MSI RX 470 Gaming X 4GB, it should arrive this week. Since I'm on 1080p and don't plan paying crazy prices on PC VR headsets, this will serve me fine.
 
If only there were a blog post somewhere that explains how he's doing this. It probably requires messing around with very dangerous substances, but there must be enough people with access to some kind of chemical lab installation to try this as well.

Die shots of Maxwell, Pascal, ... They're all missing.
He went through a few iterations in his process. I think he explained it once in the 3Dcenter forum, but basically the only chemical he uses is water. He really just mechanically grinds down the metal layers after unsoldering the chip from the substrate (that's why you see scratches in the pictures). First with very fine sanding paper, then with polishing tissue. And that in a wet environment. To improve the flatness and equal lapping, he does it on a flat glass plate and glues the die to (or even into) a block of epoxy. An he apparently does it manually, not with some fancy polishing machine.

He showed a few steps for a Nortwood P4 die:
northwood_lapping_witd1j7p.jpg


And for the photos he doesn't even use a high end camera (just a "reasonably good" one, afaik ~500€ body with a sub 500€ lens, but the positioning of camera and object requires some effort [translation stage with stepping motor, probably costs the same as the camera itself], there are a few pictures on his flickr account showing the setup [the linked one is the first, just scroll to the right]). But that is enough for this kind of detail (I actually downscaled it a bit, the original has an even higher resolution):
polaris10_valuslds_smzssbd.jpg

On top one sees two vALU pairs from neighbouring CUs and underneath the respective LDS arrays. So in total there is 4*64kB vRegs + 2*64kB LDS = 384kB SRAM in this section which measures below 2mm² (a pixel equals about 1.2µm in this shot, which makes a pure 2kB vReg SRAM bank [without the stuff around it] about 1400µm² in size or 0.085µm² per bit in that array if I didn't miscalculate it, high density SRAM cells in Samsung's/GF's 14nm FF process should have a cell size of 0.064µm²; fits together quite well considering the overhead for a functional array I think).
One even sees quite some structure in this tiny blueish thing inbetween the CUs (belonging to the clock distribution was a guess here in the thread).

And regarding other GPUs, send him a note and a card with the GPU you want to see. ;)
 
Last edited:
Clearly nobody in its right mind should expect Vega before Q1 2017. AMD has been demoing most of their high end sponsored games (DeusX, Battlefield, etc) at trade shows on the Fury X (yeah even they know that the RX480 is not going to cut it there), the official Gears Of Wars 4 PC specs released today list the Fury X along side the 980 Ti & 1080 as the "ideal" config GPUs..

LrRyN76.png


and just a few minutes ago HP announced it's new OMEN gaming rig which is set to be release in October 2016 :
6th generation Intel Core i5/i7 over-clockable processors and the latest graphics technology, up to dual NVIDIA GeForce GTX 1080 and up to dual AMD Radeon R9 Fury X
The Fury X is here to stay...a bit longer than originally planned I guess..thankfully for AMD DX12 & Vulkan are helping it a bit..there also has been a recent price cut (worldwide it seems) which should help them clear everything before the end of Q4 2016
 
He went through a few iterations in his process. I think he explained it once in the 3Dcenter forum, but basically the only chemical he uses is water. He really just mechanically grinds down the metal layers after unsoldering the chip from the substrate (that's why you see scratches in the pictures). First with very fine sanding paper, then with polishing tissue. And that in a wet environment. To improve the flatness and equal lapping, he does it on a flat glass plate and glues the die to (or even into) a block of epoxy. An he apparently does it manually, not with some fancy polishing machine.

He showed a few steps for a Nortwood P4 die:
northwood_lapping_witd1j7p.jpg


And for the photos he doesn't even use a high end camera (just a "reasonably good" one, afaik ~500€ body with a sub 500€ lens, but the positioning of camera and object requires some effort [translation stage with stepping motor, probably costs the same as the camera itself], there are a few pictures on his flickr account showing the setup [the linked one is the first, just scroll to the right]). But that is enough for this kind of detail (I actually downscaled it a bit, the original has an even higher resolution):
polaris10_valuslds_smzssbd.jpg

On top one sees two vALU pairs from neighbouring CUs and underneath the respective LDS arrays. So in total there is 4*64kB vRegs + 2*64kB LDS = 384kB SRAM in this section which measures below 2mm² (a pixel equals about 1.2µm in this shot, which makes a pure 2kB vReg SRAM bank [without the stuff around it] about 1400µm² in size or 0.085µm² per bit in that array if I didn't miscalculate it, high density SRAM cells in Samsung's/GF's 14nm FF process should have a cell size of 0.064µm²; fits together quite well considering the overhead for a functional array I think).
One even sees quite some structure in this tiny blueish thing inbetween the CUs (belonging to the clock distribution was a guess here in the thread).

And regarding other GPUs, send him a note and a card with the GPU you want to see. ;)

looks like pizza
 
looks like pizza
The result of an automated layout of the logic blocks. This tends to create somewhat chaotic looking structures.
Even the look of the old 40nm WiiU GPU (VLIW5) resembles this somewhat (don't have another high res die shot at hand, this is half of the ALUs of an SIMD, but the WiiU had less replication, the blocks differed all a bit):
wiiu_gpu_half_simdler89.jpg
 
Last edited:
He went through a few iterations in his process. I think he explained it once in the 3Dcenter forum, but basically the only chemical he uses is water. He really just mechanically grinds down the metal layers after unsoldering the chip from the substrate (that's why you see scratches in the pictures).
...
And regarding other GPUs, send him a note and a card with the GPU you want to see. ;)
Awesome! Glad I asked.

I never considered mechanical grinding as a valid option. The few other decappers on the web are using the nasty stuff.

Maybe I should send him a broken eBay 750Ti. ;)
 
Last edited:
In my case I can't describe how painful it is to see the 4gb 470 nitro for 220 when the price for the 480 nitro is 210...i feel like robbed.

Enviado desde mi HTC One mediante Tapatalk
 
Status
Not open for further replies.
Back
Top