The AMD Execution Thread [2018]

CSI PC · Jan 16, 2018

Gubbi said:
One of the shortcomings of Bitcoin is that is much more effectively mined with special purpose ASICs than general purpose CPUs or GPUs. The barrier of entry for a miner who wants to economically mine Bitcoin is thus high. That's a problem, because the validity of the blockchain is chosen by majority and when you have few, big, miners your currency is vulnerable to a few big miners pooling their resources to get a majority (so called 51% attack).

The new currencies tries to mitigate this by shaping the workload so that it fits with the capabilities of hardware in widespread use. E.g. Etherum has a big (and growing) memory footprint, which renders ASICs worthless (because the cost is primarily in the memory subsystem). Mining is piggybacked onto existing hardware.

As long as crypto currencies is an inflating market, we'll see big demand for hardware that mines well.

We'll just have to wait for the bubble to burst

Cheers

Or wait for the aliens to attack and knockout all electronics

That is why it is important to hide gold with a big X in the garden 4 paces away from a tree.... no-one will ever know its there and totally secure

OK more seriously, or until China/Russia decide Bitcoin/etc does not fit in with their views and destabilise such systems in multiple ways.

DmitryKo · Jan 17, 2018

ToTTenTranz said:
Making dedicated ASICs is too costly, risky and it would take too much time. Even if they had started to develop dedicated "blockchain-processor-units" (BPUs) back in mid-2017, a real product wouldn't be out until late 2019, and who knows if such a product would have demand by then.
As for SKUs with less video outputs and lower clocks, OEMs already have those and I think they weren't terribly successful because they didn't manage to properly optimize the voltages, they didn't make it significantly cheaper

Sorry, I don't think I'm being entirely clear. I'm not talking about a dedicated ASIC or a new 'BPU' architecture, or OEM cards with a custom board layout.

I'm thinking about a new class of professional PCIe extension cards solely dedicated to miners - like professional OpenGL cards for CAD/CAM applications, which have higher-end hardware AND different software driver package tailored for the specific requirements of computer animation, visual design, architecture etc. crowd.

Professional equipment typically offers additional value and/or specific features, so the users can make money with this equipment in their routine daily work - as opposed to consumer equipment which is not as sturdy and not intended/warrantied for heavy daily work, or does not have specific features required in professional environments. This applies to anything from high-duty power tools to professional video equipment and server processors and motherboards, but somehow does not apply to mining cards.

I can think of a few ways that would make such professional "mining" boards more attractive in comparison to consumer cards:

1) On-die multichip interconnect
Instead of multi-chip or multi-card solutions, place several lower-cost GPUs and shared HBM memory in the same chip package.
This would make professional solutions scale in a cost-effective and power-efficient way.
I believe this is rumored for Navi 10.

2) Binning for low power
Professional chips can be binned for the lowest possible power requirements, increasing their cost-efficiency ratio - like the R9 Fury Nano which offered the same 8 GFLOP performance at 1.5x lower power requirements of the regular R9 Fury.
This should also improve failure rates, which is critical for 24/7 operation in datacenters and mining farms.

3) Optimised software
Compute-optimised drivers, additional shader languages targeting SPIR-V bytecode, standard processing library (like CUDA from Nvidia), early preview of Vulkan-based OpenCL, etc.

4) Fast processor interconnect
HyperTransport, PCIe 32x slots with PCIe 4.0/5.0 protocol, etc. to make NUMA nodes from each card.

5) Factory bundles
4-8 cards in one boxed package, intended for small mining farms.

The fact that current professional GPU solutions like $2100 Radeon WX 9100 and $7000 Radeon Pro SSG do not really offer any additional value for mining/HPC applications in comparison to high-end 'consumer' cards like Vega64, is obviously an indication that we are going through the paradigm shift which GPU companies did not anticipate. Nvidia faces this problem as well - look at their EULA affair with Sakura where they try to artificially restrict the use of consumer cards in professional environment.

I'd say their focus right now is on Vega M

How many people are actually going to buy an integrated $2000 17" gaming notebook instead of a $1200 gaming PC/27" QHD monitor combo, or a $1500 XBox One X /55" 4K TV combo?

Bondrewd · Jan 18, 2018

DmitryKo said:
PCIe 32x slots with PCIe 4.0/5.0 protocol

Not possible (for anything but 2*x16 risers); but long live CCIX (should AMD implement it)!

Grall · Jan 18, 2018

Bondrewd said:
Not possible

Anything is possible, but I've never heard that a 32x socket is even defined in the standard, but standards can be amended... It would be a big brute for sure tho...

Maybe you could do a two-tier pin layout like AGP to shorten it. *shrug*

DmitryKo · Jan 18, 2018

Grall said:
I've never heard that a 32x socket is even defined in the standard

32-lane operation and PCIe x32 slot were defined in PCIe 2.0 specifications from 2007, though it's currently not used in any product AFAIK.

FYI, AMD Epyc 7000 processors support a total of 128 PCIe 3.0 lanes, Ryzen Threadripper supports 64, and Intel Xeon supports 48.
SuperMicro server motherboards in the proprietary WIO/Ultra form factor, like H11D series (Socket SP3), X9D series (LGA-2011), and X11D series (LGA-3647), include proprietary WIO (x16+x16) and Ultra (x16+x8+x8) riser card connectors.

Bondrewd said:
long live CCIX (should AMD implement it)!

Gen-Z, CCIX, NVlink, OpenCAPI, you name it - though I'd image AMD would want to stick with HyperTransport and PCIe 5.0

Bondrewd · Jan 18, 2018

DmitryKo said:
Gen-Z, CCIX, NVlink, OpenCAPI, you name it - though I'd image AMD would want to stick with HyperTransport and PCIe 5.0

CCIX looks the most lively though (and can reuse existing PCIe PHYs).

Deleted member 13524 · Jan 18, 2018

DmitryKo said:
How many people are actually going to buy an integrated $2000 17" gaming notebook instead of a $1200 gaming PC/27" QHD monitor combo, or a $1500 XBox One X /55" 4K TV combo?

Vega M is already in Kaby G inside a $1300 16mm thick 2-in-1.
OEMs making gaming laptops with the higher-TDP Vega M are probably going to charge as much as the ones with a GTX 1060, which is still in the ~$1200 range. In my country you can get a 15" 7700HQ with a GTX 1060 for as low as 1050€ - 1100€.
The XboneX + TV combo is a very different value proposition from a laptop, and so is a desktop PC with a 27" monitor.

But let's make the case for that desktop PC with 27" QHD monitor. What can you buy nowadays with $1200?

- $250 Monitor
- $120 motherboard
- $180 CPU
- $150 for 16GB RAM
- $150 for SSD
- $75 for PSU
- $25 for case

That leaves you.. $250 for the graphics card.
What GPU can you buy with $250 nowadays? A cut-down 3GB GTX1060 that will be slower than the full 1060 found in $1200 gaming laptops...

CarstenS · Jan 18, 2018

Is $250 enough to get a nice and shiny glare display?

Deleted member 13524 · Jan 18, 2018

CarstenS said:
Is $250 enough to get a nice and shiny glare display?

It's enough for a 1440p 27" display with a mediocre TN panel...

CarstenS · Jan 18, 2018

ToTTenTranz said:
It's enough for a 1440p 27" display with a mediocre TN panel...

Wouldn't buy.

Deleted member 13524 · Jan 18, 2018

CarstenS said:
Wouldn't buy.

Neither would I, but I wasn't making my case. I was making @DmitryKo 's...

CarstenS · Jan 18, 2018

Apart from me personally not liking/using glare/touch displays on anything else than smartphone (i was kidding above), I think the Vega M can offer great value propositions for many people. Hopefully this time Intel will have enough leverage to prevent mediocre or worse designs. A (not in-house) problem AMD had to fight in the past many times.

DrYesterday · Jan 18, 2018

DmitryKo said:
I'd rather think they decided to remain on current process so they don't lose 2-3 months of production capacity while retooling the lines to a new 12nm process.

Please correct me if I am wrong. My understanding of the GF 12nm process is just a rebranding of the 14nm+ process. This includes just tighter design rules and new constructs, not any different fab equipment. So there is no retooling downtime at the fab. AMD would need to make new masks and validate the new wafers, but this isn't losing much production cap.

DmitryKo · Jan 18, 2018

ToTTenTranz said:
Vega M is already in Kaby G inside a $1300 16mm thick 2-in-1.
OEMs making gaming laptops with the higher-TDP Vega M are probably going to charge as much as the ones with a GTX 1060, which is still in the ~$1200 range.

15.6" notebook with a ~3.5 GFLOPs video card may cost $1200, but a 17" display, which I was talking about, would be at least $1800.

And 6-8 GFLOP GTX 1070/1080 would be $2500.
https://www.notebookcheck.net/Notebookcheck-s-Top-10-Gaming-Laptops.98628.0.html

But let's make the case for that desktop PC with 27" QHD monitor. What can you buy nowadays with $1200 ...
What GPU can you buy with $250 nowadays? A cut-down 3GB GTX1060 that will be slower than the full 1060 found in $1200 gaming laptops...

Yes, with these insane video card prices, you'd actually need ~$1400 for a mid-range gaming desktop.

https://pcpartpicker.com/list/HY8fsJ

GPU is the most important part of the setup though - so the best you can do to stay within a $1200 budget is spend $450-500 on a mid-range 6 GFLOPs part like RX 470/570/480/580 or R9 290/390/290X/390X and cut corners elsewhere.

Or just buy a ready-made PC with a Rx 570/580 pre-installed (such as the Dell Inspiron Desktop 5675, or the water-cooled Alienware Aurora R7), which would be somewhat cheaper comparing with current street prices, and accept the fact that upgrade options will be limited to RAM and a second Rx 580.

ToTTenTranz said:
enough for a 1440p 27" display with a mediocre TN panel

Most 15.6" FHD notebook panels do not look any better, they are the same TN type.

27" QHD IPS/PLS/xVA monitors (AOC Q2781PQ, BenQ GW2765HE, Philips 272B4QPJCB etc.) start from $300.

The XboneX + TV combo is a very different value proposition from a laptop, and so is a desktop PC with a 27" monitor.

Exactly. Who would really want to play top-tier desktop games on a tiny 15" laptop? I sold my SyncMaster 15GLe display back in 1999, thank you very much.

3dilettante · Jan 18, 2018

DmitryKo said:
I can think of a few ways that would make such professional "mining" boards more attractive in comparison to consumer cards:

1) On-die multichip interconnect
Instead of multi-chip or multi-card solutions, place several lower-cost GPUs and shared HBM memory in the same chip package.
This would make professional solutions scale in a cost-effective and power-efficient way.
I believe this is rumored for Navi 10.

Are there specific crypto-coins that are limited by the main chip's processing or internal transfers?
The rigs I've seen go out of their way to put as many discrete cards on a board, hooked up to as few PCIe lanes as possible. The GPU is usually clocked low and undervolted if possible, with the memory pushed up as high is feasible.

I see a possible use case for allowing a way to maybe chain multiple chips from a single PCIe slot without having a bridge chip, although the actual speeds needed to get a DAG into a local DRAM pool don't appear to that limited if miners are content to plug many cards into risers or PCIe cables dangling from as many motherboard PCIe x2 slots as are available.

Maybe create a minimal mobile/external GPU package using external PCIe connectivity. If on a dedicated board, perhaps give them the ability to daisy-chain from a single PCIe x16 slot.

2) Binning for low power
Professional chips can be binned for the lowest possible power requirements, increasing their cost-efficiency ratio - like the R9 Fury Nano which offered the same 8 GFLOP performance at 1.5x lower power requirements of the regular R9 Fury.
This should also improve failure rates, which is critical for 24/7 operation in datacenters and mining farms.

I suppose this depends on the state of the mining market, and may be too late at this point. Right now, the market's so overheated that making a profit in such a supply-limited scenario could allow for any chip that can reach the underclocked speeds miners set them to be salable. And if the card can make back its purchase price before it breaks, the reliability standard seems like it could be more relaxed.

3) Optimised software
Compute-optimised drivers, additional shader languages targeting SPIR-V bytecode, standard processing library (like CUDA from Nvidia), early preview of Vulkan-based OpenCL, etc.

This would also depend on whether the market remains overheated, but so long as a card is profitable, does it matter as much? The larger outfits can invest in optimized code, and perhaps just one programming avenue for it could be enough--or none if a card is at least profitable and the situation is supply-limited.

Points 2 and 3, if the mining market remains as speculatively driven and supply-limited it is (risky assumption at this point long-term) point to another way to "cater" to them. They're paying ridiculous mark-ups, so the point isn't to make one's product more effective for their dollar, they're obviously being under-charged.

Create instructions that can noticeably help mining performance or efficiency, then throttle them.
Determine what patterns are needed for mining, then determine what lowered voltage+gating+clock levels those need, and then make them unavailable on standard firmware.
Charge extra for a card with the limits lifted, or charge for a driver version+firmware update that makes them available.

Vega's PSP already serves as a barrier AMD can use to restrict things.

4) Fast processor interconnect
HyperTransport, PCIe 32x slots with PCIe 4.0/5.0 protocol, etc. to make NUMA nodes from each card.

This is similar to point 1, is there a mining target that is constrained by that element of the system?
Coins like Ethereum are purposefully bottlenecked by local DRAM bandwidth, in part to make them ASIC-resistant and to prevent what is believed to be an unfair advantage from large SMP setups or clusters, which would use a high-speed system interconnect to get massive numbers of chips and memory to scale in performance.

5) Factory bundles
4-8 cards in one boxed package, intended for small mining farms.

Charge extra for bundles of cards that can have their mining instructions and DVFS levels unlocked with a shared enablement key.

The fact that current professional GPU solutions like $2100 Radeon WX 9100 and $7000 Radeon Pro SSG do not really offer any additional value for mining/HPC applications in comparison to high-end 'consumer' cards like Vega64, is obviously an indication that we are going through the paradigm shift which GPU companies did not anticipate.

For HPC, the SSG is a step in the direction AMD expects is necessary long-term. The capacity, cost, and static/refresh power consumption of DRAM are expected to scale too poorly, and inter-node communication power costs more if local storage spills. Some kind of non-volatile pool near the GPU is something AMD proposes to compensate, but there are more complex trade-offs based on workload and access patterns.

Nvidia faces this problem as well - look at their EULA affair with Sakura where they try to artificially restrict the use of consumer cards in professional environment.

Not for mining, that was specifically carved out from the datacenter limitation.

CarstenS · Jan 19, 2018

3dilettante said:
Are there specific crypto-coins that are limited by the main chip's processing or internal transfers?

I don't know all the Alt coins, but at least the Equihash algorithm scales quite good with core clock speed (and number of ALUs/amount of FLOPS). This of course limits the undervolting potential a fair bit and you have to experiment (or google) to get to the sweet spot.

--
An approach like AMD and Nvidia did or do with the professional graphics drivers might work, though. And if both companies are serious about mining hurting their "core" business of gaming cards, then we ought to see such diversification in the next round. I could imagine for example the most easy way: Enable full mining throughput only if no display outputs are present (which might be fusable at configuration point?). This way, you also protect your pro/quadro lines from being slaved into mining duty, should the bubble ever be so large to make this viable.

Rootax · Jan 19, 2018

But how can you make a difference between compute task needed for games, pro apps, etc, and miner program (you can't whitelist/blacklist all of them...) ?

CarstenS · Jan 19, 2018

I'm pretty sure, the drivers can analyze the behaviour sufficiently accurate in order to do that. After all, it's enough to throttle throughput over time, isnt't it? And isn't Nvidia an AI company now?

edit: Most but not all games would use DX/DC anyway.

Lightman · Jan 19, 2018

CarstenS said:
I'm pretty sure, the drivers can analyze the behaviour sufficiently accurate in order to do that. After all, it's enough to throttle throughput over time, isnt't it? And isn't Nvidia an AI company now?

edit: Most but not all games would use DX/DC anyway.

It's possible to do that, but I personally am against artificially limiting products. Besides, people will come up with algorithms using limited gaming cards regardless, same way as it happened with BTC and ASIC, driving people to ETH.
Only pernament solutions are, regulating crypto market or producing more GPUs and selling them to end users in a more direct fashion.

A1xLLcqAgt0qc2RyMz0y · Jan 20, 2018

Lightman said:
It's possible to do that, but I personally am against artificially limiting products. Besides, people will come up with algorithms using limited gaming cards regardless, same way as it happened with BTC and ASIC, driving people to ETH.

Only pernament solutions are, regulating crypto market or producing more GPUs and selling them to end users in a more direct fashion.

Which is what Nvidia is proposing:

NVIDIA Asks Retailers To Stop Selling To Miners & Sell To Gamers Instead
https://wccftech.com/nvidia-instructs-retailers-stop-selling-miners-sell-gamers

The company wants its cards to go to gamers instead and it’s trying to do so by directly selling on its website and asking its retail partners to limit orders to two cards per person. When asked to comment, NVIDIA provided the following statement to ComputerBase.de

Boris Böhles – NVIDIA ( Translated from German to English )
“For NVIDIA, gamers come first. All activities related to our GeForce product line are targeted at our main audience. To ensure that GeForce gamers continue to have good GeForce graphics card availability in the current situation, we recommend that our trading partners make the appropriate arrangements to meet gamers’ needs as usual. “

Click to expand...

NVIDIA states that retailers have the freedom to follow or ignore its recommendations and that it can’t directly intervene in how they choose to run their businesses. This is where direct-selling by NVIDIA on its website plays a key role. The company can implement all the measures it sees fit to make sure that at least graphics cards sold directly by it, go to gamers instead of miners.

https://www.nvidia.com/en-us/geforce/products/10series/geforce-store

The AMD Execution Thread [2018]

CSI PC

DmitryKo

Bondrewd

Grall

Invisible Member

DmitryKo

Bondrewd

Deleted member 13524

Guest

CarstenS

Moderator

Deleted member 13524

Guest

CarstenS

Moderator

Deleted member 13524

Guest

CarstenS

Moderator

DrYesterday

DmitryKo

3dilettante

CarstenS

Moderator

Rootax

CarstenS

Moderator

Lightman

A1xLLcqAgt0qc2RyMz0y

Similar threads