The AMD Execution Thread [2018]

Discussion in 'Graphics and Semiconductor Industry' started by A1xLLcqAgt0qc2RyMz0y, Jan 8, 2018.

Tags:
Thread Status:
Not open for further replies.
  1. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,485
    Likes Received:
    897
    I shouldn't post when I'm this tired—thanks for pointing it out, it's fixed! :)
     
    digitalwanderer likes this.
  2. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    17,031
    Likes Received:
    1,605
    Location:
    Winfield, IN USA
    I really, really, REALLY hate to say this but I'm not so sure AMD's capacity will be the limiting factor. I'm terrified that DDR4 prices are going to keep going up, and with all the cryptowhores out there GPUs are gonna be insane for a while. I'm not predicting the death of pc gaming or anything, but I will be betting on a rather huge and continued slump while the price of entry is so damned high.

    There are some absolutely killer AMD cpu/mobo deals out there that would pump my system up to a real gaming rig again for under $250...but it's still gonna cost $180 for f-ing ram that cost $80 2 years ago and that just galls me to the bloody bone! I'm thankful as hell I still have a decent gaming card in my wife's rig I can steal (R390) because if I had to get one it would just TOTALLY kill it. :|

    Please don't get me wrong, AMD is doing great and I'm glad..but there are a whole bunch of other factors at play that I have never seen affect the market the way it has in as short a period ever before in my lifetime so I am hesitant to be too bullish on anything right now. :/
     
  3. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    637
    Likes Received:
    477
    Location:
    55°38′33″ N, 37°28′37″ E
    Exactly. There could be several versions of 'professional' multi-chip cards, with memory controller and interconnect implementation targeting a specific market.


    Entry-level 'mining' accelerator card would use several low- or mid-range dies with reduced amount of inter-die communication, and maximize memory bandwidth with HBM stacks per each die. Thus each 'professional' accelerator card replaces several consumer video cards, working in the same 8-slot mining setup.
    Such card would also have a value for HPC and/or gaming applications which can utilize multiple adapters and/or multiple adapter nodes.


    High-end 'HPC' accelerator card would use high-performance dies with increased number of inter-die links, better PCIe bandwidth, and Cache Coherent Interconnect for Accelerators (CCIX) protocol.

    In the current implementation.
    EPYC CPUs have a shared memory controller and a shared L3 cache for all dies.
    Navi GPUs could have a dedicated memory controller with their own HBM stacks per each die - with a bandwidth increase past 512GB/s in the desktop Vega GPUs.

    I mean HPC workloads - even a 'mining' accelerator card has to be viable for traditional compute tasks.
    BTW ASICs with HBM memory are coming from Samsung, TSMC, and GlobalFoundries.

    Yes, GCN architecture does implement hash instructions so AMD can disable this microcode for non-professional cards, but it wouldn't help much if EThash is memory bandwidth bound.

    Game engines are increasingly using compute tasks for pre-rendering computations which do not need to be bound to the graphics pipeline - so these restrictions would just stand in the way of legitimate gamers.

    The PC gaming market will not bear these prices for much longer. It is in the interest of all parties to return the pricing situation to normal as soon as possible, or the ensuing crash will bury the entirety of desktop hardware makers.

    A 'free market' solution would be increasing production to reduce prices, a 'planned economy' solution would be fixing the price and regulating the demand with queuing and rationing. We already know which one will work.

    Even wholesale buyers cannot bypass board makers - the latter are the ones benefitting the most from high prices, but they will also be the first ones to suffer if this situation continues. And they seem to realize this very well.
     
    #83 DmitryKo, Jan 31, 2018
    Last edited: Jan 31, 2018
    Grall likes this.
  4. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    637
    Likes Received:
    477
    Location:
    55°38′33″ N, 37°28′37″ E
    I can't see how power management counters could be used to reliably discriminate mining workloads.
    It's not practically enforceable by driver software and/or DRM schemes - there have to be different hardware to prevent driver modding.
    The wealth of the current 'oligarchs' has nothing to do with goods shortages which ended with the Soviet Union.
    Impersonating someone's identity to avoid queuing or online restrictions is not the same as stealing someone's money from a bank account.
    Surely I can see Pacific Gas and Electric Company Police raiding residential buildings with thermal detectors in their hands. The staff will probably come from retired veterans of Florida Bathing Suit Patrol. Their effect on the gaming market would be zero or negative.
     
  5. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,286
    Likes Received:
    385
    Location:
    Australia
    There is supposed to be a massive amount of ddr4 from China coming online this year , memory prices should go down not up.
     
    Lightman, eastmen and digitalwanderer like this.
  6. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,485
    Likes Received:
    897
    Could you point me to some additional information on that, please? I haven't heard anything about it.
     
    digitalwanderer likes this.
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    While it may be possible to make links that fall short of the memory controller's throughput, the synchronization and routing for the fabric may run into issues if the memory clients responsible for arbitration, snoops, and ordering are overwhelmed by a link.

    The neat thing about those focused on mining is their profit motive makes them contort their systems in significant ways beyond the limits of general or professional workloads, and there's no obligation to allow the hardware to be taken to those points for free.

    The engines aren't going to optimized for workloads that run the card off of a 1x PCIe 2.0 slot, uses no graphics buffers, ignores most graphics operations or floating point, no texturing, one or two kernel programs running for days, uses no API resources or operations, and random memory patterns and hashing through the same locations over and over.
    The system and workload fingerprints are quite unique for mining, and while it may be possible to create code that works around some of the checks it won't come without a cost in hashrate or broader appeal for the hash algorithm.
    The profit motive intrudes at that point. Work around restrictions at an ongoing penalty, or pay the premium and get on with mining before difficulty rises or the market corrects.

    A game that somehow manages to trip some of these checks at launch would get the same sort of follow-up game-ready hotfix driver that they all get, or dev builds would show low performance and they'd adjust the engine.

    It's in the best interests of the GPU vendors to charge the ones who will pay the most more than those that will not. Charging miners more would help moderate their demand and provide the necessary financial cushion for the second-hand glut after a likely correction. They should be priced higher if only to account for the much higher risk a fickle mining market entails.

    Quadro, Titan, FirePro, Xeon, EPYC, i7, R7, Threadripper, and countless other upmarket products are indicative of that this sort of differentiation on the same hardware is sustainable.
    The mining market takes things to the next level by not having the same level of workload complexity, economic sustainability, long-term thinking, or legal compliance the other buyers in less price-sensitive segments have.

    Even if RTG and Nvidia took on the role of "GPU vendor of the miner-folk", their other markets, DRAM vendors, shareholders, and the foundries will not be abandoning their philosophy of maximizing revenue extraction.

    The hardware knows at a unit granularity what the workload is doing, and the crypto algorithms that try to be ASIC-hard make certain choices that lead to discernible patterns. Focusing on local bandwidth as the limiter leads to a lot of pseudo-random accesses to cache and memory, mining focuses on reams of integer and bit operations, and resource allocations are not handled in a common fashion.
    Games are highly variable even within a frame, so the cumulative time that a given set of operations takes up GPU execution can be tracked and duty-cycled with limited impact--unlike a workload churning through straightforward math or purposefully scattered accesses without variation for hours, days, or months.
    The internal tables are also not entirely visible to software, or the choice exists to give more authority to the internal execution loop. Various thresholds can override clock or voltage settings, and the platform for encrypted payloads and a multitude of keys is in place.

    Quadro features and server options for the high-end SKUs have been segmented for decades on the same hardware, and current and future platforms are becoming more capable.

    Despite the fact that so many were either part of the state control apparatus or their cronies?
    How many planners went without the goods the common people couldn't get, and which side of the command structure would the vendors be in this analogy?

    Bank accounts are expected to carry a minimum balance, account creation requires a fair amount of the bank's time and resources to set up and register, and the law takes a dim view of large-scale wire fraud and identity theft.
    If these practical hurdles are somehow not a problem for your scenario, the overdraft, credit, and loan services of a bank that misses dozens to hundreds of (apparently untraceable?) fake accounts being created in a short period of time can be abused.

    Hobbyist miners won't bother, and large-scale miners don't need it.

    Why do you think it's that hard for electric companies to know how much they need to bill their customers for?
    For that matter, in the case of drug interdiction there are thresholds for review at sufficient levels of consumption.
    Past a certain level of consumption, the utility, local inspectors, and law enforcement can become involved due to thresholds each can have for questioning power delivery to a location.

    In the case of the big mining concerns, they aren't sneaking around. The power hookups and delivery plans are not secret. The delivery level, bulk rates, and service agreements are handled like they would be for any commercial/industrial customer.

    The idea is to make them choose to pay more for a special SKU or hardware unlock, or go without the profit from an optimal hash rate. Charge them less than it takes to make it wholly unprofitable, but up to the limit of overhead and ongoing cost incurred from working around restrictions.
     
  8. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    9,974
    Likes Received:
    1,491
    I was under the impression that the real wall was the ram and motherboard stability with it. I am hoping with the new chipset and new power delievery set up with the new boards it will allow more overlocking room even for the older chips
     
  9. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,727
    Likes Received:
    4,395
    Lightman likes this.
  10. Pressure

    Veteran Regular

    Joined:
    Mar 30, 2004
    Messages:
    1,317
    Likes Received:
    243
    It seems a single customer bought a lot.
     
  11. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,727
    Likes Received:
    4,395
    Baidu?

    Due to recent news, people were wondering if the Baidu sales included substantial numbers for the Instinct Mi25, or if it was just a couple of GPUs with the blunt of it being EPYC CPUs.
    Turns out they bought a lot of Mi25. It's just not in their 1P servers.
     
    Grall likes this.
  12. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    637
    Likes Received:
    477
    Location:
    55°38′33″ N, 37°28′37″ E
    In June 2017, NVidia Research published a detailed paper on their proposed multi-chip GPU design, which was tested in a VHDL simulator. Their findings should also be applicable to AMD multi-die performance.

    MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability
    http://research.nvidia.com/publication/2017-06_MCM-GPU:-Multi-Chip-Module-GPUs

    https://wccftech.com/nvidia-future-gpu-mcm-package/
    https://techreport.com/news/32189/nvidia-explores-ways-of-cramming-many-gpus-onto-one-package
    https://wccftech.com/amd-navi-gpu-launching-siggraph-2018-monolithic-mcm-die-yields-explored/


    The simulated MCM-GPU has 4 GPU modules each with 64 SM cores and a 768 GByte/s link to local HBM. The modules are connected through a ring bus, and a required inter-module bandwidth is researched using real-world CUDA workloads running in the simulator (paragraph 4).

    First they consider unoptimized MCM-GPU design, with a 16 MB L2 cache directly connected to each module's cross-bar/memory controller, and 128 KByte L1 cache connected to each SM core.
    For memory intensive workloads, inter-module link bandwidth of 1x memory bandwidth (768 GByte/s) results in 60% of maximum theoretical performance (using an 'ideal' 6 TByte/s inter link), while 2x bandwidth (1.5 Tbyte/s) results in 90% and 4x (3 TByte/s) in 97%. For compute intensive workloads, 1x link corresponds to 85% of maximum performance, 2x link to 97%, and 4x to 100% (Figure 4, paragraph 3.3.2). An 'ideal' 6 TByte/s link has little to no performance gain over 4x (3 TByte/s) link.


    They assume that an 1x (768 GByte/s) inter-module link can be easily implemented today, so this is the bandwidth they use in further research. 1.5 TByte/s link is reasonable to achieve as well, but 3 TByte/s would require further development of signaling/packaging technology (paaragraph 3.3.3).

    Then they consider design optimizations for the 768 GByte/s link to reduce the performance gap, such as adding 16MB L1.5 cache per each GPU module in order to cache remote memory accesses (paragraph 5.1, Figure 6), and distributed scheduling of co-operative thread array (CTA) - i.e. enable contiguous data-local threads are re-grouped to execute on the same GPU module (paragraph 5.2, Figures 9/10).
    These two optimizations would result in a 33% reduction of inter-module communication, improving performance of memory-intensive workloads by 23.4% and compute-intensive workloads by 1.9%.


    The final optimization is first-touch page mapping policy, where virtual pages are mapped to the physical memory of the GPU which initiated a page load request, to further reduce inter-module traffic. When touch-first page mapping is combined with 8 MB L2 cache, 8 MB L1.5 cache, and distributed CTA scheduling, real-world performance improves by 51% in memory-intensive workloads and 11.3% in compute-intensive workloads (paragraph 5.3, Figures 13/14).

    Overall, these three optimizations improve performance of the 1x (768 GByte/s) inter-module link by 22.8% over unoptimized design - which is about 90% of a similar monolithic or MCM-GPU with an 'ideal' aggregate bandwidth of 6 TByte/s, both of which cannot be practically implemented in silicon (paragraph 5.4, Figures 16/17).


    Such MCM-GPU would also be very power-efficient. At 28 nm node, on-die and on-package transmission links require energies of 80 fJ/bit and 0.5 pJ/bit, while on-board links require 10 pJ/bit and system (inter-slot or inter-processor) links require 250 pJ/bit - each of these is an order of magnitude (10x) higher that the preceding one (paragraph 2.1, Table 2, paragraph 6.2).
    The resulting MCM-GPU design is fully transparent to the programmer and works like a monolithic integrated GPU (paragraph 7).



    These are products with either additional significant hardware features or application-specific optimized driver (which BTW could be easily modded, if anyone cared to run CAD-optimized OpenGL path on their consumer gaming cards).

    They just don't have this 'heuristic' dynamic DRM to limit your workloads to the arbitrary 'allowed' ones.


    Whatever, I just want my high-end gaming video card, and they buy it from board vendors.

    I'd rather charge them less for the special mining SKU that offers limited compute performance but the same high memory bandwidth, so they would release grip of high-end gaming cards.

    It would not prevent the crash of the desktop gaming market, since gamers will soon just stop buying video cards for these insane prices.


    PS. AMD is actually ramping up GPU production for both GDDR5 and HBM2 based parts:

    http://www.guru3d.com/news-story/am...n-blames-availability-of-graphics-memory.html
    https://wccftech.com/amd-ramping-gpu-production-confirms-memory-behind-shortage/
    etc.
     
    #92 DmitryKo, Feb 2, 2018
    Last edited: Feb 2, 2018
    tunafish, wirtold, Lightman and 3 others like this.
  13. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    637
    Likes Received:
    477
    Location:
    55°38′33″ N, 37°28′37″ E

    Mining workloads are not 'optimized' for 1x PCIe, loading a 3GB data set in a few minutes instead of a few seconds is hardly an 'optimization'. It's just 8+ slot motherboards are unable to offer more PCIe lanes using current low-end processors.
    I can understand your description of these high-level tasks, but I can't understand a practical approach to implementing such detection logic, as hardware and drivers do not operate on a high level, and the only currently viable way to determine the exact type of workload is to have a graphics programmer analyze annotated C++/HLSL source code in the graphics debugger.
    There may be individual pieces of the jigsaw puzzle, but the big picture just does not add up.

    Why the developers should even care to fix something that they didn't break in the first place? Rather everyone would just move to the greener eye-shaped pastures.
    Post-Soviet corporativist oligarchy of former state control apparatus is not based on their exloitation of the planned economy and an accumulated wealth from that period, which would be eaten wholesale by hyperinflation of 1992-1993.
    I can't see the analogy with graphic card vendors either.
    You are going to cut off large-scale mining crowd by enforcing a hard limit on the number of cards sold in each order, and they just won't bother and won't need to evade it? We shall see.
    Why would power companies even care to charge mining custumers extra money, unless some idiot control-freak politician legislates a theoretical 'mining tax' into an unpleasant reality?

    The only thing this would achieve in the long term, everyone installs a Tesla Solar Roof with a 15 kWh PowerWall and says 'kiss my ass' to the power grid police, for always.

    A video card pricing model which involved arbitrary additional charges is not going to be sustainable.

    What's that got to do with the maximum memory bandwidth?
     
    Grall likes this.
  14. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    My comment was related to the properties of AMD's coherent fabric implementation. The fabric itself is rather bound to the clocks and throughput of the memory controllers. For EPYC, no link exceeds the throughput of a memory controller's interfacing hardware, with certain links like xGMI dropping slightly below the 1:1 match of on-die link to MCM link bandwidth. It's possible there's some nice simplifying property to making sure a link's endpoint is able to service it.

    Any of the CPU products have the same or similar hardware in their client and professional SKUs. Significant amounts of their management hardware can autonomously override inputs for clocks, voltages, and can set feature levels at the factory based on fuses, or in the wild with microcode or firmware updates.
    DVFS for any of these products can willfully override outside commands in specific instances, that they haven't bothered in this way up until now is not because the hardware cannot override things further.
    The introduction of new instructions and semantics for Spectre mitigation occurs through microcode updates to the x86 processors, via signed and encrypted blobs.
    AMD already offers for the purposes of specific paying clients interfaces with its PSP for trusted computing and secure software running internally, and that shares infrastructure with DVFS and parts of its IO complex. For semicustom and their fabric's ability to easily build new products, the fabric is run through a secure domain with its control and data paths.


    If they bought those cards, they'd take the savings and income from further mining and continue buying the gaming cards, unless demand craters. This is enabling them to buy up even more cards, while robbing the gaming market of supply that could reduce prices--which the miners are better positioned to pay if they do not fall.

    The paraphrasing from the first link goes:
    "AMD reports they are working closely with memory partners to solve the issue at hand and ramp up production, but also mentioned it is one of the most important factors for the company to achieve. Graphics memory is now getting available in better quantities it seems, and as such AMD will be ramping up GPU production."

    The supply constraint for DRAM is indicative of the overall demand spike for memory, and the batch ordering process for niche memories that usually applies to GDDR and probably HBM.
    Unlike commodity DRAM that had dedicated lines running continuously, the boutique DRAM types are usually made to order--which contributes to their lower volumes and higher cost per device.
    Working with DRAM partners means arranging for more orders, which in the constrained DRAM market means diverting from the in-demand commodity types. In the case of HBM in particular, the DRAM maker statements to investor indicates they can charge much more. Raising the demand for said memory doesn't motivate Samsung and Hynix to charge less, which raises the cost of the final board.
    It's somewhat better position for AMD than last time they were hit by a glut and someone else's memory production didn't limit them from overextending.

    It seems like it would be more effective to price the cards whose memory is several times more expensive higher.
    If there is an interest in getting some cards at an affordable price to gamers, then it might help if those who are willing to pay 2-3x markups subsidize that.
     
    #94 3dilettante, Feb 2, 2018
    Last edited: Feb 2, 2018
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    If that mattered significantly, the optimal setup would have fewer GPUs per motherboard. Miners are paying more than the cost of a motherboard+cheap CPU+power supply in the difference between MSRP and market price per GPU.

    This is mostly because many popular mining algorithms purposefully avoid scaling with system-level interconnect bandwidth. A mining workload that can scale with PCIe throughput means it scales with large multi-socket boards and clusters with non-consumer specs like infiniband and high-end switches. This is something many of the developers were explicit in working against, as it would allow for those that could afford clusters with high-end interconnects to dominate.
    What happened with the first set of ASIC-resistant algorithms is that their creators missed the possibility of cramming a lot of GPUs into a rig. Later algorithms altered the bottleneck, but generally still do not want to reward systems that have enterprise-grade connectivity.

    HBCC has automatic tracking of memory access behavior and heuristics for data movement.
    The intelligent work distributor actively combines instances and schedules calls to avoid context rolls.
    DVFS calculates hardware events and progress counters at hundreds/thousands of times per second.
    The turbo functionality also needs to accumulate a rolling window of hardware events in order to know how long clocks can be pushed for time periods that approach human-relevant lengths.
    Things like Radeon chill, frame limiters, and the new VSYNC options can track progress and utilization at a game-frame level.
    The driver compiler literally pores through every line of the shader code being loaded, and there's dynamic evaluation of what can be cached, precompiled, or combined based on execution history.
    People were complaining about those thick DX11 drivers that got in the way of developers with all of their "managing the resource allocation of a whole application on their own" functionality.
    The behaviors of the mining workloads are very inconsistent with gaming workloads, and because of their developers actively targeting a systemic bottleneck so that they are ASIC-resistant and cluster-unfriendly, there are some really non-standard elements to an optimal rig's setup, with an optimized rig being multiple times as effective per watt than a baseline install.
    If you don't plan on undervolting to the point of instability, downclocking the GPU while upclocking RAM, running on a 2x PCIe slot, full crypto throughput every cycle for months on end, with custom DRAM timings and non-gaming drivers, then a gaming card that quietly walls off options at those points won't affect you.
    Even something like throttling a super high rate of random no-graphics misses to DRAM after 12 hours would likely only hurt crypto, since games are nowhere near that consistent or unvarying in behavior.
    What is needed is the choice to be made in design for the system to have those options actively guarded or overridden, or in this case to apply the measures that exist for this purpose.

    High-demand situation with shortages for all vendors. If Nvidia cards are sold out, a miner can either sit and not buy mining equipment, or buy something that is not the most optimal. If they're paying massive markups for even modest cards, I think I know what choice they're making. If the market shifts to where people aren't buying GPUs at stupid prices, then it's likely not worth bothering to cater to them.
    (edit: I misunderstood your point. My response upon re-reading is that developers noticing slowness in some random corner case are just going to see 1 more out of hundreds of other reasons they might see slowness. The GPU isn't going to throw a massive error blaming blockchains. For the throttling to materially affect hash rate, it doesn't need to be crippling for a game anyway. There's no real profit loss if a game drops a percentage similar to the sorts of drops most newly launched games experience anyway. And since games get launch drivers and development effort/non-effort with Nvidia, what's the difference here?)

    I'm asking about the oligarchs now. How did they get early and cheap access to the state interests being sold off, which none of the common people waiting in bread lines had the opportunity to do?
    If you have physical assets, production, or capital, inflation means their price rises. Also, if sufficiently connected or wealthy, you can convert to an asset or currency not hyper-inflating--or just move.
    The GPU vendors are not the little people, and they are not their friends.

    The limit is a voluntary request for retailers, which the big miners likely don't have the time to mess with and might actually want in place.
    They can go around the channel, and these limits can hurt smaller miners that could become realistic competitors if they were able to sustain a higher rate of board purchases. Instead, the intermediate miners wind up losing some of their purchase rate to gamers or part-time miners.
    That means the contribution to the global hash rate for the cards not sold to the biggest miners is less.
    The small-time miners/gamers won't care, and the big miners have an even bigger fraction of the global hash rate.

    Free market says charge what the buyer is willing to pay. I don't follow why you think up-charging someone someone demonstrably willing to pay more is unthinkable in a capitalist system.
    On top of that, power distribution at higher levels requires more work and investment on the part of the utility, and requires more care for grid stability. They aren't going to size the power lines in a neighborhood for a hundreds of KW or possibly more 24/7, and past a certain point the support and safety considerations require far more due diligence and regulatory review--given the tendency for death, fire, and explosions at those energy levels.
    Past a certain threshold, getting past the up-front complications might give a lower rate, as the predictability and consistency of demand can be beneficial for the grid and for planning power generation.

    This is literally how everything has worked since almost the start of electrification. Almost nobody cares at the residential level, and they actively avoid the sorts of infrastructure necessary to deliver multiple times the normal residential hookup.
    As for one buyer that would need to stay on the grid, a miner with a rig that would exceed the power delivery of a solar roof and the capacity of a PowerWall--which is one or two multi-GPU rigs in a detached residence with sufficient roof area in the summer, if that.

    Then the miners don't buy the card, and they hope that the next miner won't buy the card. The miners willing to accept having just a little less additional profit up front will pay, or gamers have a card they can buy as-is. No crippled mining cards means a market correction will not have cards sitting in warehouses that nobody will buy.

    Downclock the core clock and undervolt as much as possible, then upclock memory. Put as many cards on a board as possible, and use the power savings to put in as many rigs as possible before the limitations of the wiring or local hookup come into play.
    Maximum bandwidth per watt and volume equals maximum hash rate and profit per location, at least for the algorithms that chose to be limited by local device bandwidth.
     
    #95 3dilettante, Feb 3, 2018
    Last edited: Feb 3, 2018
  16. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    Bound or just operating that way? It wouldn't seem difficult to make it go faster, but it would burn more energy. There were some multipliers as I recall for debugging.
     
  17. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,098
    Likes Received:
    2,814
    Location:
    Well within 3d
    From the testing on Ryzen, the FCLK and DFICLK values are fixed, and are half the MEMCLK.
    GMI is 4x FCLK.

    The controller itself apparently has a debug 1:1 rate relative to the memory speed, but it doesn't appear to be functional or not really supported since setting it causes severe instability.
     
    #97 3dilettante, Feb 4, 2018
    Last edited: Feb 4, 2018
    Lightman likes this.
  18. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    15,837
    Likes Received:
    4,799
    Yup, year over year numbers are very good. The important part now for AMD is if they can solidify their recovery and move forward. After years of Intel not taking them seriously, they will once again become a focus for Intel competitively in the CPU scene. And in the GPU scene, they are still mostly behind NV.

    Hopefully, they'll continue to execute well on the CPU front and continue to improve on the GPU front.

    NV and Intel not having competition for so long isn't good for anyone (god I really REALLY hate NV's consumer drivers). Things will hopefully get interesting in the next few years.

    Regards,
    SB
     
    Alexko likes this.
  19. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,485
    Likes Received:
    897
    They sure have their work cut out for them on the graphics front, but perhaps they can pull it off. If so, the company should eventually be very successful.
     
  20. DrYesterday

    Newcomer

    Joined:
    Jan 8, 2013
    Messages:
    32
    Likes Received:
    18
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...