You completely missed the point here, which was fairly obvious. High bandwidth links between interposers have nothing to do with this. High bandwidth links on an interposer do. By far the biggest limitations of AFR/SFR is moving data between GPUs. Which in the past has largely occurred over PCIE or a proprietary link to some degree. A bandwidth measured in single GB/s with significant latency. AFR on a single GPU is pipelining. SFR I guess would be some form of partitioning, which would likely still work acceptably with some synchronization.
LOL that is not where the limitation is, its not the data between the two GPU's, its the way the rasterizers of each GPU need to communicate with each other, and it can't be done with multiple GPU's right now! And its all due to the way the current lighting models are done in games. That has to change for mGPU tech to take off, no wary around that!
Developers would likely need to change their programming model with a higher quantity of cores or GPUs. 2-4 GPUs, as has occurred in the past, is possible. Scalability for Navi will likely be more about submitting the scene as a bunch of largely independent tiles.
Again where is the problem coming from, look above.
Why would rasterizers working on independent tasks need to share lots of data? Worst case they could just duplicate the work, culling portions that are irrelevant (SFR). A task they are perfectly suited to doing. Best case rasterization hardware on each GPU is interconnected. No different than doubling current hardware units. The simplest solution is just having each GPU rasterize separate frames (AFR) if pipelined appropriately.
Because the rasterizes work on all polygons if a model crosses over a tile, and it will many times over, that will cause problems if they can't communicate with each other. And across 2 separate GPU's they just can't do it right now. It has NOTHING to do with bandwidth needs, it has everything to do with control silicon and cache, you don't want to be sending this data to an outside memory let alone to another GPU, it will cause too much latency for the pixel shaders to do their work. And thus you end up with no gains over the one card (SFR). That is what is happening isn't it? pretty much sending that data and introducing that latency SFR gives no tangible benefits over a single GPU.
AFR is different current engine technologies lighting needs information from previous frames, that information is not saved, so AFR breaks down. No such thing as pipelined different, the pipeline doesn't change what it does, what changes is what the programs needs are. That has nothing to do with the graphics pipeline, the graphics pipeline doesn't change nor should it change. And this is where I am coming from, if they don't make changes to the graphics pipeline stages, the need for mGPU technologies will not just disappear, which is what you are saying is going to happen, it won't happen, because if they change the graphics pipelines, there will be a major change in the way current game engines, older game engines work. Its going to be slow process gen to gen, not something that is going to happen in 2 generations of cards.
Seems like it. Or at least your interpretation of what he said publicly.
Yes you are making things up on AMD's behalf. They haven't even stated anything like this, or nV for that matter because they know its not anywhere close to doing things like what you are mentioning.
So SLI and Crossfire have never existed without specific dev involvement? It's not typically done, at least well, because you end up with data dependencies between frames. No dependencies and AFR is pretty simple to schedule. The biggest hitch has always been the need to move a resource from one adapter to another to be used by the next frame. A task pretty simple task for synchronization primitives in DX12 and Vulkan. They seem to work really well keeping compute shaders from executing before any resources are required. These limitations you speak of seem mostly relegated to DX11 when rendering wouldn't be pipelined. If an app held off presenting the next frame until the current one completed I guess you could be correct.
What is that data, look above, that fix has not been done yet, and WON'T be done anytime soon.
This is why it's obvious you have no clue what you're talking about. You stated that elements of a GPU can't be directly connected to one another. An obviously false supposition as it's the basis of all chips. SM/CUs don't communicate wirelessly last I checked. They are wired together to send signals. Routing through a chip, interposer, or actual wires is irrelevant outside of communication speeds.
I never stated as such, but then again, there you go to something that is BS.
Where is the freckin link that says multi gpu's will work fine without any type of mgpu technology in the next 2 years? Made up? You stated they are already for sale. It must be a figment of your imagination!
The problem I feel I've pretty well laid out. I'm just not sure you understand what the past limitations actually were or why things were done the way they were.
In your twisted little mind that goes against any sane understanding of why the technologies evolved the way they have.....
Where did I mention compute, or even graphics in regards to P100? All I indicated was that an interposer was used for connecting HBM memory, which I'm unsure you understand the point of including it. The interposer exists because running that volume of traces through a PCB is prohibitive. The same reason running lots of traces to connect multiple GPUs would likely require an interposer. While I'm not saying this is what Vega is doing, elements of each GPU could be directly connected to each other as if on the same chip.
No you didn't it, nor did I even state you did, I just stated that you are lumping graphics and compute together and making assumptions (the first part of assumptions, is ass), shit this is simple stuff why it hasn't been done before, it had nothing to do with the bandwidth between the chips, it had everything to do with the chips communicating with each other. Actually if this problem was solved, they wouldn't need the extra bandwidth *or reduced bandwidth bottlenecks* that an interposer would provide to see many tangible benefits. It would make programmers lives that much easier to not worry about mGPU technologies.
is this retardville here, I must be in the wrong forum.