It was never their intention to cede anything. I don't think any company would ever do that in any market, cede a market segment if at all possible. Yeah timelines can sift a little bit so there will be times where the others can take advantage. Just like what Geeforcer pointed out, the 2xx line for nV, it was a case of stretching a particular architecture for too long and giving the chance for its competition to catch up. I don't think nV will let something like that happen if at all possible. If it happens then they screwed up somewhere.
If a company has a certain advantage, its up to them not to further that advantage, if they don't they are doing something wrong on their end or they took the wrong direction to begin with and are trying to fix it.
How well would that work, e.g. when I told you that GP20x is due for Q1 2017 (fictive example!)?Now maybe that's just what you say when your big chips aren't done yet, but typical GPU vendor PR is, "ours is the best, if not now then very soon."
Could one of you guys take a minute to explain "async time warp" please? I've seen this crop up a couple times now, and the term is new to me...
Thank you in advance!
How well would that work, e.g. when I told you that GP20x is due for Q1 2017 (fictive example!)?
Correct, the sales of the GP10x series would drop dead, even if the second revision wasn't expected to be much of an improvement in actual performance.
AMD is in a strange position. They want to sell of the remaining Fiji chips, but everyone is already waiting for Vega, and that even before Polaris is officially released. Even if they HAD released a big Polaris now, it would only have worsened the situation as Vega would then had deprecated not only one, but two designs.
For the very same reason I also doubt that AMD will release any dedicated low end chips based on the Vega arch until Navi, even if they could. Not with Navi already scheduled for 2018, and Polaris just being about to be released.
I seriously doubt that Polaris was supposed to become a mere gap filler as it does now. The schedule looks just plain wrong, respectively too compressed.
Well it's up to developers to implement what ever they want. And yes for some scenarios compute shader will be enough. However I don't think you'll be able to get by most cases without redrawing anything. How about objects that are really close to the player such as weapon or cockpit? How about GUI elements such as health bar, cross hair?Yes, it is. At least that's one possible implementation, as it technically doesn't need a geometry of any sort.
This does seem possible, although if AMD is to serve as an example it can take longer to get the graphics pipeline and its larger amount of context amenable to preemption.With GP100 having fine-grained compute preemption, there's a good chance that graphics preemption will be present as well. If so, then no fundamental benefit here either. So let's defer this one for later...
True, but the timewarp is somewhat time sensitive and compute should complete more quickly as you wouldn't need to worry about geometry. AMD also favored preemption with compute over graphics. How Nvidia implemented it prior to Pascal I'm not exactly sure.What specifically does compute have to do with VR that it doesn't have with the rest of rendering in modern games? Rendering stuff is graphics task, async time warp for VR is graphics task.
Looking at the roadmap it doesn't seem that distinct. HBM2 added, likely due to availability and only making sense on higher end parts. The rest is likely more of their fine grained power gating that didn't make the cut for Polaris. Maybe added support for additional display outputs VR might use. So yeah, Vega looks like a big tweaked Polaris, unless some of those patents didn't meet the deadline.On the topic of HW speculation, do you think Vega will be distinct enough as an architecture to even merit replacing Polaris in it's segment? I think we're going to continue to see incremental improvements, as ever with GCN. So in that sense Vega will be big Polaris + tweaks.
But isn't compute preemption graphics the hardest kind of preemption? Because there's a lot of state in the non-shader pipeline that needs to be take care of one way or the other?As long as compute can preempt graphics, that's already all which is needed.
Hrm, not if we assume that the preempted warps are guaranteed to be re-instantiated on the very same SMM, and if we allow the entire graphic pipeline to stall, in whole, retaining the state. And given that Nvidia has only announced preemption of graphics in favor of compute, without loosing a single word about rescheduling any workload, that sounds plausible enough to me. Yes, it also sound like a band-aid fix, but I somewhat doubt Nvidia would provide more than that.But isn't compute preemption graphics the hardest kind of preemption? Because there's a lot of state in the non-shader pipeline that needs to be take care of one way or the other?
What needs to happen to the non-shader pipeline state?But isn't compute preemption graphics the hardest kind of preemption? Because there's a lot of state in the non-shader pipeline that needs to be take care of one way or the other?
Hrm, not if we assume that the preempted warps are guaranteed to be re-instantiated on the very same SMM, and if we allow the entire graphic pipeline to stall, in whole, retaining the state. And given that Nvidia has only announced preemption of graphics in favor of compute, without loosing a single word about rescheduling any workload, that sounds plausible enough to me. Yes, it also sound like a band-aid fix, but I somewhat doubt Nvidia would provide more than that.
Yes, I'm suspecting that as well. So using the high priority context would be highly volatile.Just freezing it means that if the OS is getting antsy about GPU responsiveness, the graphics pipeline might cause the OS to restart the device.
I meant varying levels of the OS blanking out the screen, killing the application, restarting the driver, or possibly a hard system crash.Yes, I'm suspecting that as well. So using the high priority context would be highly volatile.
But it's still an improvement over Maxwell, were a high priority compute context would still need to wait for a SM unit to ramp down entirely first, at minimum finishing all active draw calls, and hence also having the entire graphics pipeline running empty, plus an indeterministic latency. Freezing the pipeline is definitely better than draining it.
If you want to interrupt the rendering of a huge triangle, you need to save somewhere which parts have already been rendered and which have not. Similarly, you'd need to save the configuration of ROP blenders. Etc.What needs to happen to the non-shader pipeline state?
then page 30:Compute Preemption is another important new hardware and software feature added to GP100 that allows compute tasks to be preempted at instruction-level granularity, rather than thread block granularity as in prior Maxwell and Kepler GPU architectures. Compute Preemption prevents long-running applications from either monopolizing the system (preventing other applications from running) or timing out. Programmers no longer need to modify their long-running applications to play nicely with other GPU applications. With Compute Preemption in GP100, applications can run as long as needed to process large datasets or wait for various conditions to occur, while scheduled alongside other tasks. For example, both interactive graphics tasks and interactive debuggers can run in concert with long-running compute tasks
The new Pascal GP100 Compute Preemption feature allows compute tasks running on the GPU to be interrupted at instruction-level granularity, and their context swapped to GPU DRAM. This permits other applications to be swapped in and run, followed by the original task’s context being swapped back in to continue execution where it left off.
Compute Preemption solves the important problem of long-running or ill-behaved applications that can monopolize a system, causing the system to become unresponsive while it waits for the task to complete, possibly resulting in the task timing out and/or being killed by the OS or CUDA driver. Before Pascal, on systems where compute and display tasks were run on the same GPU, long-running compute kernels could cause the OS and other visual applications to become unresponsive and non-interactive until the kernel timed out. Because of this, programmers had to either install a dedicated compute-only GPU or carefully code their applications around the limitations of prior GPUs, breaking up their workloads into smaller execution timeslices so they would not time out or be killed by the OS.
Indeed, many applications do require long-running processes, and with Compute Preemption in GP100, those applications can now run as long as they need when processing large datasets or waiting for specific conditions to occur, while visual applications remain smooth and interactive—but not at the expense of the programmer struggling to get code to run in small timeslices.
Compute Preemption also permits interactive debugging of compute kernels on single-GPU systems. This is an important capability for developer productivity. In contrast, the Kepler GPU architecture only provided coarser-grained preemption at the level of a block of threads in a compute kernel. This block-level preemption required that all threads of a thread block complete before the hardware can context switch to a different context. However when using a debugger and a GPU breakpoint was hit on an instruction within the thread block, the thread block was not complete, preventing block-level preemption. While Kepler and Maxwell were still able to provide the core functionality of a debugger by adding instrumentation during the compilation process, GP100 is able to support a more robust and lightweight debugger implementation.
A compute shader doesn't interact with that stuff though.If you want to interrupt the rendering of a huge triangle, you need to save somewhere which parts have already been rendered and which have not. Similarly, you'd need to save the configuration of ROP blenders. Etc.