Intel Gen9 Skylake

Arun

Unknown.
Legend
... and if anyone starts talking about numbers of hardware queues and ACEs and whatever else you can pretty safely ignore that as marketing/fanboy nonsense that is just adding more confusion rather than useful information.
Yes, I'm honestly curious what the benefits to having multiple compute kernels in parallel really are (ala AMD's >2 ACEs)... This is beneficial if you cannot overlap an independent graphics workload and you have multiple independent compute workloads to run, but I'm not sure how important that is in practice.

Certainly a lot depends on the workload, the developer, *and* the API's ability to expose that parallelism in the first place. OpenGL ES3.1's memory barrier mechanism is quite bad at this and this will hide some of the benefit of HW support for parallel graphics & compute. I think DX11 would have the same problem (or worse) while DX12/Vulkan/Mantle should all do a much better job at it, but I don't know enough about non-GLES/Vulkan APIs to be certain.

Another thing to consider is that if you have enough parallelism on one workload, then running a second one at the same time risks trashing your cache, and arbitration may also be non-trivial. Again I have never done any performance analysis of GCN so I don't know how well they handle that but it's certainly something that I expect will benefit from gradual improvement between hardware generations.
 

Andrew Lauritzen

Moderator
Moderator
Veteran
Yes, I'm honestly curious what the benefits to having multiple compute kernels in parallel really are (ala AMD's >2 ACEs)... This is beneficial if you cannot overlap an independent graphics workload and you have multiple independent compute workloads to run, but I'm not sure how important that is in practice.
Right so the bit people get confused with is that "I want multiple semantically async queues for convenience/middleware in the API" does *not* imply you need some sort of independent hardware queue resources to handle this, or even that they are an advantage. I hate to beat a dead horse here but it really is similar to multithreading and SMT... you don't need one hardware thread per software thread that you want to run - the OS *schedules* the software threads onto the available hardware resources and while there are advantages to hardware-based scheduling at the finer granularity, you're on thin ice arguing that you need any more than 2-3 hardware-backed "queues" here.

Certainly a lot depends on the workload, the developer, *and* the API's ability to expose that parallelism in the first place.
Absolutely, and that's another point that people miss here. GPUs are *heavily* pipe-lined and already run many things at the same time. Every GPU I know of for quite a while can run many simultaneous and unique compute kernels at once. You do not need async compute "queues" to expose that - pipelining + appropriate barrier APIs already do that just fine and without adding heavy weight synchronization primitives that multiple queues typically require. Most DX11 drivers already make use of parallel hardware engines under the hood since they need to track dependencies anyways... in fact it would be sort of surprising if AMD was not taking advantage of "async compute" in DX11 as it is certainly quite possible with the API and extensions that they have.

Now obviously I'm all for making the API more explicit like they have in DX12. But don't confuse that with mapping one-to-one with some hardware feature on some GPU. That's simply a misunderstanding of how this all works.

Another thing to consider is that if you have enough parallelism on one workload, then running a second one at the same time risks trashing your cache, and arbitration may also be non-trivial. Again I have never done any performance analysis of GCN so I don't know how well they handle that but it's certainly something that I expect will benefit from gradual improvement between hardware generations.
Yes, the scheduling is non-trivial and not really something an application can do well either, but GCN tends to leave a lot of units idle from what I can tell, and thus it needs this sort of mechanism the most. I fully expect applications to tweak themselves for GCN/consoles and then basically have that all undone by the next architectures from each IHV that have different characteristics. If GCN wasn't in the consoles I wouldn't really expect ISVs to care about this very much. Suffice it to say I'm not convinced that it's a magical panacea of portable performance that has just been hiding and waiting for DX12 to expose it.

Anyways this is going a bit off topic so I'll leave it at that :) I think I've answered the question in any case and hopefully made those asking it think a little deeper.
 

Grall

Invisible Member
Legend
Seems like GT4 will be getting the "580" number and it's first mentioned for one of the Workstation parts, missing the "e" for EDRAM.
...So the part will lack the eDRAM die, or it's simply not visible in the model code even though it has it? :p

Maybe this is simply not known at this stage...
 

Paran

Regular
H-Series with 45W and Iris Pro is a GT4e variant, the only option.


ywkpvpmm.png
 
D

Deleted member 13524

Guest
With the performance of a GM107 I think it'll come a little too late to be called a "gaming PC".
They're better off just supporting an external graphics card through Thunderbolt v3, IMO.


A couple of months after this is out in the shelves, AMD is launching Polaris Mini.
 

CarstenS

Legend
Subscriber
While I think that's true for many hardcore PC gamers, there is a ton of games (even some popular ones) out there that play really nicely on integrated graphics. I use my 3 year old notebook IGP to play some of these while my main rig is in waiting queue for graphically intense multiplayer shooters. :)
 
D

Deleted member 13524

Guest
But if for such games all you need is your old notebook's IGP, then you don't need a GT4e either.

The amount of cases for which the GT4e is interesting is starting to be very reduced, and will be close to null when the FinFet lower-end graphics cards start appearing.
If it's to play 2D or simple 3D games, then the lower-end IGPs are enough. If you want to play games then you're probably better off spending that premium in a dedicated Polaris or Pascal graphics card. This will be even more accentuated when Thunderbolt v3 graphics card enclosures are popularized and the smaller and low-power graphics cards offer over twice the performance of a GT4e.
 

CarstenS

Legend
Subscriber
That's true, but the faster the IGP, the bigger the pool of games I can choose from. Another thing is and that's more the EDRAM and the video stuff: It helps saving power.
 
While I think that's true for many hardcore PC gamers, there is a ton of games (even some popular ones) out there that play really nicely on integrated graphics. I use my 3 year old notebook IGP to play some of these while my main rig is in waiting queue for graphically intense multiplayer shooters. :)


Or what about a former gamer who fell behind for a while. The truth is in that case, you need a CPU first and a GPU second to play games. E.g. even a Haswell Pentium is amazingly fast and powerful, I'd think of doing video editing, dealing with 20 megapixel pictures, raytracing or whatever on such a CPU. But some games run like crap on it.

If I got a Skylake GT4e or an i5 5675C today that'd be the most powerful and advanced GPU I have anyway, same for many people : )
Playing with less than native resolution on LCD would be not that bad given that 15 years ago we put up with such things as simple bilinear filtering which gave embarassing "walking lines" on the screen.

Value is fairly bad on GPU gigaflops per dollar. But the CPU is on the top of the food chain for many years. You can go buy a < $100 AMD quad core and something like GTX 970, 960 or R9 380 to go with it : you get exponentially more GPU gigaflops but a couple years down the road the CPU is a dud.

Laptops do succeed in playing games due to most often having a 2C/4T Intel CPU (Carrizo may be the first AMD CPU that's good enough for a laptop for a long while)
Non-technical people do value being able to run a game at all. (I do too ; if a quad core AMD Zen CPU was available yesterday, coupled with a GM107 GPU there would be some good gaming value here)
 
Last edited:

swaaye

Entirely Suboptimal
Legend
I picked up a second hand, top model Venue 11 Pro with the Core M 5Y70 and the battery keyboard for a lovely price. I gotta say it's somewhat eye opening to see what the fanless thing can play. Playing say Dishonored smoothly on 10W hardware is a curious experience.

I would like to see EDRAM more available because it appears to benefit frame rate consistency in the reviews of the desktop Broadwell.
 
Last edited:
D

Deleted member 13524

Guest
I picked up a second hand, top model Venue 11 Pro with the Core M 5Y70 and the battery keyboard for a lovely price. I gotta say it's somewhat eye opening to see what the fanless thing can play. Playing say Dishonored smoothly on 10W hardware is a curious experience.

IQl70qe.jpg


Now you need an iPega 9023 and your transition to mobile PC gaming shall be complete.
 

Andrew Lauritzen

Moderator
Moderator
Veteran
GT4e in particular is interesting in that it gets into roughly the performance territory of the Xbox One and thus games targeted at console minimums will generally work well on it too.

And don't get me started on external graphics - it seems cool in your head, but it's really pointless in practice vs. just having a desktop PC in the bread box. dGPUs are a significant portion of the cost and power use of a modern gaming PC so it makes little sense to slave them to some laptop that has been optimized to be portable and light vs. just having two separate PCs.

Also the Surface Pro 4 i7 w/ Iris graphics (GT3e) is awesome. Quite fast and great cooling solution. Expensive, but you really can't do much better in the form factor currently.
 
Top