AMD GPU14 Tech Day Event - Sept 25'th

3dilettante · Sep 29, 2013

lanek said:
Will it not be easier to implement new features or even make evolve GCN1 based console when you have the advantage of a low level API access ? And Sony and MS have the possibility to upgrade the APU /GPU in 3 years and release a PS4 v2 and XB One v2 with better GPU ?.. They have allready do it by the past... ( without saying with process advance, they can got some real gain on TDP, performance ) .

What happens when the game coded with low-level details of the new revision runs on the old hardware?
Exposing low-level details means there's no intermediary to intercept your mistakes. CPU architectures are rife with low-level details that stick around because you can't wantonly crash software on your platform because you suckered people into using feature X in a certain way.

Maintaining a higher level of abstraction means the software doesn't have details on how the hardware achieves an end result, so if it's a new vector ALU, a re-ordered data path, or turtles, it's none of the software's business.

Consoles already have low-level access, and they barely change the core architecture because they need consistency. The 360 shrink that put the GPU and CPU on the same die actually spent silicon on a fake bus unit that pretended to be a slow external bus so that the chip behaved identically to the old ones.

Davros · Sep 29, 2013

3dilettante said:
What happens when the game coded with low-level details of the new revision runs on the old hardware?

It falls back to dx11 (or whatever ) I guess

3dilettante · Sep 30, 2013

Davros said:
It falls back to dx11 (or whatever ) I guess

The hypothetical is a new revision of a console, which wouldn't have that fallback.
I don't think it would be acceptable even if there were. Unless Mantle brings nothing to the table--and in that case why bother, a fallback would lead to older consoles having an inferior experience.

Dave Baumann · Sep 30, 2013

Mantle already deals with a number of architectural revs.

Dominik D · Sep 30, 2013

LeStoffer said:
I don't know how much this applies to real world performance in games, but it is interesting nonetheless.

These numbers are fairly meaningless w/o CPU usage numbers. Submitting from a single hardware thread at some point you'll become CPU limited - which is what these numbers indicate. If you calculate time per drawcall, you'll see that somewhere between 300 and 2100 drawcalls dude became CPU-bound.

There's a mostly fixed CPU cost of each drawcall which is indicated by the almost flat line at the bottom. The reason it's not completely flat is that once you try pushing more and more drawcalls per frame, you decrease the number of "houskeeping" operations pushed down the pipe (clear, present, etc.) which also take time. So this time saved on, say, less presents gives you some extra time to perform draw calls, which shows up on the graph as decreased drawcall time.

The bottom line is: yes, draws take time. This time is consumed by pushing data from UMD to KMD, memory operations (allocations, mapping, building paging buffers and what not) and can't be avoided if you want multitasking operating system that's responsive and works with more than one GPU type.

lanek · Sep 30, 2013

Nemo · Sep 30, 2013

One DP can up to three 4K displays via MST?

Davros · Sep 30, 2013

it says "1 display port with mst"
Does that mean they supply a mst (they are $100)
or that there is a mst built in and you just need some sort of dumb splitter
it also says use "any of the outputs" that was not the case with the 6950 you had to use the displayport
with an "active" adapter if you didnt have a dp monitor, this caused a lot of confusion for some people.

ps: are the 2 dvi ports dvi-i or 1xdvi-i + 1xdvi-d as on the 6950 (bloody penny pinchers)

Lille · Sep 30, 2013

Davros said:
ps: are the 2 dvi ports dvi-i or 1xdvi-i + 1xdvi-d as on the 6950 (bloody penny pinchers)

In a Linus Tech Tips video on youtube he mentions that it has 2 dual link dvi-d.

sir doris · Sep 30, 2013

Davros said:
it says "1 display port with mst"
Does that mean they supply a mst (they are $100)
or that there is a mst built in and you just need some sort of dumb splitter
it also says use "any of the outputs" that was not the case with the 6950 you had to use the displayport
with an "active" adapter if you didnt have a dp monitor, this caused a lot of confusion for some people.

ps: are the 2 dvi ports dvi-i or 1xdvi-i + 1xdvi-d as on the 6950 (bloody penny pinchers)

Side says 2 x dual link DVI, so that would surely mean dvi -d.

lanek · Sep 30, 2013

Davros said:
it says "1 display port with mst"
Does that mean they supply a mst (they are $100)
or that there is a mst built in and you just need some sort of dumb splitter

Looking the numbers of adaptaters including Active you have in the 7900 boxes.. I think its not a problem for them to include an Multi stream or splitters for DP .. should cost even less.

KKRT · Sep 30, 2013

Dominik D said:
These numbers are fairly meaningless w/o CPU usage numbers. Submitting from a single hardware thread at some point you'll become CPU limited - which is what these numbers indicate. If you calculate time per drawcall, you'll see that somewhere between 300 and 2100 drawcalls dude became CPU-bound.

There's a mostly fixed CPU cost of each drawcall which is indicated by the almost flat line at the bottom. The reason it's not completely flat is that once you try pushing more and more drawcalls per frame, you decrease the number of "houskeeping" operations pushed down the pipe (clear, present, etc.) which also take time. So this time saved on, say, less presents gives you some extra time to perform draw calls, which shows up on the graph as decreased drawcall time.

The bottom line is: yes, draws take time. This time is consumed by pushing data from UMD to KMD, memory operations (allocations, mapping, building paging buffers and what not) and can't be avoided if you want multitasking operating system that's responsive and works with more than one GPU type.

I was not CPU bound.

This is my CPU utilization with less than 100 draw calls
http://i2.minus.com/iOPjPdyK3T9a5.jpg

With 23k draw calls (that spike before the end of the graph)
http://i4.minus.com/iJq4S1WZS18z6.jpg

With 100k draw calls
http://i1.minus.com/ibaWNyioP6DyO5.jpg

Psycho · Sep 30, 2013

KKRT said:
I was not CPU bound.

Yes you were. CPU utilization falling to 25% on a quad core screams being limited by the draw submit thread.

KKRT · Sep 30, 2013

Psycho said:
Yes you were. CPU utilization falling to 25% on a quad core screams being limited by the draw submit thread.

It felt down to 25%, because it minimized SDK window when i took a screenshots of task manager.

Dominik D · Sep 30, 2013

You could change core usage to 1 in msconfig and retest stuff with high priority process to get better results.

KKRT · Sep 30, 2013

Dominik D said:
You could change core usage to 1 in msconfig and retest stuff with high priority process to get better results.

You mean setting Editor to only use one core? Editor is maxing out one core from the get go. With two cores enabled there is no performance increase.

karlotta · Sep 30, 2013

KKRT said:
You mean setting Editor to only use one core? Editor is maxing out one core from the get go. With two cores enabled there is no performance increase.

CoreParking?

Andrew Lauritzen · Sep 30, 2013

If you weren't CPU bound on draw calls/setup (which I agree has yet to be proven), what is the point of the test? What are you even measuring that has any relevance to Mantle?

Also this obviously isn't a great way to measure API overhead, since there's a lot more going on in an engine that relates to objects/instancing than just draw calls/3D API interaction. Typically you want to set up a microbenchmark that changes a specific set of state between each draw call (and the cost of the draw call will vary depending on which state that is!) and ensure that everything is offscreen/culled on the GPU.

KKRT · Sep 30, 2013

Andrew Lauritzen said:
If you weren't CPU bound on draw calls/setup (which I agree has yet to be proven), what is the point of the test? What are you even measuring that has any relevance to Mantle?

Also this obviously isn't a great way to measure API overhead, since there's a lot more going on in an engine that relates to objects/instancing than just draw calls/3D API interaction. Typically you want to set up a microbenchmark that changes a specific set of state between each draw call (and the cost of the draw call will vary depending on which state that is!) and ensure that everything is offscreen/culled on the GPU.

That was only test how CryEngine 3 reacts to increasing draw calls, when nothing more is affected, except by creating some sprites.

BRiT · Oct 1, 2013

*AHEM* Moved off-topic conversation to here: Logical Fallacies - Cheap Bastards with DSUB Displays with High-End Video Cards

AMD GPU14 Tech Day Event - Sept 25'th

3dilettante

Davros

3dilettante

Dave Baumann

Gamerscore Wh...

Dominik D

lanek

Nemo

Davros

Lille

sir doris

lanek

KKRT

Psycho

KKRT

Dominik D

KKRT

karlotta

pifft

Andrew Lauritzen

Moderator

KKRT

BRiT

(>• •)>⌐■-■ (⌐■-■)

Similar threads