AMD Mantle API [updating]

UT you are really not getting the message about your signal-to-noise ratio. GCN is a rather different beast from ATI's prior art, your (conspiracy) theory is baseless.
Despite your accusations of noise he is right, the reason mantle will not support earlier cards is not technical.
Its not something that can be "separated", it is part and parcel of the API hence the entire API needs to be written for the architecture.
 
As far as I can tell GameGPU is using MSI Afterburner:

(Google translate)
"All the cards were tested for maximum graphics quality program MSI Afterburner."

Like FRAPS, Afterburner is a well known program to benchmark DX and OGL, but how accurate is it to benchmark Mantle? I am surprised it even produces numbers for Mantle. I don't have the hardware to test myself, for those who do, do FRAPS and Afterburner show FPS even under Mantle? If they do, how are they different to BF4's own performance monitor?

As an Official MSI Rep (got this one from overclock.net though) Look at all the 0 under Framerate:FPS!

a4095fd5_MantleTests2.jpeg
 
Has anyone in the internet made an extensive multi-CPU comparison using Mantle?
 
As an Official MSI Rep (got this one from overclock.net though) Look at all the 0 under Framerate:FPS!

a4095fd5_MantleTests2.jpeg

Hi Neliz,

Thanks for clearing that up.

Can we trust Afterburner's memory usage under Mantle? If so, I wonder why it's so high.
 
Has anyone in the internet made an extensive multi-CPU comparison using Mantle?

I'm really waiting for Techreport to actually look at details (like cpu/mem graphs) instead of the basic "well here it clearly show the value X is higher than the value Y"

Hi Neliz,

Thanks for clearing that up.

Can we trust Afterburner's memory usage under Mantle? If so, I wonder why it's so high.

I think the memory usage is okay, one of the biggest benefits that repi also describes is better control of the memory allocation. so, finally a game that will stuff my GDDR5 with useful stuff!

Not sure if that is also the reason BF4 seems to load faster, or that is the result of the update that came along with the Mantle patch.
 
Last edited by a moderator:
Computerbase.de did a decent job comparing CPU performance between very different clock speeds:

Ncdzhe7.png




PCLab.pl also has a more extensive comparison:
F9Bf3T0.png



130€ CPUs at default speeds getting about 5% performance difference to 280€ overclocked CPUs.
This is the thing I like the most about Mantle :)
 
Despite your accusations of noise he is right, the reason mantle will not support earlier cards is not technical.

Is there more in-depth information on the technical aspects of the Mantle driver and the hardware it utilizes that says this?

Mantle may be relying on some of GCN's changed memory addressing and software control to implement its low-level control.
The CPU-side optimizations work by putting more onus on the developer, but there may be a system component to it.
GCN is much more flexible with setup and control, while the VLIW designs are more restricted and may not have designs validated for the lower level interaction offered to programs under Mantle.

Some of the shortcuts Mantle relies on don't match up well with AMD's older GPUs. I'd argue that there might be more commonality with low-level access to graphics features, since there are elements that haven't evolved as much with regards to texture and ROP features compared to the execution and software control portions of the design. That wouldn't help with the CPU-side stuff assumed to be more portable.
 
If so and even older AMD GPUs would be problematic, how would they plan to widen the support to NVidia too?

Anyways, even if their intentions are different, I still have to go and buy a new card. So, no matter what, the end result is still the same - driving sales despite that older configurations may need this boost more.

At this point Mantle is very future centric and what will happen to all future AMD architectures?
 
If so and even older AMD GPUs would be problematic, how would they plan to widen the support to NVidia too?
Nvidia's architectures in general have been very flexible for far longer.
That aside, I don't think AMD needs to do the actual widening for other vendors' architectures.
I'm not sure if they'd feel that bad if some random gotcha made Mantle unusable for much of Nvidia's lineup, but the general programmability of Fermi and beyond coupled with the level of abstraction in Mantle probably allows workarounds for many incompatibilities.
 
As an Official MSI Rep (got this one from overclock.net though) Look at all the 0 under Framerate:FPS!

A huge sorry for the off topic but... you are alive!!!!! :eek:
Had not seen a post from you in ages...

Computerbase.de did a decent job comparing CPU performance between very different clock speeds:

Ncdzhe7.png

Hmm.. so it looks like Mantle performance does not improve much with CPU clockspeed? From 2Ghz onwards it did not gain much, as long as the CPU has/is using 4 cores. Or could that be a GPU bottleneck? But from what I could gather from German it is only 1080p so probably not?

130€ CPUs at default speeds getting about 5% performance difference to 280€ overclocked CPUs.
This is the thing I like the most about Mantle :)

Yup, me too.
 
Last edited by a moderator:
to me the 860/2600/3770/4770 aren't really I7's.. they are all I5's....... real I7's are LGA1366 and above damnit!!!
I have a 3770K @ 4.4 (noctura D14) and to me thats a good upper bound for your "average serious gamer".

my 2600@4.3 kick the butt of your i920 "real i7"
 
Some of these figures thrown about suggest that there's more to mantle than just reductions in API execution overhead. There seems to be some additional concurrency allowed between CPU and GPU with mantle.

[After hunting around]

Confirmed?

http://techreport.com/review/25683/delving-deeper-into-amd-mantle-api/3

It looks like DX11 makes some games worse than CPU bound. It makes them core or thread bound.

Mantle seems to allow more cores to simultaneously drive the GPU.
 
It looks like DX11 makes some games worse than CPU bound. It makes them core or thread bound.
The term "CPU bound" is overloaded and a bit misleading here. Many "CPU bound" games are bottlenecked on a single threaded API submission thread, but the rest of the CPU is relatively idle. Applications that see massive gains from Mantle typically fall into this category.

Note that even pure reductions in API overhead would show these gains. Mantle also enables better multithreaded submission but IMO that's less important than just the raw overhead reduction.
 
It is the case that AMD's drivers have historically shown stronger evidence for being driver-bound more quickly than Nvidia's, and this is before the outright non-implemention of things like multithreaded command lists (debates as to whether it would help aside).

I haven't kept up with the multithreaded render support situation, though I recall in the early days that it took a fair amount of bespoke driver development on a per-application level and hidden driver shenanigans to make Nvidia's implementation worthwhile.
If a more universal method hasn't come to pass, then AMD's failure to implement is somewhat mitigated by the implementation having to be more fragile than it should be.
 
Why are these games bottlenecked to just one thread?
Because that's the entire problem with the graphics APIs right now that Mantle is trying to address... in practice you can basically only submit from one thread and that thread ends up getting pretty heavy with lots of state changes and draw calls.

This is why high IPC dual core machines tend to do just as well or better in game as low IPC quad cores. (And also why six cores are rarely any faster than four in games.) Games aren't generally making much use of the other cores, so lightening that submission thread relieves a huge bottleneck.

It would be great to see more games make use of the extra processing resources for gameplay, AI, physics, etc. but with the relatively weak console CPUs this generation and an increasing push towards running on a wider range of machines and power ranges (ultrabooks, tablets), it may not happen.

I haven't kept up with the multithreaded render support situation, though I recall in the early days that it took a fair amount of bespoke driver development on a per-application level and hidden driver shenanigans to make Nvidia's implementation worthwhile.
It's mostly implemented but doesn't really help (anyone) much outside of toy cases. There are core problems in the design that limit the potential gains quite significantly, so it's not really worth the effort at the moment.
 
I thought Nvidia was able to cobble something together that was at least a performance gain in Civ5.
It was a pretty targeted effort, at least at the time.
 
Last edited by a moderator:
I thought Nvidia was able to cobble something together that was at least a performance gain in Civ5.
Yeah, you can get some minor gains in a few applications, but it's typically stuff like eating *two* submission threads for barely double-digit performance improvements (in CPU bound cases)... not really that exciting and doesn't scale at all beyond 2. It's clearly something that needs a fundamental rethinking, a la. Mantle.
 
Yeah, you can get some minor gains in a few applications, but it's typically stuff like eating *two* submission threads for barely double-digit performance improvements (in CPU bound cases)... not really that exciting and doesn't scale at all beyond 2. It's clearly something that needs a fundamental rethinking, a la. Mantle.

I agree that the gains aren't commensurate with the level of special-case work and fragility to the solution. It's just that I believe Civ5 moves beyond a toy case.
 
...


I swear to god people if I hear any more FPS deltas I'm going to start banning people ;) repi already warned you guys! This is Beyond3D, I want to see frame times (in ms) :) And I definitely don't want to see any "minimum frame rate" measured over some interval nonsense. Frame time distributions everyone, say it three times :)


....

If I only had time to test I would do it properly :cry:
These were just static scene frame rates and they fluctuate +/- 5FPS hence 'average' word was used. But because it was static scene you can calculate frame time for this specific spot.

Maybe tomorrow I will have a bit of time to test Mantle more thoroughly. For sure I have issues with Mantle and CF on my setup as this I managed to try already and it just crashes within few seconds.
 
Back
Top