Is UE4 indicative of the sacrifices devs will have to make on consoles next gen?

Andy already debunked that (not to mention that your question is ill-formed). Not in a made up hand-wavy bullshit nature, as seems to be the case with this dumb "magical consoles have magical to-the-metal benefits" nonsense meme that will not die, but rather with numbers. Are you purposefully being dense?

I'm responding to Sebbi's post. So, you're calling Sebbi a liar. Got it. Also, there is nothing magical about overhead. The term "magic" is, usually, used by the people that don't understand the science behind such things.
 
I'm responding to Sebbi's post.
Yeah, that's called cherry picking to support an agenda, which is all you've been doing in this thread. You're not trying to learn, you're trying to collect a comfortable little collection of quotes that you can repeat ad nauseum, often out-of-context or in irrelevant ways to support your preconceived notion that PS4 is God's gift to mankind.

It's tiresome fanboy-ism, and it has no place on Beyond3D.
 
So, you're calling Sebbi a liar.

I think you need to step back a bit and take a breather. We are primarily talking about CPU overhead of DirectX - however due to your lack of understanding you have assumed this as a discussion of GPU overhead, and thus are getting needlessly passive-aggressive / antagonistic when people have tried to educate you.
 
Sebbi, could you give some examples of what could be done on a GPU, like the PS4, in 1000 cycles? How often would you have to use a mutex lock per frame, in a PC game you have developed?

Your question is pointless because the mutex overhead has no bearing on the GPU performance (unless it's starving for data) it just means the CPu is less efficient at feeding data to the GPU.
Much of the benefit Sebbi is talking about also goes away with DX11 and the consoles moving to more modern architectures. PC's also tend to have multithreaded drivers that actually do a pretty good job of hiding the actual costs involved with a draw call on a modern GPU. I am not talking about submitting data on multiple threads here, but rather hiding the high latency of certain operations effectively.

There is still a huge benefit to fixed function hardware and there would be even with identical overhead, you can optimize dataflow to the systems strengths rather than playing the do generic optimizations and let the fast hardware deal with it game.
And if you're dynamically creating/, reading or writing GPU resources there is still a massive difference in performance.
 
We say Draw Call overhead but we mean State Changes and Validation overhead. (and a few other things, like copies/shadow copies...)

Still, DX back-end is not the thin layer you can expect on a console.

I've not used Bindless on OpenGL, but I indeed expect it to be close to console performance. (But I'd still expect that to be slightly slower.)
Also there's a number of things you can't easily do on PC because of the architecture. (PCIe, two memory pools...)
 
I've not used Direct State Access on OpenGL, but I indeed expect it to be close to console performance. (But I'd still expect that to be slightly slower.)
Meh, direct state access on GL makes a really old API somewhat less ancient and crazy. The pointer-based state on DX is efficient enough.

The big overhead is just with all the resource stuff that happens to ensure that applications can run alongside one another in Windows (console stuff doesn't currently have to share) and indeed some design catering to discrete GPUs with separate memory pools. But bindless textures, etc. eliminates most of that overhead, and even without that there are tons of ways to write good code with fewer state changes in DX10+. People just don't do it currently because they port ancient DX9 code-bases naively.

Anyways, I really don't see a huge issue with the numbers that I posted. Can they be better? Sure, but anything can. Modern engines really have demonstrated pretty conclusively that number of state changes is not really the huge bottleneck to visual quality that people once thought it was.

Just stop micromanaging tiny bits of state and feeding it to the GPU like it was DX9 days and start letting it pull some of its own data from memory.

/rant. Guess I've been looking at too many game traces lately ;)
 
Your question is pointless because the mutex overhead has no bearing on the GPU performance (unless it's starving for data) it just means the CPu is less efficient at feeding data to the GPU.
Much of the benefit Sebbi is talking about also goes away with DX11 and the consoles moving to more modern architectures. PC's also tend to have multithreaded drivers that actually do a pretty good job of hiding the actual costs involved with a draw call on a modern GPU. I am not talking about submitting data on multiple threads here, but rather hiding the high latency of certain operations effectively.

There is still a huge benefit to fixed function hardware and there would be even with identical overhead, you can optimize dataflow to the systems strengths rather than playing the do generic optimizations and let the fast hardware deal with it game.
And if you're dynamically creating/, reading or writing GPU resources there is still a massive difference in performance.
I like the civility in your post. It's refreshing. I still doubt that I wouldn't have been attacked by three people, if I changed the "G" in GPU to a "C".

Surely, you can see the confusion here. I've shown tech articles, graphs (from a forum member's tests on performance differences), and quoted a statement from Timothy Lottes of Nvidia. However, it's just tossed to the side. I get not articles or graphs in return. I just get hostility. And, I'm suppose to just disregard all that information?

I don't have a problem dropping the information I have, in favor of a new theory. It just has to be done scientifically. Provide me with at least more proof than I've brought to the table. Don't just dismiss it, as a few others have tried already. Don't try to attack me, when asking for evidence. Show me something.

That scientific principle is what I would like to think B3D is mostly about. Respect is the other think I would expect on B3D. You have shown that where others have, recently, failed. Is there anyway you can provide the counter-evidence I seek?
 
Don't worry Lucid, I just saved you from the mistreatment, for at least 6 months you can go hang out in a place that is fit for your "science" and where no meanies who actually program these things will hound you with anything that's not made up / unfounded / perfectly aligned with how you choose to view the world. That being said, let us get back to what is a rather interesting discussion.
 
Crazy hypothetical question: now that we know that console tec and PC tec is as close as never before next gen. We also know that mid range PC can keep up with next gen consoles and it seems rather easy to build PC with much more performance.

Question: is it possible (from pure tec side of things) next gen, is it likely that we see a hack/emulator which allows to run console exclusive titles on PC? Say...the new Killzone on my shiny PC. Or what about the new Halo. Just put in the PS4/X720 BlueRay and play. And spinning this even further...how about the new Killzone with emulator on PC at 60Hz, or with some AA, or higher resolution? What about exclusive titles build directly e.g. with UE4?
 
There are certain things that are still going to be difficult to emulate. Being able to code around or brute force on pc versions doesn't mean an emulator is going to cut it.
 
Crazy hypothetical question: now that we know that console tec and PC tec is as close as never before next gen. We also know that mid range PC can keep up with next gen consoles and it seems rather easy to build PC with much more performance.

Question: is it possible (from pure tec side of things) next gen, is it likely that we see a hack/emulator which allows to run console exclusive titles on PC? Say...the new Killzone on my shiny PC. Or what about the new Halo. Just put in the PS4/X720 BlueRay and play. And spinning this even further...how about the new Killzone with emulator on PC at 60Hz, or with some AA, or higher resolution? What about exclusive titles build directly e.g. with UE4?

From what I understand as the hardware gets more complex, emulation gets more difficult almost exponentially.

Thus a SNES emulator is somewhat easy, a PS3 one, forget it.

It seems like there is a crossover point where emulation of consoles became relatively impossible. More or less the PS2/Xbox gen.

There is some kind of thinking since the PS4 is PC components unlike PS3, it will be easier. But I dont really buy that. Counterargument is Xbox (the first) was PC components and not emulated. Counterargument to the counterargument is, apparently people didn't try very hard to emulate the Xbox.

But my guess is forget PS4 emulation more or less ever.

Also the one area PS4 really excels in is RAM. I rather think of it almost as an 8GB video card (of course not exactly, but I'm guessing the majority of that RAM will be used for graphics). In that case, it will be two generations before PC cards really even catch up. They are at 2GB now as standard, I'm not even sure 4GB will become the standard for the next gen of cards. I'm not sure how an (fairly standard) 8GB DDR3+2GB video card PC stacks up to the PS4 in terms of quickly GPU accessible data, but I'm guessing not really well.
 
Crazy hypothetical question: now that we know that console tec and PC tec is as close as never before next gen. We also know that mid range PC can keep up with next gen consoles and it seems rather easy to build PC with much more performance.

Question: is it possible (from pure tec side of things) next gen, is it likely that we see a hack/emulator which allows to run console exclusive titles on PC? Say...the new Killzone on my shiny PC. Or what about the new Halo. Just put in the PS4/X720 BlueRay and play. And spinning this even further...how about the new Killzone with emulator on PC at 60Hz, or with some AA, or higher resolution? What about exclusive titles build directly e.g. with UE4?

Since PCs can run the OpenGL API I would think it would be far easier than say the PS3 + exotic hardware or PS2 (same there). But I would think some things would be more difficult or require a bit of brute force. Compute processing might be troublesome.

Also the one area PS4 really excels in is RAM. I rather think of it almost as an 8GB video card (of course not exactly, but I'm guessing the majority of that RAM will be used for graphics). In that case, it will be two generations before PC cards really even catch up. They are at 2GB now as standard, I'm not even sure 4GB will become the standard for the next gen of cards. I'm not sure how an (fairly standard) 8GB DDR3+2GB video card PC stacks up to the PS4 in terms of quickly GPU accessible data, but I'm guessing not really well.

4GB and 6GB is soon going to be more more of the standard. There is no (certainly not possible with OS and other game data loaded) way PS4 will use more than 6GB for graphics data. Heck I dont even think it will go past 4GB considering what other things has to be present in RAM.
 
Crazy hypothetical question: now that we know that console tec and PC tec is as close as never before next gen.
The original XBox was closer to a PC (it was a PC ;)).

Question: is it possible (from pure tec side of things) next gen, is it likely that we see a hack/emulator which allows to run console exclusive titles on PC?
Depends entirely on how the software is written. If it's sitting on an API, potentially (as true with any device. You can emulate iPad and Android devices on PC, for example, because the hardware is abstracted from the system). Whereas if the developers are using hardware specifics in a way that doesn't map to a PC, than no. I don't think it'll happen with any great accuracy/reliability myself.
 
4GB and 6GB is soon going to be more more of the standard. There is no (certainly not possible with OS and other game data loaded) way PS4 will use more than 6GB for graphics data. Heck I dont even think it will go past 4GB considering what other things has to be present in RAM.

Yep, Nvidia and AMD pretty much confirmed that 3gb+ will be standard for mid-range GPUs.

--
About drawcalls CE3 SDK have some interesting info about current gen:
http://freesdk.crydev.net/display/SDKDOC2/Rendering+Performance+Guidelines

It seems that CE3 games target 2000 drawcalls as max target and there is even performance data for PS3
If you want to hit a target of maximum 33 ms ( 30 fps ), a drawcall which is vertex shader bound (Zpass/shadows, part of general pass) on the PS3 can cost about 10 microseconds on GPU (more or less - depending on the amount of vertices to process), so this means you can estimate approximately 2k * 10us = ~20 ms."
It would be quite interesting to check if Crysis 3 has overcome that barrier, because those data are mostly from Crysis 2 development period.
Unfortunetely sys_enable_budgetmonitoring 1 or even r_displayinfo cvars are blocked currently in Crysis 3 ;\, but SDK 3.5 should be out in few days/weeks and should allow for some test on C3 levels.

Still 2000 is very low in comparison to Andrew Lauritzen tests, so 20-25k should be enough for almost any next-gen game.
 
Confirmation about the SVOGI ditching and that lighting and shadowing in the Infiltrator demo are prebaked:

http://forums.epicgames.com/threads...ng-system!?p=31241619&viewfull=1#post31241619

Hey guys, rendering team lead from Epic here.

Fully dynamic lighting and precomputed lighting are just two tools in our UE4 toolbox. We have games being made like Fortnite that are using fully dynamic lighting, no lighting build times, and the game has full flexibility to change what it desires at runtime. In the case of Fortnite, this is used to great effect with building and harvesting of resources. We don't yet have a solution for dynamic GI in the fully dynamic lighting path, this is something we hope to address in the future.

On the other hand, using precomputed lighting where you don't need dynamicness frees up a lot of processing power. The infiltrator demo that we released at GDC leverages this heavily. In short: we would have had to scale down Infiltrator massively without precomputing some of the lighting. There are over 1000 lights in some of the scenes, and about half of those cast shadows. Dynamic shadows have a huge cost on graphics hardware, but precomputed shadows are very cheap. Our general purpose reflection method (Reflection Environment) also relies on pre-captured probes. By having the general purpose reflection method be cost efficient, we were able to spend the extra GPU time on Screen Space Reflections, which provides extra detail where it is available (due to screenspace limitations).
https://www.youtube.com/watch?v=-bLOi3mo9NE (watch in HD)
Now there are some workflow costs to precomputed lighting (creating lightmap UVs, build time), and this is something we hope to improve significantly in the future.

Precomputed lighting is also really useful for scaling down, to low end PC, 60fps, mobile, etc.

In summary: UE4 supports multiple tiers of lighting options, and games can use what suits them best. Fully dynamic lighting provides maximum interactivity, editor workflow and game design flexibility, precomputed lighting provides maximum quality and performance.

Let me know if you have any specific questions and I'll do my best to answer them.
 
Just saw that as well. Infil is still impressive, but I'm curious to see what that massively scaled down version without precomputed lighting would've looked like.
 
Back
Top