Love_In_Rio
Veteran
Am i wrong understanding that their "general purpose reflection method" is the star of the show and what gives the infiltrator demo that CGI look?.
Last edited by a moderator:
I think it's more because of the physically based materials and rendering (everything being mostly dark and metallic also helped here too I suppose). That and the wonderful and copious amounts of blur to trick your eyes even more.
We simply call it draw call overhead, as we cannot measure the state change overhead directly. The draw calls submit the changed state to the GPU. Without draw calls, the state changes are very cheap indeed (as pretty much nothing happens)We say Draw Call overhead but we mean State Changes and Validation overhead. (and a few other things, like copies/shadow copies...)
My DX10/11 test results are already posted here in the forums a few years ago. Granted, my old 5870 isn't exactly the same as my new 7970. AMDs multithreaded rendering has been fixed since. At that time it was slightly slower to use multithreaded rendering than a single thread (no matter how many cores you used). Multithreaded rendering was fixed in AMD drivers after Civ5 was launched (it was the first AAA title that used DX11 multithreaded rendering). So the situation is now better.This is all easily testable yourselves guys so stop making crap up.
Yes, you can go all the way down to a single (indirect) draw call per viewport (plus some indirect dispatch calls) if you keep the whole scene / mesh / texture data on GPU and do the whole scene setup / culling on GPU (negating the draw call cost completely). That's possible (and also saves lots of data traffic from CPU->GPU), but I don't think the majority of devs are going to implement systems like that any time soon (as long as they have to also support current gen DX9+ consoles and DX10 low end PCs with the same code base).you simply don't need to change state as much since GPUs these days can dynamically index/branch a lot better than they used to.
That's pretty much the same that a single x360 thread can do (when the low level API is used instead of DirectX). I also stated this a few years ago in my DX11 performance analysis post.With a texture change in between, I can do 30k draws at 60fps. Changing a constant buffer reference, 60k. Updating and changing a constant buffer (map_discard, set), 35k.
Agreed, it doesn't matter for current generation games that are designed for eight year old CPUs. Those CPUs are less powerful than a single Sandy Bridge core. Current gaming PCs have at least four cores and games don't utilize them all. So the render thread could take a whole core, and the GPU driver could take another without any problems. The driver overhead was a much bigger problem when high end gaming PCs still used Core 2 Duos (and other dual core CPUs). Similar problem will be visible when next generation game ports for PC start to appear. A next gen game logic/physics would likely tax the quad core CPU already pretty heavily, so there's no longer two free CPU cores that could just be dedicated for draw call overhead / driver overhead.So yeah, I'm not buying the argument that you can't do pretty much whatever you want already on PC. Like I said, some of the best looking stuff (Frostbite, etc) uses a few thousand draw calls, and you could use 10x more and still be fine. Sure the overhead will be lower on consoles, but prove to me that it matters.
Sure, I'm just pointing out that the overhead depends on what state has changed between the draw call (and to some extent the total number of resources referenced by a command buffer), so I don't like lumping it all into one basketWe simply call it draw call overhead, as we cannot measure the state change overhead directly.
I still don't heavily recommend multithreaded rendering for a number of reasons. My results were pure single-threaded.At that time it was slightly slower to use multithreaded rendering than a single thread (no matter how many cores you used). Multithreaded rendering was fixed in AMD drivers after Civ5 was launched (it was the first AAA title that used DX11 multithreaded rendering). So the situation is now better.
True, but you don't have to go that far to get the majority of the benefit. Some basic culling, shadowing and instancing support usually gets you down below 5000 draw calls in any reasonable scene.That's possible (and also saves lots of data traffic from CPU->GPU), but I don't think the majority of devs are going to implement systems like that any time soon (as long as they have to also support current gen DX9+ consoles and DX10 low end PCs with the same code base).
To be fair, on PC's with shared memory/integrated graphics you can avoid the copies as well (map/discard semantics are sufficient). There's still a fairly high overhead, but honestly constants are definitely one place where you can start just fetching them from memory (buffers) if you become CPU-limited. Frankly I'm not really sure that we need something as specific as constant buffers in the future (caches should be sufficient) anyways, but obviously devs will have to deal with them for a couple more years..Constant updates are much more efficient on consoles, since the GPU can directly fetch the constant data from a memory address. No transfers (or map/unmap synchronization) are needed at all.
Agreed, and this is a good approach. The whole spoon feeding of every minor change into a serial command buffer has to stop, even on the GPU side. We'll of course keep working to reduce the overhead as much as possible, but it's not a good long term strategy anyways.In our engine we do not change textures at all, as we are using virtual texturing (both on PC and on x360), so I haven't needed to profile texture change performance on any platform. Similar virtualization techniques are possible for meshes and constants (and are performing very well). So I will expect the "draw call" overhead to be a solved issue in a few years (when games do not need to support DX9 and DX10 anymore).
Possibly, but I hope developers will mostly add complexity to the shaders/GPU work itself rather than to the command interface between the two (which as mentioned is a bad long term plan). Several of the big PC developers have already started to realize this direction and I imagine even on consoles it will still be a win.Similar problem will be visible when next generation game ports for PC start to appear. A next gen game logic/physics would likely tax the quad core CPU already pretty heavily, so there's no longer two free CPU cores that could just be dedicated for draw call overhead / driver overhead.
Also the one area PS4 really excels in is RAM. I rather think of it almost as an 8GB video card (of course not exactly, but I'm guessing the majority of that RAM will be used for graphics). In that case, it will be two generations before PC cards really even catch up. They are at 2GB now as standard, I'm not even sure 4GB will become the standard for the next gen of cards. I'm not sure how an (fairly standard) 8GB DDR3+2GB video card PC stacks up to the PS4 in terms of quickly GPU accessible data, but I'm guessing not really well.
Thus when PS4 launches, it's entirely possible the top end PC GPU's may be sprting between 8 and 12GB of local memory.
Most spread videocard vram is 1024 MB and there's not any significantly better gpu in the top 10 gpu steam stats compare to what's in Orbis (1st is even an Intel HD 3000..). That's the PC gaming right now, not some limited editions of 500$ gpu with double amount of memory.
Sony's choice of GDDR5 will certainly fuel some evolution for the standard. 8gb chips by 2015-2016 seems likely now.
Most spread videocard vram is 1024 MB and there's not any significantly better gpu in the top 10 gpu steam stats compare to what's in Orbis (1st is even an Intel HD 3000..). That's the PC gaming right now, not some limited editions of 500$ gpu with double amount of memory.
Sony's choice of GDDR5 will certainly fuel some evolution for the standard. 8gb chips by 2015-2016 seems likely now.
Fully dynamic lighting actually uses less memory than static. Static lighting for big levels needs baked lightmap data for every surface and/or large grids of precomputed light probes. Fully dynamic lighting on the other hand only calculates the lighting for the currently visible viewport. The goal here is to only process things that affect the surfaces seen in the single image (since lighting is recalculated every frame). A good fully dynamic lighting solution thus doesn't want to generate/use huge data structures (and couldn't even afford spending clock cycles and memory bandwidth generating them over and over again every frame).As soon as you throw in some dynamic lighting, less static environments, better shaders, better post, more transparancies and what not, the needed memory will rapidly increase.
I agree that less static environments would consume more memory. Especially if the environment changes are permanent, and especially if we have more fine grained control over the environment (can break/modify terrain/objects in various ways). Current gen games tend to remove all marks of destruction/bodies/shrapnel very fast. But the biggest thing that more memory brings is ease of development. With small memory you need to stream in/out everything rapidly (just in time). It's hard to create algorithms that predict things properly in all scenarios (since HDD data seeking is slow, and popping isn't desirable). Of course the bigger memory allows more variety in the game world (less copies of the same model/texture are required, more unique models/textures are possible).
Currently the top is the Geforce Titan at $999 with 6GB. 2 less than the PS4.
Then I see on newegg there are some 670 and 680 with 4GB, though they are at a significant price premium. Then of course the 7970-50 at 3GB.
I think it will be a while before the standard enthusiast GPU will be sporting 8GB. Again that "standard" right now is 2GB (7850, 7870, 660TI, 660, 650 TI Boost, most 680, 670, etc).
I'm not convinced the standard mid-class enthusiast GPU will move to 4GB next gen, and there's no clear direction when "next gen" of GPU's comes either (it may not even be 2013 according to rumors). But even if it does, that's still half of PS4.
Though ironically, next gen consoles will likely speed up GPU VRAM increases anyway, because of ports.
Yep. 8GBs of GDDR5 isn't as impressive as lay0eople think. Sony wanted 8 GBs, but they were going to release with 4 GBs and that'd have been plenty. Pushing the memory amount up required another 4GBs from somewhere, but that doesn't mean the performance of that extra 4 GBs for its intended purpose necessitated high BW GDDR (and let's be honest, the BW of PS4 effectively halved per GB when the RAM increased, so the amount of time you can read and write to each GB of RAM dropped considerably). But adding DDR3 would have added complications in new mobo and developer issues, so Sony took an expensive but convenient approach. It's not comparable to 8 GBs GDDR5 on someone's GPU though. I seriously doubt PS3's unified 8Gbs will give a remarkable improvement over a fast GPU with plenty of system RAM to preload resources from. Potentially a game could instantly switch from one set of assets to another in that 8GBs RAM where a PC would have to copy them over from DDR3 across the slow PCIE bus, but that's not going to be a likely scenario, especially when playing cross-platform titles designed around Durango and PC limits. And a faster GPU (MUCH faster) will handle computed effects better anywhere."8 GB GDDR5" seems to have taken over the internet, with hordes of hardware junkies only able to compare GDDR5 with GDDR5, as if memory with less than 176 GB/s of bandwidth can't actually store anything.