LightHeaven
Regular
D3D will fire an assert if you try.
But you can use tile and memexport in the scene, just not at the same time?
(or does the tile bracket contains the entire frame?)
D3D will fire an assert if you try.
But you can use tile and memexport in the scene, just not at the same time?
(or does the tile bracket contains the entire frame?)
Yes, you can use memexport, but not inside predicated tiling. This is not a hardware limitation. The GPU does not know the "concept" of predicated tiling directly. PTR is only a very clever software technique built upon several hardware features and the API sits at a higher level than things like drawing a primitive.
To cut a long story short, it's just a bit more than recording a command buffer and submitting it to the GPU once per tile (the command buffer may or may not contain the entire scene, it depends on the engine), while predicating away all the primitives outside the current tile.
When it comes to the GPU, if it sees a command to do memexport in the command buffer, it wil just execute it: since the command buffer is executed once per tile, that memexport command will be executed once per tile. The GPU doesn't know about predicated tiling, it only sees a stream of commands to execute.
The API limitation that prevents memexport during predicated tiling comes from here, which makes a lot of sense in the context of console programming, where you don't want to add a potentially big performance pitfall behind the back of the developer.
Yes, you can use memexport, but not inside predicated tiling. This is not a hardware limitation. The GPU does not know the "concept" of predicated tiling directly. PTR is only a very clever software technique built upon several hardware features and the API sits at a higher level than things like drawing a primitive.
To cut a long story short, it's just a bit more than recording a command buffer and submitting it to the GPU once per tile (the command buffer may or may not contain the entire scene, it depends on the engine), while predicating away all the primitives outside the current tile.
When it comes to the GPU, if it sees a command to do memexport in the command buffer, it wil just execute it: since the command buffer is executed once per tile, that memexport command will be executed once per tile. The GPU doesn't know about predicated tiling, it only sees a stream of commands to execute.
The API limitation that prevents memexport during predicated tiling comes from here, which makes a lot of sense in the context of console programming, where you don't want to add a potentially big performance pitfall behind the back of the developer.
Why do people think eDRAM is only useful when tiling is used?I think it's predicated tiling rendering. (shoot me if I'm wrong )
edit: Even though it's a good compromise, I would say that the pros will have to be pretty darn amazing to outset the cons. I mean the system has been out for a year now. The best looking title is a game based on UE3 which doesn't use tiling also given the fact that tons of games are multiplatform out of which possibly none? will use tiling. It also seems like UE3 will become very popular engine for X360 and if there are future updates that will make tiling possible remains to be seen. So I might be sadly mistaken when I say that it looks like about a handfull of first party titles will actually use tiling and even in those titles developers will have to do lot's of work to get it up and running properly. So often when I think about this, I feel like maybe it could have been more usefull if the transistors had been put elsewhere.
What exactly are you talking about here?The cost of a wide external bus or die area for embedded DRAM are not the only resources to trade against to solve the bandwidth issue. The rendering pipeline could afford some deepening to offer a more optimal approach.
Thanks for the explaination. I never actually tried it myself. :smile:D3D will fire an assert if you try.
Why do people think eDRAM is only useful when tiling is used?
It eliminates far and away the biggest bandwidth consumer in the pipeline from the GDDR3. It saves them transistors in compression, decompression, and the memory controller. It gives game devs the huge fillrate BW they're used to on the PS2. It lets X360 draw 64 samples per clock for shadow maps. All this for about 15% of the console's total silicon, maybe half that when you subtract the ROPs. In fact, according to NEC's specs, it's only 18 mm2 for the memory cells alone.
EDRAM makes a lot of sense on a console even without tiling.
Why do people think eDRAM is only useful when tiling is used?
It eliminates far and away the biggest bandwidth consumer in the pipeline from the GDDR3. It saves them transistors in compression, decompression, and the memory controller. It gives game devs the huge fillrate BW they're used to on the PS2. It lets X360 draw 64 samples per clock for shadow maps. All this for about 15% of the console's total silicon, maybe half that when you subtract the ROPs. In fact, according to NEC's specs, it's only 18 mm2 for the memory cells alone.
EDRAM makes a lot of sense on a console even without tiling.
How do you get that it's 15% of the silicon? The EDRAM die is ~70mm^2 supposedly (where Xenos is ~180 and RSX ~240). That's a significant chunk of change. Further, the transistors has to be 80+ million right? You need 8 per bit..10 Megabyte. The EDRAM die is stated at 105 m transistors so..the vast majority of it is EDRAM.
Seriously, if EDRAM was that cheap/small, why didn't microsoft put 30MB in there so you get 4xAA 720P without tiling?
Current rate of progression in rendering is making eDram(purely for render buffer usage) less relevant, not more. Not to mention restricting it to renderbuffer usage only is another tradeoff that limits its usefulness further.acert93 said:This gen there doesn't seem to have been a nice solution to fit all needs. Next one, with 64MB and larger eDRAM modules, we may see it being more robust.
Current rate of progression in rendering is making eDram(purely for render buffer usage) less relevant, not more. Not to mention restricting it to renderbuffer usage only is another tradeoff that limits its usefulness further.
Read my quote again. Total console silicon. You're excluding the CPU and other smaller things too.How do you get that it's 15% of the silicon? The EDRAM die is ~70mm^2 supposedly (where Xenos is ~180 and RSX ~240).
That's what I figured you were talking about, but deferred rendering complicates a LOT of things. It's not simply a matter of a "deeper rendering pipeline" as you put it.Fully basing rendering upon tiling to use a very large amount of tiles saves both the external bus -- using on-chip SRAM whose speed is even faster than eDRAM and makes fully deferred texturing/shading practical -- and also conserves processor die area -- using very small tiles which afford an even more consistent level of image quality through deffered rendering.
True, but I don't see this progression continuing for much longer. A lot of the longer shaders today seem to be long simply because they can be, not because they offer much noticeable improvement. I personally think we're going to have to increase the data per pixel more than math per pixel to get more realistic graphics. Xenos/RSX can do what, ~10000 fp ops per final pixel at 720P/30fps?Current rate of progression in rendering is making eDram(purely for render buffer usage) less relevant, not more. Not to mention restricting it to renderbuffer usage only is another tradeoff that limits its usefulness further.
True, but I don't see this progression continuing for much longer. A lot of the longer shaders today seem to be long simply because they can be, not because they offer much noticeable improvement. I personally think we're going to have to increase the data per pixel more than math per pixel to get more realistic graphics. Xenos/RSX can do what, ~10000 fp ops per final pixel at 720P/30fps?
It's also likely that memory will grow in capacity much faster than speed, and rendering ability (Gsamples/s) will also increase much faster than memory speed. High res textures make a big difference in realism, and they will increase BW usage. BW per ROP has been decreasing a lot for many years. For cheap consoles, wider buses don't look like an option.