Is HDR "Free" for the Xenos?

scooby_dooby said:
From the horses mouth as they say:

"FiringSquad: You said earlier that EDRAM gives you AA for free. Is that 2xAA or 4x?

ATI: Both, and I would encourage all developers to use 4x FSAA. Well I should say there’s a slight penalty, but it’s not what you’d normally associate with 4x multisample AA. We’re at 95-99% efficiency, so it doesn’t degrade it much is what I should say, so I would encourage developers to use it. You’d be crazy not to do it."

http://www.firingsquad.com/features/xbox_360_interview/page4.asp

95-99 makes it sound like 1% penalty for 4x in some situations. Using fouad logic, Xbox360 could get 8x for 2%, and 16x AA for 3%!! ;)
 
scooby_dooby said:
From the horses mouth as they say:

"FiringSquad: You said earlier that EDRAM gives you AA for free. Is that 2xAA or 4x?

ATI: Both, and I would encourage all developers to use 4x FSAA. Well I should say there’s a slight penalty, but it’s not what you’d normally associate with 4x multisample AA. We’re at 95-99% efficiency, so it doesn’t degrade it much is what I should say, so I would encourage developers to use it. You’d be crazy not to do it."

http://www.firingsquad.com/features/xbox_360_interview/page4.asp

So who should I believe ?!! Dave on beyond3d, or this ?!!

please tell me... :rolleyes:
 
Believe whatever you want.

Believe they wasted 3 years and 100million transistors for nothing. The 256GB of internal badwidtn is meaningless, and that the 192 processing units embedded in the EDRAM are also completely useless.

Believe that the EDRAM is worthless and the ATI reps are lying.

Whatever you want!
 
fouad said:
scooby_dooby said:
From the horses mouth as they say:

"FiringSquad: You said earlier that EDRAM gives you AA for free. Is that 2xAA or 4x?

ATI: Both, and I would encourage all developers to use 4x FSAA. Well I should say there’s a slight penalty, but it’s not what you’d normally associate with 4x multisample AA. We’re at 95-99% efficiency, so it doesn’t degrade it much is what I should say, so I would encourage developers to use it. You’d be crazy not to do it."

http://www.firingsquad.com/features/xbox_360_interview/page4.asp

So who should I believe ?!! Dave on beyond3d, or this ?!!
please tell me... :rolleyes:

Fouad the answers are the same.

1% or less hit at 720p w 2xMSAA

5% (total not additional) 720p w4x MSAA

Its easy. The X360 only suffers 1% or 5% (or so) penalty at 720p. Period.
 
Now I'm confused. Jawed said the hit for 2X and 4X were additive for consumer gpus since you have to do both if you do 4X but you guys are saying differently that its not additive. What's the truth?
 
Jawed said:
ralexand said:
Thanks, realsky. So it looks like the R520 will have a raw shading power advantage over the xenos.

LOL, supposedly Ruby 3, the R520 demo, runs faster on XB360 than it does on R520...

Who knows if that's true though, eh?

Jawed
It would nice to know where you heard that.
 
ralexand said:
Now I'm confused. Jawed said the hit for 2X and 4X were additive for consumer gpus since you have to do both if you do 4X but you guys are saying differently that its not additive. What's the truth?
you could see it as "additive" if you think of it as a 4% decrease in efficiency from 2x MSAA to 4xMSAA...

there you got from 1% to 5%... the overall numbers dont change which is the point.
 
blakjedi said:
ralexand said:
Now I'm confused. Jawed said the hit for 2X and 4X were additive for consumer gpus since you have to do both if you do 4X but you guys are saying differently that its not additive. What's the truth?
you could see it as "additive" if you think of it as a 4% decrease in efficiency from 2x MSAA to 4xMSAA...

there you got from 1% to 5%... the overall numbers dont change which is the point.
That's interesting. I still don't understand why you would need to do 2X then turn around and do 4X AA. Seems like that 2X AA would be part of the 4X. With 8X AA do you have do 2X then 4X then 6X then 8X to get the final image? I thought 4X just meant you took 4 samples not that you took 2 samples then another 4 samples.
 
It's my understanding that each "pixel pipeline" is only capable of generating the actual samples in groups of 2, per clock cycle. So per pixel, it takes multiple cycles to generate multiple sets of 2 samples to achieve more than 2xAA.

Jawed
 
Having said that, it's not clear if this limitation exists in Xenos.

It may be that the parent <-> daughter bus cannot transport more than 2x AA samples per clock per pixel, and in fact the full gamut of 4xAA samples can be generated by the GPU parent, but it takes longer to transport them into EDRAM than 2xAA samples.

There's lots of vagueness surrounding Xenos, sadly...

Jawed
 
There's nothing not clear about it, Dave's article address this issue:
Despite references to 192 processing elements in to the ROP's within the eDRAM we can actually resolve that to equating to 8 pixels writes per cycle, as well as having the capability to double the Z rate when there are no colour operations. However, as the ROP's have been targeted to provide 4x Multi-Sampling FSAA at no penalty this equates to a total capability of 32 colour samples or 64 Z and stencil operations per cycle.
Each pixel pipe/ROP can produce 4 zixels per clock cycle, from this standpoint there is no hit when the hw switch from 2x to 4x AA.
4X AA is a native mode on Xenos.
 
nAo said:
There's nothing not clear about it, Dave's article address this issue:
Despite references to 192 processing elements in to the ROP's within the eDRAM we can actually resolve that to equating to 8 pixels writes per cycle, as well as having the capability to double the Z rate when there are no colour operations. However, as the ROP's have been targeted to provide 4x Multi-Sampling FSAA at no penalty this equates to a total capability of 32 colour samples or 64 Z and stencil operations per cycle.
Each pixel pipe/ROP can produce 4 zixels per clock cycle, from this standpoint there is no hit when the hw switch from 2x to 4x AA.
4X AA is a native mode on Xenos.
Thanks for the explanation guys. Where did fouard go?
 
ralexand said:
Thanks for the explanation guys. Where did fouard go?
He needed 4 pages to discover something he could have been read from the front page :LOL:
 
fouad said:
scooby_dooby said:
ATI: Both, and I would encourage all developers to use 4x FSAA. Well I should say there’s a slight penalty, but it’s not what you’d normally associate with 4x multisample AA. We’re at 95-99% efficiency, so it doesn’t degrade it much is what I should say, so I would encourage developers to use it. You’d be crazy not to do it."

So who should I believe ?!! Dave on beyond3d, or this ?!!

please tell me... :rolleyes:
Hopefully you've read all this and are willing to apologise!

The performance hit for 720p with 2xAA is next to nothing compared with no AA. The performance hit for 4xAA is very little compared with 2x AA. The tiles don't have a huge overhead, though there is an overhead of course, so using more tile is not hugely costly, and certainly cheaper than RAM BW thrashing without eDRAM on a conventional GPU.

Before saying it's all lying BS next time you disagree with something, find out how it works exactly before hand...
 
Lets try and use some logic, and analyse what you are saying :

Tiling mode OFF :
To pass from NO AA at 720P to 2X AA at 720P, the performance hit is next to nothing ( less than 1% ), and to pass from 2X AA at 720P to 4X AA at 720P, the performance hit is very small ( less than 5%).

Tiling mode ON :
To pass from NO AA at 720P to 2X AA at 720P, the performance hit is next to nothing ( passing from no tiles to 2 tiles ) ( less than 1% ), and to pass from 2X AA at 720P to 4X AA at 720P, the performance hit is very small ( passing from 2 tiles to 3 tiles) ( less than 5%).

:LOL:



And you want me to agree ?!!!!!!!!!!!!!!!!!!!!!!!! :oops:
Please, use your brains, GOD created brains for human, so they must use them to arrive to the easy obvious truth. ( its clear that for you its a difficult truth).
If someone working with ATI or Microsoft say : we have free AA ! than why everyone should believe him ?!!!!!!!!!!!!!!!!!!
because if you believe this, than just believe that since NV30 and radeon equivalents we have free AA ( according to the creators of course ) and that the EE of PS2 could do real time toy story,....etc
 
Since you was wrong before, care to explain why you're right now? :)
U're not proving anything, you know?
 
fouad said:
Lets try and use some logic, and analyse what you are saying :

Tiling mode OFF :
To pass from NO AA at 720P to 2X AA at 720P, the performance hit is next to nothing ( less than 1% ), and to pass from 2X AA at 720P to 4X AA at 720P, the performance hit is very small ( less than 5%).

Tiling mode ON :
To pass from NO AA at 720P to 2X AA at 720P, the performance hit is next to nothing ( passing from no tiles to 2 tiles ) ( less than 1% ), and to pass from 2X AA at 720P to 4X AA at 720P, the performance hit is very small ( passing from 2 tiles to 3 tiles) ( less than 5%).

:LOL:



And you want me to agree ?!!!!!!!!!!!!!!!!!!!!!!!! :oops:
Please, use your brains, GOD created brains for human, so they must use them to arrive to the easy obvious truth. ( its clear that for you its a difficult truth).
If someone working with ATI or Microsoft say : we have free AA ! than why everyone should believe him ?!!!!!!!!!!!!!!!!!!
because if you believe this, than just believe that since NV30 and radeon equivalents we have free AA ( according to the creators of course ) and that the EE of PS2 could do real time toy story,....etc

As usual you have fundamental flaws with your logic.

You are making a comparison between MSAA on the Xenos with Tiling on and with Tiling off. THAT'S your problem, as there is no such thing as AA on the Xenos with Tiling off ( not at 720p anyway ). 10 megs of embedded eDRAM is insufficient for ANY AA at 720p without Tiling.

All the numbers regarding performance hits for AA are with Tiling ON.
 
fouad said:
Lets try and use some logic, and analyse what you are saying :

Tiling mode OFF :
To pass from NO AA at 720P to 2X AA at 720P, the performance hit is next to nothing ( less than 1% ), and to pass from 2X AA at 720P to 4X AA at 720P, the performance hit is very small ( less than 5%).
If MS can do AA without tiling at no hit, why the hell'd they use tiling then? The reality is the overhead of tile based rendering is small. If you have any facts or evidence to the contrary, please present it, but months of discussion here's has seen acceptance of this principle.

I was willing to give you the benefit of the doubt when you entered this forum, but I see I was wrong. Regardless of whether you're right or wrong (and you're wrong), your attitude sucks.

"Hi guys. You're all thick for believing technical documents defining hardware implementation and in depth hardware reviews from knowledgable members of your community."

How to make friends and influence enemies 101 :D
 
Hi, (I am new on the board!)

basicly you have to understand that the edram´s purpose is to spare memory/vram bandwidth.

I think it is safe to say, that the edram effectively doubles the memory bandwidth and allows 4xAA with (nearly) no bandwidth costs.

Due to tiling there are minimal costs (even in bandwidth) for AA in P720 or higher resolutions. This is because for every object the corresponding tile has to be determined (during a z-only pass) and some objects/polygons will lie in more than one tile, which will cost extra texture bandwidth.

Afaik the RSX has nothing comparable to the edram, so the frame buffer lies in the vram. This effectively halfes the memory bandwidth and makes AA cost alot of memory bandwith.

If you consider, that the memory bandwidth on both GPUs (Xenos/RSX) is round about the bandwidth you need to read the whole vram once (I assume here 256mb for both) in one frame (60hz), which obviosly isn´t that much, than edram seems to be a realy clever solution to avoid a bottleneck in memory bandwith.

Actually propably both GPUs even have to share memory bandwith with the CPU, but I am not sure about that.
 
Back
Top