onanie said:
I don't think I need to remind you that the bandwidth between the parent and daughter die is 32gb/s. Compressed data doesn't magically uncompress after crossing that bridge.
I seriously doubt the cost of the hardware for unpacking at the daughter die comes even close to standard z/color compression, and having that sitting in the pipeline. But, eh, email ATI and ask 'em. Heh.
After all that, I must still ask - does xenos' edram have redundancy?
No. Ultimately ATI decided it wasn't worth a few extra transistors to increase their yields on the memory. They'd just absorb the defects they got. Honestly, if that's not a satisfactory answer, then email ATI. I don't know if their answer will be any clearer than the one they generally give out for some person asking exactly how much texture cache they have or how many transistors each part of the chip consists of.
Getting rather philosophical, no? My question originally was "What kind of visual effects does dynamic branching allow exclusively?"
Er, I'll explicitly state it then: Don't expect for there to be a multitude of things NOW. Ask for a list in a few years. Nevertheless, there's speeding up shadows by branching and choosing how many samples to take, taking fewer in complete shadow and far more on edges to give better soft shadows without incurring the pain of uselessly oversampling the entire shadow. Selectively supersampling in the shader to reduce aliasing, even just in certain sections instead of the entire shader(s) for each pixel. Or parallax occlusion mapping. Exclusive effects? No. But I'd call it BS to say it's not worth quite a lot.
You would need to demonstrate the "numerous" situations where the VS load is "very high". While you might accuse one of saying (and I did not) that "we haven't seen it in games, thus there must be no use for it", is it not technically right to respond that in the months that the xenos hardware has been available to developers, "it" has not been used where it is available? While you would say that "it's because it hasn't been an option that we dont' see it, not that there aren't uses", I might just ask again what I have done before - "show me".
No, it's not right to respond that way. It can be said it hasn't been
seen yet in games out
now that very likely didn't allot time for people to create novel shaders. It can be said we haven't heard anything from developers, many of whom are reluctant or NDAd from speaking about techniques in games currently being developed or out now, or better yet, it's not worth talking about them to other people besides developers. And maybe we haven't seen it simply because it's not something you can see. If something is behind the scenes, would you therefore consider it a wasted effort?
You also mention developers having hardware for "months" as if it's been a long time. If you think I'm arguing it's a godsend, you've formulated the wrong impression. I don't expect new techniques to create such giant leaps visually that people will be able to point them out when they see them. But that's not what we're here for. We're talking about the technical merits of hardware here, not how good games look. Developers are far too important for hardware, especially this generation, to make that big a difference. Perhaps you disagree with that. I don't know.
But in the end, if you can dedicate all your hardware to the vertex load when it's highest, then you've just shortened the rendering time for that section by up to a factor of 5. Meanwhile, you gain a small benefit everywhere else.
With any particular method, its advantage might be real if other methods for achieving the same prove to be less effective. To produce the same effect, a different kind of code might run just as well on a different hardware.
Hence the inclusion of "the option available on the other card" in my post. With a wider range of possibilities, based upon a more diverse/flexible feature set, it's more likely that you're going to find that technique that is 10x faster than on the other card, and produces Y visual effect at such and such quality. On the other card, you use that technique that's 8x faster on
that hardware relative to the first one. But maybe it's not quite as good, or in the end it's still a little more resource intensive than the technique on the first card. And a number of such things add up, and it's not so insignificant anymore. It's not like anyone's trying to say it'll double RSX's performance. But anyone who thinks these things are only going to add up to 2% gains over itself, IMO, has fallen off their cookie. If such people are right, then that means the top minds at ATI are utter failures, and wasted significant portions of all their latest chips, from R5xx to C1 and soon to be R6xx.
Commenting on the same graph which depicts the work required in one particular frame (chosen by ATI nonetheless), the ratio of pixel to vertex workload is still maintained overall (AUC). The proportion of this particular frame where vertex wordload completely replaces pixel workload is perhaps about 10%. Even so, I might just echo Phil's sentiment in wondering why, in a parallelized pipeline, should the pixel pipelines stop all the time while the vertex pipelines are busy?
Overall is a nearly useless metric. Somewhere in there, resources are being wasted. Those unused vertex shaders could go towards giving you an additional 10%, 20%, however much brute force on a regular basis. On the other hand, it could be that the vertex shaders aren't wasted that much, and instead you spend a non-negligible amount of time totally bound by vertex shaders because you can't process the data fast enough and can't go onto pixel shading until more gets done (for whatever you may be using it all for). Dedicate all your resources to it and now that amount of time spent there
is insignificant, or at least less significant. Don't we have such high fillrates not because we actually need to write that many pixels, but because we want it to write at that
rate when it comes time to do the job? Unify your hardware and suddenly doing this is less wasteful of the hardware you have available to you.
And it's simply going to be the case that you'll have shaders where you can't shade
any more pixels until you're done with the current task. You're probably going to finish rendering your shadow before you decide to start pixel shading. And why not cut out the shader instructions that don't matter when there's a shadow on that pixel. Another use for branching.
Or hey, maybe i'm completely talking out my ass (instead of only partially), RSX has more shader power, and Xenos' feature set isn't going to make up for a deficiency nor take a "slight advantage" in say shading and turn it into a better looking gain. MS spent their money on the GPU only to fall short of Sony's machine with the GPU, and Sony's focus on the CPU means the PS3 totally spanks 360 and developers are going to really have it cut out for them to get 360 titles to match PS3 titles.
I honestly don't know.