I did not notice any shadow casting in X3: Reunion. It is pretty much about using shiny, reflective metallic surfaces for everything, sectors have singular light sources (stars), weapon fire, engine and beacon light's does not affect surfaces. Apparently there is a lack of dynamic lighting, it pretty much represents the sharp contrast one light sceanarios one would experiment orbiting Earth. I was wondering myself, I did not play X2 but I've seen pictures and ships cast shadows on themselves.neliz said:X2 was about the shadows, what about X3? X2 was heavily optimized for nV40's architecture, the shadowing etc.
Elaborate, please.Pete said:
Well, bloom effects come about because the frame is still output on a low dynamic range device. So they're there in an attempt to approximate the effect of bright lights blinding you. Of course it won't look exactly like it does in real life, because it can't actually be that bright. But game developers are going to do the best they can until/unless we get some real HDR displays.Hubert said:PS. I can see HDR being used in space, too. Why souldn'we have dynamic range in space ? Lights are lights, and lacking the atmosphere, maybe the bloom effects would be missing too.
High is PS2.0 as far as I can tell (PS2.0b on my X800XT), with medium PS1.4 and low PS1.1.Hubert said:PS. I wonder if SM 2.0b and SM 3.0 are not both included for the High setting. That would make SM 2.0a the Medium and SM 1.1 the Low one.
R300 was a huge improvement over the previous gen because R200 was rather mediocre. Now R520 is successing the R300 architecture which was very good in its time. Obviously, that makes an equally big step almost impossible.Mintmaster said:R300 shone not only versus NV30, but was a huge improvement over the previous gen. It didn't even have twice the transistors of R200, but had like 3 times the practical shading power, in FP24 to boot, amazing AA, better AF, and the list goes on.
That's probably about the difference R300 was faster than NV25 when AA/AF was disabled. The real difference only came apparent when you used the main features of R300: PS2.0, AA, and AF.R520 had double the transistors of R420 and was maybe 30% faster overall.
The thing is, I do not believe it would perform much faster. Since you said I shouldn't take your suggestions too literally, it seems to me the only thing you are proposing is to throw out the "good branching" and put in "cheap branching" and "more pipelines" instead. I don't believe this would be enough for even going from 16 to 24 (if you're suggesting just adding ALUs, then it would be better to take R580 for comparison). And it does nothing to bandwidth whatsoever, besides increasing demand.You may think a cheap PS3.0 wouldn't sell well, but remember that it would perform much faster. That sells a lot more than claiming you have the most advanced design. Look at NV15 vs R100, or NV25 vs. R200. Performance sells way more than technology for the mass market.
Again, the memory controller is 8-10% of the die space. It looks like more, but measure it. Getting the chip to to date with PS3.0 didn't take much die space. PS3.0 with fast dynamic branching did, though. I gave proof straight from sireric, who knows more about R5xx than anyone on any forum.SugarCoat said:yes and almost 2/3 of it was used getting the chip up to date with SM3.0 and the memory controller which in itself was a long term investment more then anything.
Was the Geforce4 Ti4600 mediocre? The jump to R300 was still enormous if you use that as a reference, so your point is moot.Xmas said:R300 was a huge improvement over the previous gen because R200 was rather mediocre.
Look at the 9500PRO benchmarks. Pixel shading was almost as fast as the 9700, and gaming wasn't far behind either.R300 doubled memory bandwidth over R200.
You're arguments here are very weak. Of all the things you mentioned about R520, only dynamic branching is the big die space eater, and the topic of this thread is about the pixels shader size. And none of those points sell as well as performance. R300 made a huge leap in PS1.x performance as well, and AA was part of mainstream benchmarking for a while. The games using so-called SM3.0 right now are pretty much only using FP blending, so they're taking far less advantage R520's hardware than PS1.x games did of R300's.R520 gives you PS3.0 with good dynamic branching, FP32, better AF, AVIVO, and more. That's a big step as well.
That's probably about the difference R300 was faster than NV25 when AA/AF was disabled. The real difference only came apparent when you used the main features of R300: PS2.0, AA, and AF.
If you do the same with R520, it can shine as well. And while when R300 came out DX9 wasn't even there, we already have some games taking advantage of PS3.0 now.
16 to 24 only? R300 shader pipes, including a ROP and texture unit, were ~5M transistors (see R300/RV410->R420). R520->R580 shows FP32 shader units (w/o a texture unit) are ~2M transistors. ATI could easily have produced a 32-shader, 32-texture unit, 16-ROP part with under 300 million transistors. Beef up the mini-ALU, or add another stage, and it would easily outperform R580 too when dynamic branching isn't involved. Again you mention bandwidth, but like I said, the memory controller is not huge at all.The thing is, I do not believe it would perform much faster. Since you said I shouldn't take your suggestions too literally, it seems to me the only thing you are proposing is to throw out the "good branching" and put in "cheap branching" and "more pipelines" instead. I don't believe this would be enough for even going from 16 to 24 (if you're suggesting just adding ALUs, then it would be better to take R580 for comparison). And it does nothing to bandwidth whatsoever, besides increasing demand.
It became enormous when the R300's strengths were used. It wasn't enormous right from the start. But I repeat myself...Mintmaster said:Was the Geforce4 Ti4600 mediocre? The jump to R300 was still enormous if you use that as a reference, so your point is moot.
Actually the 9500Pro supports my point. It was the same chip, but significantly slower because it frequently hit the bandwidth limit.Look at the 9500PRO benchmarks. Pixel shading was almost as fast as the 9700, and gaming wasn't far behind either.
And a big die space eater, relatively speaking, for R300 was FP24. Which was completely useless until games using PS2.0 came out.You're arguments here are very weak. Of all the things you mentioned about R520, only dynamic branching is the big die space eater, and the topic of this thread is about the pixels shader size. And none of those points sell as well as performance. R300 made a huge leap in PS1.x performance as well, and AA was part of mainstream benchmarking for a while. The games using so-called SM3.0 right now are pretty much only using FP blending, so they're taking far less advantage R520's hardware than PS1.x games did of R300's.
What does the memory controller have to do with it? What's the point of having 32 TMUs when you simply can't feed them? There is a reason why R580 doesn't have more TMUs either.16 to 24 only? R300 shader pipes, including a ROP and texture unit, were ~5M transistors (see R300/RV410->R420). R520->R580 shows FP32 shader units (w/o a texture unit) are ~2M transistors. ATI could easily have produced a 32-shader, 32-texture unit, 16-ROP part with under 300 million transistors. Beef up the mini-ALU, or add another stage, and it would easily outperform R580 too when dynamic branching isn't involved. Again you mention bandwidth, but like I said, the memory controller is not huge at all.
It certainly is an important part, but most likely not the sole reason.My central point, supported by ample evidence, is this: ATI made a big die space commitment to make dynamic branching fast. That's the sole reason ATI is behind NVidia in performance per clock per transistor.
There is no static branching in PS2.0, only in VS2.0.PurplePigeon said:I believe I understand. So, it sounds like the shader compiler could conceivably compile a model 2.0 shader to utilize the model 3.0 hardware resources (ie. dynamic branching) if it determined that it would ultimately run faster that way. Or, to paraphrase, fast hardware dynamic branching could benefit shader code that does not explicitly use it.
Mintmaster said:Xmas, AA was the GF4's strength prior to R300. R300's strengths were everything except multitexture rate (which wasn't a weakness anyway), including PS 1.1, which used to be GF4's territory.
The 9500Pro does not prove your point - look here, here, here, etc. It's ~10% behind the 256-bit 9700 on average, and miles ahead of previous gen. R300 gave immediate benefits to all games when given similar bandwidth and clock speed to NV25 & R200. Likewise with G70 vs. NV40. Not so with R520 vs. R420 (see X1800XL vs. X800XT).
Unknown Soldier said:erm .. I thought Dave said the R400 became the R500 .. and teh R500 Tech is now the Xenos .. and part of that tech is in the R580.
The R420 tech comes from the R300.
At least that's what I understood Dave as saying.
US
Actually, the more I think about it the more sense it makes. If ATI's flow control unit can execute one FC op per clock and is capable of jz/jnz, that could potentially bring huge savings at no runtime cost, just more work for the compiler.Xmas said:OTOH, if flow control is implemented in a way that allows e.g. one "free" if(x==0) per cycle, this could indeed be useful. It increases code size, however.
And kill, of course.There is one thing, that could be turned into a dynamic branch easily: alpha test.
You may want to initialize 'b', in case its previous value was inf or nan.Xmas said:Since all sequences
a = simpleFunc(x);
b = incrediblyComplexFunc;
c = a * b;
can be rewritten as
a = simpleFunc(x)
if(a) b = incrediblyComplexFunc;
c = a * b;