Running the Toy Shop demo on the 360

Azrael

Newcomer
I emailed Mike Doggett and asked. (It's formatted from bottom to top.)

That's actually difficult to answer. At lower resolutions say 640x480 or
HDTV then the xbox is most likely faster due to it's slightly higher
shader power and high framebuffer bandwidth. But at higher resolutions
1600x1200 it could possible go to the X1K. They both have similar
performance. In the end it comes down to the type of application.


Mike

-----Original Message-----
From: epsilon9@.net [mailto:epsilon9@.net]
Sent: Tuesday, October 11, 2005 7:04 PM
To: Mike Doggett
Subject: Re: Toy Shop demo and XBOX 360

Ah I see. Thank you for replying. I have one more question, if you
would, and then I will leave you to your work.
Knowing what you do about ATI hardware, specifically Xenos, would you
say that Xenos, ala Xbox 360, would
run the Toy Shop demo faster, equal to, or slower than the X1800 XT?


Mike Doggett wrote:


>>Hi,
>>
>>It's unlikely that we would port the Toy Shop demo to the 360. The X1K
>>and the 360 are different platforms with different development
>>environments, so it takes some time to take the code and recompile it.
>>Also the 360 will be launching in a month and doesn't need technology
>>demos anymore, soon it will have real games.
>>
>>Mike
>>
>>-----Original Message-----
>>From: epsilon9@.net [mailto:epsilon9@.net]
>>Sent: Tuesday, October 11, 2005 2:41 AM
>>To: Mike Doggett
>>Subject: Toy Shop demo and XBOX 360
>>
>>The Toy Shop demo is beautiful work of art.
>>I would like to see it ran on the Xenos in the 360, as the Assassin
>>demo was.
>>Is there any way it can be arranged?
>>The Toy Shop demo is now widely considered the new bar for
>>real-time graphics.
>>
>>
>>
>>
>>






-- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.344 / Virus Database: 267.11.13/124 - Release Date: 10/7/2005

I bolded the part I found interesting. Since he designed the thing (Xenos) I am guessing he would be the one to know just what it's capable of.
 
Last edited by a moderator:
So Xenos has "slightly" more shading power than X1800? So we know how Xenos is relative to X1800, and we have test benchmarks for X1800 vs G70....

Out of curiosity, anyone know how's the X1800 stacks up to an overclocked 550mhz G70? I thought G70 was supposed to be a shading monster...how does it compare to the X1800?
 
It should be pretty clear why it would lose out at 1600x1200, but it's nice to have an educated guess. :)
 
It's good news i guess. X360 will never had to run at a res higher than 720p so who cares if the X1800 beats it at 1600x1200? it's supposed to! Besides, just the X1800 card will prob cost more than the WHOLE X360!:???:
 
scooby_dooby said:
So Xenos has "slightly" more shading power than X1800? So we know how Xenos is relative to X1800, and we have test benchmarks for X1800 vs G70....

Out of curiosity, anyone know how's the X1800 stacks up to an overclocked 550mhz G70? I thought G70 was supposed to be a shading monster...how does it compare to the X1800?


My original question was regarding the XT so I would assume he was referring to that with his answer. Are there any third party benches of the 1800XT vs 7800 GTX?
 
scooby_dooby said:
...
Out of curiosity, anyone know how's the X1800 stacks up to an overclocked 550mhz G70? I thought G70 was supposed to be a shading monster...how does it compare to the X1800?

Before anyone says GFlops are 'meaningless', blah, blah,blah...

32bit, programmable shading flops, with 'peak' issue rates,

R520, XT, 625 MHz ~ 170 GFlops
G70, GTX, 430 MHz ~ 199 GFlops
Xenos, 500 MHz ~ 216 GFlops (not 240, according to recent MS 'leak').
RSX, 550 MHz ~ 255 GFlops

Well, these are raw 'peak' shading Flops...so we'll see what transpires in forthcoming games...
 
london-boy said:
It's good news i guess. X360 will never had to run at a res higher than 720p so who cares if the X1800 beats it at 1600x1200? it's supposed to! Besides, just the X1800 card will prob cost more than the WHOLE X360!:???:


well, it might be important if someone well... might want to run at 1920x1080



.....nevermind. :p
 
Jaws said:
Before anyone says GFlops are 'meaningless', blah, blah,blah...

32bit, programmable shading flops, with 'peak' issue rates,

R520, XT, 625 MHz ~ 170 GFlops
G70, GTX, 430 MHz ~ 199 GFlops
Xenos, 500 MHz ~ 216 GFlops (not 240, according to recent MS 'leak').
RSX, 550 MHz ~ 255 GFlops

Well, these are raw 'peak' shading Flops...so we'll see what transpires in forthcoming games...

I found the math interesting.

The G70 GTX has 8.5% more peak FLOPS than the R520XT ... and yet the 1800XT is faster?

and

This is the same case for the Xenos and RSX - 8.5% more peak FLOPS ... and Xenos looks to more than likely be the faster solution.

Does this allude a more efficient engineering design with ATI parts in general or should it be chalked up in the FLOPS are meaningless column?
 
Jaws said:
Before anyone says GFlops are 'meaningless', blah, blah,blah...

32bit, programmable shading flops, with 'peak' issue rates,

R520, XT, 625 MHz ~ 170 GFlops
G70, GTX, 430 MHz ~ 199 GFlops
Xenos, 500 MHz ~ 216 GFlops (not 240, according to recent MS 'leak').
RSX, 550 MHz ~ 255 GFlops

Well, these are raw 'peak' shading Flops...so we'll see what transpires in forthcoming games...


and if rsx will has only 16 bit simds, then flops value equal zero ? :heh :D
 
Azrael said:
I found the math interesting.

The G70 GTX has 8.5% more peak FLOPS than the R520XT ... and yet the 1800XT is faster?

Depends what you're measuring. The XT has more raw vertex shading flops whilst the GTX has more raw pixel shading flops. And this is reflected in the shadermark tests. However the R520 has spent transistors on improving dynamic branching and this is reflected in those tests too...

Azrael said:
This is the same case for the Xenos and RSX - 8.5% more peak FLOPS ... and Xenos looks to more than likely be the faster solution.

Does this allude a more efficient engineering design with ATI parts in general or should it be chalked up in the FLOPS are meaningless column?

The same applies here too. But this one is more tricker because Xenos has 3 SIMD engines as unified shaders. G70-RSX, has 8 MIMD VS units and 6 SIMD PS quad units with their own pros's and cons...

version said:
and if rsx will has only 16 bit simds, then flops value equal zero ? :heh :D

Well we know it will do FP32!
 
Dave Baumann said:

Each of the 48 ALUs ~ 9 Flops per cycle (vec4+scalar and scalar not madd capable, i.e. 1 flop/cycle)

This was also reported by a japanese site during E3 too...
 
And here's why this news is so important:

http://www.beyond3d.com/forum/showthread.php?t=24254&page=4

Also, the parallax occlusion mapping technique really takes advantage of the excellent dynamic branching that X1K cards have.
Whle developing this technique, I ran performance tests on G60 / G70 cards versus the R520 generation and on a typical parallax occlusion mapped scene, performance on G6/70 generation cards is around 30-50% of the R5XX cards depending on the situation.

So if you wanted to make this demo look exactly as it stands right now and run as smoothly as it does (the average frame rate is around 25-28fps) (as R300King! wanted), it simply would not happen on the current latest generation hardware from NVidia. Even fitting this demo into memory wouldn't work without 1010102 and 3Dc and additional vertex data formats that we use (dec3n, for example). The demo uses a huge amount of texture and vertex data.



This demo would not run at a playable framerate on the G70 series of cards. The R520 runs it 2x to 3x faster than a G70 depending on the situation! And Xenos is even more powerful than that!!
 
Hardknock said:
This demo would not run at a playable framerate on the G70 series of cards. The R520 runs it 2x to 3x faster than a G70 depending on the situation! And Xenos is even more powerful than that!!

Heh, I think we'd need to know a lot more detail about the performance tests and these 'situations' before jumping to those kind of conclusions. And those figures specifically relate to parallax occlusion mapping not overall performance. It's also hardly surprising that a heavily optimised ATI tech demo performs comparatively poor on Nvidia hardware.
 
Hardknock said:
And here's why this news is so important:

http://www.beyond3d.com/forum/showthread.php?t=24254&page=4





This demo would not run at a playable framerate on the G70 series of cards. The R520 runs it 2x to 3x faster than a G70 depending on the situation! And Xenos is even more powerful than that!!

Before jumping up and down, you omitted subsequent paragraphs, namely,

Of course, one could think up alternative ways to implement some of the algorithms that we have used in this demo. For example, you could use relief mapping instead of parallax occlusion mapping. The relief mapping technique performs well on both ATI and NVIDIA hardware because it doesn't utilize dynamic branching and also makes heavy use of the dependent texture reads. However, in my quality results tests for comparison of these two techniques, the relief mapping technique displayed visual artifacts on our current dataset.

When developing algorithms, there's always more than one way to skin a cat!

Also, he was asked,

Anyone know if Luna and other 7800 demos function on X1000 hardware?

...and his reply,

We have tried and they don't.

Surprise, the NV demoes don't run on ATI hardware! Therefore ATI hardware is crap!

And when asked how it would run on Xenos,

The demo could run on Xenos and will run quite well. But we haven't tested this theory.

"quite well."

All this highlights is that there are 'several' means to the same 'end' and you develop for each architectures strengths whilst avoiding their weaknesses...
 
Before jumping up and down, you omitted subsequent paragraphs, namely,

What does that matter? Parallax occlusion mapping is superior, which G70 can't run efficiently. That's the point.

"However, in my quality results tests for comparison of these two techniques, the relief mapping technique displayed visual artifacts on our current dataset."



Surprise, the NV demoes don't run on ATI hardware! Therefore ATI hardware is crap!

No what this shows you is that Nvidia locks their demos to their hardware, while ATi's demos can be run on either hardware. Who do you think is more confident?

And when asked how it would run on Xenos,



"quite well."

All this highlights is that there are 'several' means to the same 'end' and you develop for each architectures strengths whilst avoiding their weaknesses

I don't know why you want to downplay this. 2x to 3x better performance than the Nvidia counterpart is a very considerable performance advantage. I don't care how you slice it. And TWO people from ATi have stated that Xenos would run it even better!
 
Hardknock said:
What does that matter? Parallax occlusion mapping is superior, which G70 can't run efficiently. That's the point.

"However, in my quality results tests for comparison of these two techniques, the relief mapping technique displayed visual artifacts on our current dataset."





No what this shows you is that Nvidia locks their demos to their hardware, while ATi's demos can be run on either hardware. Who do you think is more confident?



I don't know why you want to downplay this. 2x to 3x better performance than the Nvidia counterpart is a very considerable performance advantage. I don't care how you slice it. And TWO people from ATi have stated that Xenos would run it even better!

he downplays anything pro xenos often
 
Hardknock said:
...
I don't know why you want to downplay this. 2x to 3x better performance than the Nvidia counterpart is a very considerable performance advantage. I don't care how you slice it. And TWO people from ATi have stated that Xenos would run it even better!

Erm...I'm not downplaying this demo. One person has said Xenos will run it "quite well" NOT better. The other person, has stated it would run better under certain conditions. The 2/3 x difference with G70 is for parallax occlusion mapped scenes "depending on the situation." And finally your missing the point of what an algorithm is. You develop algorithms to take advantage of the hardware, and if you design algorithms that show the SAME thing on screen then it's irrelevant what the algorithm is.

The Luna demos show subsurface scattering, would the same algorithm run as efficiently on ATI hardware? Or would they need to tune it for ATI strengths? Or why not port a Playstation game to Gamecube without fine tuning and see if it runs the same? The point being there are many ways to achieve the same results. You'd be dumb to use the architectures weakest method to do this in the 'realworld' outside of a tech demo!
 
Back
Top