Beyond3D Forum

Beyond3D Forum (http://forum.beyond3d.com/index.php)
-   Pre-release GPU Speculation (http://forum.beyond3d.com/forumdisplay.php?f=51)
-   -   The NEXT LAST R600 Rumours & Speculation Thread (http://forum.beyond3d.com/showthread.php?t=39173)

leoneazzurro 09-May-2007 22:04

Quote:

Originally Posted by Julidz (Post 983954)
this is true ???



Here is the one of the key reasons why the R600 marchitecture falls behind the G80 GTX version. While Geforce 8800 GTX or G80GTX has 32 Texture Memory Units (TMUs) Radeon HD 2900 XT has 16 only.

TMUs are responsible for getting the textures in your games and the more you have, the faster you can handle the texture work.

Still, the Radeon HD 2900 XT at 750 MHz core is about 30 percent faster in raw clock speed from Geforce 8800 GTX at 575 MHz and finishes the texture cycle some 30 percent faster than Nvidia.

In spite of that this is not really an ideal solution as Nvidia has twice as many texture memory units. The only hope is that ATI has more effective TMUs, but this is one big marchitectural disadvantage over the G80 marchitecture.


It is not really true. Texture units in both chips are radically different in comparison to R580 or G71, and at the moment it's really difficult to compare them.
From the details we could say that in theory (based only on the unit number) R600 has an advantage in fetching textures and G80 has an advantage in filtering.

Mintmaster 09-May-2007 22:04

Quote:

Originally Posted by Jawed (Post 983464)
And what's annoying me is that you think ATI has never written a co-issuing compiler ever before. R300 is a 4-issue ALU. R600 is 5-issue. R300's four instructions are all different (either in component count or capability or both). 4 of R600's ALUs are identical. The sky isn't falling in.

Yet AGAIN you're missing the point. Nobody ever said ATI can't do it. DemoCoder simply asserted that it's more important to be good at it now.

Quote:

1-clock instructions are extremely costly...
I still don't agree with this unless you're talking about latency. Changing ops each clock isn't hard as long as you don't need the result immediately. Nearly every pipelined processor in the world, regardless of how simple, does this. You don't save much space by increasing this to more clocks.
Quote:

I'm glad the penny's dropped. This is just one way that your sequencer complexity increases.

If instead of combining a 64-pixel ALU with a single-clock instruction pipeline your sequential scalar GPU has 16-pixel ALUs and four-clock instructions, you've still got increased sequencer complexity compared against R600, because you've just multiplied the number of batches in flight 4x in order to retain the same batch size (this is the "four Xenoses glued together" scenario).

So, now muse on how much of G80 is batch sequencing logic, since it has 16 batches in flight and compare that against the 4 batches in R600.
I think one of the reasons we're having difficulty communicating is differing terminology when you say "batches in flight". Before G80 came it was 512 for R5xx and 6 (albeit enormous ones) for G70. Now you're talking about batches in the immediate vicinity of the ALU arrays, ignoring all the other batches in the pipeline.

Simple cycling between batches for predictable ALU instructions isn't hard, and doesn't add measurably to the sequencer complexity. The tough task is managing the many threads in flight that are waiting for texture fetch results. I don't consider what you're talking about to be sequencer complexity. I think the complexity arises from the larger data pool that each stage of the ALUs need to select from.

G80 may have more batches in flight in this sense, but the only reason for it to have more total batches in flight is the higher texture throughput.

SugarCoat 09-May-2007 22:10

Quote:

Originally Posted by Razor1 (Post 983744)
remember that rumor monger site..... how real can these be?

http://it-review.net/index.php?optio...1&limitstart=1

if they arent then obviously some people have way too much time. Thats the best attempt i've seen at discrediting a product if its all BS. It would be rather strange that they would go through all that to a product that is already badly disadvantaged by a severely delayed launch. Their "findings" arent going to mean a damn to anyone still left holding onto the "R600 dream" this late in and so close to the actual launch. It would also mean they have an agenda, again ask yourself, why? In all likelihood they're real.

Quote:

Originally Posted by INKster (Post 983775)
From the looks of it, the HD 2900 XT is loosing to a X1950 XTX (yeah, right... :D).

With a new architecture if its poorly designed, or especially immature drivers, this is entirely possible. If i remember right (not to bring up this dead horse) the Geforce FX cards were actually slower or not giving near enough of a lead you'd expect in quite a few instances compared to the previous Geforce 4 cards in DX8 benchmarks at launch.

Mintmaster 09-May-2007 22:13

Quote:

Originally Posted by leoneazzurro (Post 983957)
It is not really true. Texture units in both chips are radically different in comparison to R580 or G71, and at the moment it's really difficult to compare them.
From the details we could say that in theory (based only on the unit number) R600 has an advantage in fetching textures and G80 has an advantage in filtering.

How can you say that? R600 can do 16 fetches per clock, and G80 can do 32. G80 can do 64 bilinearly filter ops (8-bit per channel) per clock, and we don't know what R600 can do but it's probably 32 if we're lucky. Take clock differences into account and there's still a big disparity.

Fornowagain 09-May-2007 22:17

Just revisiting the review posted earlier they've added this, which is strangely exactly how I imagined him to look. :D Still think its fake?

http://img267.imageshack.us/img267/2...0test12cj3.jpg

leoneazzurro 09-May-2007 22:17

Quote:

Originally Posted by Mintmaster (Post 983963)
How can you say that? R600 can do 16 fetches per clock, and G80 can do 32. G80 can do 64 bilinearly filter ops (8-bit per channel) per clock, and we don't know what R600 can do but it's probably 32 if we're lucky. Take clock differences into account and there's still a big disparity.

Nope, from the supposedly real AMD slides there are 20 Texture fetch units in each of the 4 Texture Unit blocks of R600. That's 80 Unit total, and at higher frequency.

Silent_Buddha 09-May-2007 22:21

Quote:

Originally Posted by Mintmaster (Post 983963)
How can you say that? R600 can do 16 fetches per clock, and G80 can do 32. G80 can do 64 bilinearly filter ops (8-bit per channel) per clock, and we don't know what R600 can do but it's probably 32 if we're lucky. Take clock differences into account and there's still a big disparity.

I'd imagine from this slide...

http://forum.beyond3d.com/showpost.p...postcount=4164

It seems to imply that R600 can do 80 Texture Fetches per clock.

Regards,
SB

AlexV 09-May-2007 22:22

Quote:

Originally Posted by SugarCoat (Post 983960)
if they arent then obviously some people have way too much time. Thats the best attempt i've seen at discrediting a product if its all BS. It would be rather strange that they would go through all that to a product that is already badly disadvantaged by a severely delayed launch. Their "findings" arent going to mean a damn to anyone still left holding onto the "R600 dream" this late in and so close to the actual launch. It would also mean they have an agenda, again ask yourself, why? In all likelihood they're real.



With a new architecture if its poorly designed, or especially immature drivers, this is entirely possible. If i remember right (not to bring up this dead horse) the Geforce FX cards were actually slower or not giving near enough of a lead you'd expect in quite a few instances compared to the previous Geforce 4 cards in DX8 benchmarks at launch.

Something to keep in mind:the difference in terms of work being done between DX8 and DX9 was rather huge(you went from a rather primitive programability to a much more general one, you went from integer to float etc.)-it`s not the same with DX9 to DX10(save for the GS which is new and as I`ve stated before could be a trump card). Again, IMHO, if you`re poor at DX9 math, you`re still going to be poor at it in DX10. If you suck at doing DX9 games, you`re not going to see huge improvements in DX10 UNLESS it involves some heavy Geometry shading and you`re adept at that.

Fact is, we don`t really know how good/bad R600 is(yet).

Sound_Card 09-May-2007 22:23

Quote:

Originally Posted by Fornowagain (Post 983964)
Just revisiting the review posted earlier they've added this, which is strangely exactly how I imagined him to look. :D Still think its fake?

http://img267.imageshack.us/img267/2...0test12cj3.jpg


ROFL. Look very very very very close to the finer detail. I work with photshop almost everyday at school, and that is one horrible job. :razz:

I must say, that was a very valid try.

digitalwanderer 09-May-2007 22:28

What makes you think that's a chop? I don't see anything tell-tale jumping out at me.

Skrying 09-May-2007 22:31

Are the stickers on the fans in the same orientation?

SugarCoat 09-May-2007 22:32

Quote:

Originally Posted by Morgoth the Dark Enemy (Post 983968)
Something to keep in mind:the difference in terms of work being done between DX8 and DX9 was rather huge(you went from a rather primitive programability to a much more general one, you went from integer to float etc.)-it`s not the same with DX9 to DX10(save for the GS which is new and as I`ve stated before could be a trump card). Again, IMHO, if you`re poor at DX9 math, you`re still going to be poor at it in DX10. If you suck at doing DX9 games, you`re not going to see huge improvements in DX10 UNLESS it involves some heavy Geometry shading and you`re adept at that.

Fact is, we don`t really know how good/bad R600 is(yet).

regardless of your valid point, i think most people expect performance to be increased noticably across the board in pretty much every single thing that the previous gen did. I didnt mean to make it sound like i was saying DX8->DX9 was the same as DX9->10 but more so that when a new card is introduced, it does not always clobber its predecessor.

Quote:

Originally Posted by digitalwanderer (Post 983971)
What makes you think that's a chop? I don't see anything tell-tale jumping out at me.

looks like a part number is blurred on the side of the HSF on every single card where its visible in the same way and the one against his abdomen has a very blurry back left corner where the 6pin and 8pin are suppose to be, could of just be taken by a cheapish camera though.

leoneazzurro 09-May-2007 22:32

Quote:

Originally Posted by Silent_Buddha (Post 983967)
I'd imagine from this slide...

http://forum.beyond3d.com/showpost.p...postcount=4164

It seems to imply that R600 can do 80 Texture Fetches per clock.

Regards,
SB

Yes, thanks. There are also some hints at the filtering capability, even if we don't know how much work per clock really each of the 16 TF units of R600 can do. Anyway what I heard is that R600 can not do trilinear per cycle.

Bob 09-May-2007 22:35

Quote:

It seems to imply that R600 can do 80 Texture Fetches per clock.
If you're going to start counting prefiltered texels, then G80 can do 256 of those per clock.

Frank 09-May-2007 22:35

Quote:

Originally Posted by Sound_Card (Post 983969)
ROFL. Look very very very very close to the finer detail. I work with photshop almost everyday at school, and that is one horrible job. :razz:

I must say, that was a very valid try.

It looks real to me.

leoneazzurro 09-May-2007 22:38

Quote:

Originally Posted by Bob (Post 983978)
If you're going to start counting prefiltered texels, then G80 can do 256 of those per clock.

Sure? I knew 64. Have you some link to this info? I would appreciate very much :)

nAo 09-May-2007 22:49

Quote:

Originally Posted by leoneazzurro (Post 983980)
Sure? I knew 64. Have you some link to this info? I would appreciate very much :)

If it can provide 64 bilinear filtered 'samples' per clock then it should be able to read 256 texels per clock (from L1 cache) as well for obvious reasons.

Fox5 09-May-2007 22:50

Quote:

Originally Posted by Geeforcer (Post 983902)
The notion that R600 is "born for DX10" strikes me personally as a reaction a lower than expected performance - almost as if people are saying "Well, those specks are got to be good for SOMETHING, right?". I am personally trying to think of the last card that was slower then similarly-featured card in DX(n) but significantly faster in DX(n+1).

Of course, if anyone can point me to the specific features of R600 architecture and specific provisions in DX10, that, when combined, would deliver significant advantage over G80, I am all ears.

I'm sure some of ATI's r300 derivatives were slower than the 8500 yet could beat it in dx9...assuming 8500 could do dx9.
And the r300 derivatives were definitely slower than the geforce fx cards in dx8/8.1 but beat them badly in dx9.

Mark 09-May-2007 22:50

Quote:

Originally Posted by Fornowagain (Post 983964)
Just revisiting the review posted earlier they've added this, which is strangely exactly how I imagined him to look. :D Still think its fake?

http://img267.imageshack.us/img267/2...0test12cj3.jpg

That dude looks like a character from Planet of the Apes...

leoneazzurro 09-May-2007 22:51

Quote:

Originally Posted by nAo (Post 983988)
If it can provide 64 bilinear filtered 'samples' per clock then it should be able to read 256 texels per clock (from L1 cache) as well for obvious reasons.

And to fetch all 256 to the ALUs? (because we are not speaking about filtering but about texture fetching to the ALUs)?

NocturnDragon 09-May-2007 22:53

Quote:

Originally Posted by Mark (Post 983990)
That dude looks like a character from Planet of the Apes...

He looks like a happy kid on Christmas day!

http://xs315.xs.to/xs315/07193/xmas.jpg

Pete 09-May-2007 22:53

Quote:

Originally Posted by Geeforcer (Post 983902)
The notion that R600 is "born for DX10" strikes me personally as a reaction a lower than expected performance - almost as if people are saying "Well, those specks are got to be good for SOMETHING, right?".

That's exactly what I'm thinking. Surely ATI hasn't crammed so much into R600 to not have more to show than a GTS-beater, and even then conditionally? My skepticism waxes and wanes with each blurted NDA-buster.

Quote:

I am personally trying to think of the last card that was slower then similarly-featured card in DX(n) but significantly faster in DX(n+1).
One could argue that R300 (DX9), I think R520 (DX9 SM3 branching), and now apparently R600 (really long/complex shaders? HDR all the way?) are similar in that regard: trading off some current performance for greater anticipated future-API/IQ performance. If this is the case, it would seem that ATI is aiming a little beyond NV on the can-do/hope-to-do curve, and has fallen behind in two succeeding generations (R300 being on time relative to NV30). Or I'm oversimplifying, not seeing the forest, etc.

Quote:

Of course, if anyone can point me to the specific features of R600 architecture and specific provisions in DX10, that, when combined, would deliver significant advantage over G80, I am all ears.
This is the 64,000 pixel question, and I'm thinking we'll have to wait for NDA lift and possibly 3DM07 or simultaneous 360-PC releases like Shadowrun to do more than guess.

Razor1 09-May-2007 22:57

Quote:

Originally Posted by Mark (Post 983990)
That dude looks like a character from Planet of the Apes...


Good eye lol

digitalwanderer 09-May-2007 22:57

Quote:

Originally Posted by NocturnDragon (Post 983992)
He looks like a happy kid on Christmas day!

http://xs315.xs.to/xs315/07193/xmas.jpg

http://www.elitebastards.com/forum/i...les/biglol.gif

Kaotik 09-May-2007 23:02

Regardless if it's a chop or not, the one on his lap / against his stomach does look too small compared to the others, even the one on his right knee looks bigger in my eyes, and it should be further away from the cam


All times are GMT +1. The time now is 12:24.

Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.