AMD confirms R680 is two chips on one board

PsychoZA · Jan 11, 2008

Morgoth the Dark Enemy said:
The proof of that being where?Ignoring that we're talking trans IHV comparisons that don't make that much sense(I picked the 8800GT as an (far-fetched)example because it packs a similar functional unit arrangement, the GTS(classic) handles more pixels(20>16), nV can handle 4 multisamples per cycle, ATi only does two etc.).Show me this great BW limited scenario

Just to prove that the GTS is bandwith limited:

http://www.anandtech.com/video/showdoc.aspx?i=3175&p=4

To prove the ATI scenario we'd need 900Mhz RV670 with downclocked and overclocked memory, because that was my original point. Not that the 774Mhz RV670 is memory limited.

AlexV · Jan 11, 2008

I give up with you too. You're incredibly, monsterously right. I'm so wrong I'm going to cry. Because ET:QW shows it. And ET:QW sure is indicative of future workloads. Not to mention that 1600x1200 with 4xAA is really where 300$ cards are aimed at, thus that is what I meant with:under normal circumstances. Clearly.

PsychoZA · Jan 11, 2008

If you'd taken the time to scroll down, you'd have seen that that page included graphs for almost all the current games...

Squilliam · Jan 12, 2008

This is a stupid question sorry! Would I be right to presume that two of these in Xfire would be insufficient for 2560 by 1600 gaming? Namely a 30" LCD monitor...

compres · Jan 12, 2008

pjbliverpool said:
Xenos can use near 256GB/s of bandwidth (doesn't matter that its only for limited situations and not main memory, it can still use it for some things)

RV670 cannot use more than 72GB/s

To me, both of those statements can't be correct, either one, or the other is wrong.

I think you need to go back and read again all the replys you have gotten on this. There has been a clear effort to explain to you just how wrong you are.

compres · Jan 12, 2008

_xxx_ said:
Well that is debatable, but as an overall package the GF6800 was capable of more than ATI's competing products and to me that counts more than a few % more speed in like half of the games.

GF6800 is #1 on FUD/marketing(aka SM3.0 checkbox with no performance to back it up) and craptacular optimizations. At similar IQ the nVidia was clearly the loser.

That said I still took the nVidia because they support linux better. (Edit: no filter?)****Windows.

vertex_shader · Jan 12, 2008

compres said:
GF6800 is #1 on FUD/marketing(aka SM3.0 checkbox with no performance to back it up) and craptacular optimizations. At similar IQ the nVidia was clearly the loser.

That said I still took the nVidia because they support linux better. (Edit: no filter?)****Windows.

Yeah it was marketing and?
Crytek sm3.0+hdr patch for Far Cry help a lot to NV40 based cards succes.
ATi need to do similar marketing things, so simple.

fellix · Jan 12, 2008

NV40 was really mostly a show desk for SM3 -- more a developer shiny new tool than a complete consumer experience, IMHO.
Very clumsy DB, just an option to try & play with, not to employ.
Slow FP blending (fixed in G70), almost useless vertex tex fetching and some other minor woes.
But, yes - it did a very well marketing (eg. propaganda) impact, paving the road for the next generations of hardware.

pjbliverpool · Jan 12, 2008

compres said:
I think you need to go back and read again all the replys you have gotten on this. There has been a clear effort to explain to you just how wrong you are.

Er, no. In fact if you had read all the replies you would have seen that the only one to fully understand and address what I was saying completely validated it.

My statement was correct, Xenos can use near 256GB/s of bandwidth under some situations and under those same situations RV670 would be bandwidth restricted.

The situations may be rare in current generation games and thus RV670 is largely unrestricted by memory bandwidth but then that raises the question about the benefit of eDRAM in Xenos (which isn't something we need to discuss in this thread).

My point was always to demonstrate that either Xenos didn't benefit greatly from having all that extra bandwidth or that RV670 was bandwidth limited on at least a partially regular basis.

If you care to suggest a third alternative then go right ahead.

ninelven · Jan 13, 2008

pjbliverpool said:
If you care to suggest a third alternative then go right ahead.

How about both chips are designed to function well given their respective available bandwidth.

Or we could just say that Xenos is so shader bound that the bandwidth available to it is irrelevant at resolutions RV670 users regularly play at...

pjbliverpool said:
We can say that most of the time, Xenos doesn't use anywhere near that much bandwidth but the fact of the matter is that situations were it does use a lot of it (if they even exist) would also be able to exist in a PC game and thus in those situations RV670 would be bandwidth limited.

In modern titles on the RV670 you are most likely going to be shader limited before you are bandwidth limited. So while you could certainly spend $$$ increasing the available bandwidth, your best bang/buck bet would be going the other direction, hence R600->RV670.

Squilliam · Jan 13, 2008

Since when do console GPU's have anything to do with an upcoming PC Card?

swaaye · Jan 13, 2008

Squilliam said:
Since when do console GPU's have anything to do with an upcoming PC Card?

When they are made by the same engineers.

pjbliverpool · Jan 13, 2008

ninelven said:
How about both chips are designed to function well given their respective available bandwidth.

That would be fair enough if their were significant differences between the Xenos core and the RV670 core that would result in Xenos requiring FAR more bandwidth. But there isn't. At least no one has been able to demonstrate that there is thus far.

RV670 is just an evolved and more powerful version of Xenos (minus the edram). There is no reason that anyone has provided why Xenos would be able to use three times more bandwidth (aside from the lack of compression).

Or we could just say that Xenos is so shader bound that the bandwidth available to it is irrelevant at resolutions RV670 users regularly play at...

Yes possibly, thats one of the scenario's my argument allows for. Xenos simply doesn't need all that bandwidth. For the record though, I believe my point has already been well answered. Xenos can use the bandwidth, but only under rare and extreme circumstances.

In modern titles on the RV670 you are most likely going to be shader limited before you are bandwidth limited. So while you could certainly spend $$$ increasing the available bandwidth, your best bang/buck bet would be going the other direction, hence R600->RV670.

Yup, thats pretty much my conclusion too.

pjbliverpool · Jan 13, 2008

Squilliam said:
Since when do console GPU's have anything to do with an upcoming PC Card?

Well lets see....

They are both made by the same company.
They are both based on the same core architecture.
One is a clear evolution of the other (by ATI's own admission)
Both run practically identical games

Are you honestly trying to say that no comparisons can be drawn? Rather than giving the old generic response of "they can't be compared because ones in a console and ones in a PC", how about you give tangible reason as to why the two aren't comparable in this way?

My basic point really is extremely simple. Either Xenos doesn't need all that bandwidth or RV670 needs more. I believe its already been well answered in that Xenos can use all that bandwidth, but in most situations doesn't need too, and the same would translate to RV670. Most of the time its fine, but their are rare occasions when it would be bandwidth limited.

AlNom · Jan 13, 2008

pjbliverpool said:
My basic point really is extremely simple. Either Xenos doesn't need all that bandwidth or RV670 needs more. I o,believe its already been well answered in that Xenos can use all that bandwidth, but in most situations doesn't need to and the same would translate to RV670. Most of the time its fine, but their are rare occasions when it would be bandwidth limited.

(Perhaps this Xenos discussion should be split off...)

It's a bit backwards to think of Xenos needing the bandwidth versus game developers making use of it. As a piece of hardware enclosed within a fixed system, the enormous bandwidth made available for the framebuffer is really a means of removing a potential bottleneck. There are other bottlenecks within the closed system, i.e. # of ALUs, # of ROPs etc.

I would like to add that the use of multiple render targets will increase bandwidth usage as well as FP16 precision + blending. Games making use of deferred shading should see a benefit with higher memory bandwidth, but this rendering method is not prevalent in games at the moment - STALKER, Gears of War, UT3. Gladiator used deferred shading IIRC, at least on Xbox (There was a GDC2004 presentation on it). And of course, the use of deferred shading is a boon for shader limited hardware. (Xenos has its own issues with tiling and its MRT implementation, but that's another discussion)

Even with say... support for 8 render targets, it's probable (from what I understand, but please correct me if I'm wrong

) that most hardware in the market would run out of memory bandwidth should a developer actually use that many.

pjbliverpool · Jan 13, 2008

AlStrong said:
(Perhaps this Xenos discussion should be split off...)

Yeah, apologies for getting OT, feel free to split into a new thread.

It's a bit backwards to think of Xenos needing the bandwidth versus game developers making use of it. As a piece of hardware enclosed within a fixed system, the enormous bandwidth made available for the framebuffer is really a means of removing a potential bottleneck. There are other bottlenecks within the closed system, i.e. # of ALUs, # of ROPs etc.

Yep, I can see that with the bandwidth available, devs will make use of it when they wouldn't even try in a system that doesn't have that kind of bandwidth. But the very fact that they would want to try proves that there is a limitation in other systems with less bandwidth. Otherwise there would be no motivation to make use of such huge bandwidth.

The thing is that most games on the 360, at least most of the graphically high profile ones end up on the PC. So if 360 devs are using the edram to its full potential in the console, that hardware requirement is going to transfer over to the PC. The fact that its a fixed system really doesn't mean a thing since the same games with the same bandwidth requirements are still being played.

I would like to add that the use of multiple render targets will increase bandwidth usage as well as FP16 precision + blending. Games making use of deferred shading should see a benefit with higher memory bandwidth, but this rendering method is not prevalent in games at the moment - STALKER, Gears of War, UT3. Gladiator used deferred shading IIRC, at least on Xbox (There was a GDC2004 presentation on it). And of course, the use of deferred shading is a boon for shader limited hardware. (Xenos has its own issues with tiling and its MRT implementation, but that's another discussion)

Indeed. In fact the requirement to use FP16 precision + blending as opposed to FP10 would mean that RV670 need even more bandwidth than Xenos as opposed to less.

ninelven · Jan 13, 2008

pjbliverpool said:
But the very fact that they would want to try proves that there is a limitation in other systems with less bandwidth.

So you have come to the conclusion that, all else being equal, more bandwidth = better... astonishing. In related news, if devs have more shader power (or CPU, fillrate, tools, etc) available to them, they will probably make use of it... everybody panic.

silent_guy · Jan 13, 2008

pjbliverpool said:
My basic point really is extremely simple. Either Xenos doesn't need all that bandwidth or RV670 needs more.

Or you're comparing apples to oranges.

I believe its already been well answered in that Xenos can use all that bandwidth, but in most situations doesn't need too, and the same would translate to RV670. Most of the time its fine, but their are rare occasions when it would be bandwidth limited.

When you stay entirely on-chip, the bandwidth to internal RAMs is not a scarce good. Above a certain size, the area overhead of splitting 1 RAM into 2 is minimal.
So once you've decided to go this route, you can design the rest of the architecture while recklessly spending on bandwidth without worrying about efficiency. A direct consequence is that standards methods of comparison become pretty much meaningless, yet you're still pretending that apples are oranges. They are not.

You only need to read the Xenos article to see that the real bottleneck in the architecture is not in the on-chip bandwidth, but in the interface between the 2 dies.

Let's compare the two architecture GPU architecture for typical cases.

In one case, you're rendering completely covered 2x2 pixel tiles. In a traditional GPU, this results in compression. In Xenos, the data travels compressed to the ROP die and undergoes an 8-fold(!) bandwidth expansion. Sure, you see impressive eDRAM bandwidth usage, but with no additional benefit over a traditional GPU.

In the other case, when no compression is possible, the eDRAM uses only 1/8 of the theoretical maximum, because now you're limited by the link bandwidth.

So if you want to compare bandwidth numbers, the meaningful numbers are the bandwidth of a regular GPU memory interfaces against (the memory bandwidth of the system memory + the bandwidth of the inter-die link + the read bandwidth for RMW operations - the bandwidth to copy from eDRAM to system RAM). In this equation, there's no contest: an RV670 will probably beat a Xenos by a factor of two, if not more.

The real advantage of the eDRAM arrangement lies not in the bandwidth but in the complexity and area reduction that it allows: no compression logic, much more coherent data streams to the system memory (-> significantly smaller and simpler MC), and drastically lower latency for the eDRAM (-> large area savings due to smaller latency FIFO's).

Xenos has a really neat architecture, but it's just too naive to take one number out of context and build a whole argument around it without looking at the whole picture.

no-X · Jan 13, 2008

pjbliverpool:

- R6xx uses 64:1 Z-compression for MSAA 4x. Xenos doesn't

- R6xx ROPs are capable of 2 MSAA samples per clock and doesn't perform hardware resolve, so much lower bandwidth per clock is required for this architecture - ROPs waits for resolve being performed by shader core, less MSAA samples are being generated per clock, higher compression is applied, so less data transfers are spreaded for longer periods of time. That's why R6xx is very bandwidth independant compared to Xenos.

Just compare performance of these cards:
HD2600XT GDDR3 vs. HD2600XT GDDR4: 57% more bandwidth, 4-7% higher performance
HD2900XT GDDR3 vs. HD2900XT GDDR4: 21% more bandwidth, 0-5% higher performance
HD2900XT vs. HD3870: 32% less bandwidth, 0-12% higher performance

pjbliverpool · Jan 13, 2008

ninelven said:
So you have come to the conclusion that, all else being equal, more bandwidth = better... astonishing. In related news, if devs have more shader power (or CPU, fillrate, tools, etc) available to them, they will probably make use of it... everybody panic.

Wow, way to miss the point completely there ninelven.

The point is that RV670 MUST be bandwidth limited in some situations if even weaker GPU's can make use of more.

AMD confirms R680 is two chips on one board

PsychoZA

AlexV

Heteroscedasticitate

PsychoZA

Squilliam

Beyond3d isn't defined yet

compres

compres

vertex_shader

fellix

pjbliverpool

B3D Scallywag

ninelven

PM

Squilliam

Beyond3d isn't defined yet

swaaye

Entirely Suboptimal

pjbliverpool

B3D Scallywag

pjbliverpool

B3D Scallywag

AlNom

Moderator

pjbliverpool

B3D Scallywag

ninelven

PM

silent_guy

no-X

pjbliverpool

B3D Scallywag

Similar threads