R700 Inter-GPU Connection Discussion

trinibwoy · Jun 30, 2008

Sinistar said:
Well my eye sight is bad, but it almost looks like 2 x 32 lane pci-e ports.

Well if you're counting the number of distinct structures discernable in the die shot, I see 2 x 16. And that's some good core porn too, thanks fellix

ZerazaX · Jun 30, 2008

Kinda off topic on this but interseting to see what AMD might have next in line. Kyle at [H] posted this

R800 will be single GPU design and Bulldozer will do reverse hyperthreading.

What in the world is reverse hyperthreading?

gostriker · Jun 30, 2008

http://www.theinquirer.net/en/inquirer/news/2006/06/23/amd-socket-am2-has-a-secret-weapon

itsmydamnation · Jun 30, 2008

multipule cores appearing as one logical core. personally i think thats BS and if it isn't why would they go that way for CPU but go back to monolthic for GPU even though they have seemed to focus on it.

BRiT · Jun 30, 2008

The alleged benefit of reverse hyperthreading is auto-parallelization. They are able to take a single monolithic software task and scale it's execution across multiple cores without any effort on the part of the software developers. This saves the developers from the tedious and error-prone manual parallelization process and software rewrite using more complicated algorithms.

The reason why they're moving in seemingly different directions is the nature of each realm. Typical GPU tasks are inherently parallel behind the scenes (read: at the sub-pixel level). Typical CPU tasks are inherently sequential.

Novum · Jun 30, 2008

ZerazaX said:
Kinda off topic on this but interseting to see what AMD might have next in line. Kyle at [H] posted this

What in the world is reverse hyperthreading?

It's a joke

shiznit · Jun 30, 2008

I'm waiting to buy my 4870 until more info comes out about this card. If multi-gpu issues are solved I'm there. I just wish ATI would be more forthcoming with information because I would imagine many people like me are waiting to find out if the R700 is really bringing home the bacon. I understand they why they would want to wait but it's not like a Nvidia can pull out a similar card out of a hat if ATI truly has solved the multi-gpu problem.

ZerazaX · Jun 30, 2008

While shared memory seems increasingly unlikely, someone did mention that in this pic: http://www.ocxtreme.org/opb/hd4850/r700slide.JPG

The GPU-Z shows 1GB of memory being detected. Now obviously that's an ATI PR slide and who knows if they photoshopped anything, but a few tidbits is that the core clocks are still 750 MHz (might be changed soon) and that the memory is showing 1GB.

No idea if GPU-Z actually detects the memory itself or if the memory values are in a database, but I figure that detecting memory is one of the easier things to do.

Tchock · Jun 30, 2008

What does reverse HT really help out in the end anyway?

Mitosis seems on the pessimistic extreme end of scaling. Ugh.

I do, however, believe in the R800/single chip rumour. In fact, I had one of them made when RV670 debuted.

entity279 · Jun 30, 2008

itsmydamnation said:
multipule cores appearing as one logical core.

I don't think is technically posible. you would need a global issue unit for this, wouldn't you?

mboeller · Jun 30, 2008

ZerazaX said:
Kinda off topic on this but interseting to see what AMD might have next in line. Kyle at [H] posted this

What in the world is reverse hyperthreading?

Maybe it's the same as Intels speculative Threading?

nicolasb · Jun 30, 2008

Supposedly "Reverse Hyperthreading" is a form of speculative execution, similar to what you see in (for example) the Itanium processor. If you encounter a branch in the code, which is dependent on a calculated value, then, in a conventional processor, you risk a pipeline stall: it has to try and predict which branch to take before the calculated value comes out of the pipeline, and, if it gets it wrong, you have to flush out the entire pipeline to start on the other branch.

With speculative execution the processor immediately starts to execute both branches at once. Once the calculated value comes out out of the pipeline it discards one branch and continues with the other without interruption.

This obviously uses up considerably more total CPU time, but, if you've got an otherwise single-threaded application it allows you to make use of what would otherwise be an idle second processor core - and the result is that you never experience any pipeline stalls due to branch prediction going wrong (and, effectively, you don't even need any branch prediction hardware any more).

Personally I've always been thoroughly sceptical about "Reverse Hyperthreading". Speculative execution works nicely in Itanium, but, like many other Itanium features, it is very strongly dependent on the compiler churning out code that makes use of the feature. The claim with "Reverse Hyperthreading" was that AMD chips would be able to do the same thing with legacy code without recompiling it. That sounds a lot less likely to me. I'm not sure there's ever been any evidence that "Reverse Hyperthreading" is a real feature, as opposed to one dreamed up in the fevered imaginations of especially rabid AMD fanboys.

w0mbat · Jun 30, 2008

ZerazaX said:
While shared memory seems increasingly unlikely, someone did mention that in this pic: http://www.ocxtreme.org/opb/hd4850/r700slide.JPG

The GPU-Z shows 1GB of memory being detected. Now obviously that's an ATI PR slide and who knows if they photoshopped anything, but a few tidbits is that the core clocks are still 750 MHz (might be changed soon) and that the memory is showing 1GB.

No idea if GPU-Z actually detects the memory itself or if the memory values are in a database, but I figure that detecting memory is one of the easier things to do.

The slide is legit.

3dilettante · Jun 30, 2008

nicolasb said:
With speculative execution the processor immediately starts to execute both branches at once. Once the calculated value comes out out of the pipeline it discards one branch and continues with the other without interruption.

The case you're discussing with Itanium is actually not unique. Compilers can leverage predicated instructions to unconditionally fold branch outcomes into a single code stream.
GPUs have predication as well.
x86 and other CPU ISAs without predication can do something similar with conditional moves. x86 could do more if it actually had more registers.

This is used carefully in compiled code because a succession of branches can explode the number of instructions that have to be coalesced into a single stream.

Such branch folding is actually something of a bad thing for OoO CPUs, as predication and conditional moves force an explicit data dependence that dynamic scheduling cannot break.

This obviously uses up considerably more total CPU time, but, if you've got an otherwise single-threaded application it allows you to make use of what would otherwise be an idle second processor core - and the result is that you never experience any pipeline stalls due to branch prediction going wrong (and, effectively, you don't even need any branch prediction hardware any more).

It would also be a waste >90% of the time.
That 90% would be the proportion of branch encounters that are properly predicted by branch prediction.
90% of branches would have the CPU consuming twice the resources and twice the power for the same amount of work.

If this were implemented, I'd bet there would be a predictor structure that basically tracks branches the CPU keeps on mispredicting, and only then would it use such capability.

The claim with "Reverse Hyperthreading" was that AMD chips would be able to do the same thing with legacy code without recompiling it. That sounds a lot less likely to me. I'm not sure there's ever been any evidence that "Reverse Hyperthreading" is a real feature, as opposed to one dreamed up in the fevered imaginations of especially rabid AMD fanboys.

The possibly more realistic claims I've seen indicate a more modest sharing of units. I haven't seen claims about executing down both branch paths.

edit:
Back on topic:

It appears from the way the chips are aligned in the board pics that each RV770 has one side on the die with the sideband port.
Each chip is rotated 180 degrees from the other.
This fits with a two-lane bus of some kind, with each chip's in lane lining up with an out lane from the other.
It does seem to put a limit of 2 chips per board, at least for this implementation.

ZerazaX · Jun 30, 2008

w0mbat said:
The slide is legit.

Yeah the slide's legit, I'm just wondering if GPU-Z detecting RAM is actually just GPU-Z looking it up in a database like other things or if it actually is detecting 1GB for that GPU. So either the R700 is 2 x 1 GB or 1GB shared... hm!

ShaidarHaran · Jun 30, 2008

ZerazaX said:
Kinda off topic on this but interseting to see what AMD might have next in line. Kyle at [H] posted this

LOL, wrong on both accounts. Kyle must have the same AMD sources as Fuad.

ZerazaX said:
What in the world is reverse hyperthreading?

Something that doesn't exist, sadly. It's the "holy grail" method of extracting instruction level parallelism for multi-core CPUs. I started a discussion on RHT @ RWT at the time that Inq article came out and it was soundly trounced. Great idea, simply unrealistic to implement.

The idea was to create a control scheme by which a single thread could be parallelized across multiple homogenous cores, presumably in an x86 CPU (hence the name).

ShaidarHaran · Jun 30, 2008

3dilettante/nicolasb:

WRT your RHT definition(s) and speculative execution/full branch traversal (multiple independent branch path execution) jeez that's a mouthful

Do not several in-order execution CPUs lacking predication hw do this already? Cell, for instance.

ShaidarHaran · Jun 30, 2008

ZerazaX said:
Yeah the slide's legit, I'm just wondering if GPU-Z detecting RAM is actually just GPU-Z looking it up in a database like other things or if it actually is detecting 1GB for that GPU. So either the R700 is 2 x 1 GB or 1GB shared... hm!

I haven't used this pic in months!

Moohoohoohaha!

Come on, show me a shared memory architecture you mo fackies! Give consumers the biggest revolution in GPU performance - ever.

A.L.M. · Jun 30, 2008

ShaidarHaran said:
LOL, wrong on both accounts. Kyle must have the same AMD sources as Fuad.

I don't think so. Kyle was probably the first to back up, like 6 months ago, the theories about a R700 that was not only a CF on a card. Which now seems extremely likely.
I think that with that post he's hinting to the fact that R800, even if it will be a multi-gpu card, it will be viewed by the system as a single gpu card.

AlexV · Jun 30, 2008

A.L.M. said:
I don't think so. Kyle was probably the first to back up, like 6 months ago, the theories about a R700 that was not only a CF on a card. Which now seems extremely likely.
I think that with that post he's hinting to the fact that R800, even if it will be a multi-gpu card, it will be viewed by the system as a single gpu card.

Or he's simply joking.

R700 Inter-GPU Connection Discussion

trinibwoy

Meh

ZerazaX

gostriker

itsmydamnation

BRiT

(>• •)>⌐■-■ (⌐■-■)

Novum

shiznit

ZerazaX

Tchock

entity279

mboeller

nicolasb

w0mbat

3dilettante

ZerazaX

ShaidarHaran

hardware monkey

ShaidarHaran

hardware monkey

ShaidarHaran

hardware monkey

A.L.M.

AlexV

Heteroscedasticitate

Similar threads