The LAST R600 Rumours & Speculation Thread

Status
Not open for further replies.
OK, well, if you've got a decent guess why this is asymmetric, I'd be interested. Of course, it might turn out that version 1.0 is asymmetric and version 2.0 (R600) is symmetric?...

I'm afraid I don't follow. My thinking is that it isn't assymetric at all in terms of performance. It is in terms of logic, because all the 'intelligence' is at the issue side.

[... I have to let you go when you're talking about pure GPU specific issues. Not my thing... ]

Agreed with all that. Additionally the sequencer(s) (which determines instruction issue to ALU, TMU, VFU pipelines) knows how much work it's got outstanding (i.e. it knows how much latency it can cover) so it hopefully intercedes in the MC when those low priority tasks suddenly go up in priority because their input queue is about to fill up.
That's part of pretty much every MC: an emergency signal (or signals) that raise hell when the need is high.
I suppose in this case what you're saying is that the MC is distinct from a memory interface unit whose only task is to multiplex to/from the pads.
Yes, exactly.
I suppose it's worth noting that each ring stop in R5xx has a lot of wires of its own:
  • 1024 to form the ring connections to its neighbours (presuming it has two 512-bit interfaces, one for each ring stop on either side)
  • data path for texturing data
  • data path for ROP reads (shared with texture data?)
  • write data path from the MC
  • control signalling from the MC (each ring stop is under the command of two sub-MCs)
At lot of wires, yes. But very regular, so it's probably not a big deal.
 
I'm afraid I don't follow. My thinking is that it isn't assymetric at all in terms of performance. It is in terms of logic, because all the 'intelligence' is at the issue side.
What's puzzling me is the data routing.

Write data, from a ROP, goes via the central MC (or at least a central crossbar) to the requisite ring stop. This data doesn't travel around the ring. Now, that might be because the ring is busy enough with read data, or that to make the ring big enough would be detrimental, overall. But it just seems puzzling.

It means that each pixel shader (12xALU + 4xTMU + 4xROP in R580) has one data bus that connects it to the ring stop and one data bus that connects it to the MC. Instead of a "double-wide" bus that connects it to the ring stop, and a command bus (tiny) that connects it to the MC. Then you have the write data from MC to ring-stop, the implication being that each ring-stop has a full speed dedicated data connection to the MC.

Cell's EIB isn't asymmetric in this sense - the ring does all data transport. The big difference with EIB is that it's not solely for client<->memory data, but also for client<->client data (hence your point about the intelligence all being on the issue side, which is the MC). It may be that EIB, being 2 pairs of rings (i.e. 4 in total, 2 contra-rotating) just has the brute capacity whereas R5xx's ring bus doesn't. Or, it may be that EIB has to cater for more clients, so symmetry is a simpler solution.

I think EIB's meant to easily sustain 200GB/s, whereas I don't know what the MC+ring bus is targetted at - but in R5xx it would appear to be in the range of 60-100GB/s. So perhaps double that for R600

So, this puts my focus back on the large die area of the MC, what's it all doing? If the MC is as big as R520's ALUs+register file, and R520's register file is in the region of 400KB (split four ways, say 100KB per pixel shader unit), does that imply about 0.5MB of memory in the MC? 1MB?

Jawed
 
Well it looks like he either heard it in action or intercepted the claim from AMD to its AIB partners.
 
Thank you :smile:

At least what I think about the metal layer as a layer of PCB concept to myseft is no so bad. I just think it may be possible.
Its close, but there's no (or as little as possible) white space between transistors/components on the die. So, think of it as a PCB with nothing but components on it. ;)

So, there would be no different between the cost of 10 metal layers to 12 metal layers die? and only area of the wafer would be the main concern of its cost (or margin), isn't it?
No, each layer of metal adds cost and time in fab. I'm not sure of the exact ratio, but its in the several percentage points per layer range, and an extra day or so for each layer.

The more layers, the more difficult a FIB can be during chip bringup, also. (though I don't even know what sort of FIB options there are at 45nm, so maybe that's not an issue).

Of course, on the other hand, when routing gets tough, its the easiest thing to do to just uncontrain the autorouter and say "ok, one more layer of metal". Its certainly easier than redoing your floorplan and growing the die size.
 
http://www.theinquirer.net/default.aspx?article=37039

This time, honours go to the manufacturer SilverStone. The company brings us the ST85F, a modular 850W power supply which brings support for four 6-pin PCIe power connectors, or two 8-pin PCIe ones.
Yes, the R600XTX will not require this 8-pin connector, but the honour is reserved for another three, more compact brothers which cannot live with single 6-pin one. Having two would result in a less-compact PCB.
So, if that's to be believed, the 8-pin PEG2 socket will "save board space" compared with 2x 6-pin PEG1 sockets. Seems plausible.

Jawed
 
// plugging in after not being around for a while

So we still have no leaked benchies or any kind of serious info? This is kinda disturbing, or have I missed something?
 
// plugging in after not being around for a while

So we still have no leaked benchies or any kind of serious info? This is kinda disturbing, or have I missed something?

Nope, you haven't missed a thing. :cry: That's not looking good for a February release.
 
Jawed, your are right on the 8 pin PEG that can be used with 2 6 pins PEG!!
The INQ said:
Now, here comes the fun part: of products mentioned here, three require a new power connector, the 8-pin one. The 8-pin was also the reason that ATI managed to get its models in 9-inch size, but we already wrote about that one. Top brass, the XTX - does not care for 8-pins, it wants dual 6-pin rails, just like the 8800GTX. Boards will come with adapters, or you need one of these babies.

Here, the more infomation from the INQ on R600 is X2x00...
http://theinq.net/default.aspx?article=37040

They said R600 will come in 4 versions... the big one (12") is the top one and the other will be on 9" size board. There will be one of them being a single slot cooling :cool:

If all these is the right info, the leak on R600 will come soon...

Edit: Typo...
 
The status of "XTX" is a bit confused I think. For one thing, there's the earlier report that the 12" version is an OEM special for companies like Dell.

According to earlier reports, there'll also be a 9" XTX.

So, I'm a bit wary of inferring too much from today's L'Inq. At least yet.

Jawed
 
What would be the advantage of a 12"-er to OEMs over a 9"-er? Would it be cheaper to make?
I can't find the previous discussion of this :oops: not sure where I saw it.

The upshot was supposedly it's secured front and back within the case (I guess just like the original ISA slot :oops: ) which prolly helps when you're sending the entire system by courier. Also, I think because it means hot air is vented out of the case, so it's cooler-running and more reliable.

Jawed
 
What would be the advantage of a 12"-er to OEMs over a 9"-er? Would it be cheaper to make?
There would be no advantage to OEMs at all. But there might be an advantage to ATI - for example, that the board is cheaper to make. You can't sell a 12" long card at retail. If you did then countless people would buy it, realise too late that it didn't actually fit in their case, then demand their money back and go and buy an Nvidia card instead in a fit of pique. So it's not a case of "OEMs will benefit if we do a special 12 inch version" it's "we have a 12-inch version already, but we can't risk selling it except via OEMs".
 
Did you miss this juicy morsel?

I don't really believe that, though it might be some internal test sample from the early dev phase. That would be really ridiculous for the retail market, noone is THAT crazy to try to sell that.
 
  • Make the die bigger than originally planned. ;)
  • Redesign some blocks and make them more efficient. (A lot of work, of course)
  • Reduce some of the on-chip memory. Maybe in the first iteration you were too conservative?
  • Increase routing density. Unlikely, as this becomes harder when going to smaller densities.

I don't see how that would work? It's really exactly like a PCB that's full of components: going from a 12 layer board to a 10 layer board wouldn't do you much good.
Another, quite viable option which fits into your first point is to build in more redundancy - if you're shrinking to a new and not very settled 65nm process maybe a very wise decision.



That's NV's claimed power consumption and not measured by xbit labs. Ironically NV also claims 143W for the GX2, yet according to xbit labs' measurement it's only (supposed) 110W. What is what exactly?
Sorry for excavating this, but xbit stated in the original GX2 article, that their 110W also was a guess based on the other GPUvariations.

The XBit Labs test relates that NVidia's spec is ~145W, as compared with the 150W that a single PEG connector + mobo can supply, which is why I describe it as marginal.
145W is not Nvidias spec. It is an arithmetic mean measured in a variety of applications as shown on editors day back in october. They also said that they had the card in their labs consuming about 180W - hence the double six-pin connector.
 
Status
Not open for further replies.
Back
Top