ClawHammer: Everything Goes As Scheduled

So this Clawhammer discussion has turned into VIA bashing? Hmmm.

Anyways, yes, 64-bits of DDR PC-2700 memory should seriously hamper the performance of the clawhammer, particularly as it climbs in clock speed. However, it should have a significant latency advantage over the P4 which should be more important for many tasks. It is also reasonable to assume that the memory controller on the Clawhammer will evolve, or have further features enabled with time. It may also be that AMD will improve on-chip caches to help make the core less dependent on the memory bus.

No matter how I try though, I can't see a single 333 MHz 64-bit path as a wise move vs the 533 MHz 64 bit path of the P4. That's just too much of a difference, and in the wrong direction if the Hammer core is as efficient as it seems.


I _still_ haven't seen any authorative statement regarding the clawhammer pin-outs. The Clawhammer has 754 pins out and the Hammer has 940 pins (Athlon has 462). As far as I recall the pin requirements of HT, this difference doesn't cover the needs of 2 HT links and another 64-bit memory controller, implying that the Clawhammer may indeed have hardware support for two channels although this board only implements one. Or, of course, I may be decieved regarding the pin requirements of the Hammer HT links.

I'll try to get this clarified, but if there is someone here who KNOWS, please contribute.

Entropy
 
Some hammer pics (japanese site): http://www.watch.impress.co.jp/pc/docs/2002/0227/idf04.htm

Any translation? :smile:

Dual Clawhammer configuration is really possible.

Some Ace´s news:
Details on AMD's Hammer Demo (AMD)
By Brian Neal
Wednesday, February 27, 2002 2:44 AM EST
We reported earlier that AMD had demonstrated a Hammer-based system yesterday, but at the time there was very little in the way of details regarding exactly what was shown. But now, a great deal more information has been revealed, including several pictures of the demoed systems.

Two Clawhammer systems were shown running various applications, with one system running Windows XP (along with Word and Excel) and the other running a 64-bit x86-64 port of Linux. Both systems were shown to be running various applications, and in the case of Linux, two versions of a simple X11 demo app ran simultaneously, one a 32-bit binary and the other a 64-bit binary. The CPUs are reportedly A0 level silicon and are no more than a month old. The demo systems were not run at full clockspeed, but supposedly they ran at least as fast as other 64-bit CPUs currently on the market (most likely a reference to Intel's Itanium).

A full overview of AMD's Hammer demo can be found here at Anandtech. On the second page of the overview, there are several photographs of the undersides of both the Clawhammer and Sledgehammer CPUs. Clawhammer is reported to have a 754-pin package, while Sledgehammer has a 940-pin package. According to the article, much the the difference is due to the additional HyperTransport channels and the memory interface:


By far the most interesting thing about the CPUs from a physical standpoint is their pincount. The ClawHammer has 754 pins (up from 462 on the Athlon and even up from 603 on the Xeon) and the Sledgehammer has a whopping 940 pins which is just over twice as many as the current generation Athlon.

The majority of the pin increase when going from the ClawHammer to the Sledgehammer is apparently due to the two additional Hyper Transport links and the dual channel 64-bit DDR memory controller vs. the single channel controller on the ClawHammer. Needless to say that manufacturing these things should be interesting.
Apparently, the systems ran on A0 chipset silicon (AMD-8000 chipset) and, as such, the AMD-8151 AGP interface was not functioning properly. The I/O hub worked well, however, so the demo systems ran with PCI video. The reference board itself is a 4-layer design with two DDR SDRAM sockets, one AGP slot, and four 32-bit/33 MHz PCI slots.

Additionally, PC Watch Japan has published an overview (in Japanese) featuring a number of pictures of the Clawhammer and Sledgehammer CPUs as well as the AMD "Solo" reference motherboard for Clawhammer, based on the AMD-8151 AGP graphics tunnel and the AMD-8111 I/O hub (click here for details on the AMD-8000 chipset). In particular, there are some excellent photos of the CPU packaging and sockets.

Thanks to Bluga and KH for the links.
 
Damn, pascal, you beat me to it:) I was gonna suggest we stop bashing VIA and focus on clawhammer/sledgehammer, per anandtechs article.
Looks nice, i want a sledgehammer in my box!
We'll see how perf. pans out...
 
I agree.

I would like one too, with 1GByte ram (no more VM), an edram 3dcard, dual 17" plasma display, and dolby 5.1 ;)

But my wife will probably disagree with my little upgrade.

Did you liked the small chip packaging ?
It is cool :cool:
Could this packaging be used to develop a 256bit 3D chip?
 
Saem, I'm sorry, but your trolling about VIA not being stable is wrong. Dell *DOES* use VIA in one of it's servers.

http://www.dell.com/us/en/bsd/products/model_nasto_2_nasto_715n.htm

If Dell can make it work, it ain't that bad. Intel made people lazy with the 440BX. ALi and SiS are not cheap POS chipset manufacturers, either. Inexpensive they maybe, but it's entirely possible to make a good, solid inexpensive system using one of their chipsets. High performing? Maybe not, but obviously it suits the needs of enough people where ALi and SiS find the chipset business profitable. Quit being such a techno-snob.
 
OK, I'm replying to my own post, but this is as far as my inquiry on Aces' got me.

The first thing I realized after posting was that Anands' article is wrong. The clawhammer does NOT have two HT links less than the Sledge, only one. It has an I/O plus an interprocessor HT link, enabling dual (but not higher) multiprocessing without further glue. I was thrown by the error, and didn't figure out the weirdness immedieately. But there still is something odd going on, as (quoting hattig):
"The pin difference is 186, and that includes one more DDR bus (112 for data and address alone) and one more HT bus (103), which would normally use up 215 pins on their own.

So yes, pins must be shared between either the HT links, or the memory channel links."

And as far as I'm aware an actual implementation of a 64-bit DDR bus could actually take even more.

This is the best model Ive seen though, to explain the pinout. I'm still not happy with the lack of official info.

If, as it seems, the Clawhammer is limited to a 64-bit DDR interface, that's a serious limitation. Wonder what the odds are for affordable Sledgehammer motherboards. :smile:

Entropy
 
Well, here's a question:
Because Clawhammer has a HT link for Dual Procs, couldnt a Dual Proc mobo utilize EACH clawhammers DDR controller to link to a seperate 64bit bus? So that its a dual 64bit bus? And then access data from the other bus via the interprocessor HT link?

Or is this implausibly complex, like i think it is?
 
BenM, I'm not trolling, so STFU with your personal attacks.

As for you absurd theory about high quality for low price, get a clue. It'd be understandable if the price differences were small, but they're not. The fact of the matter is Via, SiS and ALi produce cheap lower quality chipsets because they don't do as much QA as Intel. It doesn't explain all the cost difference it explains a good portion, however.

BTW, back to the troll comment, except for the one Dell bit, all you did is make personal attacks, if your are the yard stick I'm far from a troll.

Oh, yes, here is a little stats lesson for you, a sample size of 1 is worthless. It's very reasonable to say that they are not getting past QA. As for your example, I suggest you scrutinize it, realize that the chipset is rather OLD, what's important is that it's likely been through many hardware/software revisions. Excessive revisioning is BAD!
 
Saem,


I believe what I said was that SiS, ALi, and VIA make chipsets cheap enough to suit people's needs. I think I did even mention that they are not always top performers.

And by the way, I didn't ask for a stats lesson. I think you have a tough time admitting that you are wrong in your blanket statements.
 
All right, now for the Hammer.

Any sort of latency advantage that the K8 holds over the P4, isn't important. The P4 and K8 are almost diametrically opposed in design philosphy to compare sub-systems seems -to me at least- pointless. The K8 needs a memory sub-system that keep it fed, not necessarily well fed, but quickly fed, it's clock cycles are worth more than say the ones of a P4 and thus it needs information NOW, not necessarily in large chunks. The P4 with it's deeper pipeline and large instruction buffer won't be as inclined to such a memory sub-system. As you can see, it's kinda hard to say, there is any advantage.

I think this will be a time where a need for a standardized set of benchmarks such as spec, but with far broader code will be needed. This is very much apples to oranges comparisions, the only thing that will be relatable is performance. Also, many will have to put to rest, Spec is a compiler benchmark or SpecFP is far too memory bandwidth sensitive, compilers and memory bandwith are all FAIR parts of the performance equation, when was the last time you ran a program that didn't access any memory at any point in it's execution and it wasn't compiled?

Back to Hammers "skinney" bus, I really don't think it will be a problem, I think the Hammer will be able to use many techniques to use bandwidth more efficently and perhaps even go so far as to implement some really kewl prefecthing algorithms. It'd be interesting to learn whether they've done the aforementioned thing or something better.

Althronin,

I'm not sure about your idea, I can think of one example as to why it would work and one reason as to why it wouldn't.

First the example, I believe PPCs have the ability to share their L2 cache amoung themselves when in SMP configurations.

The reason, if you have 4 processors, how do you figure out which one will have the information you're looking for in RAM, even if you can, could this lead to insane amounts of bus traffic?

Neat thought.
 
If I understand correctly, each Hammer CPU can link to its own memory. There is a crossbar inside the embedded northbridge which will reroute the memory requests from/to other CPUs.

There is some explanation inside thispresentation.
 
That is correct. The Hammer system architecture is Non Uniform Memory Access (NUMA). Local memory will be faster than remote memory.

Clawhammer will have 2 HT links, enabling 2 way systems. One link on each Clawhammer used to link up to the other CPU, the other link used for AGP, PCI-X. Look at page 40&41 in the above presentation.

Sledgehammer will have 3 HT links for increased connectivity, trivially enabling 4 and 8 way systems.

The system architecture of these CPUs will enable cheap 2, 4 and 8 way systems (compared to today).

Cheers
Gubbi
 
On 2002-02-24 01:27, Dmitry wrote:
It will fail for several reasons:

1) Costs to produce such CPU must be high, and people do not appreciate paying premium price for AMD cpus.
The source of expense is the number of pins, hence the packaging.
2) Noone is going to compile programs to take advantage of its propritary 64bit instructions, look, even the intel 64 bit CPU is not doing well.
Who says AMD's will do worse? this chip will actually run 32 bit stuff at a decent speed.
3) DDR memory is pathetic.
I think not.
4) In order to upgrade to that CPU the poor XP owners will have to replace motherboard, cpu and probably RAM, that in case if AMD wakes up and starts using RDRAM. For intel owners it will be just the matter of replacing the chip.

Well Intel have woken up and dropped RDRAM, they'll have to replace all their kit? well of course they are gonna have to change the CPU and mobo, but why the RAM?


What about em?
 
Intel only dropped Rambus in the value platforms (for cost reasons) and in the server platforms (for scalability reasons).

The high end P4 workstations will continue to use Rambus.

On Hammer platform cost:

The Clawhammer demoed at IDF was on a 4 layer PCB, which is standard el cheapo stuff.

So one way and two way clawhammers will be dirt cheap (certainly cheaper to produce than traditional 2-way systems).

Also. The HT-to-AGP-tunnel/IO-hub will be so simple to do that even VIA can't fsck it up.

Cheers
Gubbi
 
More Hammer news from Aces:
SledgeHammer and x86-64 Details (AMD)
By Brian Neal
Sunday, March 10, 2002 1:19 AM EST
Thanks to NoSpammer for posting about this article (German) from c't regarding AMD's Hammer demonstration during the recent IDF. The article reports that aside from the two ClawHammer systems, a SledgeHammer system was also shown privately. As we have heard in the past, Andreas Stiller reports that SledgeHammer and ClawHammer are differentiated by a dual-channel and single-channel DDR SDRAM memory interface, respectively. To this end, Mr. Stiller assumes that at least this initial version of SledgeHammer is a single-core processor.

Additionally, first tests indicate that binaries compiled for x86-64 64-bit targets exhibit a 15% increase in performance for a 5% increase in code size. As you know, the x86-64 specification features a larger number of architectural registers: 16 as opposed to 8.
See this german link http://www.ix.de/ct/02/06/070/
 
This is interesting, seems that x86-64 is considerably smaller boost than I expected, I was hoping for nothing short of 20%, then again, they don't say what compiler they use (most likely gcc). I suppose over the next few months there will be improments to it and an extra 5% will be attainable.

BTW, I learned something interesting, Hammer is built with 9 metal layers as opposed to the K7 and P4 6 metal layers, seems the rather small die size the hammer has is due to the this. But I'm not suprised AMD went this route they only have on advance fab (Dresden) which will be able to initially produce Hammer so die size is one of the most important factors.
 
AMD has 1 year to optimize the compilers before the Hammer mass market launch. I hope they will do 20% too :)
 
Back
Top