The performance numbers seem curious, to say the least.
It's no surprise that a successor to K8 would do well in memory bandwidth benchmarks, though I don't recall whether 3dmark is bandwidth-limited.
Not knowing how exactly they overclocked the chip, we don't know how heavily they overclocked the memory bus, and we don't know how changing the on-chip clocks affects the latency of the L3. It is possible from these unknowns to get a superlinear performance increase, if the L3 or memory timings were off in the lower-clocked results.
It shouldn't matter too much, as even a single Conroe die has about as much cache (all of it faster) as Agena. 3dmark must fit well within the constrained caches of K10, if this the case.
The score seems too high, regardless.
30k would give a 10% advantage to Agena, which would come from a chip running a third slower than the top Intel score.
Couple that with the modest graphics overclocks (possibly) weighing the 3dmark score more to the CPU, and it would look like an IPC lead of 30-40%.
That would of course mean that K10 would have to have an IPC improvement over K8 of
>50%.
Something seems wrong with this.
None of AMD's literature or highest projections has an IPC improvement of that magnitude, and nothing disclosed thus far about the design should give anything near that improvement.
If it did, AMD would not be pricing Barcelona so close to parity with Core2 server variants.
More likely, Intel's FSB and MCM strategy hits a wall or the OCed example has a bottleneck somewhere past 3 GHz.
It's possible that there is some kind of cache thrashing over the compatively slower FSB between the two separate dies, because there is no shortfall in execution resources or scheduling ability.
If the succession of scores leading up to the Intel one show good scaling, then I'm going to believe there is something wrong with the unsubstantiated 3dmark score before I'm going to give credence to an IPC improvement that has not been substantiated anywhere.
edit:
My numbers are likely conservative. Since 3dmark weights the GPU side of the score more heavily, any change in the contribution of the GPU side of the score would be worth twice that of the CPU contribution.
If Agena did make up all that ground with the Intel system, it would have to at least double the IPC over K8.
Why the CPU score was not given confuses me.
edit edit:
Actually, it's worse than that. It's 2 shader tests + .5*CPU to get the final score.
Folks around the web trying to do the math keep hitting impossibly high scores needed to break 30k with the GPU setup the system supposedly had.
If the CPU were that competitive, it could launch at 1 GHz and be a kick-ass chip.