The half-full is the multithreaded server situation, BD's primary target.3dilettante you really seem to know your stuff but it seems your coming from a point of view of looking for reasons that BD wont be able to compete. How about putting the glass is 1/2 full cap on and thinking from that perspective of what information hasn't been realesed yet what do you think BD would have to do to equal or excessed SB performace.
The reasons why BD could have problems competing in the client space are public. For whatever reason, AMD has decided clients need not know any details on why it will win against SB, much less a 22nm shrink to Ivy Bridge that will likely be the actual competitor.
Clocks--the most critical element in a design that targets higher clock speeds at the expense of execution width and latency, both of which are inferior to the competition that will likely be replaced by the time it does reach the client space. The disclosed clocks for the competition are very good.
Die size: while not something users care about directly, it will go a long way in vindicating AMD's strategy. Core die size efficiency will help, though cores are but one component of the overall chip. It will be interesting to see how it compares to SB, and then in the client space to Ivy Bridge, which will be a full node ahead.
That aside, SB is ~225 mm2. Zambezi, the only BD we're seeing at all in the non-server market in 2011 is very likely to be larger than Westmere at ~248.
FP throughput: the FPU's issue capability is closer to that of a single core, and it has read capability no better than SB in the best case. The best-case is 128-bit SSE in a Zambezi 8-core versus an SB 4-core. Hopefully the FMAC units can be split to offer separate ADD and MUL capability.
Cache subsystem: The disclosed latencies put a ceiling on what can be done by the undisclosed parts of the L3 and uncore. The L2's capacity is the good part, its long latency is not. No matter how awesome the L3, its contribution is additive to the L2. The tiny L1 Dcache is on a per-cycle basis measurably worse than what preceded it, and its fallback to the L2 is worse on that basis. It comes back to clocks.
BD has been touted as being effective for server loads, with L3 and uncore optimizations for multisocket and high-bandwidth situations. The desktop market does not prioritize these.
His primary domain is the server market, and there is a limited amount he can say about the client side.John Fruehe has stressed severial times there is some stuff in the BD design that has not been disclosed yet spercificly designed around single thread performace. how about given what you know of the design so far take a guess on what they might be.
The bulk of AMD's promises are that it will be better than its predecessor, not relative to its competition.
Fruehe has not promised that it will match or beat SB in client applications.
I was more bullish on BD in the desktop market before some of these details came to light. The confirmation of Zambezi being the best AMD can offer on desktop in 2011, the cramped FPU, the ridiculous AVX/FMA/WTF fiasco, continued process delays, the server focus, etc.
Then there was the disclosure of details on SB, which has very good features, clocks and a 3-4 quarters head-start in the client space.
I don't care much about the IGP, but the rest of the architecture is very solid and it has no need for "secret sauce improvements" to potentially impress me in the future.
I am allowed to be more down on BD with the appearance of more information.
SB looks better than I expected, and BD looks worse than I had expected.
Last edited by a moderator: