Will Microsoft trump CELL by using Proximity Communication in XB720 CPU?

Yeah, that's no point at all. Even the 8 year old PS2 runs Havoc in games! It just depends on the amount of objects that are involved.

I for myself consider Motorstorm as the current benchmark for ingame physics and effects. And i've yet to see any game on 360 or PC that comes even close to this combination of computations as shown in that game (that is 14 high detailed cars in game, loads of wrecks or buildings you can drive through (as every other of those 14 cars) and crash in with correct physics + real time mudd with loads of muddy shader effects on it everything at a perfect framerate including AF and very good textures ). Plus this is only a launch game!

I don't really understand. You say "that's no point at all" then reply with what amounts to an "I really love the PS3, me, I do" post that doesn't even begin to address the point I was making - why HL2 isn't much use as an example of how much processing power you need to do physics in games and why you'll never be able to pin down in a meaningful way what the "games experience" benefits of Cell (or anything else) are.

It's not about being able to run Havoc on a PS2, it's about the use of physics in games and the way it's used and perceived.

I'm sure you do consider Motorstorm the the current benchmark for ingame physics and effects, but that's probably based on your love of the PS3 and no hard evidence about actual calculations processed - which kind of supports my point about how little "real" benchmarks actually mean and matter in terms of the games that influence the public.
 
Thank again gubbi for your explanation about smt ;)

This would had a lot of troubles (that you spotted).

Anyway these cores are 3 ways, so they can issue three intructions per cycle, while a xcpu can only issue two (am I right?).
so xenon peaks at 6 instructions per cycle while the hypothetical quad core that I'm speaking about would be able to output 12 instructions per cycle.
So a 1.6Ghz quad could output as many instructions per cycle as the xenon.

And if I'm right in OoO is a better way to keep execution units busy than smt (in most cases).
Gubbi you said that for implement smt properly you need a lot more registers and a more complex caches, how OoO implementation compare in terms of transistors count or die size?


Deusp,this chip are supposed to come by Q07, but except the fact designers aim at 65nm a can't see why these chip could have been made at 90 nm for fall 2005.

With the money MS has invested in xenon I feel like these chip could have been designed in time.
By 2005 a three cores 3.2Ghz cpu was quiet impressive even given the "in-orderness" nature of xenon ( my english sucks but I can't help it I have to use neologisms .. lol).

As far as execution is concerned, I think that Ms didn't aim to the right goals.
Or was probably aiming too hight, hight clock speed, multi core, hight Gflop peak, etc.
The more I think about it the more I feel like Ms could have had for the same transistor budget (or die size) as xenon a 1.6 Ghz quad cores with between 1 or 2 MB of L2 cache, or maybe a three cores with higher clock speed.

How long the design of these power efficient ppc have taken?


I really wonder how PR and marketing choices interfered during the xenon design.

could we supposed that these cores could have been done in the time frame Ms was aiming for.

What would have been the web (and Sony PR) reaction if MS came to E3 2005 with a 4/3 cores cpu running at 1.6/2 Ghz, while Sony in the same time announced
a cpu running at 3.2 Ghz with 9 cores?
Sadly I guess that the xbox 1.5 argument would have hurt MS even more... (even if this cpu would DESTROY xenon...)
Clueless fans would have even more bite in it.

Deusp you said that one of its core i able to output 32 Gflop/s (SP), at witch frequency? (just for back up my theory (and a lot of people it seems) about Ms running after PR numbers...)
 
Last edited by a moderator:
I really wonder how PR and marketing choices interfered during the xenon design.

Probably none.

I'm not sure where these conspiracy theories of 'design for PR' come from; MS was trying to achieve a high-Flop design - contrary to their modern position Flops were a concern, and MS was in heavy pursuit - XeCPU was their result (and remember not the one they originally hoped for). It obviously is not the ideal, but at the same time it is the final result of a design path they consciously went down for performance/licensing/cost purposes... and not for PR. You work the PR around your engineering, and not the other way around - anything else is designing for failure.
 
Last edited by a moderator:
http://www.washingtonpost.com/wp-dyn/content/article/2005/05/14/AR2005051400202.html

Seamus Blackley said:
"Once you get to a point where you can draw anything you want -- and we're about at that point -- it just becomes a competition for who can draw the best stuff" in a game, said Blackley, now an agent at Creative Artists Agency of Beverly Hills. "Nobody needs a teraflop in their living room."

I don't agree, but I remember that being an interesting statement at the time, in 2005.
 
I agree with Joker (or he agrees with me as I have said it before :p ) but I don't see why MS couldn't go with a Cell design. The focus being on "a" Cell design. Cell is pretty flexible--nothing is really stopping IBM from changing the PPE, having a mix of SIMD and Scalar SPEs, or going with a variant PPE/SPE balance (e.g. the "Cell 2" slides from IBM indicate a 1:16 PPE-to-SPE balance). You can think of it like Cell versus Xenon: Cell has 1 Waternoose CPU and 7 SIMD SPEs while Xenon has 3 Waternoose CPUs with enhanced VMX units. So there are similarities and differences. I don't see why Sony couldn't go a route with 4 "PPE version2" and 64 SPEs and MS take a route with 8 larger CPUs and 32 SPEs.

MS could benefit from the mature SPE libraries and developer experience. The SPEs will allow some mutualness in the console space --which benefits both MS and Sony -- but a design that is also leans heavily toward traditional CPUs (OOO?) MS keeps their "land bridge" to the PC intact. PC-centric titles could focus on the 8 traditional cores and leverage some of the more superficial libraries and multiplatform middleware.

Cell has some clean solutions for parallelism that can handle some of the more daunting issues facing parallel designs now and in the future, especially communication and memory management. Cell has some excellent performance density which will be difficult to match with other designs -- a traditional CPU design would almost surely need to be larger and a new design will meet with similar difficulties developers are having with Cell (although some of that is parallelism related and how fully utilizing Cell requires exploiting parallelism to a larger degree due to it having more CPUs).

MS at least will at least consider a Cell design. You can bet on it for this reason alone: They can use it in negotiations with AMD and Intel.

But there are a lot of wildcards right now that must be explored. I personally would get some kicks out of Sony going a route with a strong emphasis on Cell/CPU with a NV GPU and MS going with a strong emphasis on the GPU. If GPUs continue to evolve and tackle more tasks traditionally done on the CPU it may be in MS's best interest for PC<>Xbox3 synergy to leverage more robust GPUs that do a lot of the heavy lifting.

As for BluRay, by the time 2011 rolls around there won't be any significant reason not to use BluRay. It will be affordable with robust media support and will be even faster than current models as well as possibly support even larger future BluRay formats. Of course I think MS will be angling for online connectivity and large physical storage, but unless we see significant enhancements to last mile technologies and market penetration, as well as thoroughput, and the emergence of cheap solid state technology (either at launch or to quickly replace the traditional HDD within a couple years of launch) I don't think MS could go with an online-only distribution model. So BluRay very well could be in the cards.

Another area worth considering is the memory designs of the new consoles. Those will be just as significant as the CPU or GPU designs, if not more so. I think we will see eDRAM on the GPUs (but large enough to NOT require tiling at 1080p). But the system/video memory will be interesting.

Of course we may see something very uhmmm crazy, or we may see MS go the route of more managed code. What direction MS wants to go in the market and the demands of developers will be at the forefront. And absolutely Live will be a central aspect of the design. Interface and Services will be much more significant factors going forward for all the companies as they expand their market goals and audiance.
 
It could make sense but it would be hard to differentiate the systems from a marketing point of view.

More Ms is a software company, cell design put a lot of burden on the devs and softwares (memory management comes in mind).

After reading this http://www.beyond3d.com/content/articles/31/
It seems that intel doesn't want to put too much burden on software and devs
(ie providing coherent and fast L2 cache and a coherent ISA for cpu/cgpu).

Interestingly, Intel speaks of 10 nm² die size for its gpgpu/cgpu upcoming architecture it's almost the same size as some power efficient ppc cpu we've been spoken earlier in this thread.

I could see Ms going in a more fusion like design with somes (4) strong OoO cores
and some cgpu cores.

But the real question for me is more where the gpgpu/cgpu will be at this time?
What will be the better match for the workload next gen will be aiming at?

I'm not knowledgeable and I will try to explain my concerned on this matter as clearly as I can so more knowledgeable members will be able to give me more clues ;)

So we have in term of computational resources (for a system aiming at gaming/multimedia functionalities) :

1)Strong OoO super scalar cores : strong single thread performances cores

2) throughput cores :
*IBM spe or Intel larrabee and whatever AMD will throw in,I will call it cgpu (in regard to discusion related to the article I linked here)

* shaders cores that become more and more flexible, i will call them gpgpu.

3) specialize units :
texturing units, rop, etc. I could add networking acceleration, decompression etc.

what die size ms or sony will use for their next system?
Next gen will be burnt at 45nm²? 32nm²?
Anyway there will be less room for cost reductions via shrinking,shrinking will become more and more expensive... very few manufacturers will be able to burn these chips.

So for me both cpu and gpu (if it still makes sense at this time) will be tinner than today xenon/xenos and cell/rsx.


For 1, I red that there is no need for more than 4 strong "conventional" core
For 3, they will be needed no matter what.

the real problem is 2.

.....How well a bunch of mini-cpu/vectors monsters will perform at graphic jobs in regard to pure gpu?

.....How well a a bunch of shaders cores will perform at gp calculation (physics,IA,etc.) in regard to let call it cgpu?

Depending on the gap in efficiency and the die size allowed for the whole system does having both make sense?
Some jobs could be made or on the "gpu" or on the "cpu".
This could had some burden on the software side of thing.

So that my question is given die size/power dissipation envelop and the gap between gpu and cgpu :

does it will make sense for next gen system to have 2 vectors monster chips?

ie for Sony (sony is likely to use a cell like design) does it will make sense to have a strong shaders cores if they already have a bunch of spe?
 
I want to add that I don't think that a single chip could offer the proper performances for next-gen.

It's more like instead of a cell like cpu with a proper gpu, we could have 2 fusion like (I mean a mix of standard and throughput cpu like cpu no matter what the implementation is)with a third chip taking care of texturing, ROP operations, AA.

If Intel and the like provide an unified ISA for the "standard" cpu and the bunch mini-cores/simd monsters, devs would have access to a very flexible design!

But I feel like the bus architecture would be tricky, both cpu will need a fair amount of bandwidth, the graphic "fixed functions" too.

I could see (dreaming as I don't understand all the implications) this third chip including the memory controller to the unified ram pool and some huge edram L3 cache to help cooperation between the two CPU.
 
Last edited by a moderator:
I want to add that I don't think that a single chip could offer the proper performances for next-gen.

Since cost is one of the primary drivers of console design, I'm pretty sure we will see at most 2 big ICs, like the current gen. Especially since in 5 years time we'll be at 32nm beyond which further shrinkage seems very hard. In other words the current consoles have 400 mm^2 high performance si in them right now because MS and SONY will shrink those over the lifespan of the current generation of consoles to lower cost. This will be very hard to achieve next gen.

As for CPU architecture. I'm pretty confident that MS will continue down the road with general purpose cores, hopefully higher performance ones that the current gen. Shouldn't be unfeasible to have 16 OoO cores on a die in 2011 - and a large chunk of cache. They might add a few auxillary long vector cores á la Intels recent presentation, - if you're doing SOA computing anyway why limit yourself to 4-way SIMD ?

Cheers
 
I'm sure others saw the news, but for those that may not have, IBM's chip-stacking technology might be available around the time of the next consoles..? Could be worth considering when dreaming up system designs..

At a glance, the main benefits might be memory bandwidth/speed and for system size/power requirements.
 
The initial targets seem to be for non-CPU products, mostly as a cost-saving measure.

Perhaps there could be some kind of underside stacked memory under the die, but probably not anything above the CPU die, and not another CPU or GPU.

We'd need something more exotic for cooling than we have right now.
 
Since cost is one of the primary drivers of console design, I'm pretty sure we will see at most 2 big ICs, like the current gen. Especially since in 5 years time we'll be at 32nm beyond which further shrinkage seems very hard. In other words the current consoles have 400 mm^2 high performance si in them right now because MS and SONY will shrink those over the lifespan of the current generation of consoles to lower cost. This will be very hard to achieve next gen.

As for CPU architecture. I'm pretty confident that MS will continue down the road with general purpose cores, hopefully higher performance ones that the current gen. Shouldn't be unfeasible to have 16 OoO cores on a die in 2011 - and a large chunk of cache. They might add a few auxillary long vector cores á la Intels recent presentation, - if you're doing SOA computing anyway why limit yourself to 4-way SIMD ?

Cheers

In Intel presentation the larabee is around ~140mm² witch seems to be the same as a Two core "standard" cpu, i guess it's @65nm
One core seems to be ~10 mm²@65nm
The links you posted earlier state that we could also have ~10mm² ppc @65nm that are good at running painfully serial and unpredictable code and the altivec units could be cut, so they could be even tinnier or some smt could be implement to reach even higher IPC.

So it could be really interesting to a chip that mix these two kind of mini-cores.
Specially if Intel is proved right : ie this mini core make a good enought job at graphical tasks.


@65nm² we could have some 2 "traditionnal" mini-cores + 8 mini-cores/simd monster with a die size of 140mm² (or any other combination 3/7 or higher with bigger transistors budget) (ie the die size of a core 2duo)

I know that You and 3dilletante always are always telling to less knowledgeable members (me and some others comes in mind ;) ) to never underestimate the serial part of thing (if I understand properly you 're always limited by the slower part of the code).
But a lot of people here don't see the need for more than between 4/8 traditional cores (usually 4 seems the likely in discussion on this board).

But in my understanding there is still some strong OoO cores that would run the serial,branchy,etc, part of the code in this design.

Ok the cpu would be asymmetrical but some part of the cpu would be mean to run graphical task so it wouldn't be more asymmetrical than actual combo cpu+gpu.
It's just than the gpu part would become more flexible.

there still a big if Intel have to prove that this design can match or more realistically be close enough to"traditionnal" gpu in performances ;)

even more, if a manufacturer is able to bring an unified ISA.

For coders it would become even less asymmetrical than the actual cpu/gpu/ppu(lol) combo.
Ie just like USA allow better efficiency in alu usage, in standard up coming design some calculations could be done both on Gpu or on simd unit of the cpu( with different success), Dev would have to think at witch kind of resources are available, and to think at the trade off of running that part of code on cpu or gpu(like the cpu is the bottleneck so I can use the gpu for some non graphic jobs or the other way around)..

here all these calculations are done on the simd mini-core, and depending of the workload you can allow resources freely to whatever you want in a pretty easy way.

I agree with you on the fact fact that next gen are likely to use only two two big ICs.
But some more costs would be saved by using to identical chips, first during R&D and then during production due to larger scale economy.

for the third chip, I don't think it will be huge, look at the xenos smart edram.
the logic inside takes around 20 millions transistors(die size?)
How big would be a chip that include on top of that memory controler, some L3 cache, and texturing unit, I guess that @90nm ti could almost fit in xenos daughter
die size.
So if I'm right I'm not speaking of a huge third chip.

I don't even think that the two cpu would be huge in regard to a 32spe/4ppe cell or a 16 core nehalim and a matching gpu.

disclaimer, I don't try to argue with you, just try to explin my point more clearly ;)
 
Last edited by a moderator:
...MS going with a strong emphasis on the GPU. If GPUs continue to evolve and tackle more tasks traditionally done on the CPU it may be in MS's best interest for PC<>Xbox3 synergy to leverage more robust GPUs that do a lot of the heavy lifting.
If GPU's turn fully GPGPU, basically SIMD engines rather than GPUs, MS could go with a simpler CPU and two GPUs; one for graphics and the other for flexible tasks. AMD could provide a custom architecture to support that too, with a CPU and GPUs with a console-targetted bus or whatnot. Could be a very sweet design.
 
The initial targets seem to be for non-CPU products, mostly as a cost-saving measure.

Perhaps there could be some kind of underside stacked memory under the die, but probably not anything above the CPU die, and not another CPU or GPU.

We'd need something more exotic for cooling than we have right now.


These are all true points. They've said they plan to have memory-on-chip in use in servers, supercomputers and "other machines" by 2009, so a couple of years later in a console should be feasible if that's the case. Chip-on-chip..who knows.
 
I've edit my post because i was wrong about th e burning process.
The gpu describe in Intel presentation is likely to take 140mm² @65nm
ie the same size as a cell or a core 2 duo @65nm.

It should be be interesting if someone know how much die size the shaders cares take in a G80? (because as a whole the g80 is a bigger chip).
 
Back
Top