santyhammer
Newcomer
You really seem to have issues with understanding some basic concepts of electronics and physics.
For example, don't you realize that multiple CPUs on one board would:
1. Increase the cost of the board because of:
a) Much more expensive signal routing (probably more layers needed)
b) More elements required for decoupling of 4 sockets instead of 1 (solid or aluminium capacitors)
c) More power MOSFETs for powering 4 CPUs instead of 1
d) Bigger surface needed to accomodate all the elements
e) Next to inpractical synchronization for memory access
2. Increase the cost of the whole system because of:
a) mainboard layout change needs new case layout
b) You will need to install 4 coolers instead of 1 (also think of heat and noise which goes with power required for 1-cycle DOT you are so obsessed with)
c) RAM would have to be much faster (like 10 times today bandwidth)
3. Increase the complexity of software design. Not to mention that they would have to develop the tools for programming that monstruosity first. Don't count on free compilers there.
Furthermore, you are ignoring one simple fact in your requests for more powerfull instructions -- GPU is much better at random access to memory than the CPU. It is because it has multiple smaller texture caches and the data layout in memory is optimized for certain types of accesses (data order is not linear AFAIK). On the CPU side you have 10 times slower RAM (8.5GB/sec or less versus ~85GB/sec on 8800GTX) and one big cache, both delivering greatest bandwidth in simple streaming (linear RAM access) operations. It is something that can't just be changed overnight.
Yep, that was that I originally thought... then I saw this:
http://en.wikipedia.org/wiki/Torrenza
http://www.dailytech.com/article.aspx?newsid=2642
http://pc.watch.impress.co.jp/docs/2006/0713/kaigai287.htm
You can also use the upcoming Hypertransport HTX to plug all these coprocessors or math cards like the ClearSpeed...
http://www.dailytech.com/article.aspx?newsid=1276
http://en.wikipedia.org/wiki/HyperTransport
http://www.hypertransport.org/docs/wp/HTX_wp_final.pdf
Notice this is good, because the motherboard has only 2-5 HT3.0 slots ( not 900 ZIF sockets )... But you can plug an auxiliary card with multiple multi-core-CPUs and multiple sockets ( again, like the clearSpeed ). You can use a PPU, GPU, raytracing hw card or whatever there and talks directly with the main CPU in the same manner as a coprocessor.
Btw Intel is preparing the same thing with the code name of "Geneseo" but is a bit different because uses the PCI-X to plug the coprocessor cards
About the costs... One capacitor costs 0.1$, the routing is cheap, the layers yes, increase a bit the cost... But all that compared with the 1200$ of a quad-core CPU are nothing... I think what increases the costs really is the developement team not the manufacturing and packaging... And with these coprocessors the final verilog/VDHL is much less so what is important, the devel cost, will be greatly reduced because is easier to develop a 30M transistors CPU than a 800M one ( and you can save silicon and make more CPUs per disc )
However I must admit all this can be a bit vaporware... because the Torrenza/Geneseo are planned for 2008-2010 and could not be a success. We will see with the time.
And about physics... I don't think to integrate 900M of transistors in a CPU will be good... In fact, the G80/R600 are going to be the last super-transistor-numbers GPUs... See this:
http://www.short-media.com/extendednews.php?n=5417
http://www.pro-networks.org/forum/viewstory.php?t=85588
About coolers... Forget them... A smaller GPU like the VirgeDX does NOT require a cooler.. The G80 needs a cooler because uses 800M of transistors... Future CPUs are going to use nanotubes inside the chip packagement as cooling system:
http://news.zdnet.co.uk/emergingtech/0,1000000183,39147421,00.htm
http://www.hardwaresecrets.com/news/713
http://www.frostytech.com/permalink.cfm?NewsID=54604
http://www.xbitlabs.com/news/coolers/display/20040326082724.html
http://www.physorg.com/news12109.html
The new Quantum Well transistors based on indium antimonide (InSb) can help too:
http://business.pcauthority.com.au/print.aspx?CIID=59355
Btw, see the Clearspeed card I mentioned before can do 25GFLOPS per CPU and it doesnt use a cooler ( because goes at 250Mhz and 0.80 )
About the design software... I think is much better to debug a 3M lines verilog CPU than a 90-core CPU...
Also I'm not sure how many cores we will be able to integrate into a CPU into the near future... but we will reach the limit and then the coprocessor idea can be good. PCB routing debug for multiple sockets will be a bit painful though, but I think simpler than a 900M transistor CPU.
About the GPU vs CPU I think Uttar explained very well... and I can agree.. GPUs will be faster by now ( need to wait to see because 4 clearSpeed/Cell cards using HT3.0 can wipe all... and have to wait CUDA and CTM performace too ). But that's not the question really... the question is why we had to wait 10 years to get a decent basic DOT instruction when all the developers were requesting it ( and finally they gave us the reason including it.. late... but better late than never ) and the GeForce3/SuperH4 years ago had it ( so technically was possible ).
Hey btw, I found this the other day
http://www.vr-zone.com/?i=4415
It appears AMD gonna include SSE4A too... good news.
Last edited by a moderator: