Vince-
Whats the diffrence between a VU and a VS? Explain to me how a VU is 'general computing'. Whats the fundimental diffrence between a VU and the NV3x's new TCL front-end?
Explain to me why a 'general processor' like the EE or SH-4 can outpace the <quote> Hardwired <quote> solutions that you speak of that nVidia produced at the same time. Why does the EE utterly destroy the NV1x's TCL front-end?
ANSWER: Because it's throwing more tranistors at the problem. You [programmable] can maintain parity with a hardwired solution if you devote more resources [read logic] to the problem. This is simple. This is my point. Yet you fight it, over and over.
I guess to sum up my end of countering what you are saying, I see the GeForce1 as a better product in terms of graphics then the PS2(obviously the PS2 has the enormous benefit of being a fixed hardware platform which gives it a huge edge in real world situations). You keep wanting to look at narrowly defined areas and say that a CPU can compete in those particular areas. Let's throw eight million polys per second with trilinear, anistoropic, Dot3 and CubeMaps at the PS2 and see how it holds up. Carmack has stated that Doom3 was built around what was made possible with the GeForce1. Giants when it was ported over to the PS2 had to have downgraded graphics versus the DX7 build which ran very nicely on a GeForce1 at console resolutions. Obviously the game wasn't designed from the ground up around the PS2, else it would have eliminated many of the image enhancing features that the dedicated hardware offered and instead relied on increased poly counts.
The GeForce1 has advantages in certain areas despite the, according to you, 3x increase in complexity. Given fabrication advancements three times the transistor count and it still has shortcomings. Now why would I ever think that dedicated hardware would be better?
Thank you Lord!! Look at OGL and DX10+. The merger of the PS and VS is comming, architectures like the P10/9 are going to be the future.
And this means what to you? Let's see how a CPU enjoys trying to compute out the logic on early Z to reduce the amount of OD, loops back instructions that exceed the processors ability to handle in a single 'pass' and then deals with a BPU miscalc. SGI has viz machines with hundreds of MIPS processors already pushing out TFLOPS, and yet they rely on cutting edge dedicated hardware for their rasterization. Of course, they don't know nearly as much about 3D as Sony right?
Yep, what I'm advocating is that threw advanced lithography, you can increase the programmability of an architecture by a large amount while still maintining preformance parity with a comparable hardwired design. But, in order to do this, you must 'beat' Moore's Law - and thus to equal a hardwired design while maintaing flexibility - you must increase the usable transistor counts.
In rasterization terms the EE was not competitive. First they must significantly exceed Moore's law to
catch rasterizers, they have a very long way to go to think about beating them.
This can be done threw: (a) More Advanced Lithography, (b) Multichip, (c) GRID/Cluster/Pervasive/or otherwise Computing, (d) Architectural Advance.
All of these have been used for a long time in off line rendering. They still can't compete with dedicated hardware. The Alpha chips were packing as much L2 cache
per core as CELL is supposed to several years ago(although it was off die, the typical 300mm limit for consumer products doesn't apply to the higher end parts which has the same impact as more advanced build techniques). Placed up against an Athlon with a GeForce Alpha's got throttled. Render farms have been around for years, the latency of them(and this is LAN based, not WAN) make them useless for anything nearing real time.
Hey, and your always ready to argue back
Well of course
There is no way they will yeild a true 6.6TFLOPS in one console... period. I'll be impressed if they can output a true TFLOP and sustain it, but I'm not so sure.
Are you saying that Sony will not hit their initial claim?
I too, wonder if SCE will use a full software rendering approach with PS3, with the idea that the hardware will be more of a 'VU' like, scientific computing approach [not like the traditional CPU].
I don't think they will. The next GS has to have a decent amount of feature support or they will be killed by the XBox2 in the early going. Dealing with an entirely new architecture that is significantly different then anything else with, as of now, no compiler support? Their first gen games would be lucky to look much better then late life cycle XB or GC titles. Yes, five years down the road they might be able to pull of some very impressive things considering its software, but it would only compound the problems a lot of developers brought up when the PS2 dev kits first started circulating.
I doubt it would be near an nVidia powered solution, but does it have to be? Interesting questions emerge, such as with 1TFLOP [Which is well over the GSCube IIRC which rendered FF:TSW and Antz at 60fps] isn't that suficient? How much of a visual diffrence would be seen?
1TFLOP isn't that much for real time rendering. If you focus an extreme amount on your raw FP power you are going to sacrifice your interger performance. Current rasterizers are already pushing a trillion ops per second, you dedicate your die space to vector based ops it comes out of somewhere else. As far as FF:TSW being rendered in real time on a TFLOP machine, no, it wasn't perfect. The render farm for FF:TSW was pushing over a TFLOP(not sure exactly how much) and IIRC its total render time was in the several months range(for a two hour movie). Antz was rendered on a multi TFLOPs farm and also took months.
The biggest question I have, is if a developer had full control, and could tailor the entire 3D pipeline for his title - how much is gained in effeciency? I mean, they could literally do anything... hell, banish trinagles. All the petty arguments about the nV2A's PS and the TEV's features would disapear.
But current rasterizers are already headed in the fully prgrammable direction. The big difference is that the hardware is custom built around the pitfalls that general purpose CPUs will fall in to.
But, I bet there will be some sort of rasterizer/GSx
I expect so also, which you do realize makes most of our argument pointless(though I'm sure we will continue it for some time to come
).
V3-
Hmm, don't know about that. Those P4 is getting pretty fast.
It's not speculation. A TNT2 is roughly fifty times faster then a GHZ P3 rendering real time graphics(trilinear filtering etc, etc, not software compromised code). Given the P6 core's IPC edge it's going to take the P4 some time to catch up.
Randycat-
It's not like they are trying to do "software rendering" on some x86 CPU (had they been, you would certainly be indisputibly correct).
I'm also comparing a TNT2, nothing comparable to a R9700
If it is an array of rapid execution vector units (albeit, governed by software), that pretty much blurs the line with "dedicated hardware". It just happens to not be what nVidia is up to, IMO.
Let's say Sony squeezes a billion transistors on to their CELL chips for the PS3. You take 100Million for the 8MB(32Mb) eDRAM which leaves you with 900Million. Figuring for 32 cores you are looking at 28,125,000 transistors per core. That means on a per core basis you are dealing with about as many transistors as a P4(minus the L2 cache) with less memory per core. If Sony does manage to get 1Billion transistors on .065u build process I'm sure the clock speed will be well short of that offered by desktop x86 parts of the time frame. What do you expect them to do per core with a budget comparable to the P4?
As far as it not being what nVidia is up to, you could every other company involved in 3D to go along with nV