How was Ps2 thought to work ???

Perhaps I am saying an stupidity so correct me if I am wrong...but since VU0 is responsable of the logic area it performs all the work related to colisions/visilbility tests while VU0 does all the Transform, sisn't this way ?
 
ShinHoshi said:
Perhaps I am saying an stupidity so correct me if I am wrong...but since VU0 is responsable of the logic area it performs all the work related to colisions/visilbility tests while VU0 does all the Transform, sisn't this way ?


well not necessarely... it does what the devs want it to do really..... can do physics and collision but can also help VU1 with T&L, or even encoding DTS in realtime (last time i checked, but could be wrong)
 
...

To Faf

I believe you DID see Fafracer screens sometime last year (originally as scans from a Korean mag).
What was the title of your racer again??? I wasn't paying too much attention...

Look, this has been told to you 5times in this thread alone and you still ignore it : VU0 can't output results, VU1 can.
Surely it can. Or has Kutaragi lied about its 66 million polys/s theoritical peak throughput, because you can't reach it without VU0 joining the party. The purpose of GIF is to arbitrate between streams from VU0, VU1, and memory(texture). A priority is given to VU1, but GIF will scan VU0 when VU1 sits idle.

To Marcelli

The patent quote he posted clearly puts your "second VU was added at the last moment" rambling to a well deserved rest.
How is it so????

Besides that, you are arguing here with people with much more knowledge of the matter
I pick my opponents carefully. Vince is not one. He is one of those easy ones to play with.

To Vince

Why yes, it is the modern EE Block Diagram with VU0/VU1 in March of 1997. Do I get a Scooby Snack?
"Search Time Limit Has Expired." I can't see what you are seeing...

Beside, you need to learn how to read a document clearly because that patent filing was for VIF instructions; nothing else was covered or claimed to have been invented by Sony. EE was merely presented as a device that will run the patented technique.
 
Squeak said:
True, it does have texture compression, but PS2 can do multitexturing better than DC,
Please explain the logic underlying that conclusion. DC has a secondary buffer which allows it to combine multiple textures on chip in "interesting" ways.
which in turn means that by double texturing it can get even better compression ratio than DCs vector quantization (or S3TC).
Now you have really got me intrigued. Can you explain that comment? To my knowledge the best the GS can offer is 4bpp palettised which (in general) is significantly worse in quality than both the 4bpp S3TC and 2bpp VQ. How does using this multiple times make it better than the compressed systems?
 
Re: ....

DeadmeatGA said:
So what was the idea ??? Was it only the first step ???
The idea was to outperform DC by a substantional margin at all cost, at the great expense of fabrication cost and programmability.
That sounds pretty close to what I heard on the grapevine.
 
Simon F.


Apparently PS3 will use massive amounts flops and bandwidth in order to use REYES rendering. This seems to be in direct contrast to the approach taken by PowerVR. Do you think PS3's micropolygon based approach is wise? Why or why not?
 
talking about ps2 here.... there are LOADS of PS3 related threads, can we PLEASE leave things where they belong? just a thought...
 
Squeak said:
Doesn?t it also have some on-die RAM for decompressed textures used in the current tile, or does it have to fetch the whole texture, decompress it to VRAM, and then use it?
The texture decompression is all completely on-the-fly. No decompressed texels are "stored" anywhere.

Something else to bare in mind is that the PVRDC chip would, on average in, only have to texture about 1 third of the pixels that the GS has too in a normal game.

DC (like PS2) will still need to have whole maps in its VRAM, or else it would have virtual texturing. So the 800Mb bus between VRAM and PVRDC should play almost the same role as the combined 1.2 Gb EE to GS and the 9.6Gb internal bus on the GS. A bit difficult to compare then.
Not that difficult really for DC. If you assume a conservative 50% of the bus bandwidth is used for textures, that the internal texture cache is functioning at 75% efficiency, and that the texels are using VQ at ~2bpp, then that allows you ~ 4.4 Gtexels transfered per second. That's more than enough for the texturing engine!

If you assume 8bpp CLUT textures on the GS then the 9.6Gb internal bus allows you about the same texel access rate. (i.e. 9.8GT/s).

PS2 certainly has a fillrate advantage over the DC when multi-texturing, but I wouldn't say that its clearly better at multi-texturing, DC has its advantages too being a TBR.

Being a deferred renderer means that it (DC) relies heavily on the opaqueness of objects for its performance boost and also the compression technique it uses can't do anything else than punch through ?alpha?.
Not strictly true - the internal FB means that it is faster than standard Z-buffer systems and it also can do tricks like per-pixel Z alpha sorting which, to my knowledge, no other HW system does.

PS2 on the other hand has free alpha and dual states for fast context switching among other things, that should make it a much superior multitexturer.
Well DC can do things like apply a base texture, then mix two other textures together and then apply the result to the base. I suspect you'd have to have a second framebuffer to do that on PS2.
 
Teasy said:
Erhm... I seem to remember trilinear on PVRDC required triangles to be setup twice. I could be wrong, but doesn't this directly contradict the notion of rasterizer doing more then one texture per pass?

No, PVRDC needed 2 cycles for trilinear AFAIR, not two passes.
No, 2 tris|passes for trilinear. The Aniso 'mode 1' was single pass but needed more cycles.
 
Cool PVR features

The PVR chip is pretty cool for effects, it did seem to be a pity that few games used all of them.

In comparision the GS is a stripped down simple rasteriser, but it is very fast - and for very dense meshes it does outperform the PVR

( Character meshes, often have triangles less than 8x8 pixels, and the GS will rasterise them quickly... )

For large textures the PVR is excellent though - the VQ texture compression is very useful.

On the whole though the PS2 can transform as many polygons as the GS can draw, and the nice thing is that the transforms can occur on VU1, leaving the core CPU free for game logic. - The SH4, although very strong - will be locked up when transforming geometry.

DM is right about using both VU's to process geometry - but the distinction ( as shown in the patent ) is between simple fixed geometry, and 'complex' geometry ( animated / blended / skinned )
Complex geometry is then transformed using the core cpu and VU0 together ( in the same manner as the original playstation and the DC )


A complex blend will require a 2nd frame buffer or accumulation buffer, which doesn't need to be full screen - more like texture sized...
 
And Sony's claim of 30 million polys/s peak with 1 lighting presumed the use of both VUs.

Looks at middleware specs from a few months ago... 500k 60fps fully textured, lit in-game performance.....

... looks at PA info, almost no one uses vu0... this is one of the most used if not the most used middleware for ps2... strange that didn't come up with dozens of titles using this... :rolleyes:

edited
 
Please explain the logic underlying that conclusion. DC has a secondary buffer which allows it to combine multiple textures on chip in "interesting" ways.
Whatever it had it sure didn't help that many games have multitexturing. That particular thing was so rare and sparingly used (where used) in DC games and is very commonplace on PS2. That alone makes one think DC had problems in that area.

Now you have really got me intrigued. Can you explain that comment? To my knowledge the best the GS can offer is 4bpp palettised which
There are GS decompression algorithms which require two passes which are used to decompress the texture. I was reading somwhere that programmers from Blue Shift team achieved 9:1 compression that way. One thing that was speculated even here is that it's possible to separate color and B/W component from the texture, then store color component at much lower resolution, so that when they're united back during the rendering, difference in quality is negligible (as color component has much less detal to it to begin with) Fafalada mentioned some other algorithms, saying for example that it's entirely possible to decode VQ compressed textures through multiple passes on GS.
 
marconelly! said:
Please explain the logic underlying that conclusion. DC has a secondary buffer which allows it to combine multiple textures on chip in "interesting" ways.
Whatever it had it sure didn't help that many games have multitexturing. That particular thing was so rare and sparingly used (where used) in DC games and is very commonplace on PS2. That alone makes one think DC had problems in that area.
As has been said before, DC development dried up before it even began to scratch the surface of what was possible...or do you honestly believe that game developers use 100% of the features of a system from day one?
Now you have really got me intrigued. Can you explain that comment? To my knowledge the best the GS can offer is 4bpp palettised which
There are GS decompression algorithms which require two passes which are used to decompress the texture. I was reading somwhere that programmers from Blue Shift team achieved 9:1 compression that way. One thing that was speculated even here is that it's possible to separate color and B/W component from the texture, then store color component at much lower resolution
I was refering to native compression but ignoring that, yes using multiple passes can emulate other texture compression schemes. What you describe sounds sort of YUV based; I could see that working to some extent. Can you point me at the thread where that speculation is? (since I don't frequent the console pages too often and there is a lot of traffic here)

Fafalada mentioned some other algorithms, saying for example that it's entirely possible to decode VQ compressed textures through multiple passes on GS.
That would be an impressive hack (I'm not sure how you would do it), but surely rather time consuming?
 
Simon F said:
As has been said before, DC development dried up before it even began to scratch the surface of what was possible...or do you honestly believe that game developers use 100% of the features of a system from day one?

One of the most difficult areas of trying to discuss what the DC was capable of (and how it would be competing in this day and age against Xbox, PS2 and GC) is that many people seem to be under the impression that the DC was "maxxed out" by games like Shenmue 1 & 2 (mostly done around 98-99, first generation titles if you will). Matters aren't helped by the entrenched arguments between the DC and PS2 fans about textures, image quality and progressive scan. :eek:

My opinion is that as better and better performance was obtained from the DC, and developers became more familiar working with features beyond bilinear filtering, it would have paid off greatly to put more rendering time into using the PVR2DC's more advanced features.

Even assuming the chip was only 75% efficient (I think I read in a press release somewhere stating that it was into the nineties), rendering the 307,200 pixels for a 640 x 480 screen would have taken around 4.096 ms. At 30fps you have 33.3 ms to draw a frame, meaning that dependant upon T&L, AI, physics, data transfer etc you potentially have time for more than one pass for each pixel (infact I remember some 60fps games using multitexturing).

Or as an example, take a polygon count (and all other calculations) you might normally have expected to see at 60fps, but run at 30fps and use 3, 4 or 5 passes instead of 1. Or some other balance. It would be interesting to see what talented developers with todays experience (both technical and artistic) could have done with the DC.

My figures were done in a hurry BTW, so if anyone spots a mistake please correct it.

I know "bump mapping" is considered evil by many here (quite unjustifiably so IMO), but this along with other multi-texturing effects, forms of AA such as MS and aniso, and more accurate lighting/shading would surely have lead to a far bigger increase in viusal quality than simply nudging up the polygon count - an area where the DC couldn't hope to compete with the industry leader, the PS2.

A couple of questions on the DC, to anyone with the technical knowledge to answer!

1) Shouldn't the PVR tile based arcitechure have made MSAA relatively cheap in terms of bandwidth and memory (only a standard size frame buffer needed)?

2) From what I've seen, dot3 seems to show the best results when used with point or spot lighting. How much would the DC's CPU have held back the graphics chip from really showing it's stuff?


P.S. and unrelated, I'm hardly around a PC at the moment, so missing all the GC sales and Xbox 2 revelations has been killing me! :(
 
function said:
One of the most difficult areas of trying to discuss what the DC was capable of (and how it would be competing in this day and age against Xbox, PS2 and GC) is that many people seem to be under the impression that the DC was "maxxed out" by games like Shenmue 1 & 2 (mostly done around 98-99, first generation titles if you will).

Considering Shenmue was out on Nov '00, and Shenmue II in Sept '01, developed by Sega's own AM2, and arriving amidst much fanfare and BIG budgets, I have no problems with people thinking those games would have pushed the Dreamcast higher as far as they could manage. First generation? Hardly. Granted they had long development cycles, but they also had MASSIVE resources available to them and we being done in house. If you can't trust a premiere Sega unit to push as much out of their own console as possible to show off just what they could, who can you trust?

Do I know if either game DID push all the DC's tricks, or how far they went? Of course not. And I would LOVE to see utilization tests and any other measurement tools available to track just what certain titles did and just what the DC could push, but I have never heard of that kind of data available anywhere. So offhand, with the question still being asked even now...? I don't think I'd vote against Shenmue II. Do I know just how much it pushed? Nope. Can I think of something else that would have done more? Also nope.
 
From an old DC article...

***********************************************
"How Many Polygons Can the Dreamcast Render?

Let's help clear up some of the confusion that centers around the Dreamcast's polygonal rate. When SEGA first introduced the Dreamcast back in November 1998, they indicated that the machine could do 3 million polygons per second, which is a sustainable rate that could be gotten through software running on the machine at that time.

I shall direct your attention to this article at IEEE Micro, of which these quotes come from:

The CPU was clearly an important part of the Dreamcast specification, and selection of the device was a lengthy and carefully considered process. Factors considered included performance, cost, power requirements, and delivery schedule. There wasn't an off-the-shelf processor that could meet all requirements, but Hitachi's SH-4 processor, which was still in development, could adapt to deliver the 3D geometry calculation performance necessary. The final form has an internal floating-point unit of 1.4 Gflops, which can calculate the geometry and lighting of more than 10 million polygons per second. Among the features of the SH-4 CPU is the store queue mechanism that helps send polygon data to the rendering engine at close to maximum bus bandwidth.1 The final device is implemented using a 0.25-micron, five-layer-metal process.
The system ASIC combines a PowerVR rendering core with a system bus controller, implemented using a 0.25-micron, five-layer-metal process. Imagination Technologies (formerly VideoLogic) provided the core logical design and Sega supplied the system bus. NEC provided the ASIC design technologies and chip layout, including qualification for 100-MHz operation. Fill rates are a maximum of 3.2 Gpixels per second for scenes comprising purely opaque polygons, falling to 100 million pixels per second when transparent polygons are used at the maximum hardware sort depth of 60. Overall rendering engine throughput is 7 million polygons per second, but in Dreamcast, geometry data storage becomes the limiting factor before pixel engine throughput.

You're only as fast as your slowest component, so the DC is rated at 7 million polygons per second maximum sustainable rate, and in a game situation, would most likely be rated around 5 to 6 million polygons per second depending on how good a top developer would be at squeezing performance out of the system. I consider a rate lower than 7 mpps, simply because other game code has an effect on the polygon rate. The more complex the game AI is, the lower the polygon rate that the machine can achieve.
Note, the above quote contains some information, which could be easily misunderstood, as the above article states:

Fill rates are a maximum of 3.2 Gpixels per second for scenes comprising purely opaque polygons, falling to 100 million pixels per second when transparent polygons are used at the maximum hardware sort depth of 60.
No 3D game today even comes close to having an opaque overdraw of 60 times! It's more like 2 to 3 times of overdraw, so the comparative pixel rate would be 100 million to 300 million pixels per second maximum. I indicate comparative, as that means how an "infinite plane" architecture would be compared to a traditional architecture that renders every polygon in a scene.
Here is a very interesting comment:

Overall rendering engine throughput is 7 million polygons per second, but in Dreamcast, geometry data storage becomes the limiting factor before pixel engine throughput.
Let see, if the Dreamcast can render more polygons then it can store, and I will use 6 mpps as an example:
6,000,000 (polygons) / 60 (frames per second) = 100,000 polygons per scene
100,000 x 40 Bytes (size of polygon) = ~4 MB
Since the Dreamcast only has 8 MB of video memory, that is a lot of memory!
8 MB - 1.2 MB (640x480x16-bits double buffered frame buffer) - 4 MB (polygon data) = 2.8 MB
Only 2.8 MB left for textures, and even with VQ compression that is not very much. At 3 mpps per second, there is 5.8 MB available for textures, and that is much better. Just shows you, that there is not much point in creating a game engine on the DC that does more than 3 million polygons per second. Anyway 90 percent of the developers out there cannot even get over a million polygons per second on the Dreamcast."
******************************************************
 
Back
Top