How was Ps2 thought to work ???

marconelly!:
In cases when DC can do BM in one pass (with one planar light and flat surface below it), PS2 can do it in two passes, again much otperforming it.
As Teasy was pointing out, the PVR2DC uses the bump map to vary a texture's shading in just a single pass, which is exactly what I said and brought up to contrast its hardwired ease-of-use against the PS2 which can only emulate it through multipass. Yes, there is a pass for the texture first on DC (only when there is an underlying texture, obviously), and PS2 must work steps with the framebuffer in addition and include the actual texture too.

I didn't mention anything about a scenario with no base texture like you put forth, which of course is handled the same way minus a first pass for the base texture.

Regarding the discussion about Naomi 2's plans:
Naomi 2 carried over the standard first-generation parts to enable full backward compatibility, and it still cleverly achieved high-end performance. If the compatibility hadn't been a goal of the project, it's high-end designation might've seen to the use of more powerful and/or higher clocked parts - the inclusion of which being a simple matter at that point.
 
In summary, PSX2 is a MIPS compatible system with a number of programmable devices stuck in, namely two VUs, one IPU, one rasterizer, a sound chip, and an I/O chip.

So was the original PS.
 
and the N64... ;)


PS2 probally follows a principle of KISS to a certain extent.
The programmable VU's dont match the raw GFLOPs/clock capabilities of the GC or PowerVR elan , but the programmability does allow maximum utilisation of the hardware resources.
The GS runs the majority of game polygons very efficiently - things like bump mapping are slower per clock, but unless every drawn polygon is bumpmapped it isn't a great drain on the system.
For very dense meshes with small poly's it becomes a lot more efficient than the PowerVR chip, as the overheads of binning data and then evaluating it start to become excessive.
 
PVR doesn't fill anything - the VRam is the only memory it can address - hence any textures used must be in there before it starts rendering.

If DC does load all textures into VRam despite the fact that before they're loaded the GPU already knows which ones aren't visible, then that seems like a bit of a missed opportunity to me.

And to bitch a little, I'd prefer not to use the term deferred so generally.

Hey, I'm only using the same term as the person I replied too, so take it up with him :LOL:

Anyway, in regards to multitexture, I was under impression that PVRDC only did one texture per pass?

Not as far as I know, I don't see why that would be the case. After all PVRDC should be very similar to Kyro in this respect. Kyro could do 8 in a signle pass and that was only limited by the drivers.
 
...

So was the original PS.
With original PSX, developers had to worry about only one stream in the graphics pipeline, the display list stream to GPU. With PSX2, developers were forced to deal with three separate streams in the graphics pipeline, two vertex streams to VUs and one texture stream to GS. If any of those three streams were interupted, the whole system would stall and maximum performance would not be achieved.

PSX2 and PSX were very different beasts. With PSP, SCEI admitted their past design mistakes and remodeled the architecture after more conventional 3D renderers like GC and Xbox, in that there is no VU to begin with(all vector operations are handled by CPU via vector instructions, similar to SSE of Xbox) and the GPU is not very programmable.

PS2 probally follows a principle of KISS to a certain extent.
It doesn't show on block diagram.

The programmable VU's dont match the raw GFLOPs/clock capabilities of the GC or PowerVR elan , but the programmability does allow maximum utilisation of the hardware resources.
This was the reason Toshiba proposal was selected over LSI one. Toshiba design wasn't any faster than LSI's hardwired solution, but it was more flexible. Unfortunately, some one at SCEI saw the performance difficiency and ordered the bolting on of second VU to outrun DC at all cost...
 
Anyway, in regards to multitexture, I was under impression that PVRDC only did one texture per pass?

No, but it does take quite a while to to 8 texture lookups into external SDRAM modules (read: burning a lot of GPU cycles) along with the combiner ops just to get a pixel out of course doesn't mean doing so will make it all that practical to use pervasively.

I don't know how the math works out. I am not a PSX2 programmer.

Obviously...

With original PSX, developers had to worry about only one stream in the graphics pipeline, the display list stream to GPU.

Same on the PS2... Only now you have multiple routes to improve throughput and reduce stalls. And another thing, you need to realize that the traditional thinking in terms of a "graphics pipeline" is rapidly becoming somewhat of an antiquated paradigm in the sense of staged, parameter based rendering. You need to start learning to think of your scene as possibly a collection of programs.


With PSX2, developers were forced to deal with three separate streams in the graphics pipeline, two vertex streams to VUs and one texture stream to GS.

Nobody is forcing you do deal with 2 vertex streams... There's no law written in stone that you have to have 2 vertex streams (I'm assuming you're referring to using both VUs in parallel). And unless my memory is that bad, you still had to load textures onto the GPU on the Playstation as well, so I'm not sure what you're crying about here....

If any of those three streams were interupted, the whole system would stall and maximum performance would not be achieved.

Any? Now I definitely know you're full of crap! Paths are prioritized and don't all run simultaneously. Obviously a high priority path isn't going to stall a low priority path. The GIF can manage and arbitrate data on PATHs 1 and 2 just fine. The only real one you have to worry about is big block image transfers on path 3.

Guess what! It's not THAT big of a deal. Every piece of hardware has hotspots that prevent "maximum" performance (whatever the hell that is)...

PSX2 and PSX were very different beasts. With PSP, SCEI admitted their past design mistakes and remodeled the architecture after more conventional 3D renderers like GC and Xbox, in that there is no VU to begin with(all vector operations are handled by CPU via vector instructions, similar to SSE of Xbox) and the GPU is not very programmable.

For starters you should start being more careful with your terminology, there is no such thing as a PSX2 yet, and the PSX is a PS2 with built-in HDD and burns DVDs... :p

Secondly, how can Sony have admitted their past design mistakes with the PSP since the PS3 is likely to such a durn complex bugger to work with according to you?

Thirdly, you can't really claim that the vector extensions will be similar to SSE since the architectural details haven't been published. While it's likely to be mapped over existing hardware (like MIPS-3D and MMI), it may be similar to VU0 for all you know. As it is, VU0 defaults to macro mode anyways, so it's no different from any other vector extension you see other than it at least can do arbitrary computes between elements with simple designations in the opcode mask. Something like AltiVec (or worse SSE which can't even operate across a full register in a single clock) you'd be spending time swizzling data into an SoA to make effective use of the hardware.

It doesn't show on block diagram.

Ahh! The root of your problem! Much like the Jade Fox, you've studied the diagrams but could not possibly hope to comprehend them! :p

This was the reason Toshiba proposal was selected over LSI one. Toshiba design wasn't any faster than LSI's hardwired solution, but it was more flexible. Unfortunately, some one at SCEI saw the performance difficiency and ordered the bolting on of second VU to outrun DC at all cost.

Shall we return back to earth now?

There was no Toshiba proposal to compete with LSI because LSI wasn't "in the running". Sony selected Toshiba primarily for their larger scale MIPS design and integration experience (and they could've gone to NEC for this since they had more experience with not only that but with supercomputers as well, but not the closer business relationship), but also since Toshiba has traditionally been one of the industry leaders in process technology (a big factor for Kutaragi considering the plans to increase the capabilities of Sony's semiconductor facilities).

Now back to the bolted on VUs. Toshiba wouldn't have had a proposal with VU0 since the VUs (along with the IPU) are Sony's influence. Toshiba's job was to provide a CPU core, and to manage the integration of various subcomponents. *IF* a VU had been "bolted" on it would've been VU0 since it's basically MIPS COP-port client with additional connects to SPRAM and the CPU bus. It lacks several features of VU1, has no dedicated connectivity to the GS, smaller memory allocations, and it's use in Macro mode prevents dual issue on the CPU when issuing ops to it. So *IF* a VU was a candidate for being bolted on it would be VU0 not VU1.
 
...

Same on the PS2... Only now you have multiple routes to improve throughput and reduce stalls.
You have to feed all routes or the GS stalls, it is as simple as that.

Nobody is forcing you do deal with 2 vertex streams... There's no law written in stone that you have to have 2 vertex streams
Of course, if you are satisfied with low triangle counts.

And unless my memory is that bad, you still had to load textures onto the GPU on the Playstation as well, so I'm not sure what you're crying about here....
Back in the PSX days, the system ram/VRAM ratio was 2:1. The amount of vertex data and texture developers dealt with weren't large to begin with, so developers uploaded whatever little texture they had and never be bothered with again. In the PSX2 days, this ratio changed to 8:1. The percentage of VRAM allocated for texture storage dropped drastically as well, forceing developers to swap textures during rendering. This is of course unnecessary with GC and Xbox.

Obviously a high priority path isn't going to stall a low priority path. The GIF can manage and arbitrate data on PATHs 1 and 2 just fine.
I am well aware of what the GIF does. Suppose you were to max out VU1, what do you do? Open a stream to VU0 and dump more vertex data. Now you are using both VUs. What happens when VU0 sits idle but your code keep feeding vertex data to VU1 only or the other way around??? It is a lot easier to take care of one crying baby than three.

Guess what! It's not THAT big of a deal. Every piece of hardware has hotspots that prevent "maximum" performance (whatever the hell that is)...
Well, the other consoles don't have hot spots that obvious.

For starters you should start being more careful with your terminology, there is no such thing as a PSX2 yet, and the PSX is a PS2 with built-in HDD and burns DVDs...
I am using what experts are using.

Secondly, how can Sony have admitted their past design mistakes with the PSP since the PS3 is likely to such a durn complex bugger to work with according to you?
1. PSP is coming out 4 years after PSX2.
2. It is exactly opposite of PSX2 programming model.

Thirdly, you can't really claim that the vector extensions will be similar to SSE since the architectural details haven't been published.
It doesn't take a brainiac to figure that out. The vector unit of PSP is a CPU extension sitting on a COP2 bus, not a separate and independent device like VU.

it may be similar to VU0 for all you know
Toshiba copyrights is not mentioned. Toshiba owns PSX2 VU IP.

. As it is, VU0 defaults to macro mode anyways, so it's no different from any other vector extension you see other than it at least can do arbitrary computes between elements with simple designations in the opcode mask.
Why pay Toshiba when the Sony's MIPS license already covers MIPS3D???

There was no Toshiba proposal to compete with LSI because LSI wasn't "in the running".
Where were you when the news leaked out in wall street that LSI lost its PSX2 CPU bid to Toshiba, driving its stock price down by 30% or something like that back in summer of 1997???

Sony selected Toshiba primarily for their larger scale MIPS design and integration experience
Toshiba owns VU, SCEI doesn't. Why the hell do you think SCEI dragged Toshiba into CELL business? Because Toshiba owns VU.

And Sony is a MIPS licensee since early 90s and sold its own MIPS workstation, Sony doesn't need either LSI or Toshiba for its MIPS core, they build their own.

Toshiba wouldn't have had a proposal with VU0 since the VUs (along with the IPU) are Sony's influence.
Toshiba owns VU, SCEI doesn't. How many times do I have to repeat this until you get this through your head???

*IF* a VU had been "bolted" on it would've been VU0 since it's basically MIPS COP-port client with additional connects to SPRAM and the CPU bus.
You can take out VU1 now and PSX2 will still function, the same isn't true if VU0 is taken out.
 
You can take out VU1 now and PSX2 will still function, the same isn't true if VU0 is taken out.

Er, VU1 is the one with a direct path to GIF, VU1 is the one which does 90% of all the geometry/lighting work, VU1 is the one which operates in micromode... hell, go look at the SCE GDC slides, Sony had a rep there saying that VU0 was barely a blip on the PA graphs most of the time.
 
Toshiba owns VU, SCEI doesn't. Why the hell do you think SCEI dragged Toshiba into CELL business? Because Toshiba owns VU.

And Playstation hooks toshiba ?

Actually ,is it precisely TOSHIBA ,and not some sort of "spin-of" specially dedicated ? .Archie?
 
PC-Engine said:
Isn't NAOMI 3 supposed to be based on Series 5?

Isn't Naomi 3 Xbox based, and only Naomi in name? I thought it was Chihiro.....

BTW, what is virtual texturing?
 
Fox5 said:
PC-Engine said:
Isn't NAOMI 3 supposed to be based on Series 5?

Isn't Naomi 3 Xbox based, and only Naomi in name? I thought it was Chihiro.....

BTW, what is virtual texturing?

NAOMI 3 is still in development that's why I think it will be based on Series 5 ;)
 
Any? Now I definitely know you're full of crap! Paths are prioritized and don't all run simultaneously. Obviously a high priority path isn't going to stall a low priority path. The GIF can manage and arbitrate data on PATHs 1 and 2 just fine. The only real one you have to worry about is big block image transfers on path 3.

What does GIF stand for ???
 
Teasy said:
If DC does load all textures into VRam despite the fact that before they're loaded the GPU already knows which ones aren't visible, then that seems like a bit of a missed opportunity to me.
I made this clear in the previous post - VRam is the only memory PVRDC can see. If the texture isn't in there when it starts to render, it doesn't exist as far as the chip is concerned.

Archie said:
No, but it does take quite a while to to 8 texture lookups into external SDRAM modules
Erhm... I seem to remember trilinear on PVRDC required triangles to be setup twice. I could be wrong, but doesn't this directly contradict the notion of rasterizer doing more then one texture per pass?

Deadmeat said:
You have to feed all routes or the GS stalls, it is as simple as that.
For someone that admitted to not knowing a lot more basic PS2 programming details you sure like stating facts about the less then basic programming details.

Of course, if you are satisfied with low triangle counts.
Single VU is capable of sustaining 10-20MPoly/sec. It's not quite the NV2a high, but it does the job.

What happens when VU0 sits idle but your code keep feeding vertex data to VU1 only or the other way around??? It is a lot easier to take care of one crying baby than three.
Jesus DM with your obsession with polycounts and nothing but polycounts I would have thought you are a SCE representative (or were one in your past life or something).
Anyway, once you get past your obsession with benchmarking VU throughput, comparatively speaking, VU0 is actually quite awkward and slower to use for processing vertex streams then VU1. In other words VU0 resources are better utilized in other ways - which is dictated by design. If they wanted you to mainly use it for vertices they'd bolt a pipe to GS
on it just like they did for VU1.

Well, the other consoles don't have hot spots that obvious.
No, other consoles just have less fanboy propagated myths about alleged 'hotspots' (though they don't really escape having similar myths either). Which is ironic really, as PS2 is also the one console with the most Factual information released on the issues of realworld bottlenecks. Nonetheless this doesn't stop "certain" people to ignore those published facts and continue their fictional mantras they started several years ago.

You can take out VU1 now and PSX2 will still function, the same isn't true if VU0 is taken out.
Actually it's the reverse.
If you take VU1 out, 99% of PS2 games would not work without significant reprogramming, and even extensive rearchitecting of the renderer in some cases.
Remove VU0 and majority of titles would continue to work with simply switching one or two C++ headers and recompiling the code.
 
...

I made this clear in the previous post - VRam is the only memory PVRDC can see. If the texture isn't in there when it starts to render, it doesn't exist as far as the chip is concerned.
Except for the fact that PVR2DC had lot more VRAM to begin with and with texture compression it would have stretched beyond 20 MB.

For someone that admitted to not knowing a lot more basic PS2 programming details you sure like stating facts about the less then basic programming details.
The way console hardware works is about the same. You have your custom devices, process and prepare the data the way you like it, open the stream to target device, and start dumping your data. Works pretty much the same across all devices, be it a GPU, a printer, a terminal, a HD, etc.

Single VU is capable of sustaining 10-20MPoly/sec. It's not quite the NV2a high, but it does the job.

I have yet to see a PSX2 title doing 10~20 million polys/s. Of course I have not seen the Faf racer yet.

Anyway, once you get past your obsession with benchmarking VU throughput, comparatively speaking, VU0 is actually quite awkward and slower to use for processing vertex streams then VU1.
I don't see why this would be, since both VUs are supposed to be identical. As long as you keep your VU program size and vertex list size to 4 KB, you can use either of them.

No, other consoles just have less propagated myths about alleged 'hotspots' (though they don't really escape having similar myths either).
Hotspots of Xbox and GC aren't as obvious and glowing as those of PSX2. I was able to pick up those hot spots on the day PSX2 block diagram was released.

If you take VU1 out, 99% of PS2 games would not work without significant reprogramming
Why? Because present day VU codes and vertex lists are bloated and can't fit into VU0?

Remove VU0 and majority of titles would continue to work with simply switching one or two C++ headers and recompiling the code.
VU0 cannot be removed, it is joined to the R5900 core like a Siamese twin joined to the brain and was meant to be together from the beginning. It is quite obvious from the design observation what Toshiba intended, they wanted VU0 to be used by the CPU as a vector accelerator during the animation and physics calculation stage, then have it reset as a separate T&L engine in the next stage. While this was a logical usuage, someone at SCEI was unhappy with the T&L rate and ordered the inclusion of VU1....
 
I think Jak and Daxter 2 may do over 10 million polys, and I wouldn't be surprised if there are a few others that do as well. GT4 probably will, as I think gt3 did like 6 or 8 million. However, I'd be surprised if any actually hit 20 million.
 
...

Claiming 10~20 million polys/s for VU1 is worthless, SH-4 benchmarked 10 million polys/s itself. This is why SCEI was so determined to outrun DC by a substential margin at all cost, even it meant throwing in the VU1 and destroying the programmability in the process. Of course, you don't have this problem if you choose to give up the use of VU0, but then PSX2 is only marginally faster than DC.

Having twin VUs meant little to no benefit in real world, it was just a marketting ploy, for the sole purpose of claiming big marketing numbers.
 
Uhhh, if it was just a marketing ploy, why even bother putting in the extra hardware??? Just say the existing system will do better. Use some fancy term like, hmmm, "Blast Processing" to say it would crush the DC. If anything, you should agree that Sony's marketing department has the means to "fool" anyone, no matter what the claim.

Let's see, by your account, they added VU1 to enhance performance over what was possible with just the original VU0 (which would surely be subpar, given the state they left it in). They put in extra cache and a direct line to the GS for this [puts fingers in quote shape] VU1. Now it blows away the DC performance (in its given purpose) and is used w/o question in all PS2 game developments. Sounds like a design "win" to me. They saw an opportunity for improvement and subsequentially made a magnaminous touchdown. Yet you can only make it out to be indicative of a design flaw. Hmmm... :rolleyes:
 
Mmmm... burning master discs is fun... 8) gives me plenty of useless time to write longer posts then I ever would have considered otherwise.

Except for the fact that PVR2DC had lot more VRAM to begin with and with texture compression it would have stretched beyond 20 MB.
We weren't comparing it to anything, Teasy was just debating how textures are uploaded to VRam. But since you brought it up, I believe the uncompressed equivalent texture pools on DC went up there to ~40mB.
Incidentially that's about what we use just for texture animations in one of our stages :devilish:

I have yet to see a PSX2 title doing 10~20 million polys/s. Of course I have not seen the Faf racer yet.
R&C is documented by PA around that number. Don't know if you've actually seen it though.
On an unrelated note, I believe you DID see Fafracer screens sometime last year (originally as scans from a Korean mag).

I don't see why this would be, since both VUs are supposed to be identical. As long as you keep your VU program size and vertex list size to 4 KB, you can use either of them...
...Why? Because present day VU codes and vertex lists are bloated and can't fit into VU0?...
Memory is definately a problem, vertex streams in use are optimized for VU1 layout, and some of the larger programs may very well take up entire VU0 memory (like clip routines). Also VU0 doesn't have its own path to GS, forcing CPU interaction and making it impossible to operate as a true standalone T&L unit that VU1 is.

VU0 cannot be removed, it is joined to the R5900 core like a Siamese twin joined to the brain and was meant to be together from the beginning.
VU0 is the 3rd of 3 coprocessors on R59k core, and as far as I know the only one of the 3 required by design is COP0 (control coprocessor). Both FPUs are rather optional :p
 
Inventors: Suzuoki; Masakazu (Tokyo, JP)
Assignee: Sony Computer Entertainment, Inc. (Tokyo, JP)
Appl. No.: 048137
Filed: March 25, 1998

Foreign Application Priority Data
Mar 27, 1997 [JP] 9-074930

<snip>

What is claimed is:

1. An information processing apparatus comprising:

  • a plurality of processing units including a main processing unit and first and second vector processing units
 
Back
Top