Questions about PS2

Exophase · Jun 4, 2016

function said:
Was the guard band too small for very large quads, or too slow...?

Both were issues, the guardband was only 11-bit (PS1 was 12-bit, which was still way too small but a lot better) and you had a choice between spending cycles for every quad line to get imperfect fillrate saving for pixels outside the clip rectangle or no fillrate saving at all.

But the bigger issue was that you couldn't arbitrarily tessellate quads into smaller polygons to prevent them from going outside the guard band when they did. Quads mapped to axis-aligned sprites, whose widths had to be a multiple of 8 pixels no less. So while you could tessellate the vertices you couldn't match it with the exactly right texture coordinates. I guess your best bet is to scale the quad to get the offending vertexes inside the guardband and hope that the other vertexes are all outside the clipping rectangle. The games with obvious edge distortion probably didn't even do this.

Liandry · Jun 6, 2016

I have some questions about Sega Saturn, should I ask them here or better to make new topic?

bunge · Jun 6, 2016

Liandry said:
I have some questions about Sega Saturn, should I ask them here or better to make new topic?

I would suggest a new one as having Saturn in the title might attract some people with Saturn specific experience.

Liandry · Jun 10, 2016

I found this.
http://gamingbolt.com/ps4-xbox-one-...bers-only-make-sense-for-building-random-data
“A great example of this was the old PS2: you had effectively 2MB of VRAM left after you stored your frame buffer. The Dreamcast had 8MB total and the Xbox had 64MB shared RAM. I remember at the time everybody going crazy about this trying to demonstrate the inferiority of the PS2 in comparisons. What was hard to explain at the time was the PS2’s DMA system was so fast that you if you were clever enough, you could swap this out 16 times a frame and get an effective 32MB of VRAM.”
What does it mean? How it's done?

Shifty Geezer · Jun 10, 2016

It simply means you could load 2 MBs of data into VRAM 16 times per frame, giving access to 32 MB of data, but of course only 2 MBs accessible at any given moment. It's a quote clarifying that you had more than a static 2MBs of data to work with per frame.

Liandry · Jun 10, 2016

Shifty Geezer said:
It simply means you could load 2 MBs of data into VRAM 16 times per frame, giving access to 32 MB of data, but of course only 2 MBs accessible at any given moment. It's a quote clarifying that you had more than a static 2MBs of data to work with per frame.

But does it actualy mean what there is 16 times more VRAM available? Does it have some real benefits? And does something similar possible on any other system?

HTupolev · Jun 10, 2016

Liandry said:
But does it actualy mean what there is 16 times more VRAM available? Does it have some real benefits? And does something similar possible on any other system?

If you've got a 2MB framebuffer in VRAM, and 2MB worth of textures, you might tell GS to draw a bunch of objects that make use of those textures. Then, when you want to draw objects that use other textures that currently don't exist in VRAM, you copy those other textures from main RAM to VRAM (overwriting the textures that had originally been in VRAM), and then begin drawing those new objects.

People were arguing that Xbox had a big advantage in that you didn't need to bother with the copy operation since the GPU just used the 64MB main RAM, compared with PS2 juggling things in and out of its 4MB VRAM.

The quote is simply saying that the copy operation was pretty fast and thus this perceived PS2 disadvantage was being overblown.

Liandry · Jun 10, 2016

So if we make some calculations there is not 4Mb but 36Mb of VRAM in PS2 in that regard? I mean 2Mb framebuffer + 32Mb textures.

Shifty Geezer · Jun 10, 2016

No. It's 2MBs VRAM and there's no point trying to fabricate another value. There's just more to the story than a simple metric. Like bandwidth/total system bandwidth and many other metrics - the devil is in the detail.

Just to clarify, it's analogous to three situations:

1) There's a 2 litre bucket with a 2 litre per minute hose feeding it. You can fill, hold, and empty two litres a minute. If the hose is in fact a four way hose supplying four different coloured waters, you can hold up to 2 litres made out of any combination of those four colours.

2) There's a 32 litre bucket with a 32 litre per minute hose feeding it. You can fill, hold, and empty thirty-two litres a minute. If the hose is in fact a four way hose supplying four different coloured waters, you can hold up to 32 litres made out of any combination of those four colours.

3) There's a 2 litre bucket with a 32 litre per minute hose feeding it. You can fill, hold, and empty thirty-two litres a minute. If the hose is in fact a four way hose supplying four different coloured waters, you can hold up to 2 litres made out of any combination of those four colours.

Example three has the same total throughput as example two, but much less storage. If there arose a case where you needed 8 litres of pink water in a bucket, example 3 just couldn't do it. You'd have to break the job into 4 lots of 2 litres pink water, or find some other solution to your pink-water troubles.

milk · Jun 10, 2016

Who needs that much pink water anyway. Real world software usually uses green water the most.

Grall · Jun 11, 2016

corysama said:
and Gamasutra article.

Hm... Totally irrelevant to the thread, but since we're also rather geeky people around these parts, and this is a geeky little story, I'll share anyway:

I'm reading the Gamasutra article because I had a N64 once and rather liked the box, but I'm not understanding much of what it describes since I'm bad at maths and worse at computer programming. As I've only just recently started watching the Game of Thrones TV series, I find myself reading random sentences in King Robert's deep, raspy voice.

...Such as,

While I had never dealt with the N64's signal processor (the RSP) before, I knew that its vector nature and potential to run in parallel held the keys to improved performance.

Other characters played parts as well. I guess this means I should go to bed...

Liandry · Jun 11, 2016

Shifty Geezer said:
No. It's 2MBs VRAM and there's no point trying to fabricate another value. There's just more to the story than a simple metric. Like bandwidth/total system bandwidth and many other metrics - the devil is in the detail.

Thank you for so detailed explanation. Now I understand it. :smile:

SedentaryJourney · Jun 12, 2016

jlippo said:
Been wondering this for a while, but did developers find decent way to fight the MipMap selection problem on ps2?

In my understanding the default method selected the mipmap level purely by distance and didn't take account of the tilt of polygon.
This lead to a quite bit of an aliasing in many games.

Fafalada posted something about this on NeoGAF. Apparently it was doable but no one bothered with it since the noisier incorrect filtering was considered aesthetically more pleasing.

dogen · Aug 3, 2016

Fixing effects in Gran Turismo 4 and Urban Chaos in PCSX2

http://pcsx2.net/developer-blog/277-channel-shuffle-effect.html

corysama · Sep 9, 2016

dogen said:
Fixing effects in Gran Turismo 4 and Urban Chaos in PCSX2

http://pcsx2.net/developer-blog/277-channel-shuffle-effect.html

Nice! I'm pretty sure that's article is talking about the depth green-to-alpha technique I described here. And, I later found the original Sony dev doc describing it.

Liandry · Nov 19, 2016

As we know PS2 CPU EE is 300 Mhz and it has 6,2 Gflops. In theory if it will be 3 Ghz, will it have 62 Gflops?
And second, EE have 2 VU blocks. If it will be 4 VU blocks, will it double EE power? And in general is it all possible? In theory of course.

milk · Nov 19, 2016

Sony hired Liandry to design PS2 PRO confirmed.

Shifty Geezer · Nov 19, 2016

Liandry said:
As we know PS2 CPU EE is 300 Mhz and it has 6,2 Gflops. In theory if it will be 3 Ghz, will it have 62 Gflops?

Yes. 10x clockrate results in 10x the peak theoretical flops on any processor - they process x flops per clock cycle.

And second, EE have 2 VU blocks. If it will be 4 VU blocks, will it double EE power? And in general is it all possible? In theory of course.

If you double execution blocks you double theoretical peak processing, but if you don't double up the rest of the system to support them, they'll idle.

Liandry · Nov 19, 2016

milk said:
Sony hired Liandry to design PS2 PRO confirmed.

You got it!

Shifty Geezer said:
If you double execution blocks you double theoretical peak processing, but if you don't double up the rest of the system to support them, they'll idle.

So, again in theory. As we know EE is 10 million transistors, and Nvidia Titan X is 12 billion transistors. That is 1200 more. So what if someone build chip with 1200 EE inside and make it 3 Ghz, it will be chip with amazing 74 Tflop of power! Of course all that power can't be use, but still it's great.
Why Sony just did't made new EE for PS3 with, let's say some 8 VU's + other needed stuff and with 3 Ghz? It already would've been around 240 Gflops just like Cell but only with ~ 50 million transistors not 225 million. With 225 it could be close to 1 Tflop.

Nammo · Nov 20, 2016

Liandry said:
Why Sony just did't made new EE for PS3... With 225 it could be close to 1 Tflop.

And think how much cheaper it would be than spending (combined with IBM) a billion of dollars and a few thousand man-years on R&D!
Hmmm. They probably know something we don't.

As Shifty said, if you want the same design, but multiply performance by X, you need to multiply everything else by X.

Start with bandwidth: 2 VUs had 3.2GB/sec memory bandwidth in the EE. 8 SPEs in Cell had 25.6GB/sec - "only" 8 times as much.
To address this shortcoming, Cell's 8 SPEs had 2048KB on-chip RAM total, while the 2 VUs had 40KB total - that's over 50 times as much RAM!

Notice how the "cache" (RAM) has to grow faster than FLOPs to make up for limits in memory bandwidth?

Well, a similar thing happens with transistors and MHz: At 3GHz in 90nm (PS3), a VU needs many more transistors than it did at 300MHz in 250nm (PS2). This is because 90nm is only 3x faster than 250nm, so to get to 10x, you have to redesign the VU to use more pipeline stages. With more pipeline stages, you need more transistors, and different design techniques. In the end, a 3GHz VU would look a lot more like an SPE than a VU.

There are other problems with "multiplying by X". Power and heat were a huge engineering problem in Cell's design, and it required advanced design techniques that weren't used in the VU. The number of FLOPS available in Cell were limited by power consumption, and simply scaling up the VU would be even more limiting.

In other words, just keeping the VUs in the PS3 would result in worse performance than a new design could achieve. So, they made a new design.

Questions about PS2

Exophase

Liandry

bunge

Liandry

Shifty Geezer

uber-Troll!

Liandry

HTupolev

Liandry

Shifty Geezer

uber-Troll!

milk

Like Verified

Grall

Invisible Member

Liandry

SedentaryJourney

dogen

corysama

Liandry

milk

Like Verified

Shifty Geezer

uber-Troll!

Liandry

Nammo

Similar threads