Beating Emotion Engine

Status
Not open for further replies.
PS2 was designed using performance analyser stats. They looked at what developers tried to do with the org. PS and where the architecture failed. That's why PS2, in many ways, is a super Playstation. They share a lot of the same design philosophy, just like the PS3 share many similarities with PS2.
 
Squeak said:
PS2 was designed using performance analyser stats. They looked at what developers tried to do with the org. PS and where the architecture failed. That's why PS2, in many ways, is a super Playstation. They share a lot of the same design philosophy, just like the PS3 share many similarities with PS2.
And if Sony followed the same naming convention as Nintendo, it would have been called Super Playstation.
 
swaaye said:
A console built out of that hardware would've been a formidable opponent to say the least.
A formidably expensive opponent you mean. Not to mention NV10 T&L was basically worthless, and pixel fill was something that could at best keep up with a DC.

Not sure I quite get your complaints about raw power vs. efficiency either - XBox had considerably higher raw numbers in just about every aspect except pixel fill.

I've said this before - and I still stand by it - PS2 design oversight was mipmap selection algorithm - if that was implemented correctly on GS, 99% of IQ arguments would be gone, and with that, the core of its 'efficiency' arguments.
So with a small transistor upgrade to GS you'd get something superior in every way and I'm sure still considerably cheaper then the proposed AMD/NVidia Pippin in 1999.

Squeak said:
They share a lot of the same design philosophy, just like the PS3 share many similarities with PS2.
You're right on the first part - but I'm not so sure about PS3, IMO it kinda turns the whole PS2 design ideology upside down in most ways. I'll leave it to others to discuss if that's a good or bad thing though.
 
Last edited by a moderator:
I know its a little out of context of what we are talking about here but what do people think of the PS2 vs the Dreamcast in FP etc.? If Dreamcast survived do any veteran developers that know the systems think the Dreamcast could compete or even beat the PS2 if it lived through a normal console life span? I believe the Dreamcast used tile rendering correct? Not sure on the FP numbers etc.
 
The Emotion Engine's T&L wasn't fast (probably less than 25% of May 2000's 100-MHz ELAN), but it was very flexible, like a modern vertex shader.

The size of the PS2's chipset and the cost of its external RAM made it extremely expensive to manufacture. The Graphics Synthesizer was gigantic yet still couldn't afford the room to keep proper MIP level selection, let alone a competitive set of blending functions and per-pixel lighting support, and effective texture compression, and its amount of embedded memory which sacrificed a lot of the chip's space still wasn't enough to prevent the need for field rendering and dithering in several of its highest profile games, enable high 3D depth, or exploit the fillrate for high resolutions. The system's lack of easy facilitation for proscan with no support for a simultaneous interlaced/non-interlaced output was also a step back in image quality. System bandwidth wastage necessitated the use of the most expensive brand of the most expensive RAM, Rambus DDR.
 
But

Lazy8s said:
The Emotion Engine's T&L wasn't fast (probably less than 25% of May 2000's 100-MHz ELAN), but it was very flexible, like a modern vertex shader.

The size of the PS2's chipset and the cost of its external RAM made it extremely expensive to manufacture. The Graphics Synthesizer was gigantic yet still couldn't afford the room to keep proper MIP level selection, let alone a competitive set of blending functions and per-pixel lighting support, and effective texture compression, and its amount of embedded memory which sacrificed a lot of the chip's space still wasn't enough to prevent the need for field rendering and dithering in several of its highest profile games, enable high 3D depth, or exploit the fillrate for high resolutions. The system's lack of easy facilitation for proscan with no support for a simultaneous interlaced/non-interlaced output was also a step back in image quality. System bandwidth wastage necessitated the use of the most expensive brand of the most expensive RAM, Rambus DDR.

But PS2 is now super-small and selling for $129 at good profit no? Xbox is very big and $149 and still big and expensive to make. This is why PS2 is smart design my friend. Yes, some things is not so easy for new PS2 developer but look at God of War 2, is maybe most amazing old gen graphics. GT4 has amazing racing graphics for 60fps. So we can see cheap manufacture cost and amazing graphics for "flagship" game. Even developer cost/game is not so much.
 
ihamoitc2005 said:
Xbox is very big and $149 and still big and expensive to make. This is why PS2 is smart design my friend.
The current size and cost of the Xbox have nothing at all to do with design and everything to do with contract language.
 
Fafalada said:
A formidably expensive opponent you mean. Not to mention NV10 T&L was basically worthless, and pixel fill was something that could at best keep up with a DC.

Not sure I quite get your complaints about raw power vs. efficiency either - XBox had considerably higher raw numbers in just about every aspect except pixel fill.

I've said this before - and I still stand by it - PS2 design oversight was mipmap selection algorithm - if that was implemented correctly on GS, 99% of IQ arguments would be gone, and with that, the core of its 'efficiency' arguments.
So with a small transistor upgrade to GS you'd get something superior in every way and I'm sure still considerably cheaper then the proposed AMD/NVidia Pippin in 1999.


You're right on the first part - but I'm not so sure about PS3, IMO it kinda turns the whole PS2 design ideology upside down in most ways. I'll leave it to others to discuss if that's a good or bad thing though.

ummm there *was* a supposed Pippin2 spec floating around in 1996, to be the next-gen Pippin, but I think in this case you meant to say: the proposed AMD/Nvidia X-BOX, right ? ;)


also, are you saying that if Graphics Synthesizer had a perfect / exellent mip-mapping implementation, that all of the image quality problems with PS2 games would've been gone?

how would overall PS2 efficiency problems have been solved with just a proper mip-mapping implementation?

what about a faster bus/pathway from GS to main memory (through EE right ?) instead of the 1.2 GB/sec that it had?

what about a better AA implementation? what about single pass and/or loopbacks ?


i'm not arguing, im just asking questions.

PowerVR2DC had 10 million transistors, no edram. so more transistors dedicated to logic than GS. and yet, only one pixel pipeline.

IMO, I think Sega, Videologic should've gone for at least 2 pixel pipelines given that Dreamcast was released in 1998 with the TNT and Voodoo2 SLI around.


it would've been interesting if Sony had dedicated say, 20 million transistors to GS's logic, and kept the 4 MB eDRAM or even increased it to 8 MB. would've been still, under 100m transistors, but probably too difficult/costly to mass produce. GS should've been a 16:1 or an 8:2 configuration (pixelpipes/TMUs), instead of the 16:0 / 8:1 that it was.


around 1999-2000 Rendition had its unreleased Verite V4400 with 12 MB eDRAM. it was around ~125m transistors. would've loved to see that in action. even if it didn't have 16 pixel pipes like GS.


[/ramble]
 
Last edited by a moderator:
image quality

Megadrive1988 said:
also, are you saying that if Graphics Synthesizer had a perfect / exellent mip-mapping implementation, that all of the image quality problems with PS2 games would've been gone?
[/ramble]

PS2 has funny design so many developer is not so expert for PS2 development, but we can see many PS2 games have very excellent image quality no? So is maybe not so much what is not possible in PS2 but what is not easy for developer. But I always think GameCube has best image quality for old gen. Very clear graphics. PS2 has sometimes like sand and Xbox has sometimes like blurring.
 
Gamecube has the awful, basically 16-bit, color depth limitation. Shows up in RE4 most obviously cuz of the overall dark theme of the game. The fade out of the death screen is particulary icky.

Xbox should look best considering the hardware. NV2x was liked a lot by PC users, especially NV25's aniso filtering.
 
Megadrive1988 said:
ummm there *was* a supposed Pippin2 spec floating around in 1996, to be the next-gen Pippin, but I think in this case you meant to say: the proposed AMD/Nvidia X-BOX, right ? ;)


also, are you saying that if Graphics Synthesizer had a perfect / exellent mip-mapping implementation, that all of the image quality problems with PS2 games would've been gone?

how would overall PS2 efficiency problems have been solved with just a proper mip-mapping implementation?

what about a faster bus/pathway from GS to main memory (through EE right ?) instead of the 1.2 GB/sec that it had?

what about a better AA implementation? what about single pass and/or loopbacks ?


i'm not arguing, im just asking questions.
[/ramble]

I think Faf was saying the problems with PS2 efficiency were a perception caused by people observing problems in the image quality and attributing it to a general design problem with the machine (or alternatively, a few titles which go to lengths to fix the IQ and sacrifice other areas instead, to the same result - it seems like the machine has artificially low limits).

So I think I'm with Faf on this one - although there are many ways in which GS could've been improved, if there was one single thing that could be done, purely in the interests of image quality, then mip-mapping is quite glaringly in need of fixing. And having done so, a lot of titles would look a lot better and man would probably run faster too.

More RAM would help a bit too, as it would make it easier to have a large framebuffer (for SSAA) and larger textures - though both of these things can be coded around more easily than the mipmapping.

More bandwidth wouldn't really help, IMO. Although it would be nice, I'm not sure how often it's a bottleneck, except in the case of a badly designed texture-fetching system.

There are other things I could think of that should be simple enough but doable:

A fence mechanism for stopping and starting path3 transfers in a safe and asynchronous manner. The current path3 mechanism scares the life out of me, and is synchronous anyway. We can already inject values into registers on the GS side, all we'd need would be a comparator in the GIF. I believe another big factor in PS2 graphics looking worse than should be possible is the texturing - people either use wasteful schemes with a lot of stalls, use lower-res textures to avoid swapping, or burn CPU cycles with expensive interrupts (or a combination of all three!). Trivial to fix - the GIF just needs to be able to do "Wait until REGISTER > VALUE".

Let the blender do multiply on colour - ok, perhaps with the number of pixels units that would be expensive - but even if it ran at lower throughput with that enabled, it'd be better than the hideous multi-pass mechanisms, or using monochrome hacks.

DMA-out on VU0. I have no idea what Sony were thinking in having a super-fast processing unit you can DMA stuff into but not get it back out.

No L2 cache... thanks for basically crippling the CPU.
 
my dream PS2

Emotion Engine running at 400 MHz
- twin MIPs RXXXX cores - plenty of L2 cache
- twin Vector Units that are equal in structure & performance - plenty of scratch pad memory, few programming headaches
MPEG2 as per real Emotion Engine
built-in I/O processor (PS1 CPU)
~10 GFLOPs peak performance

128 MB of RDRAM - 6.4 GB/sec

Graphics Synthesizer running at 200 MHz
16 pixel pipelines with each pipeline being as robust as an NV4-TNT or NV5-TNT2 pipeline
1 TMU per pixel pipeline
many things done in single pass (not having to redraw geometry) or loopbacks
4x FSAA unit that only demands some of the plentiful bandwidth and half the fillrate
proper mip-mapping
all the expected blending modes
-8 MB eDRAM - 96 GB/sec bandwidth
-32 MB external graphics memory ~12.8 GB/sec
a 4.8 GB/sec connection from GS to EE and/or to the 128 MB main memory

3,200 Mpixels/sec fillrate - 3,200 Mtexels/sec w/ trilinear filtering ;)

1,600 Mpixels/sec fillrate with 4x FSAA


audio: custom audio processor capable of 5.1 in realtime - 8 MB audio memory
128 channels


lots of flaws in this "dream" PS2 spec, because its not from the mind of a programmer, an chip engineer, game developer.

[/dream mode]
 
Last edited by a moderator:
Megadrive1988 said:
my dream PS2

Emotion Engine running at 400 MHz
- twin MIPs RXXXX cores - plenty of L2 cache
- twin Vector Units that are equal in structure & performance - plenty of scratch pad memory, few programming headaches
MPEG2 as per real Emotion Engine
built-in I/O processor (PS1 CPU)
~10 GFLOPs peak performance

128 MB of RDRAM - 6.4 GB/sec

Graphics Synthesizer running at 200 MHz
16 pixel pipelines with each pipeline being as robust as an NV4-TNT or NV5-TNT2 pipeline
1 TMU per pixel pipeline
many things done in single pass (not having to redraw geometry) or loopbacks
4x FSAA unit that only demands some of the plentiful bandwidth and half the fillrate
proper mip-mapping
all the expected blending modes
-8 MB eDRAM - 96 GB/sec bandwidth
-32 MB external graphics memory ~12.8 GB/sec
a 4.8 GB/sec connection from GS to EE and/or to the 128 MB main memory

3,200 Mpixels/sec fillrate - 3,200 Mtexels/sec w/ trilinear filtering ;)

1,600 Mpixels/sec fillrate with 4x FSAA


audio: custom audio processor capable of 5.1 in realtime - 8 MB audio memory
128 channels


lots of flaws in this "dream" PS2 spec, because its not from the mind of a programmer, an chip engineer, game developer.

[/dream mode]


That's not really a ps2 anymore once you do that.
You may have well just listed infinity for all the specs, you're not talking about minor changes to the hardware there (like adding a little bit more ram or upping the clock speed a litlte), you're talking about a completely new system that could likely compete with xbox 360 and ps3 to an extent.
 
Being capable of accounting for a texture's incline automatically during MIP level selection would've addressed most of the complaints against the PS2's IQ, but other image qualities still wouldn't have been high because, as impressive as embedding 4-MB of memory onto a chip was, eDRAM wasn't adequate.

The PS2's chips and parts were already pushing the limits of cost and size, so any additions to them could've only been realized through fundamental changes to their architecture. Having sacrificed even basic functionality like reliable MIP level selection in later GS revisions shows that the potential for extra room simply wasn't there. eDRAM was too costly a solution to the bandwidth issue.

Image anti-aliasing at 480p with a convenional renderer during that era probably isn't a realistic possibility. The larger back buffer of supersampling certainly wouldn't fit in the limited amounts of eDRAM which could be afforded, and working across an external bus could use up too much of the resources.
 
MrWibble said:
And having done so, a lot of titles would look a lot better and man would probably run faster too.
Exactly - it's the one characteristic that made by far the most noticeable difference (even to casual observers), and ironically one of the cheapest ones to change.

More RAM would help a bit too, as it would make it easier to have a large framebuffer (for SSAA) and larger textures - though both of these things can be coded around more easily than the mipmapping.
I'd settle for something cheaper to 'extend' memory. If IPU was architected as chain-DMA driven like the VUs, and ability to output decoded macroblocks in slices rather then linearly(like the SPR channels do), we could run as another paralel process with no CPU overhead - a lot more compressed textures.

More bandwidth wouldn't really help, IMO. Although it would be nice, I'm not sure how often it's a bottleneck, except in the case of a badly designed texture-fetching system.
Agreed, bandwith was never a real issue in the system.

The current path3 mechanism scares the life out of me, and is synchronous anyway.
Actually that was one thing that didn't really bother me - but I only implemented it with masking rather then interrupts. Overhead was almost nothing compared to Path2 transfers, and I already had sending organized so replacement was relatively trivial.
I do agree proper stall control would have been "safer" though.

No L2 cache... thanks for basically crippling the CPU.
Well - you know, it could have been worse - just look at that other Risc based chipset with only one bus...
 
Fafalada said:
Exactly - it's the one characteristic that made by far the most noticeable difference (even to casual observers), and ironically one of the cheapest ones to change.

Actually I'm not sure how cheap it would be - presumably they'd have to calculate the u and v derivates for quads of pixels (like everyone else does). Considering just how basic and streamlined the GS pixel pipes are, that might represent a significant extra amount of logic.

Of course there's probably some other junk in there that most people would happily trade off for it. Video-in was probably as total waste of space. The polygon edge-coverage stuff wasn't far behind (though if it actually *worked* it might convince me to leave it in!)

Actually that was one thing that didn't really bother me - but I only implemented it with masking rather then interrupts. Overhead was almost nothing compared to Path2 transfers, and I already had sending organized so replacement was relatively trivial.

My two issues with the mask system were the stability (it seemed like every few months a new "fringe case" would pop up on the newsgroups where there was yet another bunch of nops we have to stick into the DMA list to appease a random undiscovered FIFO and stop things crashing), and the fact that it was synchronous.

The latter is a big efficiency drain IMHO, as you pretty much have to balance the texture upload timings with the rendering timings at the time you build the list, rather than while drawing. Inevitably you're probably just going to switch textures at fixed points (like between meshes, or double-buffer your available vram) and are going to get stalls at both ends during the actual render. Unless you get very lucky, you might well be spending 50% of your time not running in parallel when you could be.

To sort that you need at least triple buffering and also to move to interrupts - which limits the maximum size of texture a bit and chews CPU cycles. A fence system would've solved that for the price of a compare - and not even one that needed to be heavily pipelined nor parallel.... nor even fast...

Of course having fixed that, there's even less bandwidth for the poor CPU - but hey, look at the pretty graphics!
 
How much does the lack of L2 cache hurt PS2? How much did it hurt the older machines like N64, PSX, and Saturn say?
 
MrWibble said:
Actually I'm not sure how cheap it would be - presumably they'd have to calculate the u and v derivates for quads of pixels (like everyone else does).
Well I was basing this off PSP GPU, since it's basically a portable GS (with a lot of unnecessary features), and it did implement that (to an extent anyway).

At any rate I'd gladly throw out Video In and edge AA. I never got Sony's hardon with the latter anyway - yes we could make it work, but it was about as 'easy' to use as performing mipmap inclination correction per triangle.

MrWibble said:
My two issues with the mask system were the stability (it seemed like every few months a new "fringe case" would pop up on the newsgroups where there was yet another bunch of nops we have to stick into the DMA list to appease a random undiscovered FIFO and stop things crashing), and the fact that it was synchronous.
I read some of those stories, but well, maybe I just got lucky for once, cuz I didn't run into any big issues with mask control. In fact I never got a freezeup because of syncing - basically once I moved from Path2 -> Path3 (which ended up being a simple matter of splitting TEXFLUSH+FLUSHA sync points on VIF1 channel into two lists), the standard rendering 'just worked'.
The only part where I encountered problems(random texture corruption) was with some SFX that violated standard pipeline rules a bit, and also time shared texture cache eDram area, so additional sync points were needed there.
But 99% of those problems went away once I discovered I had a sneaky Path2 transfer somewhere that was iniatiang BitBlt transfers of its own, corrupting Path3 BitBlts on random intervals.

To sort that you need at least triple buffering and also to move to interrupts - which limits the maximum size of texture a bit and chews CPU cycles.
I dunno - my texture cache was a FIFO split into 256KB blocks(and up to 4 of them) - I don't consider that much of a limitation given how many PS2 games limited most textures to 8KB per map.
Batching texture transfers to block size was imperative anyway - we could have ~1000 textures per frame, so lots of unnecessary stalls even on Path2 to sync for every texture. So with 256KB block size you get ~20 sync points/frame, which is what - 480 bus cycles/frame overhead compared to Path2 version.
Interrupt version overhead would be at least 10x higher (not to mention burning CPU AND Bus), and anyway, mask was just simpler to implement.

I agree of course that a proper async stall control would've been better.
 
Though ports of Dreamcast games, downgraded graphically in many cases, fit among the PS2's catalog of titles, the DC's SH-4 CPU costed six times less silicon than the Emotion Engine, its CLX2 graphics chip plus memory controller costed multiple times less silicon than the Graphics Synthesizer, too, its memory was less expensive both in amount and type, it released over a year earlier, and it was discontinued before its game development had even half of a console cycle to mature.
 
Status
Not open for further replies.
Back
Top