Luminiscent posted some questions to me but I thought I might as well answer them in this thread:
Luminescent said:
Sorry to bother you, Simon, but I bugged Kristof about some of the questions below and he was not sure, being that he came onboard PowerVR after the DC. After researching a bit about CLX on B3D, I found your bits information to be most informative and believable, being that you worked on the project. That said, I present to you the following set of questions:
I'll try to answer, but please understand it was a while ago and my memory is fading fast ;-) The information may not be 100% accurate.
In a recent
thread about PowerVR2DC, I asked the following:
Luminescent said:
Simon F said:
Finally, when John said the tiling was all HW in CLX and part-software in PVR250, I think you should take that to mean that the 250 was more programmable.
Anyone care to expand on that statement? Does it mean CLX had a native hardware implimentation for tiling while PVR250 relied partly on cpu software-side assistance, or that CLX had a hardwired (non-configurable) tiling implemenation while the 250 had a configurable one?
Care to explain a little further?
Neon250 actually had a programmable module, i.e. a CPU. It wasn't a Vert. or Pix. shader or anything like that, but enough to move data around, make decisions etc etc before the 3D rendering took place.
The tiling calculations were still done in HW but it
may be that it was more flexible than the fully hardwired CLX. I didn't work on the drivers so I can't be certain.
In addition, I read, in the Neon 250 specs listed in the thread above, that it supported a 32 floating-point z-buffer. Is it the same for CLX?
I think they were both pretty closish to IEEE float.
Do all internal units of CLX use 32-bit floating point precision (I've read the texture and geometry setup engines do, but I'm not sure about the texture shader, etc.)?
Those bits would be some form of floating point, but things like texture addressing don't need to be anywhere near as precise as IEEE, so they would be smaller.
If there are int units, what range do they typically work with?
The RGBA colour buffers were just be 8888.
Doesn't MBX work with at FP precision internally (I believe I read it an an ARM or Intel whitepaper)?
Again it will depend on what part of the chip. They will usually be tailored to just have the right amount of precision for the job.
Because CLX is capable of Dot3, I assume that the texture shader sports some sort of combiner unit that can complete a dot product (although I'm not sure about how many components), is this right?
It was a dot product but not DOT3. It used polar coordinates which requires slightly more software set-up. Unfortunately, I changed my mind to go to cartesian coordinates too late for it to be put into CLX <shrug>. Mind you, there were some other nice features in the dot product unit. Go search for the patent if you're really curious.
Is the combiner configurable to allow for other sorts of color blending? How many cycles for a Dot3 instruction and a texture fetch, 2?
a) I guess so, eg. I always wanted to do anistropic translucency with the dot unit.
b) 1 cycle for the normal map fetch, dot calculation and blend with the current accumulation buffer. If you wanted the bump to have been applied to another texture that would have cost another cycle (i.e. an earlier triangle)
The CLX sports two internal 32-bit buffers (FP?) for multipass, right?
They were 8888, and yes they allowed some interesting effects which couldn't be done with other architectures of the day.
This allows for it to maintain color integrity internally, but how many bits does its final framebuffer hold?
Whatever you program it to be. It could be 16 bit, (5:6:5 or 1:5:5:5) or 32. I think it also supported a genuine 24 bit mode (i.e. no wasted bytes).
Finally, after doing some research, I found the following post in which you stated:
Simon F said:
Lazy8s said:
The CLX's maximum sort depth is 60 .
What? You can put as many polygons in (with different depth) as you like (memory permitting on CLX). There is no opaque "depth sort" limit because depth comparsions are done with an internal Z-Buffer.
There
may or
may not (it's too long ago for me to remember) be a limit to how many intersecting layers of translucent polygons it will per-pixel-sort, but you'll hit a practicle performance limit, due to fill rate, first.
Are you sure there are no theoretical depth sort limits? If you don't remember, can you at least take a guess at whether there was a limit as to "how many intersecting layers of translucent polygons it will per-pixel-sort."
I don't actually recall there being a stated limit, but I would think it'd be hundreds if there were one. As I said, If you genuinely had a couple of hundred layers of transparent polygons you'd be bogged down with a lack of fill rate first.
Out of curiosity, approximately how many transistors was the CLX made from?
No idea but I
think the chip may have been somewhere between 1 and 1.5 cm^2 (again it was a while ago). You'd have to then look at the technology and work out how many gates/transisitors you could fit in that and round down a bit etc.