PDA

View Full Version : Beyond3D 3Dlabs P10 Technology Preview


Dave Baumann
06-Jun-2002, 08:57
http://www.beyond3d.com/siteimages/b3dsmall.gif (http://www.beyond3d.com/articles/p10tech/)Well, we were slightly late to the 3Dlabs P10 party but now we have our P10 Technology Preview (http://www.beyond3d.com/articles/p10tech/) available hopefully it will cover some details a little more indepth to give an idea of how programmable this new 3D architecture actually is.

Also included is a short Q&A on Matrox's much touted feature, Displacement Mapping, and how the P10 is able to use its programmability to make such a feature available and also its status on inclusion in OpenGL.

Read the full article here (http://www.beyond3d.com/articles/p10tech/).

Reverend
06-Jun-2002, 10:06
Being late is not disadvantageous if the final article is of an entirely different class to the rest. It really is to the disadvantage of other sites when they feel the need to compete with other sites for hits.

A thoroughly enjoyable and informative read, Wavey. Your best article yet :D

Now, answer my email re JC's interview :)

Pete
06-Jun-2002, 10:42
YARWIBB3D--Yet Another Reason Why I've Bookmarked B3D. :D

I must admit I skimmed through some of the technical parts, as my interest is mainly from a gamer's perspective, but it was well-written, thorough, and interesting. Thanks, Dave--another excellent effort.

P10's extreme programmability looks promising, but I, too, wonder if the tradeoff will be speed. I'm very interested in seeing how a consumer version of this board stacks up against Parhelia. I'm not sure if the 107 mil. transisitor R300 and 120 mil. trans. NV30 are in the same class, though--speed-wise, and possibly even feature-wise (there's no point in having the ability to create a custom effect if it'll run prohibitively slow).

Entropy
06-Jun-2002, 13:14
Me too! :D nice job.

Entropy

ushac
06-Jun-2002, 15:04
Great article! Thanks Dave!

Any idea of how much of all that programmability that will be exposed in the drivers? Will you be allowed different levels of control in the different lines (Wildcat, Oxygen, 3D Blaster)?

Regards / ushac

Randell
06-Jun-2002, 16:18
Dave,

Most of the early reaction to the P10 seems to have been, interesting, but the consumer version will be slower than Parhelia, NV30, R300.

However your article had brought out to me the bandwidth coupled with the parallellism in the arrays and the Parhelia has been announced at 220mhz.

Therefore in terms of 'IQ' performance (quality texture filtering, AA enabled) do you beleive it will perform adequately and its programmability perhaps give it a longer shelf-life?

Dave Baumann
06-Jun-2002, 17:03
Hey, glad you’ve all found it interesting so far – I was concerned that it wouldn’t reveal more than the other already out there (although, in truth, I haven’t actually read all of any of the others, just bits).

Any idea of how much of all that programmability that will be exposed in the drivers? Will you be allowed different levels of control in the different lines (Wildcat, Oxygen, 3D Blaster)?

From the sounds of things they will really only be making available what is exposed through DX9 and OpenGL2.0 WRT programmability. OpenGL 2.0 will be the big one for that as it will be a lot more flexible than current API’s. They may let let some workstation programmers have access to a lower level if they really need it but they will not be making a ‘P10 programming language’ as it were, this is very much left to the native programmability of OpenGL and what will be available in future generations of DX.

I think what 3Dlabs will be doing is writing a lot of shaders themselves, for instance the card doesn’t have native support for Anisotropic filering so they may write a shader routine that will appear as a driver applet option in the DirectX drivers (with the degree selectable); alternatively where there is an OpenGL extension for something, say a multisample buffer, they will internally write a shader routine that will wrap on to the OpenGL extension.

So, I suspect that game developers will be very much limited as to what 3Dlabs expose in the API’s and what the API’s themselves can do – it would seem that in the short term OpenGL developers could have more ‘fun’ playing with this chipset and seeing what can be done.

Therefore in terms of 'IQ' performance (quality texture filtering, AA enabled) do you beleive it will perform adequately and its programmability perhaps give it a longer shelf-life?

Featurewise I don’t think the consumer version will have much trouble, its just a question of how much 3Dlabs/Creative will be able to develop the drivers to come up with shaders/algorithms to expose these features and how fast they will perform.

From the size of the vertex array, assuming that the scalr processors don’t have any issues and there isn’t too much of a ‘flexability trade off’ or something the it would appear that P10 will have roughly the same vertex throughput as Parhelia and if ATi/NVIDIA opt to double their current offering throughput then its vertex rate could be on par with that. The concern could be the pixel rate, with the equivalent of only 4 pixel per clock then that may be a little low. However, perhaps when you start combining that with multisampling and high level filtering it could even out a little – for instance if they code it right the texture array could be working on one massively textured / filtered sample while the pixel pipes are being used to create all the multisample samples – something like that could level things a little.

I would expect NV30/R300 to outperform this in many cases in a game environment personally, but then I think really understanding the performance is a little difficult now because there are so many variable when looking at performance and potentially so many ways to code this chip. So, to sum up 256bit buss is a big help, I think vertex throughput will be good, fillrate will be a concern but it will depend on the what/how you want to run and how 3Dlabs code the chip to do it!!

Saem
06-Jun-2002, 19:49
Would the P10 be able to work on multiple triangles at the same time? If it's as programmable as it sounds, then it could pick up speed on smaller than 4 pixel triangles. Even if this isn't the case, chances are that it's "texturing pipes" aren't in a fixed configuration and could also pick up a fair bit of speed.

Dave Baumann
07-Jun-2002, 02:23
Would the P10 be able to work on multiple triangles at the same time?

I think that because it works on an 8x8 'patch' then I think this would almost certianly be the case.

Saem
07-Jun-2002, 02:35
I'm guessing there is a large amount of efficiency to be gained due to this. I no longer have any lingering doubt about P10 not performing adequately.

Anonymous
07-Jun-2002, 13:57
Excellent article !!!
Good to see 3DLabs going to the consumer market :)

pascal
07-Jun-2002, 13:59
Was me above. Too fast forgot to login.

Simon F
07-Jun-2002, 15:04
Interesting that they chose to make the geometry processor scalar rather than vector based as in DX.

Dave Baumann
07-Jun-2002, 15:08
Simon - Why do you say its interesting? What ramifications would there be in chosing Scalar processors over Vec4 in your opinion?

Simon F
07-Jun-2002, 17:06
Simon - Why do you say its interesting? What ramifications would there be in chosing Scalar processors over Vec4 in your opinion?
Interesting because of the tradeoffs. As you reported in your article, it sounds like they felt that having a smaller atomic unit (i.e. floats vs vector of four floats a la DX) gave a better utilisation of the floating point hardware. That is, not all steps in the geometry process require operations on length 4 vectors. (Some are length 3 and others merely scalar ops).
One other aspect would be that the "Swizzling" and masking fields in the DX8/9 instruction set would be eliminated, but you would have to have a lot more register address bits.

The downside to this approach, AFAICS, is that when you do want to do a vectorish op, say like a dot product calc, you'd have to use multiple instructions** as well as extra temporary storage. (** I'm assuming that they'd have a "1 operation" style instruction set). You'd have to compile (well, assemble really) a DX8/9 program into this instruction set (and thus probably have a lot more instructions) but then I suspect that chips probably have their own variations on the DX instruction set anyway (judging from the DX8/9 specifications). Of course, I believe they are the driving force behind the programmability of OGl 2.0, so I guess this wouldn't phase them.

As I said, it's interesting.

Anonymous
07-Jun-2002, 18:09
Scallar means it needs more bandwith too but P10 has plenty of it.

IIRC one of the main reasons Cray 1 sold so well was because of the excellent scallar performance.
Pascal

PSarge
10-Jun-2002, 16:11
Scallar means it needs more bandwith too but P10 has plenty of it.

How do you work that one out?

If something processes 1 vertex per clock, or you have 4 things each working on 1 vertex each (4 total) but taking 4 cycles over it, then the throughput is the same. Both will have a net throughput a 1 vertex per clock, therefore same bandwidth requirements.

If anything the scaler version will need less bandwidth because it doesn't need to read/write any unused components of vertices, but I'm not sure if that's really significant.

XX
14-Jun-2002, 21:52
Dave,
I've just read your very interesting article and although I don't realy
have a lot of knowledge about these things :oops: I was surprised
to hear that P10 won't be able to support all the features of
OpenGL2.0 :( Is this true? I thought it was supposed to be the
other way around! Anyway please clear that up for me!

Anonymous
15-Jun-2002, 00:58
Great article!

Yet another one that makes no sense whatsoewer to the readership.

Ever considered taking in some feedback? You guys are crtainly at least adequate in technology, but always at least miserable in authoring. Maybe listen to your audience to make communication right. Right?

Dave Baumann
15-Jun-2002, 12:06
I was surprised to hear that P10 won't be able to support all the features of OpenGL2.0 :( Is this true?

I asked about that recently and I’m still not sure exactly what or how P10 doesn’t support in OpenGL2. Quite a lot of it is down to resource management at the moment – each part they want to support means they have to write shader routines for the hardware so at the moment they are picking things off on a priority basis. It may well be that eventually P10 could do everything in OpenGL but it’s a question of how much performance it has.

Ever considered taking in some feedback? You guys are crtainly at least adequate in technology, but always at least miserable in authoring. Maybe listen to your audience to make communication right. Right?

We do, which is more or less why the articles are like they are – i.e. look at the comments we received from the registered members of this forum who are regular readers of the site.

If you have anything relavent to say theres nothing to stop you from registering it and commenting in our Feedback forum.

XX
15-Jun-2002, 14:27
Let's just hope that all they have to do is write more
shader routines. As you've said in your article they should
add support for 64Bit Precision latter. So maybe they'll do the same
with other features of OpenGL2.0 to!