are "untraditional" architechtures a bad idea?

Let's not forget that, for a cost in silicon, TBDRs can do MSAA without spending any off-chip bandwidth or memory footprint.
 
Up to now, at least in the performance sector, it's been the case that TBR computational overhead exceeds its bandwidth benefits, and BFRs have therefore remained dominant.

I'm not so sure what exactly you mean with computational overhead, but if you should mean for instance vertex bandwidth it was one point where it would deduct a TBDR's overall bandwidth benefit.

But from that point and on you can only work with assumptions and past implementations cannot be used anymore as paradigms since no one knows what steps development has taken in the meantime.

I was hoping that someone would make this conversation a tad more interesting and ask a couple of interesting questions so I'll shoot first:

Does anyone consider shader task switching in the pixel shader to be a problem for a TBDR?
 
And because of the way TBRs work they can render to an internal buffer which has higher accuracy than the framebuffer itself. This results in better quality with many transparency effects. However, that way the end result is not quite the same as the reference renderer of DirectX. ;)
 
xGL said:
http://www.anandtech.com/video/showdoc.html?i=1435&p=3

Despite the advantages that a tile based system offers, the method has come under fire recently. Most notably, the lead programmer at Epic Games, Tim Sweeney, recently mentioned that implementing a T&L subsystem on a tile based renderer was next to impossible.

The N@omi 2 Arcade board offer hardware T&L along with PowerVR TBDR, and was shown Fall 2000.

I won't comment about what's Mr Sweeney is reported to have said in 2001, I'm sure you can make up your mind.


Although I don't know how PowerVR does it, my guess is that hardware T&L works roughly as follow:

-Transform Vertices to screen space. (run Vertex Program... + screen clipping)
-Sort Triangles by tile.
-Batch Triangles by Texture/Fragment Program.
-For each Tile, send its Triangles batchs to the Fragment Program/remaining of the pipeline.
 
For KYRO the CPU does TnL, does it really matter if the CPU or a Hardware GPU TnL unit does the work ?

Instead of the CPU sending transformed (VS processed) vertices to KYRO a hardware vertex shader could do it... there is really no difference and there is no problem with the tile based design and vertex shaders or hardware TnL.

I think this all again boils down to the same old claimed issue of parameter storage where they reason that TnL of some form will blow the parameter storage size up due to increased numbers of vertices/polygons...

K-
 
Kristof said:
For KYRO the CPU does TnL, does it really matter if the CPU or a Hardware GPU TnL unit does the work ?

Instead of the CPU sending transformed (VS processed) vertices to KYRO a hardware vertex shader could do it... there is really no difference and there is no problem with the tile based design and vertex shaders or hardware TnL.

I think this all again boils down to the same old claimed issue of parameter storage where they reason that TnL of some form will blow the parameter storage size up due to increased numbers of vertices/polygons...

K-

Perhaps it has to do with some difficulty in imagining that the fragment side of the GPU could be working on frame n while the vertex side works on frame n+1?? Otherwise my best guess is that the paraphrase is not exact or was taken out of context.
 
xGL said:
http://www.anandtech.com/video/showdoc.html?i=1435&p=3

Despite the advantages that a tile based system offers, the method has come under fire recently. Most notably, the lead programmer at Epic Games, Tim Sweeney, recently mentioned that implementing a T&L subsystem on a tile based renderer was next to impossible.

Back then Sweeney didn't even expect KYRO to be able to run UT2k3. I'd say that he has changed his opinion since then, albeit I wouldn't consider the game to be actually playable on it, but I can think of just as bad cases that claim far more advanced capabilities just on paper.

DaveH,

Perhaps it has to do with some difficulty in imagining that the fragment side of the GPU could be working on frame n while the vertex side works on frame n+1?? Otherwise my best guess is that the paraphrase is not exact or was taken out of context.

www.metagence.com

--------------------------------------

Anyone have a guess on my question or did I touch a too hot topic there? TBDRs and T&L is quite ancient.
 
Ailuros said:
DaveH,

Perhaps it has to do with some difficulty in imagining that the fragment side of the GPU could be working on frame n while the vertex side works on frame n+1?? Otherwise my best guess is that the paraphrase is not exact or was taken out of context.

www.metagence.com

Just to be clear--I wasn't suggesting it would be a problem; just that that's the only concievable reason I could think of why someone might think it would be. And I'm still surprised Sweeney would have said something like that.

Anyone have a guess on my question or did I touch a too hot topic there? TBDRs and T&L is quite ancient.

I don't have a guess on your question...but I'd sure like to see one. Damn good question...
 
Ailuros said:
...

Does anyone consider shader task switching in the pixel shader to be a problem for a TBDR?

Well, what kind of shader management is implemented for IMRs now? Is the challenge in managing the large and complete set of shaders for a scene, where an IMR would just manage the set for the section of screen being drawn? Would this necessarily introduce much in the way of a challenge? I think something like the P10's virtual memory would work towards this, and something like PCI Express would alleviate some of this issue along with such a system.

Come to think of it, wasn't the P9 (if that is the right model number) an illustration of this idea in practice? I forget the details at the moment.

Pretty late, so maybe I just missed something completely.
 
Well I had a similar discussion with s.o. about it on another board and his considerations made sense, so I was hoping that someone here might have an idea. One of the suggestions was to do some extra binning for the PS shader switches (oversimplyfied), but I wasn't very fond of the idea unless I missed something down the line.

And I'm still surprised Sweeney would have said something like that.

I can't believe you missed Sweeney's comments back then when the K2 was announced. It created quite a few heated discussions.
 
Ailuros said:
I can't believe you missed Sweeney's comments back then when the K2 was announced. It created quite a few heated discussions.

I didn't use to follow 3d very closely back then--only for the past 8 months or so...

Maybe I'll do a search if I get curious.
 
Some Tim Sweeney Kyro 2 quotes from the past:

Tim - It's a competent TNT2 class chip, and the sorting and alpha-testing artefacts of past generations seem to have been sorted out successfully. But, like every generation of PowerVR hardware before it, it's a day late and a dollar short. It lacks support for basic DirectX7 (yes, 7!) features like cube maps. The kyro developers are cool guys, so it pains me to say that this is just not a viable piece of hardware in the market it's trying to compete in.

1) The hardware T&L games are just beginning to come out now, which makes it a particularly bad time to spend $150 on a non-T&L graphics card. That's the flaw in using 1999 games to benchmark a 2001 graphics card: it ignores the larger issue of whether the card will be appropriate in the 18 to 24 months between a typical gamer buying the card, and when he buys his next 3D card.

That's the thing with these tile renderers, they've always run great with the older games, then had the compatibility and performance problems with newer games as they started coming out. I'm really sure that any game you buy 18 months from now will run acceptably well on a GeForce2 MX, but I have big doubts about that with Kyro II.

So, if your question is whether Kyro II runs UT and Q3 well, the answer is unquestionably YES, as proven by the benchmarks, it's really good at UT and Q3. If you're asking whether I think it's a good card for gamers to buy now, planning on being able to use it for games coming out over the next 18 months, then no, I just don't think that's a good idea.
 
xGL said:
http://www.anandtech.com/video/showdoc.html?i=1435&p=3

Despite the advantages that a tile based system offers, the method has come under fire recently. Most notably, the lead programmer at Epic Games, Tim Sweeney, recently mentioned that implementing a T&L subsystem on a tile based renderer was next to impossible.

Struth! Is it? Someone go and bring back all those Naomi2 systems!

Ailuros said:
Perhaps it has to do with some difficulty in imagining that the fragment side of the GPU could be working on frame n while the vertex side works on frame n+1?? Otherwise my best guess is that the paraphrase is not exact or was taken out of context.

www.metagence.com
:?: :?: Well, you've succeeded in confusing me :)
 
Sweeney was right when he said they weren't viable solutions : just look at the hard time kyros are getting with any modern game which require T&L
 
xGL said:
Sweeney was right when he said they weren't viable solutions : just look at the hard time kyros are getting with any modern game which require T&L

What are you talking about ?
KYRO II performs well for a card which does T&L on CPU.
 
xGL said:
Sweeney was right when he said they weren't viable solutions : just look at the hard time kyros are getting with any modern game which require T&L
Are you being sarcastic?

Just because some highly intelligent coders have put a test in the application for "must have HW T&L" doesn't mean they should have. Many games are still fillrate limited and so letting a modern do the T&L calculation often will make minimal impact. In fact, driver hacks are out there which tell the application that it does have HW T&L and then just lets the SSE/3DNow hardware do the work.
 
Well, you've succeeded in confusing me.

Let's see I used Dave's part of his post where he says:

Perhaps it has to do with some difficulty in imagining that the fragment side of the GPU could be working on frame n while the vertex side works on frame n+1??

N in relation to N+1 just brought multithreading into my mind. Was I that confused?
 
Ailuros said:
N in relation to N+1 just brought multithreading into my mind. Was I that confused?
And the CPU may have moved on to N+2 :) I see what you mean now but linking to Meta/Metagence just threw me.
 
TiM S.'s comment had particular comic value because at the time Kyro 2 ran UT above and beyond any other chipset out.

Cheers
Gubbi
 
Back
Top