What would happen PowerVR next PC chips have free FSAA?

Gubbi · Apr 20, 2004

notAFanB said:
One thing for sure is PowerVR are not just sitting on the technology they are using it right now and now the mobile chip is all but done I guess they will more some of the team over to the PC chip team.

Click to expand...

but your right it does look like they redirecting their more profitable (and frankly where they have the advantage right now) market.

Right. They have two advantages. Lower bandwidth requirement and better performance/watt. Both of these are the issues in mobile applications.

However, power is also going to be an important parameter in (this and the) upcoming generation. NV40 burns 75W! Because of the two slot cooling I can't put one in my XPC

Why Intel hasn't bought them yet is beyond me. Instant numero uno player in mobile graphics and the ultimate chipset core for integrated (aswell as discrete) PC solutions.

Cheers
Gubbi

Scott C · Apr 20, 2004

Also realize that one of the complaints people once had about deferred rendering is the space needed to store the binned geometry.

I recall Mr. Sweeney calculating that a complicated Unreal scene won't fit on a 32 MB card....

Well, we have much more memory space now, and one other big factor related to AA.

All forms of AA so far on "traditionals" require a large footprint of memory used. For a 32 bit 1600x1200 framebuffer and 6x AA you need 5x4x1600x1200 = 36MB of RAM for just the back buffer. Double that if you add the z-buffer. Current compression schemes mean that this memory is sparsely used, but it all has to be allocated and reserved.

On a PowerVR or Gigapixel AA scheme, there is NO z-buffer at all, and no extra space requirements for FSAA.

This leaves tons and tons of space for geometry.

You are left with the same basic advantage that deferred rendering has always had: The need to only process visible pixels and access texture data for these visible pixels.

Only with a z-only pass and perfect front to back sorting can this be even approximated, and even then there is a lot of waste. Such sorting is expensive (much moreso than binning) and traditionals are not so good at the state changes required for sending triangles in perfect front to back order.

Well, that's my two cents. FSAA for free, and with much less memory significantly offsets the only disadvantage a tiler has: memory space and bandwidth for geometry.

Hellbinder · Apr 20, 2004

as far as powerVR and not being considered for consoles because of lack of PC presence...

I am VERY confident that if SEGA was to decide to release another console tomorrow they would use an Imagine Tech design. SEGA loves those guys and their hardware.

MfA · Apr 20, 2004

They have an existing relationship, originally IMG got their foot in the door because of the pre-existing relationship with NEC (making PC products).

CIN · Apr 20, 2004

IMG tech has awesome technology and the SEGA devs have become very good with it because of the dreamcast. PVR chips have the best price/performance ratio so it is not a surprise that SEGA is continuing the relationship with IMG tech.

What I find surprising is that neither of the next gen consoles are using PVR chips which if they have the same raw specs as the other GPUs have an obvious advantage over the competition. :?

I am eagerly awaiting the N@omi 3 release now so we can see what the specs for this new PVR5 or custom chip are. I think it will be something very special if it is of the same level of PVR2DC when it was released. Usually if SEGA says it will be the most powerful chip available they are right. They are somewhat conservative.

Jodi · Apr 20, 2004

Mmm, nice and warm. *snuggles*

Ailuros · Apr 20, 2004

NV40 burns 75W!

Errrrr.....*snicker*

Ailuros · Apr 20, 2004

Scott C said:
All forms of AA so far on "traditionals" require a large footprint of memory used. For a 32 bit 1600x1200 framebuffer and 6x AA you need 5x4x1600x1200 = 36MB of RAM for just the back buffer. Double that if you add the z-buffer. Current compression schemes mean that this memory is sparsely used, but it all has to be allocated and reserved.

At 1600x1200 with 6xAA, our buffer consumption is over 100 MBs of local memory space. Going to 8xAA would have blown passed 128MB.

http://www.3dcenter.de/artikel/2003/11-06_english.php

Natoma · Apr 20, 2004

Blah blah blah. Why do I have a feeling PowerVR Series 5 is going to turn into another Glaze3D or Rampage? Hasn't PVRS5 been on track for release since 2002 or something?

Dave B(TotalVR) · Apr 21, 2004

Well for a start if Series 5 had been released already it would have far and away been the first VS and PS 3 compliant GPU which is a bit of a tall order is it not?

Also, AFAIK PowerVR MBX uses 32x32 tiles whereas all the other chips definatelty utilise 32x16 tiles.

AS for Geometry footprint...

There are two buffers, one stores vertex pointers the other stores ermm, cant remember had too much beer;p

But once the first tile is fully binned you can start rendering it, so if objects are sent in the correct order (leftmost to rightmost, uppermost to bottom-most) then tile 1 can be rendered whilst the rest are still being binned. Once tile 1 is rendered its vertex buffer space can be re-used; after all, it would be odd for IMGTEC to patent a technique for doing this and then not use it in their latest technology revision.

On top of that, the vertex buffer space is fixed in the driver (at least it is with the KYRO and all previous to it). The KYRO uses 6mb in total for both its buffers which allows it to render some serious high polygon scenes. In fact the KYRO Ii was able to render the high polygon scenes in 3Dmark2001 using this 6MB buffer and was just short of fully rendering it (though this is easily fixed by enabling the scene manager) and missed a few triangles. There were something like 500,000 triangles in that scene. When my site is back up I will be able to find the exact number Madonion gave me.

In short, the geometry storage space on PowerVR tech is not going to be a problem, it never has been and shall not be in the future, especially if developers tend towards truform style curved surface calculations and displacement maps.

Dave

Ailuros · Apr 21, 2004

Square might be true for MBX, 32*32 I don't think so.

But once the first tile is fully binned you can start rendering it, so if objects are sent in the correct order (leftmost to rightmost, uppermost to bottom-most) then tile 1 can be rendered whilst the rest are still being binned. Once tile 1 is rendered its vertex buffer space can be re-used; after all, it would be odd for IMGTEC to patent a technique for doing this and then not use it in their latest technology revision.

I'm not sure which patent you mean; got a link for it?

Hasn't PVRS5 been on track for release since 2002 or something?

No that was Series4, which has been cancelled.

Scott C · Apr 21, 2004

Ailuros said:
Scott C said:

All forms of AA so far on "traditionals" require a large footprint of memory used. For a 32 bit 1600x1200 framebuffer and 6x AA you need 5x4x1600x1200 = 36MB of RAM for just the back buffer. Double that if you add the z-buffer. Current compression schemes mean that this memory is sparsely used, but it all has to be allocated and reserved.

Click to expand...

At 1600x1200 with 6xAA, our buffer consumption is over 100 MBs of local memory space. Going to 8xAA would have blown passed 128MB.

Click to expand...

http://www.3dcenter.de/artikel/2003/11-06_english.php

I should have clarified a little. If you check my math, the 36MB is the difference between no FSAA and using 6xFSAA for one buffer. There are multiple buffers, and if you are triple buffering you need two big back buffers. That means more than 100MB extra, and nearly 128MB if you include all of it and not just the delta between no AA and 6x.

A TBDR requires no extra memory for AA, and no z-buffer at all, for a total savings of over 100MB in the above case. Go up to 8x or 16x AA.... The extra space needed for binning more tiles compares favorably to the extra space for a traditional renderer doing SSAA or MSAA as samples per pixel increase.

Dave B(TotalVR) said:
There are two buffers, one stores vertex pointers the other stores ermm, cant remember had too much beer

heh, I don't recall exaclty either, but my guess is an array of vertices and a list of indexes into this array that represent all the triangles, strips, and fans (or curved surfaces, whatever) that are at least partially in the bin for the tile.

Regardless of whether you can determine whether all triangles for a bin are sent, you can always send and bin the next frame while the previous is rendering. And the previous can always deallocate as it goes.

I agree space won't be a problem, and furthermore even in an extreme case, without good optimizations, bandwidth won't either. We might get say 20MB of geometry that requires a write and two reads, for a total of 60MB/frame. At 100 frames/sec, this is 6GB/sec. Just Z-buffer access on a traditional renderer at high resolution with FSAA is equivalent. Obviously this is a horribly bad worst case with no geometry compression or other optimizations. If you consider the incrimental geometry bandwidth needed over a traditional when both are using vertex shaders and vertex buffers, the relative difference is even less important.

jvd · Apr 21, 2004

Natoma said:
Blah blah blah. Why do I have a feeling PowerVR Series 5 is going to turn into another Glaze3D or Rampage? Hasn't PVRS5 been on track for release since 2002 or something?

i hate u

Dave B(TotalVR) · Apr 21, 2004

Just found This document on the pvr dev site. I especially liked the last line of this:

Tile-based render parameters

â€œAs complexity increases, tile-based renderers will run out of space to store scene parametersâ€

This assumption is usually raised when dealing with tile-based rendering. Because scene-capture renders effectively â€œcaptureâ€ the whole scene parameters, some people fear that they will run out of memory to contain all the data.

A key aspect of PowerVR technology is parameter management. Minimising parameter data is an important part of the technology since less memory transfers equates to more performance! Possible techniques to permit efficient storage of parameter data are culling, stripping and indexing. If needs be one could even store the parameters in AGP memory. Even if memory were to run out, there are different techniques to trade overdraw for parameter space while still retaining some degree of architectural advantage.

As an example 3 Megs of parameter data on KYRO II can contain more than 30,000 polygons. This figure depends on type, size and distribution of polygons on the screen and doesnâ€™t take into account any internal conversion so the end result is likely to be more.

Also memory is getting cheaper and cheaper, so adding more memory onto graphic boards will not be a problem. There are 64 Megs boards out there and KYRO II already works extremely well with 32Megs of memory.

Disclaimer: This is theoretical argumentation and we are not necessarily discussing future hardware

Sounds a lot like my 'What Is PowerVR?' article in parts

but it was written after if

As for that patent Ailuros, I think this is it and it was discussed on these Forums too

Dave

Ailuros · Apr 22, 2004

Thought that you were refering to that one, cause your description sounded a tad weird.

The fundamental difference in that patent is that the scene gets divided into macro tiles first and then into usual micro tiles.

I'd say that this one is already in use on MBX; s.o. suggested that with smaller tiles (micro in that case) binning space would increase. Not with that idea and certainly not per macro tile. I also have severe doubts that the micro tile size remained rectangular past S3.

However this patent isn't only quite old, it deals with only one of the problems. IMHO parameter bandwidth is the smallest consideration, while other topics like vertex bandwidth are far more important.

Questions 39/41/42 as examples:
http://www.pvrgenerations.co.uk/cgi...iew/johndavid1011&printer=0&pagenum=4

Of course don't expect to get an answer on questions like that LOL. Oh and I'd have a second look at question 34 if I were you

pmac · Apr 22, 2004

Questions 39/41/42 as examples:
http://www.pvrgenerations.co.uk/cgi...iew/johndavid1011&printer=0&pagenum=4

With regard to question 39 in that interview John Metcalfe responded:

39. TBDRs can have differing render states per pixel inside a tile and must be able to switch them quickly, because they process primitives somewhat out of order. This problem is amplified for pixel shaders, where a renderstate means there is suddenly a lot of data. On top of that comes programs and constants, fixed functions like texture bindings, etc. Is this a theoretical problem only, or is it something that actually requires special attention?

John: Yes, it requires special attention.

Ailuros, I was wondering if this patent solves that problem ? Here's a snippet from the patent:

[0012] A specific embodiment of the present invention provides a pixel blending buffer on a graphics chip. It enables portions of a frame buffer or tile from a frame buffer to be accessed on a polygon by polygon basis. Large polygons are broken up so that they never exceed a predetermined size. Smaller polygons can be combined together to fill up the pixel blending buffer thereby improving the performance of the system.

Ailuros · Apr 22, 2004

Excellent find pmac. I haven't had time to read through it, but from a quick glimpse over it it seems to be right on track.

I think I should check espacenet more often

Jerry Cornelius · Apr 23, 2004

Scott C said:
Also realize that one of the complaints people once had about deferred rendering is the space needed to store the binned geometry.

I recall Mr. Sweeney calculating that a complicated Unreal scene won't fit on a 32 MB card....

Well, we have much more memory space now, and one other big factor related to AA.

All forms of AA so far on "traditionals" require a large footprint of memory used. For a 32 bit 1600x1200 framebuffer and 6x AA you need 5x4x1600x1200 = 36MB of RAM for just the back buffer. Double that if you add the z-buffer. Current compression schemes mean that this memory is sparsely used, but it all has to be allocated and reserved.

On a PowerVR or Gigapixel AA scheme, there is NO z-buffer at all, and no extra space requirements for FSAA.

This leaves tons and tons of space for geometry.

You are left with the same basic advantage that deferred rendering has always had: The need to only process visible pixels and access texture data for these visible pixels.

Only with a z-only pass and perfect front to back sorting can this be even approximated, and even then there is a lot of waste. Such sorting is expensive (much moreso than binning) and traditionals are not so good at the state changes required for sending triangles in perfect front to back order.

Well, that's my two cents. FSAA for free, and with much less memory significantly offsets the only disadvantage a tiler has: memory space and bandwidth for geometry.

Nice.

I undersand why some developers may want a device that chews through triangles one at a time, it's hard not to dream about the high end possibilities. But in the real world, you have immediate limits (no pun intended lol) that destroy that illusion. Pesky things like bus speed, memory bandwidth etc...

I'd wager that by the time geometry storage is that kind of an issue, it won't matter anymore for other reasons.

What would happen PowerVR next PC chips have free FSAA?

Gubbi

Scott C

Hellbinder

MfA

CIN

Jodi

Ailuros

Epsilon plus three

Ailuros

Epsilon plus three

Natoma

Dave B(TotalVR)

Ailuros

Epsilon plus three

Scott C

jvd

Dave B(TotalVR)

Ailuros

Epsilon plus three

pmac

Ailuros

Epsilon plus three

Jerry Cornelius

Similar threads