Intel's new graphics core

pcchen

Moderator
Moderator
Veteran
Supporter
The newly announced i845G and i845GL has a new graphics core called "Extreme Graphics." According to Intel's whitepaper, it looks like a tile based renderer, but I am not sure whether it is a deferred renderer (only visible pixels are rendered), or just put a tile of depth buffer and frame buffer inside the chip.

Some benchmarks show that it is not fast, though.
 
After more carefully reading into the white paper, I found the following paragraph (pp. 11):

One final benefit of the zone rendering architecture is that the fill rate requirements placed on the graphics device is not reduced. Because pixels are not overdrawn in the frame buffer, the fill rate that is required to draw any scene is equal to the total number of pixels in the scene (depth complexity = 1). By contrast, traditional 3D graphics architecture may have to redraw each pixel in a scene anywhere that triangles overlap. This means that if the average depth complexity in a scene is 3, meaning each pixel must be drawn 3 times, the fill rate required by a conventional 3D graphics architecture is 3 times the number of pixels in the scene. Thus the fill rate requirements for zone rendering will always be the same as the current resolutions, whereas the fill rate requirements for a conventional architecture will increase as a function of the depth complexity.

This suggests that it is a deferred renderer. :)
 
deferred rendering or not, it's really slow : extremetech-article

Maybe they should have used S3TC / DXTC for the tests too to lower the bandwidth demand for the textures. With the 32bit textures, the chip is clearly bandwidth-limited.

With an 16KB Z-Cull - Cache as mentioned in the extreme-tech article the tile would be 64x64 pixel big.
 
Mmmm, may have missed a trick there. Intel have just sent me the new P4 for review and were going to send me a i845G board as well, but I asked for an 850E. Perhaps I'll to review the 845G after this review is done.
 
Will it help IMG that Intel uses an own form of deferred rendering; or is Intel no "force" in the 3D-chipset arena and so this "big guy" cannot help here?


By the way; I find it really annoying that the new 850E has no deferred rendering architecture, but the 845G-chipset has one. The 533MHz FSB with 1066MHz RDRAM and 4.2GB of the 850E chipset would have allowed far higher 3D-speeds then the 2,1GB/sec bandwidth of the 845G-chipset.
Maybe the bandwidth would have been high enough for two pipelines instead of only one as with the 845G-chipset.


Link : http://developer.intel.com/support/graphics/intel845g/
 
850E is the high-end chip-set, and the whole idea with the 850E (+RDRAM) is the CPU having all the bandwidth by itself. I see no meaning in having a low-low-low-end graphics core in a high-end chipset. Even the bandwidth of 850E/RDRAM, and two pixel pipes, would have turned the integrated graphics to a high-performing core.

I must say I'm really disappointed with the 845G 3D-performance. I was expecting something in the league of the GF2MX400, but it seems to be closer to the GF2MX200 (which is limited by the 64 bit SDR-SDRAM). I guess it may improve slightly with optimized drivers.

What 845G need is a large and fast texture cache similar to the 1 MB one in the Gamecube Flipper chip.
 
The 533MHz FSB with 1066MHz RDRAM and 4.2GB of the 850E chipset would have allowed far higher 3D-speeds then the 2,1GB/sec bandwidth of the 845G-chipset.

Heh. FYI, *officially* i850E doesn't support PC1066, only PC800. Intel's own i850E boards do not have the facility for changing the ram speed to use PC1066 they are stuck at PC800; they are leaving it to other vendors to unoffically support PC1066.
 
I think there are several possibilities for the slow performance of this graphics core:

1. lack of fill rate
2. lack of texture bandwidth
3. poor driver optimization

The first one is quite unlikely, since it has more theoretical fill rate than a Kyro when running dual-textured games. The second one is, IMHO, the most responsible. Note that it has to compete with the CPU for bandwidth. When running games which require large bandwidth for CPU itself, this core may suffer a lot in performance. An extra texture cache memory (with dedicated memory bus) could help a lot.

The third one is also possible. The JK2 benchmark in the extreme-tech article shows superlinear scaling to resolution reduction. Perhaps from a bad optimized video driver.
 
pcchen said:
I think there are several possibilities for the slow performance of this graphics core:

1. lack of fill rate
2. lack of texture bandwidth
3. poor driver optimization

IMO it's No. 2 :

32bit x 400Mtexel x 4 x 0.33 / 8 = 2,1 GB/sec ; so the texturing alone needs the complete available bandwidth. Therefore I assumed that an 4.2 GB/sec bandwidth would allow near full-speed rendering with this setup :

~100Mpixel ( 400Mtexel, but multitextured and transparency) / (1024x768) x 75% efficiency (like the Kyro) = ~90fps instead of the 20-30 we see in the extremetech benchmarks.
 
1 pixel pipe/2 TMU's? Would it have added really that much in die size/cost if the numbers would have been the other way around?

Will it help IMG that Intel uses an own form of deferred rendering; or is Intel no "force" in the 3D-chipset arena and so this "big guy" cannot help here?

Can you define what you mean with "help" here? Do you mean "developer support/attention" increasing onto defered renderers?
 
Ailuros said:
Can you define what you mean with "help" here? Do you mean "developer support/attention" increasing onto defered renderers?

Yes. And also a better standing when it comes to new "TBR-unfriendly" DirectX/OpenGL features/implementations of features.
 
But wouldn't the presupposition for that be, that both PVR and Intel would be completely identical in their approaches, despite the fact that both will use a TBR architecture?

The only game I can think of that had real and completely unsurpassable problems with TBR is Grand Prix 3*** (if there are more feel free to add them), yet the developer has claimed in the past that GP4 will not have the same problems with KYRO's.

Wouldn't it be safer to suggest that PowerVR has rather "smoothened the path" for Intel?

*** outside of the understandable limitations of Kyro's as in hardware itself.
 
Why should the implementation be the same?
All forms of deferred renderer have to store the polygons first into an array and render then the pixels tile by tile ( at least that's what the Kyro and 845G does ). So both have the same sort of limitation.
In the past the biggest problem was that developers mixed 2D and 3D ( I think that's the problem with GP3 too? ) in the future displacement mapping could be an big problem.

Wouldn't it be safer to suggest that PowerVR has rather "smoothened the path" for Intel?

No. IMG / Kyro is only an rather small chunk of all 3d chips. Intel on the other side has an large part of the chipsets-market. So developers had no intention to please IMG due to the small market penetration, but now with "big" intel on the same side the situation has changed.

Maybe ( IMHO ) this new chipset from Intel could even "help" Via to support IMG and speed up the deal with ST. :)
On the other side, the 845G is so slow, that the intention to go with an deferred renderer could even slow down ( see, Intel was not able to make an fast embedded deferred renderer using UMA, why should IMG be able to do it ).
 
mboeller said:
So both have the same sort of limitation.
In the past the biggest problem was that developers mixed 2D and 3D ( I think that's the problem with GP3 too? ) in the future displacement mapping could be an big problem.

In what way would displacement mapping be a problem ? This is done in the vertex shader as part of TnL work, which all sits before anything that is tile based. Are are we again worried about storage space with increasing numbers of polygons ?

K~
 
Kristof said:
mboeller said:
So both have the same sort of limitation.
In the past the biggest problem was that developers mixed 2D and 3D ( I think that's the problem with GP3 too? ) in the future displacement mapping could be an big problem.

In what way would displacement mapping be a problem ? This is done in the vertex shader as part of TnL work, which all sits before anything that is tile based. Are are we again worried about storage space with increasing numbers of polygons ?

K~

Like Kristof I don't see a problem with display list storage space on tilers with respect to high polygon counts. I saw a demo on a K2SE of the Alien Terrain VDO benchmark and it was handling 60000 polygons at very decent framerates. If this has the same amount of memory dedicated to binning space as the standard K2 (ie. approx 6Mb) then increasing this on a typical 64Mb card to 10Mb or above shouldnt be a problem at all. If vdo's next card carries 128Mb RAM then even larger chunks of space can be dedicated to it.
 
Kristof said:
mboeller said:
So both have the same sort of limitation.
In the past the biggest problem was that developers mixed 2D and 3D ( I think that's the problem with GP3 too? ) in the future displacement mapping could be an big problem.

In what way would displacement mapping be a problem ? This is done in the vertex shader as part of TnL work, which all sits before anything that is tile based. Are are we again worried about storage space with increasing numbers of polygons ?

K~

Yes of course. Even the small example from Matrox using an 64x64x8bit displacement map needs between ~17.000 (using mip-mapping) and more than 100.000 polygons/frame (this latter number will be needed for creatures like the alien of the Matrox presentation, cause no mip-mapping is possible here).
Even when we have an small vertex-size like the 18byte the PS2 uses between the EE and GS ( 1200Mbyte/sec / 66.6Mio vert./sec = 18byte/vertex ) then you have to store between 300KB and 1.8MB per frame and displacement map. If you use more than one displacement map the storage requirements are even higher. And so all the staggering compression-ratio's an displacement map gives you are gone ( 64x64x8bit = 4KB <-> 100000x64byte = ~6MB ; compression ratio = ~ 1500 ).
So that could very well mean, that deferred renderer will run out of memory and bandwidth when displacement maps are used heavily.
 
As you said developers mixed in the past 2D and 3D in an application and I recall only GP3 to be the bad example for that one; I'm not aware of any other game being developed right now that uses a similar approach.

To that sentence:

And also a better standing when it comes to new "TBR-unfriendly" DirectX/OpenGL features/implementations of features.

Only GP3 came into mind. I've seen more than once pointed out on these boards how "many" game specific presets PowerVR has in it's drivers. (as for "incompatibilities" in general):

a) I'm not aware of alternative vendors NOT using game specific presets at all

b) At this point the context of the variables is far more important than the number of presets. Count those that force a game/application to use an external Z-buffer for instance and then incompatibilities can be counted.

So that could very well mean, that deferred renderer will run out of memory and bandwidth when displacement maps are used heavily.

Since no one can predict what the possible specifications of Tilers will be by the time displacement maps will actually become mainstream, I wouldn't take KYRO II too much under account.

Intel on the other side has an large part of the chipsets-market. So developers had no intention to please IMG due to the small market penetration, but now with "big" intel on the same side the situation has changed.

My bad for not taking that perspective then under account. It's just that I'm having a hard time considering someone buying an integrated chipset a "gamer". ;)
 
Back
Top