PowerVR Series 5 to debut in 2003 using 0.13u

Apart from the usual pissing contests you two have a habit to have...

Geezus...I'm just having a difference of opinion here!

you give me a reasonable explanation (yes considering their past track record) what on earth they need 0.13um for?

Yes. Cheaper parts. That's what Img Tech always strives for, isn't it? They may simply "need" 0.13 not for physical limitation reasons, but for cost reasons.

As a Tiler it does not only upon default need less transistors than an IMR, but they could go for even a 4 pipe/15um/dx9 compliant budget sollution.

Wait a minute here. I have only argues against Series5 being "most likely BEYOND DX9." To be clear, I wouldn't be surprised if it is DX9...but I WOULD be surprised if it's beyond DX9.


Don't take that single bet this time around, you're going to loose it

Do you know what this bet would be?
 
Then either one of you is going to be pleasantly or unpleasantly surprised. Which of you would fall in which category would the bet then be about.

I put my money on Teasy, but I'm not betting any critical body parts to make that one clear :eek:

edit:

They may simply "need" 0.13 not for physical limitation reasons, but for cost reasons.

I can see a Radeon9500 on 15um at just 179$ at launch from latest indications. It is dx9.0 compliant even if it turns out to have just 4 pipes.
 
I wonder if the move to .13u is to get the clock speed up.

Previous PVR chips have been clocked slower than IMRs at the same feature size, so if IMG are aiming for a high performance part they may choose .13u, even though .15u would suffice, simply to get a decent clock speed into the chip.

This probably makes sense given the drop in the cost of high speed memory - by the time series 5 is available DDR may cost little more than SDR, so they may as well release a chip that can make use of the available bandwidth.
 
I see John M's quote more as an admission that PVR were caught out before.

Missing just one "checkbox" feature is simply making a rod for your products back, no matter how sensible it is to omit it. DX7 T&L wasn't much cop so they didn't bother... a decision that makes sense to me, but it made PR and marketing more difficult. Likewise high end cards don't make much money, but they do define the checkbox features and set the desirable brand name for the ultra low end volume segment. They really need full DX9.x compliance and a competetive part, or else I agree they may not get another shot.

Which would be a shame, since I don't think the potential of TBR is declining. TBR allows options for cheap FSAA and eliminates overdrawn pixel calculations. Even if newer games are computationally limited, the die space saved on a complicated memory controller could hold more vertex processors, etc.

I definitely won't be holding my breath this time, but it will be interesting to see if this switching in and out gubbins ties in with metagence.
 
Ailuros said:
I recall at least the KYRO having almost identical scores between front to back, back to front and random order sorting.

I don't think KYRO had any "serious" problems either, but since Series5 seems to be far more advanced and thus more complex, I'd rather wait and see.

I've personally waited for a high end Tiler for a long time. Here's to hope that they'll execute and release on time.

The only serious problem was it was too slow. ;)
 
Nagorak said:
Ailuros said:
I recall at least the KYRO having almost identical scores between front to back, back to front and random order sorting.

I don't think KYRO had any "serious" problems either, but since Series5 seems to be far more advanced and thus more complex, I'd rather wait and see.

I've personally waited for a high end Tiler for a long time. Here's to hope that they'll execute and release on time.

The only serious problem was it was too slow. ;)

Taking clock speed and number of pixel pipes into consideration, it was relatively fast.
 
The only serious problem was it was too slow.

Compared to what? As PC Engine already pointed at find me a 175mhz, 2 piped, SDRAM equipped IMR that did better. Don't even mention anything like FSAA and you'd be lost.
 
Does it matter?
Maybe it didn't reach higher clocks because of the compexity of TBR?
Food for thought ...
 
Mariner said:
In the past, the transistor count of PVR chips has been much less than for IMR chips. If this continues to be the case then the chips should be much cheaper.

The reason Kyro2's has a small die size is its 2x1 architecture, isn't it? IIRC, it has 15 million transistors, and didn't even have T&L. The Radeon 9000 is 4x1, has DX8.1 pixel and vertex shaders, and is only 36 million transistors.

I don't think they really have much of a size advantage for comparable architectures, taking the above into consideration. They probably have a disadvantage, actually.

The question is how much lesser of an architecture can you use with a tiler instead of IMR. I don't think you really can use a much lesser architecture. Would a 4x1 tiler keep up with the 8x1 9700? Doubtful, but I suppose anything is possible...

EDIT: for clarity.
 
If they really mean full programmability then I doubt that a TBR would turn out smaller than an IMR, nor do I expect it to be much cheaper.

Does it matter?
Maybe it didn't reach higher clocks because of the compexity of TBR?
Food for thought ...

It matters when you want to compare it to alternative sollutions of it's time. Would it have been clocked at 200MHz it would have been about 12-14% faster. A higher clockspeed would have yielded far less than doubling he pipelines and adding DDR, while just keeping it on the same 175mhz level.
 
Ailuros said:
The only serious problem was it was too slow.

Compared to what? As PC Engine already pointed at find me a 175mhz, 2 piped, SDRAM equipped IMR that did better. Don't even mention anything like FSAA and you'd be lost.

So, when I'm getting 30 FPS in a game and getting slaughtered because I can't see what is going on, I could take consolation in the fact my graphics card was more efficient? Bottomline was the Kyro was mid to low end and just couldn't compete on par with the Radeon or GF2. If I bought low end cards, it would be great...but I have somewhat higher standards. ;)

I'm not saying that everyone should buy high end cards...but since that's what I buy, the fact the card wasn't in that performance bracket obviously is a major factor against my buying it.
 
A bit interesting question: just how much of an advantage does a TBR design have over a modern IMR with bandwidth optimizations (HyperZ, LMA, etc)? Comparing the Kyro2 to an oldish TNT2 or Geforce2 will obviously not give a good answer, given that the TNT2/Geforce2 have no such optimizations. So instead I tried to find some benchmark results that could be used to compare Kyro2 to a more modern card, like the Radeon9000 Pro. Comparing the specs on these two cards, we find that the Radeon9000 Pro has slightly above 3x the texel fillrate, pixel fillrate and memory bandwidth over the Kyro2. So to approximate the results of a hypothetical TBR specced like the Radeon9000 Pro, we can multiply the Kyro2 results by about 3.1. While seriously speculative, this is the closest comparison between a modern IMR and a TBR that seems to be possible at this time.

Nosing around anandtech for a while, I find the following results (all 1600x1200x32bpp, in order to attenuate the effect of T&L and CPU speed):
  • Quake3 high quality 'demo four': Kyro2: 30.8, Radeon9000 Pro: 82.4, Kyro2*3.1 = 95.5, TBR advantage: 16%
  • UT2003 high detail DM-Antalus: Kyro2: 13.6, Radeon9000 Pro: 20.5, Kyro2*3.1 = 42.2, TBR advantage: 106%
  • UT2003 high detail DM-Asbestos: Kyro2: 18.5, Radeon9000 Pro: 43.4, Kyro2*3.1 = 57.4, TBR advantage: 32%

The UT2003 benchmarks may be affected by the lack of cubemap support in the Kyro2.

Unless there are major optimizations possible in TBRs that are not already present in the (admittedly old) Kyro2 design, I'd say that a TBR that is supposed to compete with a modern IMR should have at least roughly the same specs as the IMR (clock speed, pipelines, texture units, etc) or it may well lose. Except possibly when doing Multisampling AA, where the tiler should have an additional advantage over a modern IMR (by a factor that I would estimate to about 1.5 for 4xMSAA)
 
Nagorak,

True. Yet backtrack your initial comment. What you pay is what you get. You don't expect to get high end performance at half the price do you? Within measures of relativity it wasn't slow.

Bottomline was the Kyro was mid to low end and just couldn't compete on par with the Radeon or GF2.

Debatable. And that 30fps statement is wildly underestimated, especially against the initial Radeon. Unless we're debating 3Dmark scores here.
 
arjan,

That equation doesn't make sense even in theory. Apart from anything else you'd have to use the exact same settings a K2 uses for the 9000 in UT2003, since the first can't possibly reach high detail and I'm still unsure if that would make a supposed test accurate.

edit: most important how can comparing a software T&L with a hardware T&L case even make sense in the first place?

And the estimate on the Multisampling advantage comes from where?
 
arjan de lumens said:
So instead I tried to find some benchmark results that could be used to compare Kyro2 to a more modern card, like the Radeon9000 Pro. ...this is the closest comparison between a modern IMR and a TBR that seems to be possible at this time.

How about a KyroII vs. a Radeon256 SDR? Both have SDRAM (not sure about the width of the memory bus though), two pixel pipelines (but 1 vs. 3 TMU's), and similar clockspeeds (175 vs. 183... or did the Radeon SDR have a 166 clockspeed?). Also, the Radeon had HyperZ bandwidth saving techniques, unlike the 3dfx and NVIDIA cards of the time (but not nearly as advanced as the 8500 or 9700, of course).
 
Well, I said it was speculative... I picked the highest resolutions for which benchmark data were available in order to minimize the effect of hardware vs software T&L and CPU speed issues.

Where I would get the multisamplig advantage estimate from? The Radeon 9700 (state-of-the-art IMR) seems to take roughly a 1.5x performance hit in most benchmarks when doing 4x multisampling - I presume that a well-designed TBR would take close to 0 performance hit with multisampling.

Kyro2 vs Radeon256 SDR might make more sense, except that 3 TMUs per pipe would give the Radeon256 a rather major advantage in any multitexturing application.

edit:

The Radeon SDR seems to be rather severely bandwidth limited, so the 3 TMUs may be a rather small advantage; a quick look at anandtech benchmarks suggest that the Kyro2 mostly holds a 60-70% advantage over the Radeon SDR at 1600x1200.
 
Agreed on the T&L/high resolution part then. One part I must have overread before from your former post:

I'd say that a TBR that is supposed to compete with a modern IMR should have at least roughly the same specs as the IMR (clock speed, pipelines, texture units, etc) or it may well lose

To that I can only agree.
 
Back on the subject of the memory interface, lets not forget the advantage of the TBR chip being able to get away with using a synchronised memory controller with no frills like z compression and allsorts.

Saves in cost and efficiency further. ;)
 
Nagorak said:
I think the major selling point of Power VR breaking (breaking the bandwidth barrier) has basically been destroyed thanks to DDR memory and now faster and faster DDR memory. They developed a system for defeating a bottleneck that never developed thanks to faster memory, along with some HSR aspects incorporated in IMRs. The deferred rendering approach might actually be "better" and maybe the one to choose if one was able to choose which path 3D graphics initially followed. It is a smarter approach...

You are not the only one making the statement that memory bandwidth is less of an issue now and in the immediate future than it has been. I'd still question that assertion. From the Voodoo days to the R9700 of today, effective bandwidth and real-life performance have been very closely connected. The only example of a next-generation 3D-engine that is on the horizon (depending on where you stand, of course ;)) is DOOM3, and it makes much higher demands on fillrate than earlier games. So there is no real-life support for asserting that memory bandwidth isn't critically important. Add to that the general trends towards better filtering and anti-aliasing, higher complexity environments and greater amounts of mobile/interactive geometry.

I can't really see how one can make a general statement about memory bandwidth not continuing to be a critical enabling/gating parameter for the forseeable future.

But now that IMRs are the standard and faster memory has made memory bandwidth less of a concern, there's just not enough going for Power VR anymore. I'll give them this: they have one more shot at breaking into the market. If this next chip does not compete favorably with ATi and Nvidia's chips, then that's it: game over.

In that case Nvidia and ATi will just keep pumping out IMRs until memory speed seriously becomes a problem and then promptly switch over to deferred rendering long after Power VR is out of the picture.

Again, is memory bandwidth less of a concern today than it ever has been? Gfx design has always been a case of balancing cost vs. performance.

The case for deferred renderers has never been about avoiding any absolute limits. The point has been that they might offer a cheaper way to achieve comparable performance(, or higher performance at the same price point).
Deferred renderers have the technological/political disadvantage of not being entrenched, which means that applications will be designed to run as well as possible on IMRs. Deferred renderers will have to beat IMRs at their own game so to speak. Furthermore, you need both high production volumes and preferably multiple board manufacturers/suppliers to achieve low prices, which will be difficult for any newcomer trying to break into a market. Add to that how critical time to market is in the gfx business and things look rough. That hasn't stopped either SiS or Trident from trying though.

Since IMG doesn't operate in the same fashion as ATI or nVidia, I wouldn't be so quick to dismiss them. I agree that the total effort required to design and support a new 3D-gfx architecture is probably growing, and thus the financial stakes involved may become prohibitive. However, 3D-graphics is one of very few areas of current day PCs that aren't completely commoditized yet, and where manufacturers can hope for decent margins for some time yet. Not saying anyone will make money there, but at least it looks better than, say, going into harddrive or mainboard chipset manufacturing.

Entropy
 
Back
Top