Imagination Announces B-Series GPU IP (PowerVR)

Rootax

Veteran
So after the A series, the B series.

A nice presentation by Anantech here :

https://www.anandtech.com/show/16155/imagination-announces-bseries-gpu-ip-scaling-up-with-multigpu

PowerVR-GPU-B-Series-1_678x452.png
 
What are the chances that this is manufactured on SMIC's N+1 process ? I think it's meant to be a replacement for AMD/NV GPUs for the Chinese market ...

I guess Chinese ISVs will have no choice but to develop for the only GPU that they'll have access to in the future. There should be a new proprietary low level gfx API to compliment the hardware as well ...
 
Wow, does this mean a return to the PC space for PowerVR ?

Could be eventually yes starting from China. Question being if they can pull it off all by themselves. I would trust them to reach RTL level with ease with their experience but after that they might need 3rd party assistance if they don't add up in experienced engineering talent and/or add quite heavily in R&D resources. I'm old enough to remember the past Series5 aka PMX590 horror story and wouldn't want to see another potential high end design ending up being some sort of "do it yourself IKEA project for beginners".... Something like a IKEA Xmas tree where you get a white box containing an axe and a map for the forest :p
 
Could be eventually yes starting from China. Question being if they can pull it off all by themselves. I would trust them to reach RTL level with ease with their experience but after that they might need 3rd party assistance if they don't add up in experienced engineering talent and/or add quite heavily in R&D resources. I'm old enough to remember the past Series5 aka PMX590 horror story and wouldn't want to see another potential high end design ending up being some sort of "do it yourself IKEA project for beginners".... Something like a IKEA Xmas tree where you get a white box containing an axe and a map for the forest :p


I hope it's a first step at least.

1-I've always liked PowerVR, because when I started in the pc world, the tbdr approach seems revolutionary in my eyes. My PCX2 was the smartest thing around, not rendering what you don't see :eek: (then I learn it was more complex than that, but, you know...).

2-I like competition. That's why I hope RDNA2 will be great, I hope Intel gfx will be great, and I hope powervr/img tech can come back with quality products and drivers in the pc space.
 
While I understand what you're saying, IMG needs to make profit again as a company. If it's about their survival I couldn't care if their IP just gets integrated in SoCs for electrical appliances as one example.
 
I hope it's a first step at least.

1-I've always liked PowerVR, because when I started in the pc world, the tbdr approach seems revolutionary in my eyes. My PCX2 was the smartest thing around, not rendering what you don't see :eek: (then I learn it was more complex than that, but, you know...).

2-I like competition. That's why I hope RDNA2 will be great, I hope Intel gfx will be great, and I hope powervr/img tech can come back with quality products and drivers in the pc space.

It's always interesting to think back on those times. At the time MS was pursing tile based 3D rendering (Talisman) as well due to the high price of RAM. Then the RAM market crashed and suddenly saving memory by using TBDR wasn't the big win it could have been and here we are today.

What is interesting is that in Microsoft's latest Hot Chips presentation for the XBSX, they highlighted how RAM prices weren't cost reducing like they have been in the past. This could mean there's a large opportunity coming up for TBDR based solutions if they can drastically reduce the memory requirements for hardware based 3D rendering.

Regards,
SB
 
It's always interesting to think back on those times. At the time MS was pursing tile based 3D rendering (Talisman) as well due to the high price of RAM. Then the RAM market crashed and suddenly saving memory by using TBDR wasn't the big win it could have been and here we are today.

What is interesting is that in Microsoft's latest Hot Chips presentation for the XBSX, they highlighted how RAM prices weren't cost reducing like they have been in the past. This could mean there's a large opportunity coming up for TBDR based solutions if they can drastically reduce the memory requirements for hardware based 3D rendering.

Regards,
SB

Are you talking size or bandwitdh ? For me tbdr is mostly about bandwitdh (and saving processing power), less about size.
 
Are you talking size or bandwitdh ? For me tbdr is mostly about bandwitdh (and saving processing power), less about size.

The original impetus for PowerVR's architecture back in the 90's was the fact that memory costs were so high that no-one could make an affordable consumer graphics card utilizing a lot of memory. Hence why both MS and PowerVR and a few other companies were looking into TBDR due to it's non-reliance on having lots of memory.

Traditional rendering methods required LOTS of VRAM, and VRAM was expensive. So, to get good performant hardware accelerated 3D into the consumer space meant either...
  • Large amounts of RAM, limiting it to the professional space.
  • A different way of rendering 3D scenes. (TBDR, AGP texturing, etc.)
If not for the crash in memory prices, tile based renderers would have proliferated in the consumer space and 3dfx would never have gotten out of the realm of professional workstation rendering. 3dfx caught a lucky break when memory prices started to crash in 1996 which allowed them to enter the consumer market. This allowed them to have an, at the time, unheard of 4 MB of fast EDO DRAM on a consumer card. That large quantity of RAM had been limited to the professional space due to the exhorbitant prices associated with fast video memory prior to the memory crash in 1996.

This also meant that AGP texturing went the way of the dodo as well. There was no need to texture from main memory if, instead, you could just include tons of now relatively cheap VRAM on your video card.

And the rest is history.

...except, now we're coming again into a situation where RAM prices aren't cost reducing that much over time. Unless something changes this means TBDR approaches that minimize the need for onboard RAM may have an opportunity to upset the established order for 3D hardware renderers.

It's also why both Sony and MS have looked into ways to reduce the need for large quantities of fast VRAM. Loading Textures needed in the next 1-3 frames from an SSD is one way to accomplish this...kind of like AGP texturing. Well not really, but it's similar in that it's attempting to find ways around having to have lots of RAM.

Regards,
SB
 
Well, I rarely heard that the main benefits of tdbr was memory usage (by that I mean quantity used). Videologic always pushed bandwidth and effective fillrate...
 
...except, now we're coming again into a situation where RAM prices aren't cost reducing that much over time. Unless something changes this means TBDR approaches that minimize the need for onboard RAM may have an opportunity to upset the established order for 3D hardware renderers.

It's not happening ...

Microsoft and Sony are going to politically pressure AMD into making their GPUs compatible with the IMR model because of their demands for "backwards compatibility" on console platforms. They are going to be fuming once they find out that AMD can't promise backwards compatibility to them anymore in the future ...

Nvidia's biggest advantage compared to other desktop vendors is their driver stack. Changing to a TBR/TBDR architecture would render large sections of their driver consisting over a hundred million lines of code redundant overnight. On top of obsoleting their unique advantage, people will start proclaiming AMD/Intel GPU/drivers are better and more stable since they will work better with thousands of existing applications. Other desktop vendors might just use this opportunity to seize 80%+ of the dGPU market share for themselves instead of cooperating with each other since there's a better reward in undermining a leading competitor's strategy ...
 
If things would render at all times in absolutes a genuine TBDR would never be forced in real time to act like an IMR and the so called IMRs would never defer a single fraction of a frame let alone use any form of tiling.

Ironically: https://www.anandtech.com/show/10536/nvidia-maxwell-tile-rasterization-analysis and it's even funnier considering that the mentioned trend is rather growing than shrinking. Granted it's only for a relatively small portion of geometry, but it remains a clever workaround for whenever they need it. This should not mean that I personally expect NVIDIA to change to a DR architecture since there's no necessity for it either. It shouldn't be too hard to figure that trying to mix the best of both worlds wherever necessary could solve quite a few problems. And no I don't see where and why there are or should be any compatibility problems, even more so that Radeons use a form of hierarchical tiling for eons now.

IF (mark big if here) China should truly use IMG as a vehicle for GPU IP wherever they need it, their market is going to be vast. China will have to pour more resources though into IMG and IMG will have internally to learn how to execute, especially on the support/sw level. If now Apple on top of that should continue to use architectural licenses from IMG and their going to use TBDRs both for their Macs as well as i-gear the amount here is equally massive and there haven't been there any compatibility problems for games and applications either from the start where Apple used its first MBX to the A13 GPU today. There are simply a few pitfalls for developers to avoid with TBDRs which don't take any advanced sorcery to battle and at worst the PowerVR driver circumvents it with a workaround.

Other than that past history: https://www.vogonswiki.com/images/f/f5/Nvidia_on_kyro.pdf
Same story, in a different tune.

Apple took (was it starting with A6/7 or later?) the GPU code from IMG and has re-written it to a large portion to date. One can call it an Apple GPU eventually, however to the outside it's very much a TBDR which continues to use IMG's own texture compression algorithms and that's why Apple renewed it's architectural license with IMG months ago. To the best of my knowledge Apple has not licensed yet any form of Alborix and it will be interesting to see if they fetch a new architectural license for future Alborix C due to its inbuild ray tracing capabilities.
 
Last edited:
Ironically: https://www.anandtech.com/show/10536/nvidia-maxwell-tile-rasterization-analysis and it's even funnier considering that the mentioned trend is rather growing than shrinking. Granted it's only for a relatively small portion of geometry, but it remains a clever workaround for whenever they need it. This should not mean that I personally expect NVIDIA to change to a DR architecture since there's no necessity for it either. It shouldn't be too hard to figure that trying to mix the best of both worlds wherever necessary could solve quite a few problems. And no I don't see where and why there are or should be any compatibility problems, even more so that Radeons use a form of hierarchical tiling for eons now.

Nvidia's "Tiled Caching" and ATI's (AMD) "Hierarchical Z" are not even comparable to the tiling found in TBR/TBDR GPUs from a functionality standpoint ...

Tiled Caching is basically just another way to do work scheduling on recent Nvidia GPUs. Hierarchical Z is a hardware optimization to reject fragments before the early Z stage. On TBR/TBDR GPUs, they use tile memory to store a small portion of the framebuffer state on-chip which is extremely invasive to the overall driver design. Have you seen how bad mobile GPU drivers can get on Android which are often running on TBR/TBDR GPUs ?

Compatibility implications might include new hardware emulating the blend state which would render older generation hardware binaries incompatible with the new hardware because older hardware generations didn't use shader code to handle the blending stage. Emulating the blend state will be a disaster on platforms like consoles that don't have the concept of shader recompilations at all. Performance will be unbearably bad if games are using geometry shaders or transform feedbacks/stream out. I don't think either Microsoft or Sony will embrace the idea of emulating IMR GPUs on TBR/TBDR GPUs ...
 
Nvidia's "Tiled Caching" and ATI's (AMD) "Hierarchical Z" are not even comparable to the tiling found in TBR/TBDR GPUs from a functionality standpoint ...

Indeed; however that's not all there is to it.

Have you seen how bad mobile GPU drivers can get on Android which are often running on TBR/TBDR GPUs ?

I've seen and experienced bad mobile GPU drivers on many mobile GPUs.

Compatibility implications might include new hardware emulating the blend state which would render older generation hardware binaries incompatible with the new hardware because older hardware generations didn't use shader code to handle the blending stage. Emulating the blend state will be a disaster on platforms like consoles that don't have the concept of shader recompilations at all. Performance will be unbearably bad if games are using geometry shaders or transform feedbacks/stream out. I don't think either Microsoft or Sony will embrace the idea of emulating IMR GPUs on TBR/TBDR GPUs ...

For Geometry shaders see: https://forum.beyond3d.com/posts/2158599/

Other than that Sony had the PS Vita (handheld mobile console) fairly recently, so what? Consoles are at the moment in good hands considering the majority are custom AMD SoC designs and I'd expect Nintendo to remain with NV for the coming generation also. The primary reason why they wouldn't switch to a pure IP provider like IMG is that it couldn't deliver what AMD and/or NVIDIA can when vendors actually ask for a semi custom high end console SoC, let alone that IMG might not even have enough resources for something complex as only the RTL for a high end GPU IP block for a console SoC.
 
Last edited:
Nvidia's "Tiled Caching" and ATI's (AMD) "Hierarchical Z" are not even comparable to the tiling found in TBR/TBDR GPUs from a functionality standpoint ...

Tiled Caching is basically just another way to do work scheduling on recent Nvidia GPUs. Hierarchical Z is a hardware optimization to reject fragments before the early Z stage. On TBR/TBDR GPUs, they use tile memory to store a small portion of the framebuffer state on-chip which is extremely invasive to the overall driver design. Have you seen how bad mobile GPU drivers can get on Android which are often running on TBR/TBDR GPUs ?
Modern IMRs are not just tile caching, they cache triangle parameters and submit to a tile locality, for small geometry sets this looks exactly same as TBR. The only practical difference is the amount of buffering that occurs.

Compatibility implications might include new hardware emulating the blend state which would render older generation hardware binaries incompatible with the new hardware because older hardware generations didn't use shader code to handle the blending stage. Emulating the blend state will be a disaster on platforms like consoles that don't have the concept of shader recompilations at all. Performance will be unbearably bad if games are using geometry shaders or transform feedbacks/stream out. I don't think either Microsoft or Sony will embrace the idea of emulating IMR GPUs on TBR/TBDR GPUs ...

There are no compatibility implications around blend states, or emulation of blending in shader code. The later is a direct drop in replacement for those TBR architectures that choose to do it and is transparent (no pun intended) as far as the application is concerned. That aside there is no reason why a TBR solution cannot just implement fixed function blending, in fact I think only one TBR architecture continues to use emulation for this.

Front end geometry features such as transform feedback and GS are no more problematic to TBR than they are IMR's, they occur either directly prior to tiling or rasterisation, they do not have an adverse effects on correctly implemented TBR architectures.

Modern TBRs are largely indistinguishable from IMR's from the application perspective, they do not generally need special handling or extra emulation to make them work.
 
Modern IMRs are not just tile caching, they cache triangle parameters and submit to a tile locality, for small geometry sets this looks exactly same as TBR. The only practical difference is the amount of buffering that occurs.

The only similarity between their tiling is that they can be used for exploiting spatial locality but it ends there. IMRs don't use tiling to break up rendering into screenspace tiles and store their framebuffer state to on-chip memory like a TBR architecture would ...

There are no compatibility implications around blend states, or emulation of blending in shader code. The later is a direct drop in replacement for those TBR architectures that choose to do it and is transparent (no pun intended) as far as the application is concerned. That aside there is no reason why a TBR solution cannot just implement fixed function blending, in fact I think only one TBR architecture continues to use emulation for this.

Front end geometry features such as transform feedback and GS are no more problematic to TBR than they are IMR's, they occur either directly prior to tiling or rasterisation, they do not have an adverse effects on correctly implemented TBR architectures.

I'm pretty sure some TBR GPUs emulate geometry shaders on the CPU and I can't be certain but some of them out there might even use compute shaders to emulate geometry shaders. Are there any pure TBR architectures out there that natively support geometry shaders ?

If a hypothetical TBR architecture did implement transform feedbacks then it has pretty much already lost. Transform feedbacks requires an ordered output with respect to the input data. Transform feedback interacts with the varyings. If transform feedback is enabled during varying shading then it must output the varying data with the corresponding input order. On tilers, varying shading is done on a per-tile basis so the output order for the varyings won't exactly match the input order since tilers can skip processing primitives that lie outside of the tile boundary. It could be possible to preserve the ordering by doing roundtrips to memory but at that point it might be more sane to design IMRs instead ...

Modern TBRs are largely indistinguishable from IMR's from the application perspective, they do not generally need special handling or extra emulation to make them work.

I don't think this is necessarily true because on IMRs it's trivial to just change renderpasses with nearly no discernable performance impact. Mobile IHVs recommend the exact opposite by minimizing changing the renderpasses. Many TBR GPU drivers also don't support dual source blending or logical operations compared to IMRs where they've been standard functionality on IMRs since D3D11 ? On a few IMRs, it's not necessary to bake the blend state into PSOs so they have the potential advantage of not needing to do recompilations if the blend state changes compared to TBRs. On TBR GPUs, it's possible to expose pixel local storage on those devices while the same can't be said for IMRs. In the near future, IMRs will have a standardized mesh shading pipeline in stark contrast to TBR architectures which currently shows no signs of them going in the same direction ...

I certainly wouldn't describe these differences as being 'indistinguishable' ...
 
and TBDR is not the exact same thing as TBR ?
With a Tile-Based Deferred Renderer, the deferred part relates to how rendering is deferred until all geometry has been submitted and evaluated for tile coverage and HSR.
To a limited extent, there is a form of TBR vs TBDR version for Nvidia and AMD's rasterization methods.
Nvidia's tiling scheme doesn't wait for geometry to be submitted, but pixel warps are launched for screen space tiles sized to keep substantial portions of their context like render target tiles on-die. While it doesn't wait, it may at times opportunistically execute things somewhat out of order or perform some optimizations while rendering a given screen tile to take advantage of locality.

AMD's modern rasterizers have a similar tiling option after the introduction of the DSBR, but this one can wait for a finite number of primitives to allow for culling before submitting its batch to the rest of the pipeline. There are limits to the number of primitives it can do this for, and the tile size is controlled by among other things the size of the render target and depth cache tiles the DSBR and associated ROPs would be using. Overall impact seems to have been modest. AMD's method introduces more complexity versus Nvidia's, as it needs to fail through for modes where its culling cannot work, and there are stronger correctness requirements for culling at the front of the pipeline than there are for redundant pixels that are discarded by the ROPs.
 
Back
Top