Huddy: "Make the API go away" [Edit: He wants a lower level API available too.]

And if everyone had to code to the metal we would never have any more
no one would code for a chip that had no market share and a card would never gain market share if no one coded for it

I could be wrong, but I am very strongly under the impression that the work that goes into driver optimisations and DirectX optimisation knowhow is currently probably much worse than any opportunities offered to code to the metal would pose. I don't know that there would be a big difference here either way, but I'd sooner believe that a radical new design would suffer from having to fit into DirectX. And there is only one real way to find out.

Personally I doubt Microsoft will allow it though.
 
with a quote like make the api go away i think huddy is :D

ps: does he give any idea of what speedup we could expect from going to the metal ?

If its more than 2x that will be fun, imagine the senario, crysis 3 comes out has a gf580 path scores 60fps
nv releases gf580 successor (we never have 2x improvement) has to use fallback dx path with crysis 3 scores lower.

Note that Huddy (i.e. me) didn't say that he wanted to "make the API go away". I said that "make the API go away" is an increasingly common request from developers.

Just to support that observation I'll quote John Andersson of DICE who recently publicly stated at http://forum.beyond3d.com/showpost.php?p=1535975&postcount=8 : "I've been pushing for this for years in discussions with all the IHVs".

I'm not trying to argue that DirectX is a bad thing - just that (like most things) it comes with a cost.

For the vast majority of developers the 'cost' of using DirectX is well worth it. For some very high end ISVs like DICE it has become a bottleneck that they'd like to go away.

I guess if I were to refine my wording I'd say some ISVs want to be able to render PC graphics without using an API like DirectX or OpenGL...
 
Couldn't quite follow you here... allow what?

Access the graphics cards from outside DirectX ... but that is probably wrong (just thinking if there would be conflicts between windows components doing more GPU stuff lately and allowing more direct access to hardware, but I guess it should be possible)

@RichardHuddy: thanks for your clarifications here, it seems we agree (cf what I said a few posts down).
 
Instead of coding "to the metal", would it be possible to improve D3D to the point where it is no longer such an impediment? As I understand it, the point of a standardized API is to ensure functionality across a range of hardware. This is not something worth sacrificing IMO.
 
Instead of coding "to the metal", would it be possible to improve D3D to the point where it is no longer such an impediment? As I understand it, the point of a standardized API is to ensure functionality across a range of hardware. This is not something worth sacrificing IMO.

D.I.C.E. suggests in their GDC presentation on DirectX11 that they've been working with Microsoft to do that also (perhaps repi can comment on that himself). But those two don't have to be mutually exclusive.

Ensuring functionality across a range of hardware remains possible. It just means that developers of both software and hardware that want to can, go further and develop a complete new graphics pipeline much more easily and/or use more optimisations for specific hardware should they so desire.
 
I don't get what's wrong with DirectX ceasing to exist.
Aren't GPUs becoming more general purpose, with less and less fixed functions and increasing independence from the CPUs?
If a SH comes up with an engine capable of running solely on the GPU (with also AI, sound, physics, etc.) compiled with whatever code they like, where's the annoyance?
 
I don't get what's wrong with DirectX ceasing to exist.
Aren't GPUs becoming more general purpose, with less and less fixed functions and increasing independence from the CPUs?
If a SH comes up with an engine capable of running solely on the GPU (with also AI, sound, physics, etc.) compiled with whatever code they like, where's the annoyance?

The no API could come once we get a GPU ISA, until then we need a layer of abstraction to support a broad range of hardware.

Sounds like this isn't a call to go back to the "metal" (and Glide, PowerSGL, S3 Metal...) but to get an improved D3D12 that is even more streamlined and gives more freedom to those developers who want it.
If so, I welcome the idea, D3D (even 10/11) often gets in my way, but not all game studios can afford an increased cost of 3D rendering. (ie can afford to write lower level code than what D3D currently requires, in fact many studios buy engines to be at an even higher level.)
 
The no API could come once we get a GPU ISA, until then we need a layer of abstraction to support a broad range of hardware.

Sounds like this isn't a call to go back to the "metal" (and Glide, PowerSGL, S3 Metal...) but to get an improved D3D12 that is even more streamlined and gives more freedom to those developers who want it.
If so, I welcome the idea, D3D (even 10/11) often gets in my way, but not all game studios can afford an increased cost of 3D rendering. (ie can afford to write lower level code than what D3D currently requires, in fact many studios buy engines to be at an even higher level.)
If that's what Huddy meant, then he certainly phrased it poorly (or was being hyperbolic on purpose to garner attention to the "issue" and promote this kind of discussion).

Edit: Duh, he's here. No reason to interpret.

No API is not a path that's feasible anywhere in the near future unless the PC market turns into consoles. Cripes, even Apple would have a hard time with such a route.
 
From chatting with a couple of people that have some insight on being on the dev side of the fence, I'm guessing that devs do want an API, just not DirectX or OpenGL as they stand today.

It's become burdensome and now a challenge to work around, because of the restrictions it creates. As stated in the article, the creativity and innovation is being curtailed because of the API limitations, not the designer, developer, coder, ones.

Anytime an artist or project feels limited by an arbitrary set of limitations, they will seek to remove those shackles. The end result might not be much different but it will be the end result they intended to achieve within their own parameters, rather than settled for.

I remember the creativity of the amiga/atari etc. demoscene, where the demo coders wrote their own everything - didn't rely on the OS functions, just wrote their own assembly: close to the metal. Easy to do on fixed hardware platforms.

To deliver that now an abstraction layer that lived CTM would be needed, something that presented the physical underlying hardware as a representation with physical attributes but allowing the abstraction layer to handle things like varying shader count, simd organization, simd/rop/tmu count, even multi hardware instances. But is it better to start developing this, building on existing virtualization hardware and software capabilities or work towards reforming DirectX or OpenGL; or even expanding OpenCL?
 
Isnt the driver to the metal
game issues direct 3d api call to the driver
driver converts that to a set of gpu specific assembler instructions (puts correct values in correct registers and issues desired interupt)
 
Just a little note, when moderators delete posts, do not write them again, it tends to annoy us.
If you don't understand why a post was deleted, think harder, as a last resort, go ask in the feedback forum.
 
Somehow I feel that Huddy has forgotten the past. I would not like to go back to the time, when I had to keep my old 3dfx Voodoo 2 inside my box in addition to my brand new NVidia TNT2, because some games only supported Glide. This situation repeated couple of years later. I had an ATI Radeon at that time, but the newest Bridge Builder game was programmed only to work with NVidia OpenGL extensions, so I could not play it at all.

I think the both camps (AMD and NVidia) are starting to become tired to reprogram (optimize) shaders ported lazily from consoles to PC, and in adding antialiasing features to DX9 engine games that utilize deferred rendering.

StartCraft 2 is a good example that even the biggest developers are not finding it cost effective to port their games to multiple APIs. If Blizzard doesn't find resources to support one extra API (DX10) to enable antialiasing in their game, I doubt they would want to program support for at least 6 different APIs (AMD discrete cards, AMD fusion with UMA, Intel UMA, NVidia, PowerVR (used by some Intel chipsets), S3/VIA). And big game developers are surely not going to drop support for some chips just to get a few percent frame rate improvement on others.

I agree with Huddy that draw calls have been traditionally expensive on PC, but that was during DX9 era. Each GPU state bit had to be set separately and constants had to be sent for each object each frame over the bus. Those things really kill the performance.

But since DX9 (SM2.0) the API has evolved dramatically (->DX10->DX11):
- State blocks. This allows fast state change (matching hardware units), and pretty much mirrors the console low level APIs.
- Constant buffers. We can store object constants to GPU memory and GPU can access them without CPU bus transfer overhead. Again this is how console developers are used to use the hardware.
- Lower level access to textures. We can access MSAA subsamples directly and we can have multiple views to same texture data for quickly casting format to another. This again mirrors pretty much the console world. PC developers no longer need to do extra resource copies or instances for these reasons.
- Command buffers and multithreaded rendering. Now also available on PC (DX11). Reusing draw calls by recording them to command buffers reduces draw call overhead a lot.
- Instancing. This was not possible in SM2.0 either. Some newer SM3.0 DX9 cards supported it, but with DX10/11 this became part of the required specs. PC instancing also supports post transform vertex cache properly. I doubt many console DX9 ports supported instancing, since the API support was pretty bad (for example ATI SM2.0b cards had it disabled in their drivers by default).
- Stream out, Geometry shaders and Compute shaders. These allow many new algorithms to run completely on graphics chip (that were previously ran on CPU on PC). Again these features reduce the PC bus traffic and draw call count.

10k-20k draw calls per frame sounds like a lot (esp for PS3). Huddy is likely talking about 30 fps console games, since huge majority of console games are running at that frame rate. I did a quick DX11 render loop and allocated a separate constant buffer per object (CB had object matrix and some object properties). My test renderer could do 50k draw calls + CB changes per frame (30 fps) on Radeon 5850 using just a single CPU thread. So the situation is not that bad for PC as Huddy said.

Where the performance is lost then? A quite simple math explains it pretty well. Highest selling AAA console games run usually at slightly less than 1280x720 resolution (for example Modern Warfare runs at 1024x600, Alan Wake runs at 960x540, Halo Reach runs at 1152x720 as does Crysis 2). 1152x720x30fps = 24.8 million pixels per second. PC gamers tend to want to run their games at 60 fps and have usually 1920x1200 monitors. That results in 1920x1200x60fps = 138.2 million pixels per second. With these increased requirements (by PC gamers), the PC graphics card needs 5.6x pixel shader power to run the scene, just because of the increased resolution and frame rate requirements. And we are talking about exactly the same content here (same objects, same textures, same amount of polygons and draw calls).

Add in 4x antialiasing, 8x anisotropic filtering, 2x2 higher res textures and shadowmaps, etc, and soon the 10x more powerful PC hardware is fully utilized. Without any extra objects in the scene. Unfortunately for most players, the improved resolution (720p->1200p) + double frame rate + improved texture resolution (2x2) + better filtering (AA+AF) doesn't exactly feel like 10x improved graphics quality. But saying that this is the fault of PC DirectX API is an overstatement.

If developers would need to code separate support for all the different PC hardware, I think we would have even less PC games than currently. Most developers would feel that porting the game to PC would require too much extra work to be profitable. Releasing for PC already is really demanding, as it requires huge amount of extra testing. There's so many different OS-versions, different graphics chips (manufacturers and generations), different CPUs and memory configurations. I personally feel that DX10 and DX11 were a huge step to right direction, and I hope that in the future we will continue on that path, making DirectX closer to the hardware API, instead of making a separate API for each hardware.
 
Last edited by a moderator:
Somehow I feel that Huddy has forgotten the past. I would not like to go back to the time, when I had to keep my old 3dfx Voodoo 2 inside my box in addition to my brand new NVidia TNT2, because some games only supported Glide. This situation repeated couple of years later. I had an ATI Radeon at that time, but the newest Bridge Builder game was programmed only to work with NVidia OpenGL extensions, so I could not play it at all.

<snip--snip>

If developers would need to code separate support for all the different PC hardware, I think we would have even less PC games than currently. Most developers would feel that porting the game to PC would require too much extra work to be profitable. Releasing for PC already is really demanding, as it requires huge amount of extra testing. There's so many different OS-versions, different graphics chips (manufacturers and generations), different CPUs and memory configurations. I personally feel that DX10 and DX11 were a huge step to right direction, and I hope that in the future we will continue on that path, making DirectX closer to the hardware API, instead of making a separate API for each hardware.

Everything was interesting, but this reply only applies to those two comments.

In the first case it's interesting to consider what the PC gaming landscape would be like if not for OpenGL and Directx (both managed higher level API access to hardware). For example, would Nvidia or ATI even exist right now if not for DirectX and to some extent OpenGL? The Riva 128 for example was created specificly to take advantage of and accelerate Direct3D. Without that, Nvidia may not have had a chance to even make the TNT as they had already lost lots of money on NV1 and its proprietry rendering method and API. If not for D3D and OGL 3dfx would quite likely still be king and dominate the gaming world. Even with OGL support on later Nvidia cards, most games were still mostly using Glide for 3D acceleration. Once Direct3D matured, that combined with OGL opened the door wide open for multi-vendor competition in the 3D accelerated game arena.

And for the other comment. I can't help agreeing with that. While there are certainly developers that would and could go to all lengths not only to create multiple rendering paths AND spend the time to make sure they all worked correctly for their game, I can imagine there's an order of magnitude more devs that couldn't justify the cost involved and would only code to the hardware that is most prevalent.

While, I'm sure we'd all like to be able to see what DICE, for example, could accomplish with to the metal access, I can't help thinking that it would be yet another barrier encouraging devs to code for consoles where they only have to worry about one set of hardware if they are exclusive or two (maybe three) sets of hardware if they are multiplatform.

I'd imagine that Microsoft will continue to provide greater access to hardware as DirectX evolves. As sebbbi pointed out, there's already been huge strides with Dx9->10->11. Microsoft is always having to balance the desires of software developers with the reality of what the hardware designers can create with the need to keep Windows a stable platform. They may not always get the balance exactly right, but they do seem to be getting better and better at it. And the computer gaming industry as a whole benefits greatly from them providing a (for the most part) single way to exploit various IHV hardware.

Regards,
SB
 
I can imagine there's an order of magnitude more devs that couldn't justify the cost involved and would only code to the hardware that is most prevalent.

But wouldn't these just keep using DirectX, OpenGL, Unreal Engine 3, CryEngine 3, Flash, Silverlight or whatever?
 
Personally I don't think multiple code path is a serious problem. For those who don't want (or can't afford) to do multiple code paths, they always have Direct3D. For those who wants better performance out of each vendor's card, they already have to do multiple code paths even with Direct3D. So additional "low level" vendor specific API are probably not very serious problem for these guys.

Furthermore, these low level API also enable the possibility of middle ware (game engine) developers to make better code paths for specific GPU. Those who can't afford to do so can buy (and generally already do) game engines to reduce development cost.

The only problem with low level APIs I can think of is the compatibility problem with current WDDM. Of course you don't want a game using low level API to crash your computer. How a program using a low level API should share resources with other Direct3D programs is also a difficult problem. it could also be a security problem if not designed well.
 
But wouldn't these just keep using DirectX, OpenGL, Unreal Engine 3, CryEngine 3, Flash, Silverlight or whatever?

With using engine platforms (CE3, UE3, etc.) you still have make sure each rendering path works with your game correctly. And if you modify the engine in anyway, most devs do I assume, then you may end up having to code your own rendering paths anyway.

With regards to DirectX and OGL, would IHVs really want to take on the task of providing yet another API that's closer to the metal, bugfix an additional API, optimize an additional API, convince Microsoft that they should allow an API closer to the metal which may or may not jeopardize system stability, etc... Intel aleardy can't keep up with just one API to worry about. AMD while generally just as robust in DirectX has traditionally trailed in OGL, and that's just 2 APIs.

All of which means you will quite likely have developement gradually favoring one IHV over all others.

For example, lets say IHV X can only keep up with IHV Y in one API, gets close in another API, and just gives up on the 3rd. If we use DX and CTM (Close to the Metal), that means they give up competing in the professional market where OGL is far more relevant. And means they'll trail in performance/features in either DX or CTM. Which means end users will gradually tend to favor the IHV with more resources. Which means less resources for the other. Which then means less ability to keep up with 2 APIs, etc.

And I only mention CTM as an API as I'd imagine at least Nvidia and AMD would be reluctant to divulge all information regarding their hardware for fear it could give their competitors an advantage.

IMO, it's best to just keep going as we have been. Software developers pushing MS to allow more access to hardware. Meanwhile, MS evolving DX to better meet the demands of Software devs and pushing IHVs to include features Software developers are demanding while balancing to an extent what each IHV is currently capable of delivering for the next generation. If MS can't keep IHV hardware similar if not the same in terms of capabilities then the PC market will most definitely shrink as we go back to a time when hardware could have greatly disparate feature lists (80's to mid/late 90's).

Remember how things were with fledgling 3D acceleration? 3D acclerated game for Matrox couldn't run on S3 or ATI. And same thing going other ways. So for example, you had to create a specific version of Mechwarrior 3D for Matrox, ATI, S3, and whoever else had a card. It was a bloody mess. Then you had Rendition and 3dfx entering with their own ways of doing things and again you had to code specifically for each one. And eventually the only 3D accelerated games being made were for 3dfx with everyone else being ignored until OGL and D3D matured. At which point Nvidia entered with Riva 128 (D3D), Intel with the i740 (D3D), ATI switched to D3D and started abandoning any proprietary efforts, etc. And it's only at that point that 3D games started being made that could run on all vendors cards.

Regards,
SB
 
Remember how things were with fledgling 3D acceleration? 3D acclerated game for Matrox couldn't run on S3 or ATI. And same thing going other ways. So for example, you had to create a specific version of Mechwarrior 3D for Matrox, ATI, S3, and whoever else had a card. It was a bloody mess. Then you had Rendition and 3dfx entering with their own ways of doing things and again you had to code specifically for each one. And eventually the only 3D accelerated games being made were for 3dfx with everyone else being ignored until OGL and D3D matured. At which point Nvidia entered with Riva 128 (D3D), Intel with the i740 (D3D), ATI switched to D3D and started abandoning any proprietary efforts, etc. And it's only at that point that 3D games started being made that could run on all vendors cards.
I started my first 3d engine project when Glide was still very much alive, and in DirectX 5 you had to load different textures to different memory banks just to get things to work properly on Voodoo 2 cards as well (both Voodoo 2 texture units had their own separate memories, and the frame buffer was in third memory pool). PowerVR had their own API as well. I remember implementing lens flare code by directly accessing depth buffer. The bit orders were completely different on Matrox G400 compared to Voodoo 2 and TNT, and PowerVR didn't have a z-buffer at all. Everything was obviously completely undocumented. Personally I would not like to go back there.

Don't get me wrong, I love getting my hands dirty and doing very low level microcode and EDRAM optimizations on Xbox 360 and to pass DirectX API completely whenever possible. And same applies to all other closed console hardware I have worked with (I always prefer the lowest level hardware access). However these platforms all have a single hardware configuration, and you can performance analyze and fine tune every little bit of your code to match the hardware perfectly. You can even do silly trade offs to reduce a bottleneck that would seem really counterintuitive if you didn't know exactly the bottlenecks. But I don't want to do these things on PC, since all the bottlenecks change depending on the hardware. How should I suppose to know how I allocate my unified shader GPRs between VS and PS on each part of my scene if I don't know how many registers and shader units the graphic chip even has? What is the correct TEX/ALU ratio I should use for each card? Should I write 4d vectorized shaders for low/mid/high TEX/ALU AMD cards, and different 5d shaders for the last generation cards? What happens if my game launches with 5d shaders, and next year AMD releases new generation cards with 4d shaders? Do all developers need to path their games to run with the new hardware? Does the low level API have a "simulation" feature that I can use to query using my shader, render target and state info to get the bottlenecks and make educated choices to pick correct shaders / render paths real time? Could I query the vertex cache size? (this would actually be great feature to have in DirectX).

This thread actually increased my curiosity to test how fast DX11 renders stuff from prerecorded command buffers (we use them extensively on consoles). Basically with prerecorded command buffers you could render an animated scene with no draw calls at all. The prerecorded command buffer just includes the constant buffer GPU memory addresses for each object, so you can edit the constants and still reuse the recorded draw calls from frame to another. Objects can be moved/rotated by editing their matrices in the constant buffers. Of course you would need to update the main scene command buffer periodically, but much less often than every frame.
 
Bit-tech.net last Wednesday reported that AMD's worldwide developer relations manager of its GPU division, Richard Huddy said software developers told him Microsoft’s Direct X was getting in the way of optimal graphics performance for PC gaming. In an interview with CRN on Tuesday, however, Huddy said his comments had been taken out of context and exaggerated. Huddy and Neal Robison, senior director of ISV relations at AMD, said only high-end gaming developers may benefit from going around Microsoft’s API.

http://www.crn.com/news/components-...-micrsofts-directx-api.htm?pgno=1&itc=refresh

Some more interesting stuff there.
 
Would the lack of an API slow the the release cycle for new hardware?

I was thinking about something along those lines.

Each architecture from AMD and NVIDIA has a major refresh every 18 to 24 months with incremental and evolutionary changes in between.

Wouldn't "programming to the metal" break compatibility? It would mean a lot of new patches every time new hardware was released and 18 months is not a significant amount of time. Consoles have a life shelf of 5 years (maybe moving to 7-10 years this generation) so consumers have a system that is well supported for a reasonable amount of time.

The alternative is new hardware is built around these new metal APIs and therefore sometimes cannot be as innovative in change as they once were.

In either case I don't see it working. Not just yet anyway.
 
Back
Top