SwiftShader 2.0: A DX9 Software Rasterizer that runs Crysis

Alright, here you go:

cubemap.exe @ 1280x960

default: 13
1 thread: 11.5
2 threads: 16
3 threads: 13
4 threads: 15
.....
Phenom X4 9600 (2.3 GHz)
Wow, that's really very low frame rates, even considering a TLB erratum involvement.
A 4GHz E8400 with single thread, running under the same conditions, is more than twice faster over the highest score, here!
Don't ask what happens, when the second core engages. :cool:
 
Well, I can confirm the poor speeds on the Phenom X4 9500 (also on 780G) with the TLB fix turned off. In fact, it's slower than my A64 X2 3800+ except in the single thread case.
 
i have a question wouldnt devs of casual games realise that their audience is unlikey to have decent gpu's and scale back the graphics accordingly ?

ps: on the casual games you are targeting what framerate is swiftshader getting ?
 
I think I know more about that than you. Back in the day I was actually spinning them donuts and cubes on Amiga, 486 and Pentium, remember?
And yet you said that systems with something like a Pentium M can't run any games and 3D stuff without hardware acceleration.
I lived through that whole revolution of accelerated graphics, and I am well aware of the fact that 3d games changed forever, and there is no way back.
Then how do you explain the continuing trend towards integrated graphics? What about projects like Fusion? Exactly where you put the line between hardware and software rendering? And what makes you so sure that CPUs and GPUs will stop converging?
Michael Abrash also provided ample evidence with PixoMatic.
Evidence of what? I have a lot of respect for Pixomatic but it doesn't really surprise me that a DirectX 7-class software renderer hasn't survived much longer than 2004. SwiftShader 2.0 implements everything expected from a Shader Model 2.0 device and passes all the most relevant WHQL tests. You can't really conclude anything about SwiftShader's future based on Pixomatic's past.
You missed my point that these cards were standard in many Dells, Compaq's, HP's etc. So these people don't have to pay extra.
Indeed, those people don't. But that doesn't negate in any way that 15% of people today still don't have DirectX 9 hardware. And they're not going to get it for free. They can get SwiftShader practically for free though if it's included with the software.
Even so, an investment of a few euros would greatly improve their gaming capabilities and overall experience.
These people don't want to invest anything extra in their gaming experience at all. The majority of them doesn't even know what GPU stands for. They do however expect software to just work, even if it includes some 3D.
Because technically I can't imagine why you'd need SM2.0 for a chessgame... Why wouldn't DX7 graphics be just as good? In Chess Titans I see little more than some Gouraud shading and a plane reflection. That doesn't need shaders.
Developers prefer to implement just one rendering path, using the most recent API. Doing things with texture blending and multipass, or writing assembly shaders, is a whole lot less appealing than writing an HLSL shader. The exact reasons don't matter that much though. It's a fact that casual games using DirectX 9 effects are getting more common, and software rendering offers a way to broaden the target market.

Casual games have been running on the CPU since Pong. So why bash a software renderer that allows everyone to enjoy DirectX 9 graphics?
 
What kind of cpu would be needed to match Wii capabilities?
Interesting question!

It renders at 640x480 so it doesn't require a high fillrate. Screenshots from Mario Kart look somewhat comparable to TrackMania Nations, which renders smoothly on my Intel Core 2 Quad 2.4 Ghz. I do believe the Wii's GPU has some extra headroom though, but on the other hand SwiftShader has some extra potential too. Anyway this should give you some ballpark idea.
 
And yet you said that systems with something like a Pentium M can't run any games and 3D stuff without hardware acceleration.

I was talking about games made in the past 5 years or so. Sure, in theory you could still run your smelly old 486/Pentium software renderers on them, and actually increase the resolution a notch or two, wow!
But in the past 5+ years there have been no games that have included a software renderer, let alone that they were actually designed to work efficiently with software rendering.
You should try writing more than just a renderer sometime, perhaps then you'll get an idea of how differently you would setup your workload and rendering algorithms for software vs hardware. As I say, it was a big transition from software to hardware, and there's no going back.

So no, the actual games on the market, they don't run that well on these systems even WITH acceleration, and it's futile to even try running them with a software-renderer.

Then how do you explain the continuing trend towards integrated graphics? What about projects like Fusion? Exactly where you put the line between hardware and software rendering? And what makes you so sure that CPUs and GPUs will stop converging?

Integrated graphics is still accelerated 3D to the best of my knowledge, so what's your point?
Fusion is just a cost-effective solution for low-budget/small form factor OEM machines (and still 3d accelerated obviously).
I never claimed that CPUs and GPUs will stop converging, and I don't think they will. But your point with SwiftShader is that the CPU will *replace* the GPU, and that is something quite different from the CPU *integrating* the GPU.

Evidence of what?

Evidence that a software renderer cannot run present-day games at acceptable performance and quality levels.

You can't really conclude anything about SwiftShader's future based on Pixomatic's past.

I do know however that PixoMatic has a simpler and faster renderer than SwiftShader running SM2.0, and that the games were actually tuned a bit towards the use of PixoMatic, which should give PixoMatic some advantage over SwiftShader.

Indeed, those people don't. But that doesn't negate in any way that 15% of people today still don't have DirectX 9 hardware. And they're not going to get it for free. They can get SwiftShader practically for free though if it's included with the software.

Yes, but since SwiftShader doesn't actually allow them to play games properly (okay, Chess Titans perhaps), that really doesn't help them. They'll still need a faster CPU for SwiftShader... and if they buy a new PC to get that faster CPU, they'll also get an IGP with SM3.0+ features that is significantly faster than SwiftShader, for free! So it's a catch-22.

These people don't want to invest anything extra in their gaming experience at all. The majority of them doesn't even know what GPU stands for. They do however expect software to just work, even if it includes some 3D.

So they live in an alternate reality? Doesn't mean that what they *expect* to happen is actually a realistic possibilty.

Developers prefer to implement just one rendering path, using the most recent API. Doing things with texture blending and multipass, or writing assembly shaders, is a whole lot less appealing than writing an HLSL shader. The exact reasons don't matter that much though. It's a fact that casual games using DirectX 9 effects are getting more common, and software rendering offers a way to broaden the target market.

Which brings us back to Davros' question: what makes you think these casual game developers will choose the most recent API when their target audience clearly is incapable of running it anyway? In fact, no offense, but casual games are usually written by small and inexperienced game studios where they don't even have the knowledge/time/resources to develop their own D3D renderer and shader system. Fixed-function D3D or even OpenGL is much easier, and is also better supported by their target audience.
You have yet to name any such casual games by the way, aside from a game that comes with Vista... an OS that by its requirements alone doesn't match the target audience you are implying.

Casual games have been running on the CPU since Pong. So why bash a software renderer that allows everyone to enjoy DirectX 9 graphics?

'Enjoy' isn't the word I would use.
Besides, you have been ignoring the inevitable all the time, even though I brought it up a few times already: For quite a while now it has been virtually impossible to buy any kind of IGP/GPU with anything less than SM2.0 (3 years I think?... even Intel had SM2.0 back in 2004 on the GMA900 series).
So it's only a matter of time before everyone has upgraded their 5+ year old systems for something with an IGP that makes SwiftShader obsolete.
Aside from the fact that you may be able to outperform those 5+ year old IGPs with *today's* (extreme high-end) CPUs, but you won't be getting anywhere near with the 5+ year old CPUs that are actually inside the systems with these IGPs.
 
Last edited by a moderator:
i have a question wouldnt devs of casual games realise that their audience is unlikey to have decent gpu's and scale back the graphics accordingly ?
Scaling back graphics is going to make things visually less appealing. If the competition makes a more attractive game for a slightly smaller audience it's going to overshadow your mediocre looking game. Adding multiple rendering paths makes development a lot more complicated, and also increases QA and support costs.
on the casual games you are targeting what framerate is swiftshader getting ?
It depends largely on the actual application and the CPU. We're also mainly targetting games that are still in development, as that maximizes the opportunity for optimization (sometimes very simply guidelines have a massive effect on performance). We're already talking with some very serious companies but obviously I can't say much about that.

Anyway, to give you some ballpark idea about performance: CellZenith requires DirectX 9 and on my Q6600 renders at 90 FPS at 640x480, 20 FPS at 1600x1200. On a 1.7 GHz Pentium M it still averages 20 FPS at 640x480. That's all at SwiftShader 2.0's high default quality and with no tuning between the application and SwiftShader.
 
Anyway, to give you some ballpark idea about performance: CellZenith requires DirectX 9 and on my Q6600 renders at 90 FPS at 640x480, 20 FPS at 1600x1200. On a 1.7 GHz Pentium M it still averages 20 FPS at 640x480. That's all at SwiftShader 2.0's high default quality and with no tuning between the application and SwiftShader.

This uses the DirectX 9 API yes, but does it actually use shaders?
By the looks of it, it's a 2d sprite engine (I cannot run it from here, but the screenshot looks like they are bilinear filtered sprite textures), which requires no more than basic fixedfunction operations.
You do realize that any DX7+ hardware can make use of the DirectX 9 API, right? I'll have to try it on my old fixedfunction Celeron laptop at home. See if it works and what framerates it gets.

Aside from that, this is not a published commercial game, but rather the hobby project of one individual. Is he going to buy a SwiftShader license to bundle it with his game? Doesn't look like it.
So are there any published, commercial games (with actual 3d and actual SM2.0 requirements) that work on SwiftShader?
 
cubemap.exe @ 1280x960

default: 13
1 thread: 11.5
2 threads: 16
3 threads: 13
4 threads: 15
.....
Phenom X4 9600 (2.3 GHz)

Here is an updated comparison with a Q6600 @ 3825 MHz:

1 thread - 35 fps
2 threads - 56 fps
3 threads - 52 fps
4 threads - 65 fps
 
Well, just as I suspected, CellZenith ran fine on the old Celeron 1.6 GHz laptop with Radeon 340M (which is a DX7-class rasterizer with software vertexprocessing, one of the cheapest solutions available at the time... figures since it's a Celeron machine anyway).
It was playable, even in 1024x768.
With SwiftShader it was far too slow.
 
Here is an updated comparison with a Q6600 @ 3825 MHz:

1 thread - 35 fps
2 threads - 56 fps
3 threads - 52 fps
4 threads - 65 fps

Yikes! That's at 1280x960 resolution, right? Can you down clock that bad boy to 2300 MHz just to see what the numbers will be?

My cpu does have the TLB patch enabled (through Vista SP1), so it isn't that easy to disable it (from what I could tell from Google).
 
So I have a second laptop that I use (more portable than my 17" monster) - a Dell 1501.

Great CPU in it, 4 gigs of RAM (xp x64, naturally) - hamstrung by the crappy, integrated, non-upgradable-without-buying-another-laptop graphics. (ATI Xpress 1150).

GPU - In World of Warcraft, I get ~20 FPS at 1280x800, medium graphics settings.
SwiftShader, same settings ~3 fps (1.8 ghz X2)

Two things:
I have a *shitload* of RAM - anything SS could be doing differently with that in mind (yeah, on systems that it's designed for, they won't have tons of RAM, I know)? [Chances are any memory size intensive operations would simply slow things down, that'd be my guess]

Second, when alt-tabbing out of WoW, upon return, the mouse cursor isn't grabbed - mouse clicks simply end up at the desktop.
 
It was a bug in SwiftShader's UpdateSurface implementation. Indeed a corner case that apparently was never hit by any other application, with a hazy description in the SDK documentation. But no worries, the bug has been addressed entirely and Trials 2 Second Edition works very nicely.

Fun game too! It's a bit hard to get used to the controls though. Is this an evolution of a Flash game using the same concept?

Yeah, it's a basically a highly evolved version of the old Java and Flash based browser Trial Bike games. But this time we were using a real 3d physics engine and our new high end deferred renderer. It's been a really fun project. I have been able to experiment lot more with different rendering techniques than in our previous console projects with large publishers. Programming for PC for a while has been really refreshing... but gettings things to work on all possible hardware configurations requires much more testing than on closed console hardwares. Glad to hear that the bug was not in my code, and we don't need to patch the game soon :)
 
Alright, here you go:

cubemap.exe @ 1280x960

default: 13
1 thread: 11.5
2 threads: 16
3 threads: 13
4 threads: 15
IGP: 130

Phenom X4 9600 (2.3 GHz) with HD3200 IGP (780 chipset)
4 GB RAM
Vista 64

I finally found a program to disable the Phenom's TLB patch. I verified it was disabled with a benchmark. Here are the updated results:

cubemap.exe @ 1280x960

1 thread: 13.5
2 threads: 18.5
3 threads: 7.5
4 threads: 2.5

Not much of an improvement. A Core 2 it ain't.

Why do you guys suppose the 3 and 4 thread numbers are lower with the patch disabled?

@Nick
Is this just an issue of coding for different cpu architectures, or are the Phenoms really this bad? Good thing this isn't a gaming rig. LOL!
 
Hm, that's bizarrish. BIOS level lock is the preferred way of managing the TLB erratum, AFAIK from other sources.
Anyway, the ultimate verification would come from a B3 rev. chip.
 
Last edited by a moderator:
Apparently there's some kind of synchronization (?) issue on the Phenoms because the architecture is different. I think the 2.01 version will have a fix.
 
What method does SwiftShader use anyway, for multithreading?
I would suspect that an Alternate Frame Rendering approach would be the most efficient, as it should have the least sync/overhead/redundancy/cache issues.
 
Back
Top