SwiftShader 2.0: A DX9 Software Rasterizer that runs Crysis

Discussion in 'Rendering Technology and APIs' started by B3D News, Apr 4, 2008.

  1. B3D News

    B3D News Beyond3D News
    Regular

    Joined:
    May 18, 2007
    Messages:
    440
    TransGaming has just released SwiftShader 2.0, an highly optimized software rasterizer that supports DX9 and Shader Model 2.0 and scales with multi-core processors. It can run (albeit slowly) many modern games and it makes a dual-core Penryn perform similarly to the GeForce FX5600/5700 in 3DMark05.

    Read the full news item
     
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,640
    Location:
    London
    Looking forward to seeing the performance in Far Cry, FEAR and HL-2.

    Jawed
     
  3. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    Far Cry Benchmark

    The benchmark started at 04-Apr-08 01:33:43

    System Information
    Operating system: Windows (TM) Vista Ultimate
    System memory: 4.0 GB
    CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
    CPU speed: 2400 MHz
    Sound system: Luidsprekers (Creative SB X-Fi)
    --------------------------------------------------------------------------------
    Resolution: 1024×768
    Maximum quality option, Direct3D renderer
    Level: Research, demo: Research.tmd
    Pixel shader: model 2.0b
    Antialising: None
    Anisotropic filtering: 1×
    HDR: disabled
    Geometry Instancing: disabled
    Normal-maps compression: disabled

    Score = 7.22 FPS
     
  4. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
  5. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    I can play whole levels on my machine. FPS varies between about 10 and 40 (slightly less than 20 most of the time) for 640x480 and settings at High.
     
  6. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    Please note that SwiftShader 2.0 hasn't been particularly optimized for any of these games. We are committed at optimizing for our client's needs of course, but for existing games only the most obvious bottlenecks were analyzed.

    And, just as importantly, none of these games have been optimized for software rendering. For example using a cube map for vector normalization is tens of times slower than a 'nrm' shader operation. Also many operations a graphics card does 'for free' actually cost cycles when software rendering, unless properly disabled.

    So the above scores are not an upper limit for what is possible with software rendering. For this release we focussed mainly on features and quality, offering a 'complete' Direct3D 9 device for the casual games market.
     
  7. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,640
    Location:
    London
    So it would be sort of playable at 640x480 with maximum quality, ~15fps?

    :oops: If I ever knew that I've totally forgotten :shock: :oops:

    So that's quite decent too. Both of these games would scale to 800x600 or 1280x1024 with lower quality settings.

    Since we talk a lot about ALU:TEX ratio for hardware and where software is in relation to that, would you like to hazard a guess at this ratio for your C2Q PC?

    I dare say you're faced by the same kind of questions of "driver optimisation", profiling and shader replacement that the IHVs face. And presumably these games (or this type of game) aren't "casual enough" to be your main focus.

    So, it's early days yet, too early to examine the detailed performance of state of the art software rendering on C2Q, say, against dual-core-with-IGP systems.

    Presumably you're looking forward to Nehalem - I imagine the shiny new memory system will make things run significantly better. Does the performance of SS on Phenom show benefits attributable to its memory system (as opposed to C2 or X2)?

    Jawed
     
  8. Cypher

    Newcomer

    Joined:
    Jun 28, 2005
    Messages:
    85
    Wow, this is awesome!! I'm glad there's a free-to-use software rasterizer for D3D out there. I can't wait to try it out for myself.

    Quick question: right now it supports only SM2? Not SM2.a/b, or even SM3?
     
  9. wingless

    Newcomer

    Joined:
    Aug 5, 2007
    Messages:
    79
    Location:
    Houston, Texas
    SwiftShader 2.0 on AMD Phenom

    I'm really anxious to know how well it performs on the Phenom platform as well. SS seems like an extremely memory intensive program and should benefit from Phenom/Nehalem cache structures and system mem bandwidth. I hope they add SEE4/a enhancements in future revisions to make better use of the power of these new chips. Also I wonder if the virtualization characteristics on AMD and Intel processors can be used to their advantage.

    EDIT: I forgot to ask, does SS 2.0 have x64 code optimizations? It seems to me it would benefit from the memory optimizations x64 allows.
     
    #9 wingless, Apr 4, 2008
    Last edited by a moderator: Apr 4, 2008
  10. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    That might be slightly optimistic. Pixel processing takes the majority of execution time but vertex processing and primitive setup are not negligible especially at such low resolutions. Furthermore, cache coherency and prefetch efficiency improves with more pixels per triangle.
    I'm probably wrong, sorry. I've never actually ran the game. I'll download the demo and give it a try...
    Actually, no, unfortunately. Lowering quality in Half-Life 2 makes it use Shader Model 1.x but it does nearly the same operations. It does things similar to using a cube texture lookup for vector normalization.
    The theoretical floating-point performance of modern CPUs is actually close to that of mid-range graphics cards (for multiply-add). So in my experience software rendering can handle pure arithmetic work really well. Texture sampling however requires a lot of instructions to implement. You could use RenderMonkey to compare the costs. I'd actually be quite interested in the results myself. :D Transcendental functions like log and exp also don't map directly to x86 instructions.

    Note though that it's never a bottleneck in the ALU:TEX sense of graphics hardware. There are no dedicated texture samplers that could be a bottleneck on their own. It's just a shift in what code is spent most cycles on. This is also why I'm a big proponent of adding a gather instruction to CPUs. It's useful for texture sampling, transcendental functions (for lookup tables), and tons of other things besides graphics. Adding actual texture sampling units would be of much less use for anything else and hard to standardize.
    I'm afraid so, yes. But I'm up for the challenge, so keep tuned. ;)
    Yes, Nehalem and especially Sandy Bridge (with AVX) look very exciting.
    Unfortunately I don't have a Phenom system to test with. But I have noticed some 'interesting' behavior on Ahtlon X2. It's too early to make conclusions though.
     
  11. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    12,771
    wouldnt this be of benefit for those games which dont run properly (rendering errors) on modern cards + drivers
    examples:
    system shock 2
    theif 2
    crimson skies

    edit:
    Just tried crimson skies and it just doesnt run with swiftshader installed
     
    #11 Davros, Apr 4, 2008
    Last edited by a moderator: Apr 4, 2008
  12. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    What makes you think that? It renders at realatively low framerates and low resolution so total bandwidth needs are modest and there's a lot of cycles between texture accesses simply because of the filtering.

    Inter-core bandwidth and latency is something to stay aware of though, especially with increasing core counts.
    No, it's still 32-bit. Since we're aiming mainly at the causal games market 64-bit makes no sense yet. On the other hand the extra registers would definitely help performance. What specific memory optimizations are you referring to?
     
  13. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    Thanks for the info! Another game demo I'll download and try. I'm starting to run out of disk space here... ;)
     
  14. Freak'n Big Panda

    Regular

    Joined:
    Sep 28, 2002
    Messages:
    898
    Location:
    Waterloo Ontario
    Screen shots? How is the quality compared to hardware rendering?
     
  15. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,337
    Location:
    Varna, Bulgaria
    Really funny piece of code! ;)

    Here are some fillrate numbers on my E8400 @ 4GHz:

    Code:
               FrameBuffer Clear : 1254,4 FPS
                      Color Fill : 395,1034 M-Pixel/s
                          Z Fill : 807,8229 M-Pixel/s
                  Color + Z Fill : 309,5396 M-Pixel/s
                  Single Texture : 186,2271 M-Pixel/s
      Single Texture Alpha Blend : 163,5779 M-Pixel/s
                   Dual Textures : 115,7628 M-Pixel/s
                 Triple Textures : 83,04722 M-Pixel/s
                   Quad Textures : 65,43114 M-Pixel/s
        1 Floating Poing Texture : 143,4452 M-Pixel/s
                  Render to Self : 171,9665 M-Pixel/s
                   PS 1.1 Simple : 161,0613 M-Pixel/s
                   PS 1.4 Simple : 166,0944 M-Pixel/s
                   PS 2.0 Simple : 135,8955 M-Pixel/s
                PS 2.0 PP Simple : 138,412 M-Pixel/s
    Yay! My Penryn got double Z/Stencil rate... :lol:

    I noticed, that there is some colour banding in the RTHDRIBL demo -- flares and bloom edges mostly.
     
    #15 fellix, Apr 4, 2008
    Last edited by a moderator: Apr 4, 2008
  16. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    12,771
    Nick :
    One thing the retail version of crimson skies wouldnt work at all it complained about needing direct-x 7.0 or higher I had to patch it to version 1.02 for it to work. As demo's are usually based on version 1.0 and never updated it may not work for you

    ps: if your looking for ideas of games to try read this thread:
    http://forum.beyond3d.com/showthread.php?t=47534

    edit 2: sorry i thought you just wanted ideas for games to play, i didnt realise you were connected to swiftshader and actually wanted non working games to test ;)
     
  17. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,337
    Location:
    Varna, Bulgaria
    Texture filtering is completely missing here, as well as the AA:

    [​IMG]
     
  18. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    It's Shader Model 2.x actually. It supports dynamic branching and predication for vertex shaders, gradient instructions for pixel shaders, there's no limitations in dependent texture reads, support for arbitrary swizzle, no register limitations, and no shader length limitations.
     
  19. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Location:
    Montreal, Quebec
    You can enable trilinear filtering with SwiftConfig (either the .ini file or the web server).

    Anisotropic filtering and anti-aliasing are currently not implemented.
     
  20. mmaenpaa

    Newcomer

    Joined:
    Apr 4, 2008
    Messages:
    1
    What about Larrabee?

    If SwiftShader runs on x86, what about Larrabee?

    After it (Larrabee) is supposed to all about easy X86 programming...

    Markku
     

Share This Page

Loading...