nFactor2 - an engine on X360

Discussion in 'Console Technology' started by Titanio, Sep 6, 2005.

  1. deathkiller

    Newcomer

    Joined:
    Jul 24, 2005
    Messages:
    186
    Likes Received:
    4
    A 8KB buffer sould be enougth to hide up to 500 cicle DMA latency in the worst case scenario (one 128 bit load from LS to register per cicle) not counting the instrucción buffer.

    In any case when you can predict all the data that you will use (so you aren´t loading from Main Memory data that you won´t use) like in the tipical Streaming case you have a lot more memory in the LS than you need.

    In fact the originaly the SPUs only had 64KB LS.
     
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,499
    Likes Received:
    1,856
    Location:
    London
    In that case the output dataset size might form the constraint. Say 8KB of input data leads to 32KB of output data (e.g. tessellation of input triangles). How long would it take, say, to output that data to RSX? Would a larger batch be more desirable in the case of this algorithm? Does it matter what size of batch is used, as long as 8KB is the minimum input dataset?...

    Anyway, it's interesting what you say about LS being "larger than strictly necessary for a purely 128-bit Vec4 datastream".

    Jawed
     
  3. deathkiller

    Newcomer

    Joined:
    Jul 24, 2005
    Messages:
    186
    Likes Received:
    4
    If you have an algorithm that give you 4x more output then you would reduce the size of the input buffer, load a 2KB buffer and store a 8KB buffer would take you at least 600 cicles (probably a lot more) to use the buffers.

    If you have a SPU that will only work in Streaming you can have other SPU to use some of his LS.

    If you can't use buffers (because you don't know what data will you need) I think that the best option would be doing two or more things at the same time.
     
    Jawed likes this.
  4. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,837
    Likes Received:
    166
    Location:
    Minato-ku, Tokyo
    http://www.watch.impress.co.jp/game/docs/20050929/3dinis.htm

    Zenji Nishikawa uploaded his detailed report on the presentation by clearing NDA, with 1024x576 directfeed screenshots...

    + The engine, nFactor2, is an in-house engine by Inis that utilizes multicore processor and SM3.0 GPU.

    + FP10 (7e3) HDR + tone mapping
    (pupil simulation by dynamic tone mapping)
    http://www.watch.impress.co.jp/game/docs/20050929/ini04.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini05.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini06.htm
    (HDR bloom/glare)
    http://www.watch.impress.co.jp/game/docs/20050929/ini07.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini08.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini09.htm

    + Normal maps, parallax maps (details created by Zbrush)
    http://www.watch.impress.co.jp/game/docs/20050929/ini13.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini14.htm

    + Light-Space Perspective Shadow Maps (LSPSM)
    http://www.watch.impress.co.jp/game/docs/20050929/ini32.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini33.htm

    + The shader was written in assembly at first but was converted to HLSL which resulted in some preformance gain. The performance of creating soft penumbra for LSPSM could be doubled thanks to dynamic branching supported by PixelShader 3.0
    http://www.watch.impress.co.jp/game/docs/20050929/ini34.htm

    + 2x AA is applied to scene rendering without penalty thanks to eDRAM

    + Physics engine is NovodeX suitable for multicore

    + Hair physics is an original implementation by Inis, which is running on CPU and GPU.
    (hair and accessary driven by physics)
    http://www.watch.impress.co.jp/game/docs/20050929/ini41.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini42.htm
    (more physics demo)
    http://www.watch.impress.co.jp/game/docs/20050929/ini43.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini44.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini45.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini46.htm

    Besides, Inis revealed the rendering pipeline of nFactor2.
    http://www.watch.impress.co.jp/game/docs/20050929/ini47.htm
    Its average polygon count per scene is 150,000. As Vertex Shader works in the first 5 passes, apparently GPU processes about 750,000 polygons/frame, so it's still Pixel Shader intensive.

    + Pass 1/2 - Shadowmap
    (rendered by LSPSM into FP24, D3DFMT_D24FS8 depth buffer)
    http://www.watch.impress.co.jp/game/docs/20050929/ini48.htm

    + Pass 3 - Z buffer prepass
    (Deferred Rendering, which is fast in Xenos thanks to depth buffer resident in eDRAM)
    http://www.watch.impress.co.jp/game/docs/20050929/ini49.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini50.htm

    + Pass 4 - Shadow color
    According to the developer, this pass was originally included in the pass 5, but was separated due to sub-par performance. He suggested texture cache was hindered by shadow rendering pass. 5x5 Gaussian filter for soft shadow might have destroyed the locality of texture cache in color rendering. Nishikawa speculates that Xenos has relatively small number of transistors because of removal of some cache memory while having more registers for multithreading, so Xbox360 may require Xenos-specific optimization unlike cache-rich NVIDIA GPU.
    http://www.watch.impress.co.jp/game/docs/20050929/ini51.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini50.htm

    + Pass 5 - Color, lighting
    Diffuse, normal map, environmental map, gloss mapping etc. Lighting by all light sources in a scene is done in this pass.
    http://www.watch.impress.co.jp/game/docs/20050929/ini52.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini50.htm

    + Pass 6 - 11 - Luminance instrument
    Scanning HDR-rendered frames to get average luminance. It has relatively small GPU load as it's only framebuffer processing by Pixel Shader.

    + Pass 12 - 19 - Bloom/Glare
    Adding blur to places with higher luminance than average in a low-res buffer, then blend with rendering target
    http://www.watch.impress.co.jp/game/docs/20050929/ini54.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini50.htm

    + Pass 20 - Tonemapping
    pupil simulation (dynamic exposure adjustment)
    http://www.watch.impress.co.jp/game/docs/20050929/ini55.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini50.htm

    + Pass 21 - Depth of Field
    http://www.watch.impress.co.jp/game/docs/20050929/ini56.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini57.htm
    http://www.watch.impress.co.jp/game/docs/20050929/ini58.htm

    The rest is about the breakdown of threads which was covered before in this thread.
    One new revelation is, in an experimental version of the demo it had used 5 HW threads, but in the demo shown at the presentation uses 4 HW threads with Thread 0 running both main game loop and rendering engine.
     
    Jawed likes this.
  5. Titanio

    Legend

    Joined:
    Dec 1, 2004
    Messages:
    5,670
    Likes Received:
    51
    Wow, great post one, thanks! Very detailed breakdown.

    My god, that thing is ugly! :D Engine looks nice though :)
     
  6. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,579
    Likes Received:
    16,056
    Location:
    Under my bridge
    What's with the wierd fuzz? Is that their DOF? And are there any non-AA'd pics for comparison?
     
  7. pipo

    Veteran

    Joined:
    Jun 8, 2005
    Messages:
    2,627
    Likes Received:
    30
    Cheers one!
     
  8. czekon

    Regular

    Joined:
    Aug 9, 2005
    Messages:
    741
    Likes Received:
    6
    great info one :D ....vid would be nice :D
     
  9. Inane_Dork

    Inane_Dork Rebmem Roines
    Veteran

    Joined:
    Sep 14, 2004
    Messages:
    1,987
    Likes Received:
    46
    Thanks, one.

    It's too bad the leaves and hair look so... wrong.
     
  10. Acert93

    Acert93 Artist formerly known as Acert93
    Legend

    Joined:
    Dec 9, 2004
    Messages:
    7,782
    Likes Received:
    162
    Location:
    Seattle
    And all the technology in the world cannot make up for horrible art :twisted:
     
  11. _phil_

    Veteran

    Joined:
    Jan 3, 2003
    Messages:
    1,659
    Likes Received:
    13
    keep that quote.it'll be usefull this coming gen.
     
    #71 _phil_, Sep 29, 2005
    Last edited by a moderator: Sep 29, 2005
  12. mckmas8808

    Legend

    Joined:
    Mar 8, 2005
    Messages:
    6,744
    Likes Received:
    28
    Looks nice. I never really notice how great DOF can be when applied to video games. Great find one.
     
  13. London Geezer

    Legend Subscriber

    Joined:
    Apr 13, 2002
    Messages:
    23,853
    Likes Received:
    9,824
    Am i the only one to get some very strange fuzzy pixelisation in the shots with DOF? Looks terrible.

    The rest looks ok i guess, the art is really awful, but i'm sure that technically it's a good engine.
     
  14. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    what happend to polygons?

    Too many polygon edges for a tech demo of "optimized" engine. Lighting and art-style is strongest point.
     
  15. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,579
    Likes Received:
    16,056
    Location:
    Under my bridge
    No, as I said what's with the fuzz? That's the worst DOF fake I've ever seen, such that I thought maybe my graphics were getting screwed. Every other game's DOF isa gaussian like blur but this thing looks a smess.
     
  16. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    Silly shadow

    Shadow makes no sense. Partial shadow on behind object not continuous from main shadow.

    http://www.watch.impress.co.jp/game/docs/20050929/ini51.htm
     
  17. mckmas8808

    Legend

    Joined:
    Mar 8, 2005
    Messages:
    6,744
    Likes Received:
    28
    Funny thing is most people think that the art-style is the worst thing.
     
  18. ihamoitc2005

    Veteran

    Joined:
    Sep 21, 2005
    Messages:
    1,181
    Likes Received:
    15
    art style


    Yes that is very funny. I agree the creatures are very ugly but I admire the composition of the scene and in a troll-like way the creatures can be seen as charming as well no? Depends on how animation, voices, etc are combined. Also, physics could be interesting with multiple spheres. I am very dissappointed by other design choices such as excessive GPU cycles for resulting poor shadowing. I might prefer if fake shadows used so polygon count improved hence trees, characters will be smooth. But I understand this is tech demo so goal is to show range of capabilities of engine rather than design choice capability of developer.
     
  19. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    I wouldn't draw too many far-reaching conclusions on the quality of the DOF in this tech-demo, as all the pics look like they use very heavy jpeg compression (lots of artefacting all over the place). Things almost always get screwy when people try to pick apart pictures downloaded off the web and analyze them down to the smallest details.
     
  20. chroniceyestrain

    Newcomer

    Joined:
    Sep 10, 2005
    Messages:
    167
    Likes Received:
    1
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...