GPU Ray-tracing for OpenCL

Discussion in 'Rendering Technology and APIs' started by fellix, Dec 27, 2009.

  1. MrGaribaldi

    MrGaribaldi Regular

    Are there any released in the wild yet? I thought they wouldn't be available until 12th of April.
    But as soon as I can get access to one, I'll try to get some results. Have great hopes for the results!
     
  2. Dade

    Dade Newcomer

    Not yet, however if you want to see some big number follow some screenshot posted by KyungSoo in LuxRender forum dedicated to GPU accelleration.

    8 GPUs (!) at work:

    [​IMG]

    4 Tesla at work:

    [​IMG]

    I'm looking forward to the first test with Fermi too :wink:
     
  3. cho

    cho Regular

    GTX 480:
    [​IMG]

    GTX 285:
    [​IMG]

    HD 5870:
    [​IMG]
     
  4. CNCAddict

    CNCAddict Regular

    CAN YOU SAY WOOOHOOOO. Looks like I may trade in my 5850 afterall :shock:
     
  5. rpg.314

    rpg.314 Veteran

    Holy cow... :shock:

    Is that a ~20x jump I am seeing there. With just a tiny L1. I am assuming you used 48K for L1 cache.

    Come on Dave, give us some cachey goodness on radeon 6xx0.
     
  6. fellix

    fellix Veteran

    DAMN!
    The default LDS/L1 partitioning for GF100 (as of current) is 48/16KB.
     
  7. Dade

    Dade Newcomer

    Omg :!:

    "Old" NVIDIA cards have always shown some problem with SmallptGPU (I wouldn't focus too much on the speed up when compared with GTX285) but running more than 2 time faster than a 5870 is eye popping :shock:

    Cho, any chance to run one of the latest SmallLuxGPU (http://davibu.interfree.it/opencl/smallluxgpu/slg-v1.4beta3.tgz) ?
     
  8. Psycho

    Psycho Regular

    Ehm.. how can the 5870 do more passes in the same time (and show a *very* similar image that if anything is slightly better - like the number of passes indicate), but get a much lower samples/sec count? Looks like it's doing same/more work in the same time
     
  9. Jawed

    Jawed Legend

  10. cho

    cho Regular

    GTX 480:
    [​IMG]

    GTX 285:
    [​IMG]

    HD 5870
    [​IMG]
     
  11. Jawed

    Jawed Legend

    The time shown is the time between screen updates. The application varies workload per invocation of the OpenCL kernel in order to produce a consistent 0.5s update interval.

    Jawed
     
  12. jj99

    jj99 Newcomer

    Very good result of GTX 480 for smallptGPU, but performance in smallluxGPU is rather disappointing...
     
  13. Dade

    Dade Newcomer

    Thanks, Cho, however you need to tune a bit the configuration for your hardware and for still rendering (instead of preview). You had only a 50% load on the 480.

    You should edit the scenes/luxball/render-fast.cfg file and replace the content with:

    image.width = 640
    image.height = 480
    batch.halttime = 0
    scene.file = scenes/luxball/luxball.scn
    scene.fieldofview = 45
    opencl.latency.mode = 0
    opencl.nativethread.count = 0
    opencl.cpu.use = 0
    opencl.gpu.use = 1
    opencl.platform.index = 0
    opencl.renderthread.count = 4
    opencl.gpu.workgroup.size = 64
    screen.refresh.interval = 2000
    screen.type = 3
    screen.gamma = 2.2
    path.maxdepth = 6
    path.russianroulette.depth = 5
    path.russianroulette.prob = 0.75
    path.shadowrays = 1

    If you use this configuration, first of all it will use only GPU for the rendering, it will use 4 threads to feed the GPU (I assume you have a quad core) and it will disable preview mode.

    For reference, this is the result of my i7 860+5870+5850:

    [​IMG]

    Indeed, tuning the configuration is very important.
     
  14. Dade

    Dade Newcomer

    I think Cho just need a bit of tuning for SmallLuxGPU, however keep in mind SmallptGPU uses a very small dataset (i.e. few bytes). While SmallLuxGPU uses dataset of several MBs.

    May be the size of the Fermi cache shines in the first case while it is nearly useless in the second.
     
  15. jj99

    jj99 Newcomer

    Thanks, Dade, I understand that. I was wondering how Fermi's cache will help in real world scenario like in case of SmallLuxGPU. Will wait to see the updated results of Cho.
     
  16. Lightman

    Lightman Veteran Subscriber

    Indeed very good showing from GTX480 in SmallPT :shock:

    It all finally starts going in the right direction with GPGPU. I only can hope AMD and nVidia can keep up this rate of development for another 3-5 years and real-time RT will be concurred!
     
  17. cho

    cho Regular

    [​IMG]

    I am using a i7-920 with HT enabled..

    The thread number is set to 16 . The GPU load is about 67~78%.
     
  18. fellix

    fellix Veteran

    Wow -- 93°C for just 78% load?

    Anyway, here is my HD5870 @ 900MHz GPU:

    [​IMG]

    This is with 8 threads on Q9450. Four wouldn't saturate it enough, giving me lower sample rates.
     
  19. cho

    cho Regular

    yes, but the fan noise is ok at this speed.
     
  20. Dade

    Dade Newcomer

    Thanks Cho, the correct value for the thread count should be 8 (4 real cores + 4 virtual for HT).

    Anyway, the result seems to confirm 480 about 2 times faster than 5870 on GPGPU tasks (about 8M rays/secs Vs about 4M rays/secs).
     
Loading...

Share This Page

Loading...