GPU Ray-tracing for OpenCL

Are there any released in the wild yet? I thought they wouldn't be available until 12th of April.
But as soon as I can get access to one, I'll try to get some results. Have great hopes for the results!
 
Anyone with GTX480 willing to join the party?

Not yet, however if you want to see some big number follow some screenshot posted by KyungSoo in LuxRender forum dedicated to GPU accelleration.

8 GPUs (!) at work:

SLG_8gpu.png


4 Tesla at work:

4tesla.png


I'm looking forward to the first test with Fermi too ;)
 
Ehm.. how can the 5870 do more passes in the same time (and show a *very* similar image that if anything is slightly better - like the number of passes indicate), but get a much lower samples/sec count? Looks like it's doing same/more work in the same time
 
Ehm.. how can the 5870 do more passes in the same time (and show a *very* similar image that if anything is slightly better - like the number of passes indicate), but get a much lower samples/sec count? Looks like it's doing same/more work in the same time
The time shown is the time between screen updates. The application varies workload per invocation of the OpenCL kernel in order to produce a consistent 0.5s update interval.

Jawed
 
Very good result of GTX 480 for smallptGPU, but performance in smallluxGPU is rather disappointing...
 
Thanks, Cho, however you need to tune a bit the configuration for your hardware and for still rendering (instead of preview). You had only a 50% load on the 480.

You should edit the scenes/luxball/render-fast.cfg file and replace the content with:

image.width = 640
image.height = 480
batch.halttime = 0
scene.file = scenes/luxball/luxball.scn
scene.fieldofview = 45
opencl.latency.mode = 0
opencl.nativethread.count = 0
opencl.cpu.use = 0
opencl.gpu.use = 1
opencl.platform.index = 0
opencl.renderthread.count = 4
opencl.gpu.workgroup.size = 64
screen.refresh.interval = 2000
screen.type = 3
screen.gamma = 2.2
path.maxdepth = 6
path.russianroulette.depth = 5
path.russianroulette.prob = 0.75
path.shadowrays = 1

If you use this configuration, first of all it will use only GPU for the rendering, it will use 4 threads to feed the GPU (I assume you have a quad core) and it will disable preview mode.

For reference, this is the result of my i7 860+5870+5850:

i860+hd5870+hd5850.jpg


Indeed, tuning the configuration is very important.
 
Very good result of GTX 480 for smallptGPU, but performance in smallluxGPU is rather disappointing...

I think Cho just need a bit of tuning for SmallLuxGPU, however keep in mind SmallptGPU uses a very small dataset (i.e. few bytes). While SmallLuxGPU uses dataset of several MBs.

May be the size of the Fermi cache shines in the first case while it is nearly useless in the second.
 
Thanks, Dade, I understand that. I was wondering how Fermi's cache will help in real world scenario like in case of SmallLuxGPU. Will wait to see the updated results of Cho.
 
Indeed very good showing from GTX480 in SmallPT :oops:

It all finally starts going in the right direction with GPGPU. I only can hope AMD and nVidia can keep up this rate of development for another 3-5 years and real-time RT will be concurred!
 
1003292141a5c1b8da98d996b1.png


I am using a i7-920 with HT enabled..

The thread number is set to 16 . The GPU load is about 67~78%.
 
The thread number is set to 16 . The GPU load is about 67~78%.
Wow -- 93°C for just 78% load?

Anyway, here is my HD5870 @ 900MHz GPU:

luxball.jpg


This is with 8 threads on Q9450. Four wouldn't saturate it enough, giving me lower sample rates.
 
I am using a i7-920 with HT enabled..

The thread number is set to 16 . The GPU load is about 67~78%.

Thanks Cho, the correct value for the thread count should be 8 (4 real cores + 4 virtual for HT).

Anyway, the result seems to confirm 480 about 2 times faster than 5870 on GPGPU tasks (about 8M rays/secs Vs about 4M rays/secs).
 
Back
Top