Tech-Demo: SVO Voxel Raytracing with Image Warping (4x Speedup)

Summary The tech-demo consists of a sparse voxel octree raytracer that uses frame-to-frame coherency to speed up the raytracing. Pixels from one frame are projected into the next and only holes are raycasted, which leads to a speed up of up to 4x.

Download Demo/Zip (Note: you might need VS2012 redistributables )

Controls Mouse: look around, wasdqe: move around, space: just show reprojected pixels

Details The octree is a mixed normal / compact / linear node octree. The warping method uses 4 buffers to cache the most recent pixels on the screen. I have developed this algorithm already a year ago and finally found time to make a small demo for everyone to try. It uses OGL and OCL. So far it works well with NVIDIA, but I havent tried it on ATI or IntelHD yet.
Original Article : Blog Link / Further reading

Clipboard01.png


Clipboard02.png
 
Code:
read successfully ../data/palette.bmp ; 16x16x32 Bit
EXE_DIR
>> 1 OpenCL platform(s) found:
  > 0
  PROFILE = FULL_PROFILE
  VERSION = OpenCL 2.0 AMD-APP (1642.5)
  NAME = AMD Accelerated Parallel Processing
  VENDOR = Advanced Micro Devices, Inc.
  EXTENSIONS = cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
> 0
>> Try platform 0 ( 2 device(s) )
  > 0
  DEVICE_NAME = Capeverde
  DEVICE_VENDOR = Advanced Micro Devices, Inc.
  DEVICE_VERSION = OpenCL 1.2 AMD-APP (1642.5)
  DRIVER_VERSION = 1642.5 (VM)
  DEVICE_MAX_COMPUTE_UNITS = 10
  DEVICE_MAX_CLOCK_FREQUENCY = 1000
  DEVICE_GLOBAL_MEM_SIZE = 1073741824
  CL_DEVICE_LOCAL_MEM_SIZE = 32768
  > 1
  DEVICE_NAME = Tahiti
  DEVICE_VENDOR = Advanced Micro Devices, Inc.
  DEVICE_VERSION = OpenCL 1.2 AMD-APP (1642.5)
  DRIVER_VERSION = 1642.5 (VM)
  DEVICE_MAX_COMPUTE_UNITS = 32
  DEVICE_MAX_CLOCK_FREQUENCY = 1000
  DEVICE_GLOBAL_MEM_SIZE = 3221225472
  CL_DEVICE_LOCAL_MEM_SIZE = 32768

Selected : platform 0 device -1

No compatible OpenCL Device found.

Callstack:
ocl_init [c:\code\old\game_voxelmaster_repro\src\ocl.h:143]
 
:nope:

Code:
read successfully ../data/palette.bmp ; 16x16x32 Bit
EXE_DIR D:\User Folders\Desktop\SVO-Reprojection-TechDemo\bin32
>> 1 OpenCL platform(s) found:
  > 0
  PROFILE = FULL_PROFILE
  VERSION = OpenCL 1.2 CUDA 7.5.8
  NAME = NVIDIA CUDA
  VENDOR = NVIDIA Corporation
  EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_
compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sha
ring cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opt
s
> 0
>> Try platform 0 ( 1 device(s) )
  > 0
  DEVICE_NAME = GeForce GTX 780 Ti
  DEVICE_VENDOR = NVIDIA Corporation
  DEVICE_VERSION = OpenCL 1.2 CUDA
  DRIVER_VERSION = 352.86
  DEVICE_MAX_COMPUTE_UNITS = 15
  DEVICE_MAX_CLOCK_FREQUENCY = 1084
  DEVICE_GLOBAL_MEM_SIZE = 3221225472
  CL_DEVICE_LOCAL_MEM_SIZE = 49152

Selected : platform 0 device 0

CL Compilation failed:
<kernel>:172:2: warning: backslash and newline separated by space
\
^
<kernel>:498:84: warning: use of logical '&&' with constant operand
                a_screenBuffer[ofs]=( (col1&0xfcfcfcfc)+(col2&0xfcfcfcfc)+(col3&
0xfcfcfcfc)+(col4&&0xfcfcfcfc))>>2;

                 ^ ~~~~~~~~~~
<kernel>:498:84: note: use '&' for a bitwise operation
                a_screenBuffer[ofs]=( (col1&0xfcfcfcfc)+(col2&0xfcfcfcfc)+(col3&
0xfcfcfcfc)+(col4&&0xfcfcfcfc))>>2;

                 ^~~~~~~~~~~~

                 &
<kernel>:498:84: note: remove constant to silence this warning
                a_screenBuffer[ofs]=( (col1&0xfcfcfcfc)+(col2&0xfcfcfcfc)+(col3&
0xfcfcfcfc)+(col4&&0xfcfcfcfc))>>2;

                 ^~~~~~~~~~~~
<kernel>:729:26: error: can't convert between vector values of different size ('
float3' and 'double')
        float3 raypos = a_m0.xyz*16.0;//+normalize(delta)*distance)*16.0;
                        ~~~~~~~~^~~~~
<kernel>:749:14: error: can't convert between vector values of different size ('
float3' and 'double')
                phit=raypos/16.0;
                     ~~~~~~^~~~~
<kernel>:845:26: error: can't convert between vector values of different size ('
float3' and 'double')
        float3 raypos = a_m0.xyz*16.0;
                        ~~~~~~~~^~~~~
<kernel>:866:14: error: can't convert between vector values of different size ('
float3' and 'double')
                phit=raypos/16.0;
                     ~~~~~~^~~~~
<kernel>:975:26: error: can't convert between vector values of different size ('
float3' and 'double')
        float3 raypos = a_m0.xyz*16.0;
                        ~~~~~~~~^~~~~
<kernel>:995:14: error: can't convert between vector values of different size ('
float3' and 'double')
                phit=raypos/16.0;
                     ~~~~~~^~~~~


Callstack:
ocl_compile [c:\code\old\game_voxelmaster_repro\src\ocl.h:50]
 
will try on my system, set for luxrender raytracing openCL 2.1

Edit: seems to compil normally.

2x HD7970 driver 15.4 beta, but i have the AMD APP SDK installed separetely of the driver.
 
Last edited:
Runs fine on my GTX 970, can't get it below 160fps on the Imrod+Lucy one and wild movements can bring it down to about 100fps on the Trees demo.
 
I'm away from the PC so not gonna try it on my laptop with Intel 3000 grafix :).

Obviously this method relies on slow camera/object movements so it wouldn't suit a fast paced fps or anything like that. Still it could be useful for some things. Thanks for sharing :)

Question: In a static scene with no camera movement does the framerate→∞ ?
 
Back
Top