OpenCL Solar System

Discussion in 'GPGPU Technology & Programming' started by moozoo, May 18, 2013.

  1. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    I have release a inital version of my OpenCL Solar System Simulation
    http://sourceforge.net/projects/openclsolarsyst

    I'd appeciate if people with decent video cards that support opencl with double precision support could test it.
    I only have access to an old gtx260 and intel CPU opencl.
    I have not done alot of testing on other systems.
    It still has a number of bugs.

    I don't have an installer yet. Just unzip it, read the readme.txt, then edit the run.cmd to match your video card (-amd,-nvidia and -intel).

    It is double precision only and will check for it at startup.
    My intention is to make it very accurate.
    I'm not planning to add any fancy rendered planets etc. Lots of other products do that already.

    For amd cards you may need to coment out the #pragma lines near the start of the adams.cl file.

    All constructive feedback is welcome.
    moozoo.wizard@gmail.com
     
  2. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    amd 6950 it runs, but nothing seems to move
    I centered on mass 3 zoomed in so i could see the earth and the moon set the time step to 4 hours but the moon didnt move

    ps: comment your bloody code :D
     
  3. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    Video at http://www.youtube.com/watch?v=W4QpUU0zz7M
    I can not seem to get a HD upload to work. Unfortunately since the asteroids are all single pixels they don't compress that well for a video, even a HD one.
    I will try and get access to an amd graphics card.
    Make sure you start it with -amd - nsmall
    I can zip up the debug version that displays a log window. It spits out a large amount of debug information.

    Note in one point of the video I point out the comet C2013/A1 that has a very close encounter with Mars on the 19th Oct 2014.
     
  4. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    ive seen the vid and done exactly the same as you but no movement
    the steps are iterating
    also when i start it all thats visible is the sun, i have to select number of bodies 309760 and then select 8192 to see whats in the youtube clip
     
  5. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Rather than use an AMD card, just install the AMD SDK as an alternative to the SDKs you have used. It'll work with your CPU and will at least allow you to debug that variant of OpenCL.
     
  6. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    If you have troubles on an amd card try these

    1) commenting out the two pragma lines in adams.cl
    //#pragma OPENCL EXTENSION cl_khr_gl_sharing : enable
    //#pragma OPENCL EXTENSION cl_khr_fp64 : enable

    Note this file is in the same directory as the executable.

    2) try -amd -cpu -nsmall

    3) Check the about box for the OpenCl device and platform

    4) Untick Option Blend

    5) Hold down Shift Z to zoom out and Turn up the brightness.

    6) try importing the astrorbsolexsmall.slf from the source code

    I know doing 1 and 2 works for amd cpu opencl on the two systems I have access to.
    On my development machine I have the Intel Opencl sdk installed, the amd cpu opencl installed and the nvidia gtx260 opencl driver installed.
    The Intel supplied opencl.dll is used as the system opencl icd (I copy it back if it gets overwritten.). It appears to be an offical opencl 1.2 Khronos Group one.

    >ps: comment your bloody code :grin:
    Yes , I know, sorry about that. Time permitting I will do this.

    Note that with this program you can simulate a number of interesting what ifs.
    To do this Export to an .slf file, edit it, then import it.
    (yes importing is really slow...)
    For a rogue star passing trhough the solar system
    add these lines after the sun and before mercury.
    This will add a .1 solar mass star that passes through the inner solar system , ejects mars and causes chaos to the asteroid belt, earth and venus.

    286732.8 0.69595 0 0# RogueStar
    5.0335044325843779E+02 5.1028008243323814E+02 4.0646560291058268E+02
    -4.000000000000000E+01 -2.000000000000000E+01 -2.000000000000000E+01
     
  7. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    Yes, I actually started with the amd sdk cpu only opencl under xp.
    It is the only one that supported a Pentium 4...
    The amd cpu only opencl can be extracted from the amd sdk installer.
    Open the amd sdk installer in 7zip and navigate down in a folder called Packages (from memory). Its an .msi file with openCL and CPU in its name.

    I forgot about having to comment out the pragma lines.
    I read somewhere it was ownly an issue with the amd cpu opencl driver.
    That said I don't think that is the issue Davros is having. If the kernel fails to compile it spits error dialog boxes.
     
  8. pcchen

    pcchen Moderator
    Moderator Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    2,768
    Likes Received:
    150
    Location:
    Taiwan
    I ran it on a GTX 460 and it seems to run fine, though not very fast :)

    Do you plan to make a version using FP32 (but probably in a more accurate way, such as using two FP32 to represent one fractional number)? Also is there a FPS or step/second counter (for comparison)?
     
  9. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    Tried your suggestions
    already commented out the pragma lines - I do read the readme ;)
    Holding down (or pressing many times) shift z does nothing (I dont think its a zoom problem)

    what i see (8192 is selected)
    [​IMG]

    I have to select number bodies 309760 and I see this
    [​IMG]

    Then select number bodies 8192 and I see this
    [​IMG]

    Its just there is zero motion (and the steps are iterating)

    ps:
    I have no astrorbsolexsmall.slf
     
  10. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,496
    Likes Received:
    910
    For what it's worth, I have an HD 6950 as well, and I am experiencing the exact same behaviour.
     
  11. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    Thank you very much everyone for your time trying this for me.
    On my lunch break, so this reply will be short.

    I'm guessing the amd issue is related to integration between opencl and opengl. i.e. cl_khr_gl_sharing with amd.

    The opencl code updates its own data. Every 2n'd update (from memory) I acquire a shared GL vertex array of points, update it from the opencl array of positions and then update the display.
    Originally I used a buffer to buffer copy, but I later turned it into a kernel so that I could center the display on a particular body.
    Somehow that is not working properly.
    try uncommenting
    #pragma OPENCL EXTENSION cl_khr_gl_sharing : enable
    Actually I don't understand why the pragma's need to be commented out. Nvidia needs the cl_khr_fp64 or it spits the dummy the first time it encounters a double.
    To me the AMD opencl (at least for the CPU) is buggy.

    I will need to use a amd graphics card to debug this.
    I plan to buy a new system once Haswell is released early next month and I will include a AMD HD7970Ghz edition graphics card.

    Since Nvidia has dropped almost all of its developer support for opencl when it released CUDA 5.0, and because its double precision performance is so low on consumer level boards I am very keen to get it working on AMD graphics hardware.

    >Do you plan to make a version using FP32 (but probably in a
    >more accurate way, such as using two FP32 to represent
    >one fractional number)?

    No.

    My desire is to create something that is comparable in accuracy to JPL's
    Horizons ephemeris. Solex and DE118i prove that this can be done on consumer level hardware. Those programs are single threaded, CPU only.
    If anything I might look at double double or 128 bit fixed point fp128.
    On amd and intel hardware I doubt using two single precision numbers would be as fast as native doubles.

    When you are watching it run, remember every dot represents a real planet or asteroid and it's position and movement is (hopefully) very accurate.
    If you center on Mars and run it to the 19th Oct 2014 you should see the comet C/2013 A1 almost hit Mars.

    >Also is there a FPS or step/second counter (for comparison)?

    I will look at adding this.
    The code has a goto date and stop function in it which I use to get a rough idea of how fast it is.

    My development so far has been on just getting it to work.
    I don't have access to anything that can do kernel profiling until I buy new hardware. So the kernels aren't optimized.
     
  12. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    [​IMG]

    ps: little idea, make a build that uses single precision and I'll see if that works on amd hardware
     
  13. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,496
    Likes Received:
    910
    Same result for me.
     
  14. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    I uploaded a debug version.
    It logs to a seperate debug window.
     
  15. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    Tried the debug version
    same problem
    ps: it didnt log to a seperate debug window.
    there is a log.txt in the folder but its empty

    Edit:
    It did create a log window it was just hidden behind the main window and exists when you exit the program so I never saw it
     
  16. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    cant attatch log file cos forum limit is 19.5k

    Created Menu
    GLCanvas Create
    GLCanvas::InitGL
    Initialising 8192 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth4
    Using adamsKernel adamsMoulton3
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    Init Frame Succeeded
    Application::OnInit Done
    Frame Stop
    CLModel:CleanUpCL
    CLModel:CleanUpCL Done
    GLCanvas::CleanUpGL
    GLCanvas::CleanUpGL Done
    GLCanvas::InitGL
    Initialising 8192 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth4
    Using adamsKernel adamsMoulton3
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    Frame Stop
    CLModel:CleanUpCL
    CLModel:CleanUpCL Done
    GLCanvas::CleanUpGL
    GLCanvas::CleanUpGL Done
    GLCanvas::InitGL
    Initialising 309760 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth4
    Using adamsKernel adamsMoulton3
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    Frame Stop
    CLModel:CleanUpCL
    CLModel:CleanUpCL Done
    GLCanvas::CleanUpGL
    GLCanvas::CleanUpGL Done
    GLCanvas::InitGL
    Initialising 8192 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth4
    Using adamsKernel adamsMoulton3
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    Frame Stop
    CLModel:CleanUpCL
    CLModel:CleanUpCL Done
    GLCanvas::CleanUpGL
    GLCanvas::CleanUpGL Done
    GLCanvas::InitGL
    Initialising 8192 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth4
    Using adamsKernel adamsMoulton3
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    Frame Stop
    CLModel:CleanUpCL
    CLModel:CleanUpCL Done
    GLCanvas::CleanUpGL
    GLCanvas::CleanUpGL Done
    GLCanvas::InitGL
    Initialising 8192 particles
    GLCanvas::InitGL Done
    CLModel::InitCL
    Found 1 platforms
    Found Desired platform
    Using platform Advanced Micro Devices, Inc.
    platform has 2 devices
    Found CPU Device as requested
    max grav particles 2048

    Warning: adamsBashforth16 kernel has register spilling. Lower performance is expected.

    Using adamsBashforthKernel adamsBashforth11
    Using adamsKernel adamsMoulton10
    CLModel:InitCL Done
    CLModel:InitKernels Start
    accKernel Work Group Size 256
    startupKernel Work Group Size 256
    adamsBashford Work Group Size 256
    adamsKernel Work Group Size 256
    copyToDisplayKernel Work Group Size 256
    CLModel:InitKernels Done
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    Frame Start
    Timer Fired
    CLModel:ExecuteKernel Start
    CLModel:Using startupKernel
    CLModel:ExecuteKernel Done
    CLModel:ExecuteKernel Start
    CLModel:Using startupKernel
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    CLModel:ExecuteKernel Done




    it repeats the following a lot of times

    CLModel:ExecuteKernel Start
    CLModel:Using startupKernel
    CLModel:ExecuteKernel Done
    CLModel:ExecuteKernel Start
    CLModel:Using startupKernel
    CLModel:UpdateDisplay Started
    CLModel:UpdateDisplay Done
    CLModel:ExecuteKernel Done
    Timer Fired

    and ends with
    Frame Stop
     
  17. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    ps:
    Found CPU Device as requested

    shouldnt that be found gpu device ?
     
  18. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,963
    Likes Received:
    2,343
    Edit edit edit
    had my command line as
    .\OpenCLSolarSystem.exe -amd -nsmall

    totally missed your other post until just now and changed it to
    .\OpenCLSolarSystem.exe -amd -cpu -nsmall

    and its working yay...........

    Tried the non debug with the same commandline and it ctd's with a "this program has stopped working" msg only after I select GO -- START
     
  19. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    yeah, I know about that. The debug message is wrong. ignore it.
     
  20. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    >.\OpenCLSolarSystem.exe -amd -cpu -nsmall
    >and its working yay...........

    Check the about box. It will be using the CPU for the opencl.
    So there is still an issue getting it working on the AMD graphics.
    But... it suggests the opengl/opencl sharing is working.

    These combinations work as far as I know
    Nvida GPU opencl + Nvidia opengl
    Intel CPU opencl + Nvidia opengl
    AMD CPU opencl + Nvidia opengl
    Intel CPU opencl + Intel HD3000 opengl
    AMD CPU opencl + Intel HD3000 opengl

    You just added
    AMD CPU opencl + AMD opengl

    But
    AMD GPU opencl + AMD opengl
    is not working (the one I want to work the most because of AMD's fp64 performance)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...