OpenCL LDS Bandwidth

Discussion in 'GPGPU Technology & Programming' started by Man from Atlantis, Dec 7, 2011.

  1. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    732
    Likes Received:
    6
    Download:
    Windows x86/x64: http://developer.amd.com/downloads/LDSBandwidth.zip
    Linux x86/x64: http://developer.amd.com/downloads/LDSBandwidth_lin.zip

    32 Bits:
    LDSBandwidth\samples\opencl\bin\x86
    64 bits:
    LDSBandwidth\samples\opencl\bin\x86_64

    copy the batch file into folder where is .exe files to prevent command prompt window closes automatically after it runs..

    Windows 7 x64, 64-Bit-Binary:

    GTX 460 @675/1350/3600MHz
    Code:
    Selected Platform Vendor : NVIDIA Corporation
    Device 0 : GeForce GTX 460
    Device 1 : GeForce 9800 GT
    Build Options are : -D DATATYPE=float2
    
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 577.24 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 577.678 GB/s

    GTX 460 @975/1950/4600MHz
    Code:
    Selected Platform Vendor : NVIDIA Corporation
    Device 0 : GeForce GTX 460
    Device 1 : GeForce 9800 GT
    Build Options are : -D DATATYPE=float2
    
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 833.409 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 834.063 GB/s

    9800GT @550/1375/1800
    Code:
    Selected Platform Vendor : NVIDIA Corporation
    Device 0 : GeForce GTX 460
    Device 1 : GeForce 9800 GT
    Build Options are : -D DATATYPE=float2
    
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 566.438 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 291.522 GB/s

    9800GT @650/1625/2200
    Code:
    Selected Platform Vendor : NVIDIA Corporation
    Device 0 : GeForce GTX 460
    Device 1 : GeForce 9800 GT
    Build Options are : -D DATATYPE=float2
    
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 680.468 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 349.798 GB/s
     

    Attached Files:

  2. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,486
    Likes Received:
    397
    Location:
    Varna, Bulgaria
    Code:
    Selected Platform Vendor : NVIDIA Corporation
    Device 0 : GeForce GTX 570
    Build Options are : -D DATATYPE=float2
    
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 1551.08 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 1551.25 GB/s
    
    GPU clock: 825MHz
    OS: Win7 64

    GPCBenchmark OCL suite can also measure LDS bandwidth: http://forum.beyond3d.com/showthread.php?t=57322
     
  3. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    732
    Likes Received:
    6
    ^^ GF110 is 2x faster than GF104 :razz:

    GTX 460 @675/1350/3600MHz
    [​IMG]
    Vector Elements 1
    Code:
    AccessType      : single
    VectorElements  : 1
    Bandwidth       : 598.49 GB/s
    
    AccessType      : linear
    VectorElements  : 1
    Bandwidth       : 598.404 GB/s
    Vector Elements 2
    Code:
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 577.248 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 577.692 GB/s
    Vector Elements 3
    Code:
    AccessType      : single
    VectorElements  : 3
    Bandwidth       : 427.219 GB/s
    
    AccessType      : linear
    VectorElements  : 3
    Bandwidth       : 214.69 GB/s
    Vector Elements 4
    Code:
    AccessType      : single
    VectorElements  : 4
    Bandwidth       : 571.814 GB/s
    
    AccessType      : linear
    VectorElements  : 4
    Bandwidth       : 286.646 GB/s
    Vector Elements 5
    Error

    9800GT @550/1350/1800MHz
    [​IMG]
    Vector Elements 1
    Code:
    AccessType      : single
    VectorElements  : 1
    Bandwidth       : 553.436 GB/s
    
    AccessType      : linear
    VectorElements  : 1
    Bandwidth       : 549.524 GB/s
    Vector Elements 2
    Code:
    AccessType      : single
    VectorElements  : 2
    Bandwidth       : 566.499 GB/s
    
    AccessType      : linear
    VectorElements  : 2
    Bandwidth       : 291.511 GB/s
    Vector Elements 3
    Error

    Vector Elements 4
    Code:
    AccessType      : single
    VectorElements  : 4
    Bandwidth       : 514.474 GB/s
    
    AccessType      : linear
    VectorElements  : 4
    Bandwidth       : 141.145 GB/s
    Vector Elements 5
    Error
     

    Attached Files:

    #3 Man from Atlantis, Dec 7, 2011
    Last edited by a moderator: Dec 7, 2011
  4. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,803
    Likes Received:
    474
    Location:
    Torquay, UK
    W7 x64 Cat 11.11c

    HD 6970 @910/1425

    HD 6970 @950/1425

    Not affected by memory clock :)
     
  5. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,486
    Likes Received:
    397
    Location:
    Varna, Bulgaria
  6. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    732
    Likes Received:
    6
    So more SM means more bandwith when you think it is like memory controllers.. GTX 470 gets exact double bandwith of GTX 460 at same same clock(14SMs vs 7SMs)..cool :)
     
  7. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,528
    Likes Received:
    107
    Shared Memory is per SM...it makes sense for it to scale with SM count and not with the memory clock.
     
  8. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    732
    Likes Received:
    6
    Thank you for clarification :smile:.. i have a question if you dont mind, i understand vec3 and 4 are slower than 1 and 2 but why is vec3 slower than vec4 on geforce cards.. and vec 1 and 2 seems to be full speed on single or sequential reads but 3 and 4 are half speed sequential?
     
    #8 Man from Atlantis, Dec 8, 2011
    Last edited by a moderator: Dec 8, 2011
  9. pcchen

    pcchen Moderator
    Moderator Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    2,743
    Likes Received:
    106
    Location:
    Taiwan
    It's likely that vec3 on a GeForce causes some bank conflict in shared memory, so it's slower.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...