Vitaly's spe-jpeg

Discussion in 'CellPerformance@B3D' started by Mike Acton, Jul 10, 2007.

  1. Mike Acton

    Mike Acton CellPerformance
    Newcomer

    Joined:
    Jun 6, 2006
    Messages:
    47
    Likes Received:
    2
    Location:
    Burbank, CA
    Vitaly Vidmirov, one of our CellPerformance members, has written a jpeg decoder for the SPU. He's released it under the MIT license.

    Download the source from his blog, here:
    http://cellrb.blogspot.com/2007/06/first-public-version.html

    Let's have a peek at what he's done and offer up any suggestions that could help Vitaly make this a really cool, useful example.

    Mike.
     
  2. inefficient

    Veteran

    Joined:
    May 5, 2004
    Messages:
    2,121
    Likes Received:
    53
    Location:
    Tokyo
    Pretty cool.

    I notice he is using si_* intrinsics. What is the different between si_* intrinsics and spu_* intrinsics? And which document are they documented in?
     
  3. OzzyBC42

    Newcomer

    Joined:
    Jul 8, 2007
    Messages:
    25
    Likes Received:
    0
    si_* instrincts(specific instrincts) are mapped one-to-one to the corresponding assembly instruction, whereas as spu_* instrincts map to multiple assembly instructions.

    Check C/C++ Language Extensions for Cell Broadband Engine Architecture V2.4 for more information, it's included in the CellSDK from IBM.
     
  4. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
    Yes, i prefer si_* intrinsics because i love assembler :smile:
    Also these intrinsics are untyped, so it saves some casting :cool:
     
  5. OzzyBC42

    Newcomer

    Joined:
    Jul 8, 2007
    Messages:
    25
    Likes Received:
    0
    Did you try -ffast-math?
     
  6. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
    This option controls PPU floating point optimizations (i.e. mul-add and roundings).
    I don't think it will somehow affect my SPU integer-only computations.
     
  7. OzzyBC42

    Newcomer

    Joined:
    Jul 8, 2007
    Messages:
    25
    Likes Received:
    0

    Source
     
  8. Vitaly Vidmirov

    Newcomer

    Joined:
    Jul 9, 2007
    Messages:
    108
    Likes Received:
    10
    Location:
    Russia
    OzzyBC42
    This flag can only affect SPU programs what are not SIMDized.
    This is not the type of code what should be executed on SPU.
    If you use intrinsics, compiler can't perform these optimizations,
    because _you_ dictate the exact instruction sequence.
    And as i said, i don't use floating point at all.
     
  9. HDe

    HDe
    Newcomer

    Joined:
    Mar 12, 2008
    Messages:
    3
    Likes Received:
    0
    I had some problems getting this jpeg-decoder to work. It did not decode any available jpeg-image
    except the small sample included in the distribution although all my images adhere to the requirements
    stated in the readme. Even the sample image failed after conversion by the lossless jpegtran utility.
    The problem occurs with unoptimized Huffman-tables, which is the default for many (most?) jpeg-encoders.
    The treesize then becomes larger than the hardwired value (256) in the source file jpg_huffbits.h.
    Changing this value to 512 solved the problem for all my images.

    The decoder is indeed quite fast. Thanks for sharing!
     
  10. HDe

    HDe
    Newcomer

    Joined:
    Mar 12, 2008
    Messages:
    3
    Likes Received:
    0
    I forgot one more (minor) point: There seems to be an issue with rounding. The decoded image is brighter by 1-2 units.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...