Will the PS3 be able to decode H264.AVC at 40 Mps?

Discussion in 'Console Industry' started by Brimstone, Aug 15, 2006.

?

Will the PS3 decode H.264/AVC at 40 Mps?

  1. Yes, the PS3 will decode H.264/AVC at 40 Mps

    86 vote(s)
    86.9%
  2. No, the PS3 won't decode H.264/AVC at 40 Mps

    13 vote(s)
    13.1%
  1. ADEX

    Newcomer

    Joined:
    Sep 11, 2005
    Messages:
    231
    Likes Received:
    10
    Location:
    Here
    I had a look at a couple of the links and also had a quick look over the code. I can see why you think it'll be Cell unfriendly as the CABAC section is jammed with branches everywhere.

    A lot of SPE optimisation is based on branch removal using techniques such as predication where you compute both sides of a branch then "select" the answer, this might be possible.

    Also, the data in question appears to be very small so a single SPE will have no problem holding it in a local store, it might even be possible to hold it in the register file!

    It does look like a pretty serious bottleneck but it doesn't look like a worst case scenario for Cell.
     
  2. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Shifty, I made a (stupid) mistake in interpreting the first Cabac article. In any case the PPE approach is a fall back, there may be a better way to do it in the SPEs.

    The "High Speed Decoding of Context-based Adaptive Binary Arithmetic Codes Using Most Probable Symbol Prediction" paper mentions that:

    (1) More than 90 percents of the H.264 main profile stream is encoded using the CABAC (Will assume HP profile has similar characteristics)

    (2) Decoding algorithm is basically sequential and needs large computation to calculate range, offset and context variables

    (3) Other stages of AVC (HP) decoding can be parallelized and pipelined relatively easily (Not sure how large a percentage of work these constitute).

    To address (2), the proposed system processes 2 symbols at a time, and performs speculative calculation of the second symbol assuming:
    (i) The next context is used, and
    (ii) Decoded bin is MPS
    to achieve 24% speed up (empirically)

    For Cell, a dev may...

    * Use SPE's raw computation power, predication and local store to speed up the "basic" calculations.

    * Use 2 SPEs to parallelize/stagger the calculation of consecutive symbols (as in the paper).

    * Instead of assuming "the next context is used", reorganizes and fetches the contexts in the form of SOA into the local stores (if possible) e.g., Fetch the same binary keys from all 399 contexts and perform SIMD calculations on them all at once locally. If this is possible, then a correct MPS guess would guarantee the right result.

    * In addition, a 3rd SPE may be used to perform speculative calculation assuming that "Decoded bin is LPS". Again perform SIMD calculations on the right entry in all 399 contexts (if possible).

    Need more info to know whether this is doable. We should be able to hit 24% to 40+% speed up, and also free up PPE (hopefully) for other stuff ?
     
    #122 patsu, Aug 17, 2006
    Last edited by a moderator: Aug 17, 2006
  3. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    Which leads back to two points already raised - 1) Will any studios be encoding high bitrate .h264? and 2) Does not the BRD specification require that PS3 can decode h.264 at 40 Mbps to comply as a BRD player? At wich point, either it's not a complete BRD player and Sony are evading their own standard, or we can trust that it can deal with codec because it is a BRD player and that's a minimum requirement!
     
  4. RobertR1

    RobertR1 Pro
    Legend

    Joined:
    Nov 2, 2005
    Messages:
    5,852
    Likes Received:
    1,297
    It normally would be but it was "CJplay" who is the Warner compressionist working on the title who confirmed the ABR on Batman Begins. From all that's been spoken about the film, it'll be a showcase title from a PQ perspective. We'll see soon enough! I do agree that more bitrates are good but you can get to a point of diminishing returns, depending on how good the encoder is.
     
    #124 RobertR1, Aug 17, 2006
    Last edited by a moderator: Aug 17, 2006
  5. London Geezer

    Legend Subscriber

    Joined:
    Apr 13, 2002
    Messages:
    24,151
    Likes Received:
    10,297
    That's my point and also why i think there is no problem. In the end, if Cell has issues with CABAC that can't be fixed via software upgrades (if there is a problem at all!), there will just be NO movies using H264 at 40MBps. Simple as that. :grin:
     
  6. -tkf-

    Legend

    Joined:
    Sep 4, 2002
    Messages:
    5,634
    Likes Received:
    37
    If you look at the reviews and the posted bitrates, the bitrates does seem to follow the quality on the current HD-DVD. @13 mbit average i will be impressed :)
     
  7. RobertR1

    RobertR1 Pro
    Legend

    Joined:
    Nov 2, 2005
    Messages:
    5,852
    Likes Received:
    1,297
    Software advances :) We'll see soon enough!
     
  8. Npl

    Npl
    Veteran

    Joined:
    Dec 19, 2004
    Messages:
    1,905
    Likes Received:
    7
    There sure have to be sync-points, so you can jump to frame X without running through ALL preceding frames. As soon as you have such sync-points you can trivially start processing from multiple sync-points, the cost is more used memory for buffering and bandwith. Say you have a sync-point atleast every 3 frames, then you could read 6 frames and kick off 2 SPE to start decoding CABAC from frame n and n+3, but you need to provide enough buffers to hold compressed/uncomp data.
    Another SPE would then take the unCABACed data and decode it further.
     
  9. -tkf-

    Legend

    Joined:
    Sep 4, 2002
    Messages:
    5,634
    Likes Received:
    37
    Ohh absolutely, but given that VC-1 is very mature i don´t expect more miracles :)
     
  10. archie4oz

    archie4oz ea_spouse is H4WT!
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,608
    Likes Received:
    30
    Location:
    53:4F:4E:59
    Wow, 6 pages... I can't believe this thread made it this far... :(

    The higher allowable bit-rate is mainly to accommodate high bit-rate MPEG2. Even though BD-ROM supports AVC High Profile 4.1 I doubt many discs will be authored with a video bitrate higher than what's permitted in 4.0 (25mbps). 4.1 has bit-rates which exceed the video specification for BD-ROM so it's less likely that they'll be used.

    Even early devkits can handle multiple streams, there's no reason why a PS3 wouldn't (outside of some catastrophic bugs).

    Doesn't really matter. The current BD-ROM spec requires a compliant device to support discs with DD+ (along with DD, Dolby Lossless, DTS, DTS-HD, and multi-channel LPCM.)

    Yes, and those same rumors said that 16 pixel pipelines were too much to be efficiently used (compared to the more common and "normal" 4 pipe setups of the time), unless you drew *large* triangles... ;)

    Yes, you need to be corrected... :p It wasn't 1080i H.264, it was 1080i MPEG2 (technically 1088i, but who's counting)... Actually I was able to decode to just a hair under 48fps (frames not fields) before the IPU ran out of steam. Of course there are CRTC clock/sync issues that negate that as well. Also there was the multi-path XviD decoder, but that was only to 704x512...

    No, AVC is the MPEG group's designation for MPEG-4 Part 10 (aka H.264). H.264 is the ITU designation. VC-1 is the informal name for SMPTE-241M...

    1.) That only deals with movies released by Sony Pictures, and studios that use the MPEG-2 only versions of authoring tools. Ergo, it doesn't prevent other studios from authoring AVC encoded material (e.g. Panasonic's tools).

    2.) MPEG-2 is simply being used as it's far more mature and the authoring tools around it are more optimized for rendering out MPEG-2 content. Ergo, it doesn't prevent Sony from continuing to develop their authoring tools around AVC (in fact it *is* being done).

    3.) MPEG-2 based content isn't constrained by the immaturity of VC-1 and AVC hardware encoders (e.g. players that sacrifice features like single-frame advance, slow-mo, single-frame reverse, slow-mo-reverse, or just simply clean scanning forwards and backwards.

    MS didn't ditch their previous work. And I don't know why people are making such a big stink over CABAC... It's only represents like 3% of an encode cycle (and is even less significant on a decode cycle). If you're gonna bitch about processing requirements, then bitch about how many cycles are eaten up on motion vector search...

    By codec standards it's still quite immature...
     
    #130 archie4oz, Aug 18, 2006
    Last edited by a moderator: Aug 18, 2006
    Guden Oden, Farid and Shifty Geezer like this.
  11. Ventresca

    Banned

    Joined:
    Jun 15, 2006
    Messages:
    62
    Likes Received:
    0
    mmm...normally it don't , at least for me :wink:

    Do you really expect him to say something different ?

    Clearly he is going to say that his movie will be best , we always do that :razz:
     
  12. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    A Java interactivity layer isn't going to eat up much CPU at all unless it's running a video game. Interactivity layers are handled by event driven programming wherein the Java VM is asleep most of the time waiting for a signal to wake up and handle an event (e.g. user clicked on something) All of the media oriented stuff (pip, floating videos, animations) is handled by native code, not Java. The Java code simply sets up a request, and hands it off to the underlying BD-J runtime, then goes back to sleep waiting for I/O completion.

    With regards to the context model needed for CABAC, it only is a few hundred entries and easily fits within SPE local memory from what I've read.. The problem with CABAC isn't updating the model, and there isn't so much branching, as lots of shifts and table lookups. CABAC is most expensive in the encode, not the decode. CABAC isn't that parallizable compared to huffman tho, but I highly doubt that decode scales worse than n^2 and my intuition says most like O(N) or O(n log n), after all, why should 2N bits take 4N as long when the decoder at its heart, doesn't perform doubly nested loops over the entire bitstream. The decoding context at each bit is a tiny fraction of the future bitstream that hasn't been read yet.

    This means that decoding 40mbps shouldn't be much more than a factor of 4 worse than 10mbps atleast as far as CABAC is concerned. Now, the rest of the AVC pipeline might be a different story.
     
    Guden Oden likes this.
  13. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
    Finally. Thanks for the info. If sync-points (slices ?) are frequent enough, then parallelization should be a non-issue.

    In a nutshell, H.264 HP Profile decoding performs well on Cell (without using PPE even). I'm happy again. ;-)
     
  14. -tkf-

    Legend

    Joined:
    Sep 4, 2002
    Messages:
    5,634
    Likes Received:
    37
    Compared to what codec? :)

    Microsoft was behind the HD version of Terminator 2 in 2003, which used the WMV codec, i think it´s fair to say that Microsoft have worked hard and long on VC-1
     
  15. aaaaa00

    Regular

    Joined:
    Jul 24, 2002
    Messages:
    790
    Likes Received:
    23
    That's odd. Your comment piqued my interest so I grabbed the sources for ffdshow off sourceforge and ran some profiling passes.

    I used Media Player Classic 6.4.9.0, and built my own libavcodec from an Aug 3rd snapshot of the ffdshow_tryout branch. This codec incorporates the x264 open source implementation of H.264 and supports CABAC and HP (and is widely considered a very good software implementation on the PC).

    For a ~8.2 mbit/s 1920x1080p encoding of a Pirates of the Caribbean 2 trailer, on this particular machine (a dual-core AMD64 2.4 ghz with 2GB of RAM):

    ~85% was in the decode_slice fuction.
    ~14% was consumed by mplayerc's post processing and DirectShow.
    ~1% consumed by the OS (Windows Vista).

    Of that 85% spent in decode_slice:

    ~40% was spent decoding the CABAC (decode_mb_cabac)

    ~60% was spent on macroblocks (hl_decode_mb)
    intraslice prediction
    motion prediction (hl_motion)
    filtering (filter_mb)
    and everything else.

    So it seems to me that CABAC is not anywhere close to just 3% of CPU load, unless you can tell me where my profiling run went wrong. :)

    Thanks.
     
    #135 aaaaa00, Aug 18, 2006
    Last edited by a moderator: Aug 18, 2006
  16. patsu

    Legend

    Joined:
    Jun 25, 2005
    Messages:
    27,709
    Likes Received:
    145
  17. Apoc

    Regular

    Joined:
    Sep 17, 2003
    Messages:
    308
    Likes Received:
    2
    Try CoreAVC. I can perfectly play a 720p H.264 stream with a pentium-m set to 800 mhz.
     
  18. aaaaa00

    Regular

    Joined:
    Jul 24, 2002
    Messages:
    790
    Likes Received:
    23
    What bitrate? CABAC is proportional to bitrate, so a 40 mbps stream will consume 8 times the CPU of a 5 mbps stream just for CABAC, assuming your memory and cache can keep up.

    Anyway, it appears that CoreAVC is closed source, so I can't exactly profile it and get any useful information.
     
    #138 aaaaa00, Aug 18, 2006
    Last edited by a moderator: Aug 18, 2006
    Simon F likes this.
  19. Apoc

    Regular

    Joined:
    Sep 17, 2003
    Messages:
    308
    Likes Received:
    2
    I don't know what bitrate it is. I tested it playing the quicktime hd trailers from http://www.apple.com/trailers/ . They have a high bitrate, but i don't know which, as i don't know how to look it up.
     
  20. aaaaa00

    Regular

    Joined:
    Jul 24, 2002
    Messages:
    790
    Likes Received:
    23
    Quicktime is slow and doens't fully support Main Profile, so I think Apple doesn't turn all H.264 features on for their encodes, otherwise Quicktime users would have problems playing them.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...