ATI GPU transcoding app?

why wouldnt the 8800gt/gts be supported

Maybe they are referring to the G80 ones? You meant gtx, I'm assuming.

I think they lack the Pure Video 2 decoder.

The 8800 GT, codenamed G92, was released on October 29, 2007. The card is the first to transition to 65 nm process, and supports PCI-Express 2.0.[14] It has a single-slot cooler as opposed to the double slot cooler on the 8800 GTS and GTX, and uses less power than GTS and GTX due to its 65 nm process. While its core processing power is comparable to that of the GTX, the 256-bit memory interface and the 512 MB of GDDR3 memory often hinders its performance at very high resolutions and graphics settings. The 8800 GT, unlike other 8800 cards, is equipped with the PureVideo 2 engine for GPU assisted decoding

http://en.wikipedia.org/wiki/GeForce_8_Series
 
It's really encoding on GPU.

Not trying to be obtuse but I'm wondering if the hardware encoding is just doing post processing. Perhaps it's doing some other parts of the work load.

A benchmark with no post processing and some cpu measuring would be great.

Jawed, thanks for the link. Fwiw, my version of Firefox doesn't like the security certificate for the follow up page link.

Edit: Am now using google's cache of the page just to be safe.
 
Not trying to be obtuse but I'm wondering if the hardware encoding is just doing post processing.
Well, its not safe to assume that there is the same acceleration going on between the two. Last time I was aware of anything going on with PowerDirector ATI Stream was being used for transcode acceleration CUDA was being used for post processing. That may have changed though.
 
The update for CUDA post process was toward the end of last year. This update seems to be for the NV GPU encoding/transcoding.
 
The update for CUDA post process was toward the end of last year. This update seems to be for the NV GPU encoding/transcoding.
This is not an update, it's the same software. The text is reasonably explicit (now that I've bothered to look at it):

Support for NVIDIA® CUDA™ technology ensures accelerated rendering speeds of popular video effects by allowing PowerDirector 7 to tap into the multi-core parallel processing power of GPUs.
[...]
PowerDirector delivers improved video rendering performance and user experience in processing HD videos with video effects such as Gaussian Blur, Radial Blur, Light Ray, Pen Ink, Abstractionism, Kaleidoscope, Color Edge, Replace Color, Color Painting, and Glow.
[...]
NVIDIA CUDA technology enables PowerDirector 7 to achieve massive gains in rendering speed. We've taken a test sample of the video effects and compared the results across three formats [...]
Jawed
 
So is this bit on their PR page in this update or not ?

PowerDirector 7's support for NVIDIA® CUDA™ technology delivers huge speed gains when encoding HD video into the H.264 format. Offering performance gains of 270% for encoding high-definition video using NVIDIA CUDA technology, PowerDirector 7 leverages the power of the GPU to deliver its faster results.

I mistakenly downloaded their last update, the post process was in the last update last year. This one is dated 2009 several days ago. Did they posted the wrong update or something ?
 
From the same PR:

http://www.cyberlink.com/prog/company/press-news-content.do?pid=1994

[...]designed with NVIDIA CUDA Encoder technology achieves up to 274% performance gain when transcoding H.264 videos. By leveraging the multi-core parallel power of the GPU, PowerDirector 7 provides consumers with an accelerated video editing experience resulting in faster rendering of H.264 HD videos to AVCHD, M2T format and for viewing on iPod® and PSP®. PowerDirector 7 supports CUDA-enabled NVIDIA GeForce® processors with NVIDIA graphics drivers version 181.20 or higher.
The speed gain arises from alleviating the CPU from performing post-processing/rendering as well as more-efficiently performing these effects, it's not from the actual h.264 encode.

Jawed
 
Does that mean NV CUDA Encoder technology don't actually encode to H.264 on the GPU but on CPU instead ?

So CUDA encoder basically free up the CPU to encode by off loading all the non-encoding stuff to GPU. In comparison ATI Transcoder accelerates the motion compesation part at least.

Did I get that right ?

What about Badaboom don't that uses CUDA encoder too ? Does Badaboom uses the same strategy as this one ?
 
Does that mean NV CUDA Encoder technology don't actually encode to H.264 on the GPU but on CPU instead ?
In this product it seems like it.

So CUDA encoder basically free up the CPU to encode by off loading all the non-encoding stuff to GPU. In comparison ATI Transcoder accelerates the motion compesation part at least.
That's my interpretation of the way CUDA is being used.

The ATI Stream technology seems like it'll only be used for motion estimation, with no effects processing being accelerated.. Just have to wait and see.

I've got no idea how important effects processing is to most PowerDirector users. I stick to x264 for encoding with MeGUI to handle the batching of encode jobs and Avisynth scripts for "effects".

What about Badaboom don't that uses CUDA encoder too ? Does Badaboom uses the same strategy as this one ?
Badaboom is just for encoding as far as I can tell. If it has any image processing features, I've forgotten - its reputation for low quality per bit makes it easy to entirely dismiss.

Jawed
 
Checking the doom9 thread, it appears that CruNcher who did the testing is less than impressed by the image quality of this initial release (regarding both the Cyberlink and NVidia implementations):

CruNcher said:
PS: It's far from good it fails miserably on most test cases Badaboom easily beats it even being Main Profile only, not worth to test this any further just a waste of time, compared to x264 it looks like a First Day Encoder not sure what to say about Cyberlinks Encoder it's even more in the Direction of 0 Day :p. But maybe the Cyberlink engineers just did wrong implementing it inside their application it looks really bad it has B-frame stability issues in motion scenes, im to scared to encode Touhou with those jesus that must look awful.
 
Hmm that doesn't look promising. All the speed in the world is useless if the PQ is bad. CPU can encode really fast too, if the PQ is bad. I think it'll be another year before GPU get anywhere. Probably will require new GPU to get good result. Thus far it's only marketing smoke.

I did try to encode with PowerDirector (without CUDA). It doesn't come close to x264, but it was better compare to ATI Transcoder (which is really bad). I am hoping for the best with Cyberlink ATI implementation but somehow, I don't think it'll be any good either. Afterall this are not pro app.

Are there any other apps that will support ATI GPU Transcoding ?

From what I gather the most promising GPU transcoding app is Badaboom. Will Badaboom supports ATI Stream in their near future updates ?
 
I think Badaboom can produce nice results as long as one is not expecting the best bang for the buck compression wise. If one is willing to concede a moderate percentage of compression or some fine detail that might (or might not) be noticeable then it's good for some jobs.

If one is using a handheld with decent storage size it could be very useful. Same thing for quick conversions to a PS3 format.

http://www.apple.com/quicktime/guide/hd/bbc-cfb.html

BBC hosts that file too. I grabbed the 1080p sample.

I'll use that. I'll do it in the highly thought of ripbot and Badaboom. Resize to 720p and at one quarter the file size to force quality compromises.

Fwiw, I had to remux the QT video to mkv so that ripbot could read it. A common issue. Badaboom didn't need that. Though setting filesize with it is trickier, it uses a bitrate slider that isn't very precise.

Not counting the few seconds it took ripbot to scan and demux the mkv to its cache, ripbot took 4 min. 40 sec. (two passes). Badaboom took 58 sec.

CPU for ribot was pretty much at 100% CPU though it did fluctuate down a bit a few very brief times. Badaboom hovered around 10%.

I'll post links to the encodes if I can get them uploaded and there's no objection. I'll remove them if there's a problem. Imo both did a nice job.

ripbot came close with the encode size coming in at 24.9 MB. Badaboom was 25.4 MB. Very close to its slider setting, bitrate wise. Audio sizes look to be very close.

I'll be willing to take another comparative project if it's small in size.

Edit: really bad grammar
 
Last edited by a moderator:
Here's my Avisynth script (note I renamed the extension of the original 1920x1080 file I downloaded from Apple's site):

Code:
directshowsource("bbc-cfb_m1080p.mp4")
 
spline36resize(1280,720)

and here's my x264 (release 1071) settings:

Code:
--crf 25 --ref 3 --mixed-refs --bframes 3 --b-adapt 2 --b-pyramid --weightb --subme 7 --partitions i8x8 --8x8dct --qpmin 16 --vbv-maxrate 29400 --me umh --merange 8 --threads auto

note I couldn't be arsed setting-up a 2-pass encode as I never do that, I always use crf ("constant quality") to do a 1-pass encode.

I'm using an A64 3800X2 CPU (yay, I got an upgrade from my 3500+ single core CPU :D ). The video encode ran at 6.18fps, 6m34s and the mux at the end took 5s. I muxed-in the original audio rather than encoding it again.

The final muxed file is 25.1MB (25799KB). I can't upload it to MediaFire right now as the server's undergoing maintenance.

Obviously my workflow is more arduous, as I faffed around to create the script, queue the job and did the mux manually (didn't even bother to queue that as a job).

So here's a "worst case frame", frame 518:

BBCMotion-518-x264.jpg


In the top left corner I've overlaid some data from playback in Media Player Classic, via FFDShow using libavcodec to decode.

Here's [strike]the same frame from[/strike] Badaboom:

BBCMotion-518-BB.jpg


:oops:

I say "same frame" but I had to double-check that. And then triple-check it. Then I checked again...

Jawed
 
Last edited by a moderator:
Nice, can really notice the image quality difference. Something is up with the frame count though. Maybe Badaboom dropped some.

You used b-pyramid? That's cool, I think the PS3 can handle that, so can a PC, but any other standalone player might choke on an encode that uses it.

Edit: That time lapse stuff was maybe a bad choice. :) I doubt Badaboom's default range of precision is very close to the settings of a good encoder.

I'm up for another shot if anyone has a link to some legal footage.

And thanks for taking the time and making such a great effort.
 
Last edited by a moderator:
Jawed: Interesting, thanks - as Babel said, Badaboom most likely skipped a few frames. The quality indeed isn't very impressive to say the least (I assume this is H.264 Baseline for both x264 and Badaboom, right?) - hopefully the cost:benefit calculation will change as the ALU:TEX ratio increases in 2009 and beyond and software matures further...

---

I am sorry to partially hijack the thread, but one thing that has struck me is that everyone is focusing on H.264 - however, in the near future, several handheld SoCs used in mobile phones will support 720p/1080p VC-1 and H.264 *Baseline*; when given the choice between the two, the former is theoretically superior in every way.

So I'm wondering if anyone knows what the state of the art is in terms of personal VC-1 transcoders, and whether any company has indicated anything about working on it with GPGPU acceleration... Cheers! Am I right in assuming neither Elemental Technologies nor AMD have indicated they were interested in VC-1?
 
It seems both ATI and Nvidia have forgotten image quality. Speed is nothing without quality.

If I transcode something on a GPU and it takes 15 minutes instead of 45 minutes on the CPU, but the results are terrible, then I've just wasted 15 minutes, because I'm then going to spend 45 minutes re-doing the job on the CPU. In short, I might as well not bother with the GPU encoding if the results are totally unacceptable due to such poor quality.
 
Nice, can really notice the image quality difference. Something is up with the frame count though. Maybe Badaboom dropped some.
ARGH :rolleyes:

The Badaboom video is 2536 frames. The original video (and other encodes from it) are 2432 frames. On replay it seems that frames in the Badaboom video are actually skipped, as the frame number jumps, e.g. 542, 543, 544, 545, 546, 547, 548, 550 :oops:

So, ahem, the matching frame should be:

BBCMotion-541-BB.jpg


That's a relief, I thought I was going mad, trying to work out how the frames could be so different yet the sequence "looked fine". I guess the 104 superfluous frames are actually fully encoded, just never shown. I suppose they're wasting bitrate, but who knows?

You used b-pyramid? That's cool, I think the PS3 can handle that, so can a PC, but any other standalone player might choke on an encode that uses it.
My encode profiles normally target my own personal use - I adapted a crf 22 profile to make that crf 25 profile. So I've made the profile fast but efficient, which means I've turned down a lot of settings from their maximum (i.e. motion estimation range, partitions, reference frame count and rate-distortion optimisation level plus no use of trellis) as they're very costly in terms of encode speed with very little benefit in quality per bit.

As a comparison, for the same clip I've just run with these settings:

Code:
--crf 25 --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-adapt 2 --b-pyramid --weightb --subme 9 --trellis 2 --partitions all  --8x8dct --qpmin 16 --vbv-maxrate 29400 --me umh


Comparing this with the encode I made earlier:
  • 1.98fps - 20m29s - 32% of encoding speed
  • 26134KB - 110% of the file size (i.e. larger!)
I am comparing the x264 encodes only, i.e. video encode speed and size of the video alone. The earlier encode came in at 23831KB.

The frame looks a little better, too:

BBCMotion-518-x264-max.jpg


That encode speed though is atrocious.

Finally, here's the same frame at crf 29:

BBCMotion-518-x264-29.jpg



Compared with the crf 25 encode I first posted (comparing video, only):
  • 6.24fps - 6m30s - 101% of encoding speed
  • 12894KB - 54% of the file size
I'd say this video, overall, has about the same quality as the Badaboom video. It's noisier and has less blocking.

I've just uploaded two videos:

x264 crf 25

x264 crf 29

Hope they work.

Jawed
 
Yeah compared to x264, Badaboom is really poor. The inability to use it with avisynth is also a show stopper, IMO. It's a very limited app that I feel is mainly for folk who aren't all that interested in anything but speed.

My 3.0 GHz Q6600 cranks through a SD-resolution x264 encode with Unrestricted, CRF HQ settings (8 ref, 3 b, etc) at about 35fps.

As I said earlier, the most exciting encoding development of early '08 for me was hearing that the x264 devs were looking at CUDA. They gave up after months of effort. It was just too awful apparently. Some of the limitations of Badaboom may be related to CUDA difficulties. There is still talk of GPU x264 on the Doom9 forum, but they are just not impressed with the ease-of-use, performance and flexibility of CUDA and ATI's options.

I'd rather they just skip CUDA and work with OpenCL anyway. Limiting all this development work to one GPU company is not good for anyone but that company. It feels like we are reliving the olden days of proprietary 3D APIs.

The only non-game app I use on my GPUs is a avisynth filter called FFT3DGPU, based on FFT3DFILTER. It is an amazing noise filter that is incredibly demanding on a CPU. The GPU version offers quality as good as the CPU version (as far as I can see) and is missing only a few features. It just runs through Direct3D 9 somehow. I can run it on a GeForce FX if I so desire, but I really need HD3850 or better to keep up with the quad core's x264 frame rate. I can clean up a source and lose no speed with this filter. It's been fun to try it on a range of GPUs.
 
Last edited by a moderator:
Back
Top