View Full Version : Stupid question, but what exactly is the hardware rasterizer
I realize that this may be a stupid question , but after extensive googling I must ask this question . What exactly is the hardware rasterizer at the heart of 3d chips ? What makes them faster than software when just running a wireframe (no h/w tcl) . I am sorry , but no sites or tutorials or explanations really seem to cover this . TIA for any replies .
I don't have time to post a lengthy answer, but you might want to check out this three part article for a good overview of the graphics pipeline.
I realize that this may be a stupid question , but after extensive googling I must ask this question . What exactly is the hardware rasterizer at the heart of 3d chips ?.
A z-buffer based rasteriser can be summarised as:
Accept a list of triangles, i.e. 3 vertices each with screen positions (X,Y and "depth" Z), colour, and texture coordinates.
Step through each triangle and determine which pixels lie inside the triangle (usually called scan conversion).
Compare each screen pixel's stored "depth" with the computed triangle's depth at the same pixel. If triangle is "in front" at that pixel, "draw" it.
Draw means to compute the colour(s) of the triangle's texture(s) at that pixel location and to either replace the stored pixel's colour completely or blend it with the new colour in some fashion. You'd normally also replace the pxiel's stored depth as well.
What makes them faster than software when just running a wireframe (no h/w tcl) . I am sorry , but no sites or tutorials or explanations really seem to cover this . TIA for any replies .
I'm not sure exactly what you are asking here, but the reason it's fast is that the hardware is designed to do many steps in parallel.
Yup, the three main reasons a 3D accelerator is faster than software are...
1. Dedicated logic. Takes fewer cycles than a general-purpose processor, regardless.
2. Parallel operation. Take Radeon 8500 for example. Each pixel pipeline (4) performs a Z compare, plus each one has two TCU's which can each take up to 64 sub samples (8x Aniso + bilinear). That's 64 X 8 = 512 simultaneous texel samples plus some number of Z operations all done in one clock cycle. A general CPU would need a HUGE number of cycles to perform even bilinear at a theoretical 1PPC. The power needed to perform AF would be staggering. And that's without FSAA.
3. Memory bandwidth. Latest 3D core from nVidia features some 10.4GB/sec. Highest (official) CPU bandwidth to date is 3.2GB/sec (dual-channel PC800).
I think that parallel natural of a hardware rasterizer is the main reason why it's faster. Hardware rasterizers are highly pipelined.
Bandwidth is not really that big problem, because you can write a deferred renderer for a CPU, and every modem CPUs have fast and big L2 cache. The bandwidth of a P4 (3.2Gb/s) is larger than a Kyro but a software rasterizer on a P4 won't be able to compete with a Kyro.
Well, yeah, but then the CPU would have to compute the culling as well ^_^
PowerVR cores do it in hardware along with everything else so the bandwidth CAN be saved. Also remember that K2 does all Z work on-chip, with only the final front buffer stored in onboard RAM, so that's still more savings on external bandwidth which a general purpose processor couldn't achieve without some kind of dedicated logic. ^_^;
Why not? One can write a software to do all things including removing invisible pixels, or even triangles. It can save bandwidth even better than a Kyro. However, its main problem is, CPUs are too slow compares to a highly pipelined hardware rasterizers.
However, in early days some hardware rasterizers are actually slower than the fastest CPU of its time. They are generally called "3D deceleraters." :)
For simply drawing a wireframe , would a 3d accelerator sill be faster than a CPU ? What work would the card be doing ? (Assuming it did all the transform on the CPU ). I apologize if I am being ignorant . Thanks alot for the information so far , learned alot .
pcchen: Didn't say it wasn't possible, but because the CPU would have to compute the culling first, it wouldn't be as effective as dedicated raster logic (i.e. PowerVR).
EvilTwin: Drawing a wireframe, a CPU could probably do passably at low resolutions. But scaling the resolution up, the Z and frame buffer size would probably end up eating all the effective bandwidth. The main thing a hardware rasteriser could do is anti-alias the lines faster than a CPU.
vBulletin® v3.8.6, Copyright ©2000-2013, Jelsoft Enterprises Ltd.