Procedural anti-aliasing is better called analytical anti-aliasing.
To understand it, you first need to realize that the process of generating images is really just the process of evaluating a multi-dimensional integral, and anti-aliasing is accomplished by integrating over area.
This is normally accomplished by using rectangular approximation (take a number of samples, average them, and use the property that as the number of samples approaches infinity, the error in your approximate integral approaches 0). Since a large number of primitives in computer graphics are sampled, or have no basis in math (e.g., texture maps), using approximations such as rectangular (or trapezoidal) are often the best available techniques.
However, many procedural shaders (the simplest being things like checkerboards) are based entirely in the world of mathematics, and you *can* write an equation that maps f(u,v,w) to RGB values. If, instead of computing f(u,v,w) on just the texture coordinates for the current fragment, you compute the integral of f(u,v,w) on the range from the lowest point on the fragment to the highest point), you can compute the exact value for the shader directly, and get perfect texture anti-aliasing at any resolution.
This requires the ability to compute the derivative of the texture coordinates relative to screen coordinates at every fragment, in order to know what the correct filter width (range of texture coordinates) is.
Procedurally anti-aliasing noise (often used in marble and wood textures) is much more difficult, since most noise implementations are non-integrable (and non-differentiable), resulting in, at best, an anti-aliasing implementation that takes a variable number of samples.