You can find an ad-free static site version of this post here: https://therealmjp.github.io/posts/signal-processing-primer/
For a theoretical understanding of aliasing and anti-aliasing, we can turn to the fields of signal processing and sampling theory. This article will explain some of the basics of these two related field in my own words, taking a more theoretical point of view. In the following article the concepts covered here will be used to analyze common aspects of real-time graphics, so that we can describe them in terms of signal processing. If you’d like some further reading, I’d recommend consulting chapter 7 of Physically Based Rendering (available as a free download), chapter 5 of Real-Time Rendering, 3rd Edition or Principles of Digital Image Synthesis (which is also freely available for download)
As always, I’m more interested in the material being correct than I am in sounding like I’m smart. So if you see anything that you feel is incorrect or have any additional insights to share, please let me know in the comments!
Sampling theory deals with the process of taking some continuous signal that varies with one or parameters, and sampling the signal at discrete values of those parameters. If you’re not familiar with signals and signal processing, you can think of a signal as some continuous function of any dimension that varies along its domain. To sample it, we then calculate that function’s value at certain points along the curve. Usually the points at which we sample are evenly-spaced apart, which we call uniform sampling. So for instance if we had a 1D signal defined as f(x) = x^2, and we might sample it at x =0, x = 1, x = 2 x = 3, and so on. This would give us our set of discrete samples, which in this case would be 0, 1, 4, 9, and continuing on in that fashion. Here’s an image showing our continuous function of f(x) = x^2, and the result of discretely sampling that function at integer values of x:
Discretely sampling a continuous function
Working with discrete samples has a lot of advantages. For instance, it allows us to store a representation of an arbitrary signal in memory by simply storing the sampled values in an array. It also allows us to perform operations on a signal by repeatedly applying the same operation in a loop for all sample points. But what happens when we need the value of s signal at a location that we didn’t sample at? In such cases, we can use a process known as reconstruction to derive an approximation of the original continuous function. With this approximation we can then discretely sample its values at a new set of sample points, which is referred to as resampling. You may also see this referred to as interpolation in cases where local discrete values are used to compute an “in between” value.
The actual process of reconstruction involves applying some sort of reconstruction filter to a set of discrete samples. Typically this filter is some function that is symmetrical about x=0, often with non-zero values only in a small region surrounding x=0. The following image contains a few functions commonly used as reconstruction filters:
Various functions used as reconstruction filters. Starting from the top left and moving clockwise: the box function, the triangle function, and the sinc function. Image from Real-Time Rendering, 3rd Edition, A K Peters 2008
The filter is applied using a process known as convolution. In the case of discrete sample points, a convolution implies multiplying the sample values with an instance of the filter function translated such that it is centered about the same point, and then summing the result from all sample points. If you’re having trouble understand what this means, take a look at the following three images which show the result of convolution with three common reconstruction filters:
Discretely-sampled signals being reconstructed with a box function, a triangle function, and a sinc function. Images from Real-Time Rendering, 3rd Edition, A K Peters 2008
If you’ve ever written a full-screen Gaussian blur shader, then you’ve used convolution. Think about how you would write such a shader: you loop over nearby pixel values (sample points) in a texture, multiply each value by the Gaussian function evaluated using the distance from your output pixel, and sum the results. Evaluating a function using the distance to the sample point is equivalent to translating the filter function to the location of the sample point, although you may not have thought of it this way.
Of the common reconstruction filters, the sinc filter is particularly interesting. This is because it is theoretically possible to use it to exactly reconstruct the original continuous signal, provided that the signal was adequately sampled. This is known as ideal reconstruction. To define what “adequately sampled” means for a continuous signal, we must now discuss aliasing.
Key Takeaway Points:
- Continuous signals(functions) can be sampled at discrete sample points
- An approximation of the original continuous signal can be reconstructed by applying a filter to the discrete sample values
Frequency and Aliasing
Signals are often described in terms of their frequency, which in rough terms describes how quickly a signal changes over their domain. In reality a signal is not composed of just one frequency, but can have an entire spectrum of frequencies. Mathematically a signal can be converted from its original representation (often referred to as the time domain or spatial domain, depending on the context) to its spectrum of frequencies (known as the frequency domain) using the Fourier transform. Once in the frequency domain, it can be determined if there is some maximum frequency where all frequencies above that have an intensity of zero. If such a maximum frequency exists, the signal is said to be bandlimited, which means we can determine the bandwidth of that signal. Depending on the context, the term “bandwidth” can be either the passband bandwidth or the baseband bandwidth. The passband bandwidth is equal to the maximum frequency minus the minimum frequency, while the baseband bandwidth simply refers to the refers to the maximum frequency. With sampling theory we are primarily concerned with the baseband bandwidth, because it is used to determine the Nyquist rate of the signal. The Nyquist rate is minimum rate at which the signal should be sampled in order to prevent aliasing from occurring, and it is equal to 2 times the baseband bandwidth. The term “aliasing” refers the fact that a signal can become indistinguishable from a lower-frequency signal when undersampled. The following image demonstrates this phenomenon with two sine waves of different frequencies, where the samples would be the same for either signal:
Aliasing of a sampled sine wave. Image from Wikipedia
In practice, aliasing that occurs due to undersampling will result in errors in the reconstructed signal. So in other words, the signal you end up with will be different than the one you were originally sampling. For signals that are not bandlimited, there is no maximum frequency and thus there is no sampling rate that won’t result in aliasing after reconstruction.
To better understand how and why aliasing occurs, it can be helpful to look at things in the frequency domain. Let’s start with the plot of the frequency spectrum for an arbitrary signal:
Frequency spectrum of an arbitrary signal. Image from Wikipedia.
As we can see from the plot, there is a maximum frequency located at point “B”, meaning that the signal is bandlimited and has a bandwidth equal to B. When this signal is discretely sampled, an infinite number of “copies” of the signal will appear alongside the original at various points. Here’s an image illustrating how it looks:
Replicas of a signals frequency spectrum. Image from Wikipedia
The location of the signal duplicates is determined by the sampling rate, which is marked as “fs” in the plot. Since these duplicates are present, we must use a filter (the reconstruction filter) to remove these duplicates and leave us with only the frequency spectrum that was within the original signal’s bandwidth (referred to as the baseband frequencies). The obvious solution is to use a box function in the frequency domain, since a box function implies multiplying a certain range of values by 1 and all other values by 0. So if we were to use a box function with a width of B, we would remove the duplicates while leaving the original signal intact. The following diagram shows how this might work:
A reconstruction filter is used to isolate the original copy of a signal’s spectrum. Image from Wikipedia
What’s important to keep in mind is that we typically need to apply our reconstruction filter in the spatial domain, and not in the frequency domain. This means that we need to use the spatial domain equivalent of a box function in frequency space, and it turns out that this is the previously-mentioned sinc function. By now it should make sense why the sinc function is called the ideal reconstruction filter, since it has the ability to leave certain frequency ranges untouched while completely filtering out other frequencies. For this same reason it is also common to refer to the frequency domain box function as the ideal low-pass filter.
Now let’s look at what happens when we don’t sample the signal at an adequate rate. As we saw earlier, the duplicates of a signal will appear at multiples of the sampling frequency. So the higher our sampling rate the further apart they will be, while the lower our sampling rate the closer they will be. Earlier we learned that the critical sampling rate for a signal is 2B, so let’s look at the plot of a signal that’s been sampled at a rate lower than 2B:
Inadequate sampling rate results in overlap of a signal’s replicas. Image from Wikipedia
Once we dip below the Nyquist rate of the signal, the duplicates begin to overlap in the frequency domain. After this happens it is no longer possible to isolate the original copy of the signal with a sinc filter, and thus we end up with aliasing. The bottom plot in the above image demonstrates what an alias of the original signal would look like. Since its frequency response is identical to that of the original signal, it is completely indistinguishable.
Key Takeaway Points:
- Signals can be decomposed into a spectrum of frequencies, with the spectrum being tied to the rate of change of the signal
- Signals with a maximum frequency are bandlimited
- A signal’s Nyqust rate is equal to two times its maximum frequency, and this is the minimum sampling rate required to perfectly reconstruct a signal without aliasing
- Signal reconstruction can be viewed as the process of removing “replicas” of a signal’s spectrum in the frequency domain
Reconstruction Filter Design
Aliasing that results from undersampling is referred to as prealiasing, since it occurs before reconstruction takes place. However it is also possible for artifacts to occur due to the reconstruction process itself. For instance, imagine if we used a box function that was too wide when applying a reconstruction filter. The result would look like this:
A wide reconstruction filter fails to isolate the original copy of a signal’s spectrum. Image from Wikipedia
With such a reconstruction we would still end up with artifacts in the reconstructed signal, even though it was adequately sampled.
As we’ve demonstrated, using the wrong size box function in the frequency domain is one way to adversely affect the quality of our reconstructed signal. However we’ve already mentioned that a variety of functions can be applied as a filter in the spatial domain, and these functions all have a frequency domain counterpart that differs from the box function that we previously discussed. With this in mind, we can reason that the choice in filter will affect how well we isolate the signal in the frequency domain, and that this will affect how much postaliasing is introduced into the reconstructed signal. Let’s look at some common filtering functions, and compare their plots with the plots of their frequency domain counterparts:
Box function -> Sinc Function
Triangle Function -> (Sinc Function)2
Gaussian Function -> Gaussian Function
(Sinc Function)2 -> Triangle Function
Sinc Function -> Box Function
By looking at the frequency domain counterpart of a spatial domain filter function, we can get a rough idea of how well it’s going to preserve the frequency range we’re interested in and filter out the extraneous copies of our signal’s spectrum. The field of filter design is primarily concerned with this process of analyzing a filter’s frequency domain spectrum, and using that to evaluate or approximate that filter’s overall performance. Looking that the spectrums plotted in the above images, we can see that the non-sinc functions will all attenuate the baseband frequencies in some way. For some functions we can also observe that the frequency domain equivalent has no maximum value above which all frequencies have a value of zero, which means that the frequency domain filter extends infinitely in both directions. This ultimately means that all of the infinite replicas of the signal’s spectrum will bleed into the reconstructed signal to some extent, which will cause aliasing.
One general pattern that we can observe from looking at the plots of the spatial domain filter functions and their frequency domain equivalents is that there is an inverse relationship between rate of change in one representation and its counterpart. For instance, have a look at the spatial domain box function. This function has a discontinuity at some value, resulting in infinite rate of change. Consequently its frequency domain counterpart is the sinc, which extends to infinity representing the infinite rate of change inherent in the box function’s discontinuity. By the same token a sinc function in the spatial domain equates to a box function in the frequency domain since the relationship is reciprocal. The Gaussian function is a special case, where the spatial domain and frequency domain counterparts are the same function. For this reason the Gaussian function represents the exact “midpoint” between smoothly-changing functions and sharply-changing functions in the spatial and frequency domains. Another important aspect of this relationship is that by making a filtering function “wider” (which can be achieved by dividing the input distance by some value greater than 1), the resulting frequency spectrum for that function will become more “narrow”. As an example, have a look at the spectrum of a “unit” box function with width of 1.0 compared with the spectrum of a box function with width of 4.0
Frequency spectrum of a box function with width of 1.0
Frequency spectrum of a box function with width of 4.0
The graphs clearly show that as the filter kernel becomes wider, the magnitude of the lowest frequencies becomes higher. This is really just another manifestation of the behavior we noted earlier regarding rate of change in the spatial domain and the frequency domain, since “wider” functions will change more slowly over time and thus will have more low frequency components in its frequency spectrum.
One difficult aspect of filter design is that we often must not just consider the filter’s frequency domain representation, but we also must consider the effect that its spatial domain representation will have on the reconstructed signal. In particular, we must be careful with filters that have negative lobes, such as the sinc filter. Such filters can produce an effect known as ringing when applied to sharp discontinuities, where the reconstructed signal oscillates about the signal being sampled. Take a look at the plot of a square wave being reconstructed with a sinc filter:
Gibbs phenomenon resulting from a square wave being reconstructed with a sinc filter
Key Takeaway Points:
- Error resulting from inadequate sampling rate is known as prealiasing. Error introduced through poor filtering is known as postaliasing.
- A filter’s ability to limit aliasing can be estimated by observing its frequency domain representation
- A filter’s rate of change in the spatial domain is inversely related to its rate of change in the frequency domain
- A filter’s spatial domain representation can also have an effect on the quality of the reconstructed signal, with the most notable effect being ringing occurring at discontinuities.
Pharr, Matt and Humphreys, Greg. Physically Based Rendering – From Theory to Implementation, 2nd Edition.
Akenine-Möller, Tomas, Haines, Eric, and Hoffman, Naty. Real-Time Rendering, 3rd Edition
Glassner, Andrew. Principles of Digital Image Synthesis
Next article in the series: