Compressed sensing is a new compression technique that enables you to compress data by calculating several linear combinations of the original measurements. This article explores the practical side of the technique and details how to use it to identify the bandwidth of third-order digital low-pass Butterworth filters.
When you know more about a signal than just the measurements you have recorded, it is often possible to compress the signal. For example, when you know that a sequence of bytes are the ASCII values of the characters of an essay written in English, it is possible to make use of the characteristics of the English language to compress the sequence. When measurements are known to come from a video clip, you can make use of the characteristics of video signals to compress the measurements. In each case, you make essential use of the characteristics of the signal at the compression stage. In this article, we describe a method of compressing measurements called compressed (or compressive) sensing that does not make much use of the characteristic of the signal at the compression stage; the compression stage is quite simple and rather generic. At the decompression stage, the method makes essential use of the known characteristics of the signal, and the decompression stage is more complicated.
Compressed sensing is about 10 years old (though it had precursors). Such luminaries as Terence Tao, Emanuael Candes, and David Donoho started working on the techniques about a decade ago, and many researchers are currently working on generalizing, improving, and applying compressed-sensing-based techniques. As we will see, compressed sensing can be applied to signals that are (linear) combinations of a small number of signals taken from a (generally very) large database of signals—and this is the critical “characteristic” that the signal must have. In compressed sensing, the compression step is simple, but decompressing the data takes more work. Compressed sensing is particularly useful when the units performing the compression are relatively simple—say, sensors with small embedded microcontrollers—but the unit performing the decompressing has a relatively strong processor and is capable of running sophisticated software—say, a desktop computer running Mathwork’s MATLAB.
In the next section, we present a bit of the mathematical theory of compressed sensing. Hopefully, it will bring back (pleasant) memories of linear algebra courses and will give a small taste of some of the mathematics used in this field even to those who have not had the pleasure of a well-taught linear algebra course.
COMPRESSED SENSING: THE MATHEMATICS
Let B be a matrix with N columns and M rows. B is the database of signals. Each of the N columns of B represents one signal from the database, and each of the M elements of the column is a sample of that signal. Let x be a vector with N elements. Then Bx is a linear combination of the columns of B. The hypothesis that allows us to compress our measurements is that the signal we measure, y, satisfies y = Bx, where x is “sparse”—x has few nonzero entries. That means y has M elements and is a linear combination of a few of the columns of B (of a few elements in the database).
The compression step in compressed sensing is very simple. You take a matrix A with M columns and L << M rows and you calculate z = Ay. (The value of L is dependent on just how many nonzero elements x can have and on the size of the database. For more information, refer to Shai Shalev-Shwartz’s paper, “Compressed Sensing: Basic results and Self-Contained Proofs.”) The vector z is the compressed measurement and has L elements rather than M elements—and that can be a big savings.
As a rule, Ay = ABx = z is an equation in which there are more unknowns, elements of x, than equations. In linear algebra, we learn that such equations generally have an infinite number of solutions. We will see that sometimes it is possible to make use of the sparseness of x to find the solution of the equation for which we’re looking—the solution for which x is sparse. The big question is for which matrices AB is it possible to find x (and then y = Bx) from the compressed measurements, z. Not every matrix will do.
SIMPLE BUT IMPRACTICAL CONDITION ON A B
We know that the signal, y, satisfies y = Bx, where the vector x has very few nonzero elements. Therefore, what really need to know is this: Given z, B, and A, is it possible to find the solution x of z = ABx for which x has very few nonzero terms? It is simple to show that if any s columns of AB are linearly independent, then any solution of z = ABx for which x has s/2 or fewer nonzero elements is unique. (Refer to the sidebar, “Proof for the Simple but Impractical Condition.”) If we know that the condition on AB holds, then we know that any solution with s/2 or fewer nonzero elements is unique—and by assumption, it is just such sparse solutions that we are searching for.
How do you pick a matrix A for which the condition will hold? In principle, as long as the dimensions have been picked in a reasonable way, taking a random matrix A ought to do the trick. Why? Because being linearly dependent is not something you expect of a random set of vectors that has the same or fewer elements than the dimension of the space in which they live. Consider two vectors in a two-dimensional space. For the two vectors to be linearly dependent, one vector has to be a constant multiple of the other. If you draw two vectors at random, the chance that they both point in the exact same dimension is miniscule.
There is, however, a fly in the ointment. It turns out that in general the only way to actually search for such a sparse solution is to perform an exhaustive search among sparse solutions—and that will generally take a very, very long time. It would seem that what we have now is a mathematical curiosity and not a practical compression technique. We can prove that this simple compression scheme does compress the information we have. The problem is that there does not seem to be any practical way to retrieve the compressed information.
RESTRICTED ISOMETRY PROPERTY (RIP)
There is a somewhat more restrictive property a matrix can have called the restricted isometry property and known by its acronym, RIP, which enables us to say more. (The use of this name and its acronym prove that at least one mathematician had a sense of humor.) For a given s, a matrix C satisfies the RIP if it takes all sparse vectors x (all vectors with s or fewer nonzero elements) into vectors Cx, which are (on a percentage basis) about the same size as x. In particular, if x has s or fewer elements and is nonzero, the matrix C cannot take x into Cx = 0 because that would produce a huge change in the size of Cx relative to the size of x. This shows that any s columns of a matrix C that satisfy the RIP are linearly independent. Thus, if AB satisfies the RIP, it satisfies our previous condition too. Matrices AB that satisfy the RIP are a subset of the matrices of the previous section.
What makes matrices that satisfy the RIP special? It turns out that you can find sparse solutions of z = ABx by looking for solutions of this equation for which the sum of the absolute values of the elements of x is as small as possible—that is, you do not actually have to search for sparse solutions at all. When a sparse solution of z = ABx exists, this solution is also the solution for which the sum of the absolute values of the elements of the x is as small as possible—and there are many efficient computer programs on the market to solve this problem.
So how do you find a matrix A for which AB satisfies the RIP? It turns out that, once again, the answer is as followings: if B is of the correct form and you choose the dimensions right and draw the elements of A from an appropriate probability distribution, you can rest assured that the chance of the matrix AB being RIP is high.
COMPRESSED SENSING APPLICATION
If you choose the columns of B to be the responses of different systems to some input, then providing this input to one of the systems or to some linear combination of a small number of the systems will lead to an output, y, that is a linear combination of the columns of B. With an appropriately chosen A, it should be possible to compress the measurements of the output and then decompress the measurements and find the sparse vector x. Looking at the nonzero elements of x allows you to determine which columns of B are used to produce y, and this allows you to identify the components of the system being examined.
We wanted to use compressed sensing to allow us to identify a system. We chose to make the system to be identified a third-order digital low-pass Butterworth filter or a combination of such filters, and we wanted to use the techniques of compressed sensing to allow us to identify the cutoff frequency (or frequencies) of the system. We implemented the system to be identified on an Analog Devices ADuC841, which is a microcontroller with on-board A/D and D/A converters. We worked with simple low-pass filters and used compressed sensing to determine the filter’s bandwidth.
We programmed a second, totally independent ADuC841 to measure the step response of the filter and compress it. The second microcontroller is connected to the system to be identified, and the second microcontroller outputs a step function to that system, to the filter, and measures the filters response to the step function. Refer to Photo 1 to see both the signal output by and the signal measured by the second ADuC841.
Because we knew that the step response had to be a linear combination of the step responses of third-order, low-pass Butterworth filters, we made use of compressed sensing to compress the measurements made on the second ADuC841. Our database, B, was just one hundred step responses of third-order digital low-pass Butterworth filters with various cutoff frequencies. Our compression matrix, A, was a 15 × 100 matrix that we chose randomly. But since we had no convenient way to make sure that A satisfied the RIP (and checking this property is generally hard to do), we ran the system several times to check that the decompression step actually worked.
We worked with low-pass filters whose bandwidths ranged from 10 to 100 Hz. When measuring the step response, we took 100 equally spaced measurements of the step response (the sample time was 1/1350 s) and using A compressed them to 15 numbers. We then transmitted these 15 numbers to a PC using our microcontroller’s UART.
On the PC side, we took the measurements and passed them along to a MATLAB program that performed the decompression. The MATLAB program made use of the freely downloadable CVX code (http://cvxr.com/cvx/). CVX provides simple commands to find the value of x that satisfies z = ABx and for which the sum of the absolute values of the elements of x is as small as possible (see Listing 1). Given this value and B, you can recover the original measured values by calculating Bx (see Figure 1).
Listing 1 The code that makes use of the commands that CVX provides. Note that A_2_CVX = A B. cvx_begin M = size(A_2_CVX,2); variable x_from_cvx(M) minimize( norm(x_from_cvx,1) ) subject to A_2_CVX*x_from_cvx == b; cvx_end
In our case, we did not really want to see the step response of the filter; we wanted to identify the filter being used, so calculating Bx was not the main point. Recall that the columns of B are just the step responses of different filters. Assuming that a low-pass filter was implemented and that one of the columns of B is the step response of a filter with the same bandwidth, x should have only a single nonzero element, and the index of the single nonzero element x provides you with the information you are really looking for—the location of the column within B and, hence, the bandwidth of the filter. If the filter was a band-pass filter, then the indices of the two nonzero elements should give the upper and low frequencies of the pass-band.
The system works nicely, but there are several practical issues that need to be considered. Let’s review.
The first issue is sensitivity to noise. Once you implement something in hardware and make real measurements, you know that there will be some noise in all your measurements. How do the techniques of compressive sensing deal with noise? The short answer is that it can be proved that small amounts of noise lead to small changes in the solution. (Refer to the aforementioned Shalev-Shwartz paper.) This, of course, is very good news.
The second practical issue is choosing the members of the database—that is, choosing the columns of B. If the columns of B are very similar to one another, then small amounts of noise could take a signal that should have looked like one column and make it look like a different column. It is important to choose columns that are not too similar to one another. In order to make sure that the columns of our matrix were not too similar to one another, we considered filters whose cutoff frequencies were sufficiently different from one another’s that their step responses looked pretty different.
The third issue to consider is avoiding negative voltages. The impulse responses of the filters we consider—third-order, low-pass Butterworth filters—are negative at times, and the A/D and the D/A on the ADuC841 do not allow negative voltages. We could have dealt with this problem by adding an offset voltage to our signals, but there was a simpler solution (as long as we only considered low-pass filters). The filters’ impulse responses are negative at times, but their step responses are always positive. In order to keep things as simple as possible, we measured the filters’ step responses and not their impulse responses.
Compressed sensing is a new compression technique that enables you to compress data by calculating several linear combinations of the original measurements. The simple compression step is appropriate for use in systems with limited computing resources—such as sensors with small embedded microcontrollers. The decompression step is much more computationally intensive, and it is carried out by a much more serious processor.
We implemented compressed sensing and used it to identify the bandwidth of third-order digital low-pass Butterworth filters. The system worked nicely and allowed us to explore the practical side of this new compression technique.
PROOF FOR THE SIMPLE BUT IMPRACTICAL CONDITION
Suppose that there were two solutions of z = ABx. Assuming that the difference of two solutions, difference, is not zero, it still has no more than s nonzero elements but satisfies AB (difference) = 0. That gives a linear combination of not more than s columns of AB which is zero—but by assumption, no such combination exists. Thus, the difference had to be the zero vector, and the two solutions were really the same.
 CVX Research, Inc., “CVX: Matlab Software for Disciplined Convex Programming,” Ver. 2.1, 2015, http://cvxr.com/cvx/.
 S. Shalev-Shwartz, “Compressed Sensing: Basic results and Self-Contained Proofs,” 2009, www.cs.huji.ac.il/~shais/compressedSensing.pdf.
Analog Devices, “ADuC84,” available at www.analog.com/en/products/processors-dsp/analog-microcontrollers/8052-core-products/aduc841.html.
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • JANUARY 2016 #306 – Get a PDF of the issueSponsor this Article