Compressive Sensing or Compressed Sensing is a technique to sample certain classes of signals in a rate below the Nyquist Rate. Given a discrete signal x with N components , if this signal is S-sparse what means only S of its components are non zero, you can use a matrix A of dimensions M x N, with M < N, and obtain the samples y = Ax. To recover the original signal, you have to solve the optimization problem:
min_b || b||_l1 subject to y = Ab,
there || b ||_l1 is l1 norm of the vector b. And it is shown that using M > Slog(N/S) measurements the solution of the above optimization is your signal x.
However, you must be aware that matrix A must obeys some conditions as Restricted Isometry Property (RIP), and the signal x must be sparse or it can be made sparse in a certain basis.
A good collection of papers on the subject can be found at dsp.rice.edu/cs , and a good source of actualized information in http://nuit-blanche.blogspot.com/ .
If you are mathematician guy you will prefer papers from Donoho, Candés and Tao, if you an engineer guy you will prefer papers from Richard Baraniuk.
If you signals are not sparse you have first write them in a sparse, or almost sparse (few high coefficients and many non zero but low coefficients), for example by applying a transform. After you will use this sparse representation. In the receiver you recover the sparse representation and use the inverse transform to reconstruct your signal.
Real examples are images, where people normally apply wavelet transform to obtain a sparse representation.
I think the Wired article does a good job of explaining it. Not a lot of technical details but a good overview: http://www.wired.com/magazine/2010/02/ff_algorithm/