Ivan Selesnick

Research

Non-convex Penalties for Sparse Least Squares (publications)

Sparse representations are used in applications such as noise reduction, deblurring, filling in missing data, and tomography. Many algorithms for these applications require the calculation of a sparse approximate solution to a system of linear equations y = Ax. A common approach for this problem is to minimize a suitably prescribed cost function comprising a quadratic data fidelity term and a non-differentiable penalty term. Often an L1 norm penalty term is used because among convex function the L1 norm promotes sparsity most effectively. However, the L1 norm tends to underestimate the true values.

In this research, we aim to develop new penalty functions for sparse least squares that (1) improve the accuracy of solutions in comparison with the L1 norm and (2) maintain the convexity of the sparse least squares cost function. For this purpose, we have developed parametric sparsity-inducing non-convex penalties, including non-separable multivariate penalties. The proposed non-convex penalties are designed so that the total cost function to be minimized (comprising data fidelity and penalty terms) is convex. Hence, the cost function has no non-optimal local minima and a global minimizer can be reliably otained using convex optimization.

Applications of Sparse Signal Processing (publications)

We have developed signal processing algorithms for various types of data and applications: machine fault monitoring; biomedical time-series and imaging including eye-movement analysis, near infrared spectroscopy (NIRS), optical coherence tomography (OCT), magnetic resonance imaging (MRI), digital x-ray imaging, electrocardiography (ECG), and electroencepholgray (EEG); cochlear implants; chromatography; and radar.

Demonstration of the SEDA algorithm for the suppression of background speech. The algorithm is implemented for real-time operation on an iPad. In the video, the noise reduction algorithm is alternately turned on and off. The algorithm is developed and implemented by Roozbeh Soleymani.

Resonance-based Signal Analysis (publications)

Many signals arising from physiological and physical processes are not only non-stationary but also posses a mixture of sustained oscillations and non-oscillatory transients that are difficult to disentangle by linear methods. Examples of such signals include speech, biomedical, and geophysical signals. For example, EEG signals contain rhythmic oscillations (alpha waves, etc) but they also contain transients due to measurement artifacts and non-rhythmic brain activity. This research program involves the development and application of new algorithms designed to decompose such signals into 'resonance' components - a high-resonance component being a signal consisting of multiple simultaneous sustained oscillations; a low-resonance component being a signal consisting of non-oscillatory transients of unspecified shape and duration. While frequency components are straightforwardly defined and can be obtained by linear filtering, resonance components are more difficult to define and procedures to obtain resonance components are necessarily nonlinear. Matlab software for performing the decomposition is available in the TQWT Matlab toolbox on the TQWT page.

Denoising (publications)

Algorithms for effective noise reduction can often be derived starting with a suitable signal model and formulating an appropriate cost function to be minimized. Different applications and types of signals are best described by different models and cost functions. Formulating a model that well matches the data is important in the development of noise reduction algorithms. At the same time, the cost function should be one that can be reliably minimized, which entails the study of optimization algorithms.

The sparsity-assisted signal smoothing (SASS) method models a signal as the sum of a low-frequency signal and a signal with a sparse order K derivative (i.e., with jump discontinuities in its order K-1 derivative). The method combines conventional low-pass filtering and higher-order total variation denoising. This method generalizes the LPF/TVD method which assumes the signal itself has jump discontinuities, and the polynomial approximation/total variation (PATV) method which assumes the signal is well approximated by a polynomial. The approach is generalized in the transient artifact reduction algorithm (TARA).

Sparsity can be an effective approach to model some types of signals. However, some signals are better described by group sparsity rather than pure sparsity. (The significant values tend to form clusters or 'groups'.) For example, the spectrogram of speech signal exhibits group sparsity (although the exact group boundaries are unknown a priori). The overlapping group sparsity (OGS) approach is suitable for this type of signal. The OGS method was originally formulated in terms of convex regularization. The non-convex OGS is an improved version which uses non-convex regularization (but maintains the convexity of the cost function). Group sparsity can be combined with other techniques, as in group sparse total variation denoising.

Usually, sparsity is useful only when used in conjunction with a suitable transform, because most signals are not themselves sparse. For many signals, wavelet transforms are effective at providing a sparse signal representation. Scalar wavelet-domain thresholding tends to introduce artifacts, bivariate thresholding aims to improve upon scalar thresholding. A more advanced method is to combine wavelet and total variation (WATV).

Wavelet Transforms (publications)

The simplest wavelet transform for multidimensional data is the critically-sampled separable wavelet transform. This conventional wavelet transform uses 1D wavelet transforms in each dimension. However, the performance of wavelet-based image processing algorithms can be consistently improved by using more advanced wavelet transforms. There are several advances in the design of wavelet transforms that lead to substantially improved performance. For example, the undecimated wavelet transform, the steerable pyramid, and curvelet transform all give improved results in applications involving multidimensional data. We have worked on the development of several types of wavelet transforms. We have built upon the dual-tree complex transform, an geometrically oriented wavelet transform shown to be highly beneficial for multidimensional signal processing. This transform has several advantages over the conventional multidimensional wavelet transform: (1) near shift invariance, (2) directional selectivity, and (3) improved energy compaction. We have also developed other expansive wavelet transforms (ones that transforms an N-point signal into M expansion coefficients with M > N) that possess properties not possible with a critically-sampled transform.

Design of Wavelet Frames

Tunable Q-factor wavelet transform (TQWT)
Rational-dilation wavelet transforms (RADWT)
Complex Wavelet Packets. Paper (pdf). Software (zip).
A higher-density discrete wavelet transform
Symmetric wavelet frames
The double-density dual-tree DWT
The double-density DWT
Smooth wavelet tight frames with zero moments

Design of Complex Wavelets

Hilbert transform pairs of wavelet bases

Design of Multiwavelets

Other Wavelet Design

Filter Design (publications)

Probability Models

Non-Gaussian probability models are important for achieving high quality results in wavelet-based image and video processing. Such probability models are the starting point for the derivation of effective non-linear processing rules. The simplest method for wavelet-based denoising is a thresholding rule. More advanced approaches begin with a probability model for the wavelet coefficients and then obtain an estimator via Bayesian estimation techniques, such as the MAP or MMSE estimator. For denoising, several issues then arise: (1) What kind of probability distributions accurately represent the wavelet coefficients? (2) How do we estimate the parameters of that distribution from noisy data? (3) How do we estimate the noise-free wavelet coefficients from the noisy wavelet coefficients using this probability model?

We are developing non-Gaussian probability models and algorithms for wavelet-based denoising. While the multivariate Gaussian model is relatively easy to work with, it is usually not accurate for wavelet-domain modeling natural signals. This is because wavelet coefficients of natural imagery are generally (1) heavy-tailed, (2) approximately uncorrelated, and (3) strongly dependent on adjacent coefficients (in scale and spatially). These are characteristics the multivariate Gaussian can not achieve. We have developed fast and effective non-linear denoising algorithms based on multivariate Laplacian distributions and we are now working with mixtures of such distributions.

Based on this research, we have developed software for denoising digital X-ray images that is used in a dental X-ray sensor manufactured by AFP Inc, a company that sells X-ray machines and related imaging equipment. This software implements a fast real-time algorithm for the reduction of signal-dependent noise in digital dental X-ray images using a wavelet-based nonlinear signal processing algorithm.

Video Coding using a 3-D Motion-Selective Wavelet Transform

The transmission, storage, and related processing of video are important technical problems in today's society. Progress related to the efficient representation of video is of increasing importance for personal mobile electronics and for Internet applications. Video coding algorithms are often based on motion-compensated predictive coding. This project involves an original approach for video representation that is based on recent developments in wavelet theory: recently developed transforms designed to overcome basic problems that degrade the performance of the wavelet transform when it is applied to multidimensional data using the standard separable implementation.

The conventional separable 3-D wavelet transform is rarely used for video compression because it mixes 3-D orientations in its subbands; this artifact reduces the effectiveness of the separable transform for providing an efficient representation of video. However, the new 3-D wavelet transform is free of the mixing artifact and gives a meaningful multi-scale decomposition for video. With the new transform, it is more likely that the multiresolution framework, which has proved very effective for image compression and efficient feature extraction, can also be effectively applied to video representation. The new transform isolates motion in different directions in separate subbands, so the direction of motion can be inferred to some degree from the wavelet coefficients.

This research is in collaboration with Professor Yao Wang.