Workshop on Optimal Transport and Optimization in Imaging: Program

Workshop on Optimal Transport and Optimization in Imaging

Home

Registration

Venue

Tuesday 11th

9h15-9h30: Welcoming

9h45-10h30: C.-B. Schönlieb and Y. Boink. Combined modelling of optimal transport and segmentation. Abstract

Coffee break

11h00-11h45: N. Bonneel. Wasserstein Barycentric Coordinates. Abstract

11h45-12h15: E. Cazelles. Regularized Wasserstein barycenter. Abstract

The concept of barycenter in the Wasserstein space allows the dention of a notion of Frechet mean of a set of probability measures. However, depending on the considered data, such barycenters may be irregular. In this paper, we thus introduce a convex regularization of Wasserstein barycenters for random measures supported on ℝ^d. We prove the existence and uniqueness of such barycenters for a large class of regularizing functions. A stability result of regularized barycenters in terms of Bregman divergence associated to the convex regularization term is also given. Hence, we study the case of data made of i.i.d. random probability measures. In particular, we prove the convergence in Bregman divergence of the regularized empirical barycenter of a set of n random probability measures towards its population counterpart, and we discuss its rate of convergence. This approach is shown to be appropriate for the study of discrete or absolutely continuous random measures. We then focus on the analysis of probability measures supported on the real line. In this setting, we propose an ecient minimization algorithm based on accelerated gradient descent for the computation of regularized Wasserstein barycenters. This approach is nally applied to the statistical analysis of simulated and real data sets.

Lunch break

14h00-14h45: F. Iutzeler. Modified fixed points iterations and applications to randomized and accelerated optimization algorithms. Abstract

Monotone operators have shown to be very interesting mathematical objects and enabled the derivation of efficient algorithms. In this talk, we will rely on simple fixed points iterations of an operator with some contraction property enabling monotone convergence to some fixed point. Then, we will see how deterministic and stochastic variations of these iterations enable to derive randomized or accelerated versions of popular algorithms such as the proximal gradient or ADMM. More precisely, we will focus on i) randomization to develop stochastic optimization algorithms; and ii) inertia to accelerate the convergence.

14h45-15h30: C. Boyer. Adapting to unknown noise level in super-resolution. Abstract

We study sparse spikes deconvolution over the space of complex-valued measures when the input measure is a finite sum of Dirac masses. We introduce a new procedure to handle the spike deconvolution when the noise level is unknown. Prediction and localization results will be presented for this approach. An insight on the probabilistic tools used in the proofs could be briefly given as well.

15h30-16h00: C. Poon. Geometric properties of solutions to the total variation denoising problem. Abstract

Total variation (TV) denoising has been extensively studied in imaging sciences since its introduction by Rudin, Osher and Fatemi in 1992. However, the majority of theoretical works TV regularization are actually concerned with solutions to total variation denoising in the absence of noise. For instance, it is known that the jump set of the regularized solutions are contained in the jump set of the original data, but this knowledge is fairly meaningless in the presence of noise since the noisy function can introduce arbitrarily many new discontinuities. Furthermore, works that consider the impact of noise on TV regularized solutions typically derive integral estimates such as L^2 error bounds or bounds on the total variation of the solution. However, such estimates do not inform on the proximity of the gradient support of the regularized solution to that of the clean function.
This talk is concerned with the impact of noise on the regularized solutions. In particular, we consider stability in the gradient support of solutions and address the following question: Given a small neighbourhood around the support of Df, when will the gradient support of the regularized solution be contained inside this neighbourhood if the noise function has sufficiently small L^2 norm?

Coffee break

16h30-17h15: B. Wirth. Optimal design of transport networks. Abstract

Several applications in biology and engineering are concerned with the optimal transportation of substances from source locations to sink locations. As opposed to classical optimal transport, models for transport networks take into account that it is more efficient to transport material in bulk. The resulting optimal transport networks typically have a branching structure. We discuss different model formulations and their equivalence as well as the geometry of (almost) optimal networks, which can be analyzed by proving energy scaling laws in the regime of small preference for bulk transport.

17H15-17h45: L. Chizat. Scaling algorithm for unbalanced optimal transport. Abstract

This talk has three aims: (i) to describe the so-called unbalanced optimal transport problem, (ii) to introduce a scaling algorithm which generalizes Sinkorn's algorithm and solves this problem with a linear convergence rate, and (iii) to showcase applications which range from shape interpolation and colour transfer to the simulation of very degenerate evolution PDEs.
This is joint work with Gabriel Peyré, Bernhard Schmitzer and François-Xavier Vialard.

Wednesday 12th

9h45-10h30: B. Levy. Semi-discrete Optimal Transport in 3D: A Numerical Algorithm and Some Applications. Abstract

Semi-discrete algorithms seem to be a promising avenue for developing efficient numerical methods for solving the Monge-Ampère equation: Optimal Transport between a piecewise linear density and a sum of Dirac masses can be directly translated into a numerical algorithm, that maximizes a concave function [Alexandrov, Aurenhammer, Mc Cann & Gangbo, Merigot]. The gradient and Hessian of this objective function involve integrals over the intersection between a Laguerre diagram and the tetrahedral mesh that supports the input source piecewise linear density. I will present an efficient algorithm to compute these integrals, as well as its use by a Newton-type solver [Kitagawa,Merigot,Thibert], and some applications in fluid dynamics [Gallouet Merigot].

Coffee break

11h00-11h45: A. Gramfort. Optimal transport for statistics problem in neuroimaging. Abstract

Knowing how the Human brain is anatomically and functionally organized at the level of a group of healthy individuals or patients is the primary goal of neuroimaging research. Yet a number of statistical challenges emerge when processing functional brain imaging data such as functional MRI (fMRI), electroencephalography (EEG) or magnetoencephalography (MEG).
One of them is computing an average of brain imaging data defined over a voxel grid or a triangulation in order to estimate a prototypical brain activation map for the population. The problem is that data are large, the geometry of the brain is complex and the between subjects variability leads to spatially or temporally non-overlapping activation foci. To address the problem of variability, data are commonly smoothed before averaging, but this goes against the objective of brain research to achieve better spatiotemporal resolution. In this talk, I will show how using optimal transport and Wasserstein barycenters one can average fMRI and MEG brain activations without blurring the results. A specifity of the approach is to be able to handle non-normalized data.
In a second part of the talk, I will present some work in progress where we use optimal transport with entropic smoothing and some capacity constraint to address the problem of domain adaptation (DA) in machine learning. In biomedical applications shifts in distributions between train data and test data are quite common. For example test data could be coming from a different machine or from a very different cohort of patients. This is the problem that DA aims to solve and we propose to use optimal transport for this.

11h45-12h15: K. Modin. The polar decomposition as the limit of a lifted entropy gradient flow. Abstract

The work of Yann Brenier shows that optimal mass transport (OMT) in the L² sense is intrinsically tied to polar decompositions. As a toy problem, one may consider OMT in the category of multivariate Gaussian densities and the associated polar decomposition A=PQ of real matrices. In this talk I show that the P component can be obtained as the limit of a gradient flow with respect to a lifted relative entropy functional, thereby providing a new, geometric proof of the polar decomposition of matrices. I shall also present a corresponding infinite-dimensional lifted entropy gradient flow on the space of convex functions. Here, the first component in the polar decomposition of maps is an equilibrium. Furthermore, I give a result hinting that the infinite-dimensional flow has a unique limit in the smooth category if the destination marginal is log-convex.

Lunch break

14h00-14h45: E. Chouzenoux. Convergence analysis of a stochastic majorize-minimize memory gradient algorithm. Abstract

Stochastic approximation techniques play a prominent role in solving many large scale problems encountered in machine learning or image/signal processing. In these contexts, the statistics of the data are often unknown a priori or their direct computation is too intensive, and they have thus to be estimated online from the observations. For batch optimization of an objective function being the sum of a data fidelity term and a penalization (e.g. a sparsity promoting function), Majorize-Minimize (MM) methods have recently attracted much interest since they are fast, highly flexible, and effective in ensuring convergence. The goal of this work is to show how these methods can be successfully extended to the case when the data fidelity term corresponds to a least squares criterion and the cost function is replaced by a sequence of stochastic approximations of it. In this context, we propose an online version of an MM subspace algorithm and we establish its convergence by using suitable probabilistic tools. We also provide new results on the convergence rate of such kind of algorithm. Numerical results illustrate the good practical performance of the proposed algorithm associated with a memory gradient subspace, when applied to both non-adaptive and adaptive linear system identification scenarios.

14h45-15h15: P. Tan. Accelerated Alternating Descent Methods for Dystra-like Problems. Abstract

In this talk, we present an alternating descent scheme for problems involving nonsmooth objectives with a quadratic coupling of the variables. Our algorithm performs in each variable several descent steps, which improves the performances compared to a single step. Thanks to a FISTA-like trick, this scheme can also be accelerated, An application of this work is the implementation of a fast parallel solver for the proximity operator of the Total Variation for color images. In this case, this method can be interpreted as an inexact scheme.

15h15-15h45: Q. Denoyelle. Support Recovery for Sparse Super-resolution of Positive Measures. Abstract

We study sparse spikes super-resolution over the space of Radon measures on the real line or the torus when the input measure is a finite sum of positive Dirac masses using the BLASSO convex program. We focus on the recovery properties of the support and the amplitudes of the initial measure in the presence of noise as a function of the minimum separation of the input measure (the minimum distance between two spikes).

15H45-16h15: M. Schmitz. Feature learning using Optimal Transport. Abstract

Accounting for and correcting an instrument’s Point Spread Function (PSF) is a key element in many astronomical missions, including the upcoming Euclid survey where one of the key objectives is to perform galaxy shape measurements for weak lensing. Because of the non-linear way in which the PSF varies across the field of view, and the way in which Optimal Transport distances account for geometric warping, using them for PSF-related problems is a very promising prospect. Using recent numerical schemes to compute approximate Wasserstein distances, and thus Wasserstein Barycentres (Sliced Wasserstein Barycentres and entropic penalty), we perform feature learning on images, and apply it in particular to PSFs and galaxy images.

Coffee break and conclusion of the Workshop

CNRS Défi Imag’In Cavalieri

CAlcul des Variations pour L'Imagerie, l'Edition et la Recherche d'Images

Tuesday 11th

Wednesday 12th