Statistical Optimal Transport | MIT Skoltech Program

PI: Philippe Rigollet, Department of Mathematics, MIT

Abstract

As data-driven discovery drives more and more fields in the science and engineering disciplines, extracting knowledge from large data sets has become one of the most fundamental challenges of the 21st-century. Along with these new troves of data comes the promise of new achievements that were out of reach just a decade ago. New datasets are not only larger but also more complex. As such, they not only call for faster methods to process them but also lead us to completely rethink the way data should be modeled by better understanding their high-dimensional structure. This proposal aims at developing methods that better extract the underlying geometric structure of the data in order to achieve better predictive performance and more efficient and confident information extraction.

Report

Project Title: Statistical Optimal Transport
Principal Investigator: Philippe Rigollet, Department of Mathematics, Massachusetts Institute of Technology
Grant Period: March 2018 – February 2019

The project supported by the MIT-Skoltech program seed fund concerned research, training, collabora- tions and dissemination around the theme of statistical optimal transport. The main scientific goal was to develop new techniques as well as a better understanding of existing data analytic techniques using tools from optimal transport. The proposal identified two main goals:

1. Understanding the effect of data sampling on these tools and overcome limitations via new regulariza- tion techniques
2. Develop faster algorithms that scale with the size of modern data.

In this report, we detail scientific advances, including publications as well as the impact of the project on MIT-Skoltech collaborations.

1. Scientific output

The project has led to progress on both fronts.

1.1 Data-driven optimal transport

At the start of this project, the overwhelming majority of optimal transport (OT) based techniques arising in machine learning were based on naive plug-in techniques where sampled data simply replaces the correspond- ing population quantities in theoretic OT objects. While this technique is certainly simple, it suffers from many limitations, most prominently, it suffers from the curse of dimensionality [7] which prevents scaling it to even moderately sized data. To overcome this limitation, we imposed regularity on several aspects of the problems: the coupling [2, 3], the marginal distributions [8] and the intrinsic dimension of the data [6]. All these results lead to new methods with better behavior than vanilla OT and can be applied in a variety of settings. We have also developed a better understanding of the statistical role of entropic regularization: it brings robustness to measurement error [4]. Finally, with minimal assumptions, we have shown how optimal transport can be used to solve an apparently impossible problem, uncoupled regression, which is a frame- work for data integration that has the potential to give statisticians and data scientists access to seemingly unrelated databases but of very large size [5].

1.2 Faster algorithms to compute Wasserstein distances

Our initial results on the algorithmic performance of entropically regularized OT was recently extended to yield even better results when the data is known to live in low dimension (such as images) using modern subsampling techniques. This algorithm displays both the strongest available theoretical guarantees and better empirical performance compared to state-of-the-art [1].

2. Dissemination

The findings of this project were presented in various seminars, conference and workshops where the inter- national community around statistical optimal transport is rapidly growing.
Rigollet was invited to give the prestigious St Flour Summer school in Probability in the summer of 2019. He will present the finding of this project in an eponym course titled “Statistical Optimal Transport”.

3. MIT-Skoltech collaboration

The highlight of the MIT-Skoltech collaboration was a workshop taking place in July 2018 at Skoltech organized by the PI together with host Vladimir Spokoiny (Skoltech) around the theme of this project: Statistical Optimal Transport.
Read more about Statistical Optimal Transport Conference

The PI travelled there with one PhD student (Jonathan Weed) and one postdoc (Adan Forrow) and made connections with several local researchers. For example Thibaut LeGouic, who gave a mini-course at this workshop will be visiting Rigollet at MIT for the academic year 2019-20 and work on questions that were initiated in the scope of this project. He is also actively collaborating with Quentin Paris (HSE) who was also attending this workshop.
Rigollet has also organized two workshops where several members of the Spokoiny group were invited: Vladimir Spokoiny but also Franz Besold, Darina Dvinskikh, Pavel Dvurechensky, Alexander Gasnikov, Alexey Kroshnin and Alexandra Suvorikova.

References

Jason Altschuler, Francis Bach, Alessandro Rudi, and Jonathan Weed, Massively scalable sinkhorn distances via the nystro¨m method, arXiv:1812.05189 (2018).
Aden Forrow, Jan-Christian Hu¨tter, Mor Nitzan, Philippe Rigollet, Geoffrey Schiebinger, and Jonathan Weed, Statistical optimal transport via factored couplings, Proceedings of Machine Learning Research (Kamalika Chaudhuri and Masashi Sugiyama, eds.), Proceedings of Machine Learning Research, vol. 89, PMLR, 16–18 Apr 2019, pp. 2454–2465.
Jan-Christian Huetter and Philippe Rigollet, Minimax rates of estimation for smooth optimal transport maps, Arxiv (to appear) (2019).
Philippe Rigollet and Jonathan Weed, Entropic optimal transport is maximum-likelihood deconvolution, Comptes Rendus Mathematique 356 (2018), no. 11, 1228 – 1235.
Philippe Rigollet and Jonathan Weed, Uncoupled isotonic regression via minimum Wasserstein deconvolution, Information and Inference (to appear) (2019).
Philippe Rigollet and Jonathan Weed, Wasserstein projection pursuit, Arxiv (to appear) (2019).
Jonathan Weed and Francis Bach, Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance, arXiv:1707.00087 [math, stat] (2017).
Jonathan Weed and Quentin Berthet, Estimation of smooth densities in Wasserstein distance, COLT (2019).

Back to the list >>