This tutorial provides an introduction to a rapidly evolving topic: the theory of negative dependence and its numerous ramifications in machine learning. Indeed, negatively dependent probability measures provide a powerful tool for modeling non-i.i.d. data, and thus can impact all aspects of learning, including supervised, unsupervised, interpretable, interactive, and large-scale setups. The most well-known examples of negatively dependent distributions are perhaps the Determinantal Point Processes (DPPs), which have already found numerous ML applications. But DPPs are just the tip of the iceberg; the class of negatively dependent measures is much broader, and given the vast web of mathematical connections it enjoys, its holds great promise as a tool for machine learning. This tutorial exposes the ML audience to this rich mathematical toolbox, while outlining key theoretical ideas and motivating fundamental applications. Tasks that profit from negative dependence include anomaly detection, information maximization, experimental design, validation of black-box systems, architecture learning, fast MCMC sampling, dataset summarization, interpretable learning. Time permitting, we will also touch up recent related breakthrough on the broader class of so-called "completely log-concave" polynomials.
This tutorial will provide an introduction to the rich area of negatively dependent probability measures. In particular, the tutorial will outline key conceptual points in the theory of negative dependence in discrete probability, while motivating the topic by fundamental machine learning tasks (including supervised, unsupervised, interpretable, interactive, and large-scale setups). The foundations established by the tutorial will help the audience gain a new perspective on many ML problems, and accouter them with a new set of tools for their modeling and algorithmic repertoire.
Specifically, we focus on Strongly Rayleigh (SR) probability measures. These were introduced in the landmark paper of Borcea et al., (2009), who developed a rich theory of negatively dependent (ND) measures. The ND property reflects a ``repelling'' nature of items, a property that occurs more broadly across probability theory, combinatorial optimization, physics, and other fields. However, negative association is a subtle property, and the class of SR measures provides the most useful characterization of this concept by grounding it in the fascinating theory of multivariate stable polynomials. The most notable examples of SR measures are Determinantal Point Processes (DPPs), which have received significant attention in machine learning too. However, the full potential of SR measures remains to be developed and discovered, and by grounding our tutorial in the mature use of DPPs, while outlining SR measures, we hope to raise awareness in the community and spur future developments. The key driver behind this thinking is a theoretical observation of Borcea et al., that the class of SR measures is exponentially larger than DPPs
Our key emphasis, however, will be on core concepts. Hence, we will not cover detailed proofs, but will strive to highlight the rich variety of mathematical ideas underlying the theory of negative dependence, DPPs, SR measures, and related concepts, thereby helping the audience understand, appreciate, apply, and enhance the material being presented. We outline below a rough table-of-contents of the material this tutorial plans to cover.