Home /  PSDS Seminar: Efficient Sampling and Parameter Estimation for Mallows Models via Metric Learning

Seminar

PSDS Seminar: Efficient Sampling and Parameter Estimation for Mallows Models via Metric Learning May 01, 2025 (11:00 AM PDT - 12:00 PM PDT)
Parent Program:
Location: SLMath: Eisenbud Auditorium, Online/Virtual
Speaker(s) Yeganeh Alimohammadi (University of California, Berkeley)
Description No Description
Keywords and Mathematics Subject Classification (MSC)
Primary Mathematics Subject Classification No Primary AMS MSC
Secondary Mathematics Subject Classification No Secondary AMS MSC
Video
No Video Uploaded
Abstract/Media

Zoom Link

Models of preference and ranking aim to capture patterns of ordered structure arising across diverse contexts—from voting and recommendation systems to genetics and sports competitions, yet they often rely on fixed, pre-specified notions of distance between rankings. In practice, however, rankings differ in context-specific ways, suggesting that it is natural to let the data itself guide the choice of distance.  Motivated by this, we study the Mallows model, in which permutations $\pi$ are sampled proportional to $\exp(-\beta d(\pi, \sigma))$, parameterized by a central ranking $\sigma$ and a dispersion parameter $\beta$, where distance function $d$ can be learned from data. 

We analyze a general class of distance metrics based on $L_\alpha$ norms, defined as $d_\alpha(\pi, \sigma) = \sum_{i=1}^n |\pi(i) - \sigma(i)|^\alpha$. For any $\alpha \geq 1$ and $\beta > 0$, we develop a Polynomial-Time Approximation Scheme (PTAS) that achieves two main objectives: (i) approximating the partition function $Z_n(\beta, \alpha)$ within a multiplicative factor of $1 \pm \epsilon$, and (ii) efficiently generating samples that are $\epsilon$-close (in total variation distance) to the true distribution. Leveraging these approximations, we propose a computationally efficient Maximum Likelihood Estimation (MLE) algorithm capable of jointly inferring the central ranking, the dispersion parameter, and the optimal distance metric. We validate our methods empirically on datasets from American College Football and Basketball, demonstrating both theoretical rigor and practical effectiveness.

No Notes/Supplements Uploaded No Video Files Uploaded