Near-Optimal Compression in Near-Linear Time
[Virtual] Hot Topics: Foundations of Stable, Generalizable and Transferable Statistical Learning March 07, 2022 - March 10, 2022
Location: SLMath: Online/Virtual
compression
MCMC
kernel
coresets
maximum mean discrepancy
Near-Optimal Compression In Near-Linear Time
We introduce Kernel Thinning-Compress++ (KT-Compress++), an algorithm based on two new procedures for compressing a distribution P nearly optimally and more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel k and O(n log^3 n) time, KT-Compress++ compresses an n-point approximation to P into a sqrt(n)-point approximation with better than Monte Carlo integration error rates for functions in the associated reproducing kernel Hilbert space (RKHS). First we show that for any fixed function KT-Compress++ provides dimension-free guarantees for any kernel, any distribution, and any fixed function in the RKHS. Second, we show that with high probability, the maximum discrepancy in integration error is O_d(n^{-1/2} sqrt(log n)) for compactly supported P and O_d(n^{-1/2} (log n)^{(d+1)/2} sqrt(loglog n)) for sub-exponential P on R^d. In contrast, an equal-sized i.i.d. sample from P suffers at least n^{-1/4} integration error. Our sub-exponential guarantees nearly match the known lower bounds for several settings, and while resembling the classical quasi-Monte Carlo error rates for uniform P on [0,1]^d they apply to general distributions on R^d and a wide range of commonly used universal kernels. En route, we introduce a new simple meta-procedure Compress++, that can speed up any thinning algorithm while suffering at most a factor of 4 n error. In particular, Compress++ enjoys a near-linear runtime given any quadratic-time input and reduces the runtime of super-quadratic algorithms by a square-root factor. We also use our results to derive explicit non-asymptotic maximum mean discrepancy bounds for a range of kernels including Gaussian, Matern, Laplace, and B-spline kernels. Finally, we present several vignettes illustrating the practical benefits of KT-Compress++ over i.i.d. sampling and standard Markov chain Monte Carlo thinning with challenging differential equation posteriors in dimensions d = 2 to 100.
Near-Optimal Compression in Near-Linear Time
|
Download |
Near-Optimal Compression In Near-Linear Time
Please report video problems to itsupport@slmath.org.
See more of our Streaming videos on our main VMath Videos page.