Abstract
Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary approaches to understanding and addressing complexity. This paper investigates the synergy between these disciplines in two directions: AIT for Kernel Methods and Kernel Methods for AIT. In the former, we explore how AIT concepts inspire the design of kernels that integrate principles like relative Kolmogorov complexity and normalized compression distance (NCD). We also leverage the Loss Rank Principle (LoRP) to learn kernels in the context of Kernel Density Estimation (KDE), extending the applicability of AIT-inspired techniques to flexible, nonparametric models. In the latter, we show how kernel methods can be used to approximate measures such as NCD and Algorithmic Mutual Information (AMI), providing new tools for compression-based analysis. Furthermore, we demonstrate that the Hilbert-Schmidt Independence Criterion (HSIC) approximates AMI, offering a robust theoretical foundation for clustering and other dependence-measurement tasks. Building on our previous work introducing Sparse Kernel Flows from an AIT perspective, we extend these ideas to unsupervised learning, enhancing the theoretical robustness and interpretability of ML algorithms. Our results demonstrate that kernel methods are not only versatile tools for ML but also crucial for bridging AIT and ML, enabling more principled approaches to unsupervised learning tasks.
Keywords. Machine Learning, Algorithmic Information Theory, Clustering, Density Estimation, Kernel Methods, Learning Kernels from Data, Minimum Description Length Principle (MDL), Compression, Similarity, The Loss Rank Principle (LoRP), Algorithmic Mutual Information (AMI), Normalized Information Distance (NID), Normalized Compression Distance (NCD), Hilbert-Schmidt Independence Criterion (HSIC).
Authors
Marcus Hutter
Venue
Physica D - Nonlinear Phenomena