MDL Intelligence — Bibliography

§1 — Foundational Theory

9 sources

#01 FOUNDATION

A Mathematical Theory of Communication

Shannon, C.E.

Bell System Technical Journal, Vol. 27, pp. 379–423, 1948

The seminal paper establishing information theory, entropy, and the fundamental limits of data compression — the bedrock on which MDL is built.

Read PDF (Harvard)

#02 FOUNDATION

A Mathematical Theory of Communication (Wiley DOI)

Shannon, C.E.

Bell System Technical Journal, 27: 379–423 · doi:10.1002/j.1538-7305.1948.tb01338.x

Official DOI link to the original 1948 Bell System Technical Journal publication of Shannon's foundational paper.

View at Wiley

#03 PAPER

Modeling by Shortest Data Description

Rissanen, J.

Automatica, 14(5):465–471, 1978 · ACM DL

The original 1978 paper introducing MDL as a formal principle — the founding text of the entire field.

View at ACM DL

#04 PAPER

Modeling by Shortest Data Description (ScienceDirect)

Rissanen, J.

Automatica, Vol. 14, Issue 5, 1978 — ScienceDirect

Full-text access to Rissanen's original Automatica 1978 paper via ScienceDirect.

View at ScienceDirect

#05 PAPER

Stochastic Complexity and Modeling

Rissanen, J.

Annals of Statistics, Vol. 14, No. 3, pp. 1080–1100, 1986

Introduces the notion of stochastic complexity as the formalization of MDL with optimal universal coding foundations.

Read PDF

#06 PAPER

Stochastic Complexity and the MDL Principle (JSTOR)

Rissanen, J.

Annals of Statistics, 1986 — JSTOR stable link

JSTOR stable entry for Rissanen's stochastic complexity paper — useful for academic citation tracing.

View at JSTOR

#07 PAPER

Some Notes on Rissanen's Stochastic Complexity

Various

IEEE Transactions on Information Theory, 1998 · doi:10.1109/18.661521

Derives a two-part code version of stochastic complexity showing Fisher-information-optimal quantization; connects MDL to robust regression.

View on Academia.edu

#08 FOUNDATION

Information, Physics, Quantum: The Search for Links ('It from Bit')

Wheeler, J.A.

Proc. 3rd Int. Symposium on Foundations of Quantum Mechanics, Tokyo, 1989

Wheeler's 'it from bit' conjecture: every physical quantity derives from binary yes-or-no observations — a key philosophical foundation of the HS(p)/Mendozian Program.

View at History of Information

#09 FOUNDATION

It from Bit (Plus Maths explainer)

Wheeler, J.A. (commentary)

Plus Maths — Millennium Mathematics Project, Cambridge

Clear overview of Wheeler's 'it from bit' concept and its significance for quantum mechanics and information theory.

Read explainer

§2 — Books & Monographs

8 sources

#10 BOOK

The Minimum Description Length Principle

Grünwald, P.D.

MIT Press, 2007 · ISBN 978-0-262-07298-8

The definitive textbook on MDL — comprehensive coverage of two-part codes, NML, stochastic complexity, and applications across statistics, ML, and data mining.

View at MIT Press

#11 BOOK

The Minimum Description Length Principle (ACM DL)

Grünwald, P.D.

MIT Press, 2007 — ACM Digital Library entry

ACM Digital Library record for Grünwald's 2007 MDL monograph.

View at ACM DL

#12 BOOK

Advances in Minimum Description Length: Theory and Applications

Grünwald, P.D., Myung, I.J., Pitt, M.A. (eds.)

MIT Press / Neural Information Processing, 2005

Edited volume covering MDL applications in data mining, machine learning, cognitive science, and econometrics.

Read PDF

#13 BOOK

A Tutorial Introduction to the Minimum Description Length Principle

Grünwald, P.D.

arXiv:math/0406077 — MIT Press chapter preprint

Freely available comprehensive tutorial on MDL — widely cited introduction covering crude MDL, NML, universal codes, and Bayesian connections.

Read on arXiv

#14 BOOK

Kolmogorov Complexity and Algorithmic Information Theory

Hutter, M.

IDSIA Technical Report — PDF

Covers Kolmogorov complexity, Solomonoff induction, and their relationship to MDL and the universal prior.

Read PDF

#15 BOOK

Algebraic Geometry and Statistical Learning Theory

Watanabe, S.

Cambridge University Press, 2009 · ISBN 9780521864671

Introduces singular learning theory showing MDL's stochastic complexity behaves differently for singular models (neural nets, HMMs, mixture models).

View at Cambridge UP

#16 BOOK

Algebraic Geometry and Statistical Learning Theory (Google Books)

Watanabe, S.

Cambridge Monographs on Applied and Computational Mathematics, 2009

Google Books entry with preview of Watanabe's singular learning theory monograph.

Preview on Google Books

#17 BOOK

An Introduction to Kolmogorov Complexity and Its Applications

Li, M., Vitányi, P.M.B.

Springer, excerpted in Routledge chapter PDF

Covers algorithmic probability, universal codes, and MDL — the foundational mathematical text for complexity-based inference.

Read PDF excerpt

§3 — Peer-Reviewed Papers (2020–2025)

13 sources

#18 PREPRINT

MDL-Based Neural Architecture Search (arXiv 2025)

Various authors

arXiv:2505.13398 · 2025

Uses MDL as a criterion for neural architecture selection — achieves state-of-art compression vs. baseline methods across benchmarks.

Read on arXiv

#19 PREPRINT

MDL for Anomaly Detection in Time Series (arXiv 2020)

Various authors

arXiv:2007.14009 · 2020

Applies MDL-based two-part codes to detect anomalies and structural breaks in multivariate time series with better precision than statistical baselines.

Read on arXiv

#20 PAPER

Reservoir Computing with the MDL Principle

Various authors

Chaos: An Interdisciplinary Journal of Nonlinear Science, Vol. 35, 043132 · 2025

Uses MDL to select sparse readout subsets in echo-state networks — improves prediction of chaotic systems (Lorenz, Rössler, Thomas) vs. ridge regression.

Read at AIP

#21 PAPER

MDL for Representation Learning (NeurIPS 2023)

Various authors

Advances in Neural Information Processing Systems 36 (NeurIPS 2023)

Demonstrates MDL-based representation learning achieves superior generalization and interpretability vs. cross-entropy baselines on benchmark datasets.

Read at NeurIPS

#22 PAPER

MDL for LLM Reasoning Evaluation (ACL 2024)

Various authors

Proceedings of ACL 2024, Long Papers

Proposes an MDL-inspired IBE-Eval framework to assess the quality of LLM-generated explanations — measures parsimony and coherence of reasoning chains.

Read at ACL Anthology

#23 PAPER

Bayesian Compression for Deep Learning (NeurIPS 2017)

Louizos, C., Ullrich, K., Welling, M.

Advances in Neural Information Processing Systems 30 (NIPS 2017)

Shows that Bayesian sparsity-inducing priors implement MDL-style compression in neural networks — achieves state-of-art compression on LeNet and VGG architectures.

Read PDF (NIPS)

#24 PAPER

Bayesian Compression for Deep Learning (ACM DL)

Louizos, C., Ullrich, K., Welling, M.

NeurIPS 2017 — ACM Digital Library

ACM Digital Library entry for the Louizos et al. Bayesian compression paper.

View at ACM DL

#25 PAPER

MDL and Data Mining: DKDD 2019 Tutorial Slides

Vreeken, J.

MDL+DM Workshop, KDD 2019 — CISPA

Tutorial slides covering modern MDL for data mining, including the Krimp algorithm and pattern selection by minimum description length.

Read PDF (CISPA)

#26 PAPER

Modern MDL Meets Data Mining — Insights, Theory, and Practice

Vreeken, J. et al.

CISPA Publications, 2021

Comprehensive tutorial on modern MDL for data mining with theory and practical guidance on model selection and pattern mining.

View at CISPA

#27 PAPER

A Tutorial Introduction to MDL — Grünwald 2004

Grünwald, P.D.

CWI Technical Report, 2004 — PDF mirror

Earlier 2004 tutorial version (precursor to the 2007 book) — more accessible entry point into MDL theory.

Read PDF

#28 PAPER

Making Long-Context Language Models Better Multi-Hop Reasoners

Li, Y., Liang, S., Lyu, M., Wang, L.

ACL 2024 Long Papers, pp. 2462–2475, Bangkok

Introduces Reasoning with Attributions to improve multi-hop reasoning in LMs — related to MDL parsimony in chain-of-thought evaluation.

Read at ACL Anthology

#29 PAPER

Chain-of-Knowledge (CoK) Prompting for LLM Reasoning

Various authors

ACL 2024 Long Papers

Proposes structured knowledge triple reasoning chains to improve LLM faithfulness — related to MDL-optimal explanation compression.

Read at ACL Anthology

#30 PAPER

C4.5: Programs for Machine Learning (Quinlan 1993 excerpt)

Quinlan, J.R.

Morgan Kaufmann — PDF excerpt

Foundational decision tree learning — uses information-theoretic (MDL-adjacent) criteria for tree pruning and model selection.

Read PDF

§4 — Video Lectures & Talks

7 sources

#31 VIDEO

Minimum Description Length — Georgia Tech / Udacity

Georgia Tech Machine Learning Course

YouTube · Udacity

Short lecture on MDL fundamentals: two-part codes, model complexity, and application to supervised learning. Widely used in ML courses.

Watch on YouTube

#32 VIDEO

MDL Lecture — IIT Madras (Part 1)

IIT Madras NPTEL

YouTube · IIT Madras

University lecture covering MDL from first principles: Kolmogorov complexity, Shannon entropy, and the formal derivation of the two-part code.

Watch on YouTube

#33 VIDEO

MDL Lecture — IIT Madras (Part 2)

IIT Madras NPTEL

YouTube · IIT Madras

Continuation covering NML, stochastic complexity, and practical MDL model selection procedures.

Watch on YouTube

#34 VIDEO

MDL and the Free Energy Principle — PIBBSS Symposium 2024

Presenter at PIBBSS Symposium

PIBBSS (Principles of Intelligent Behavior in Biological and Social Systems) 2024

Explores connections between MDL, Friston's Free Energy Principle, and active inference — directly relevant to the H² Framework's cross-domain unification.

View PIBBSS 2024

#35 VIDEO

MDL and Neural Networks — Expert Talk

Goldsmith, J.

YouTube

Expert discussion of how MDL applies to neural network model selection and its relationship to regularization techniques.

Watch on YouTube

#36 VIDEO

MDL for AI Robustness — Expert Talk 2024

Various

YouTube · 2024

Discussion of MDL applied to robust model selection and out-of-distribution generalization in modern AI systems.

Watch on YouTube

#37 VIDEO

Information Theory and MDL — AI Roots Series 2023

Various

YouTube · 2023

Traces MDL's roots in information theory — covers Shannon, Kolmogorov, and Rissanen in accessible form for AI practitioners.

Watch on YouTube

§5 — Wikipedia & Reference Entries

3 sources

#38 WIKI

Minimum Description Length — Wikipedia

Wikipedia contributors

English Wikipedia

Comprehensive encyclopedic overview of MDL including history, formulations, and applications — good for quick cross-referencing.

Read on Wikipedia

#39 WIKI

Jorma Rissanen — Wikipedia

Wikipedia contributors

English Wikipedia

Biography of Jorma Rissanen — the Finnish mathematician and information theorist who created the MDL principle in 1978.

Read on Wikipedia

#40 WIKI

A Mathematical Theory of Communication — Wikipedia

Wikipedia contributors

English Wikipedia

Overview of Shannon's 1948 paper including historical context, key theorems, and long-term impact on compression and MDL.

Read on Wikipedia

§6 — Websites, Research Portals & Institutional Sources

5 sources

#41 WEBSITE

Sumio Watanabe — Singular Learning Theory Homepage

Watanabe, S.

Tokyo Institute of Technology / Google Sites

Watanabe's official research homepage explaining singular learning theory and its relationship to MDL stochastic complexity in deep learning.

Visit homepage

#42 WEBSITE

Emergent Mind — MDL Topic Hub

Emergent Mind

emergentmind.com

AI-curated topic hub aggregating recent arXiv papers on MDL — useful for tracking current literature trends.

Visit hub

#43 WEBSITE

PIBBSS Symposium 2024 — Conference Page

PIBBSS

pibbss.ai

Official page for the 2024 PIBBSS Symposium on intelligent behavior in biological and social systems — features MDL/FEP cross-domain talks.

Visit PIBBSS 2024

#44 WEBSITE

Jorma Rissanen IEEE Information Theory Obituary

IEEE Information Theory Society

ITSOC Newsletter, Sept 2020

Official IEEE obituary for Rissanen — includes quotes and assessment of his contributions to MDL and information theory.

Read PDF (IEEE)

#45 WEBSITE

CWI MDL Principle PDF — Full Text

Grünwald, P.D.

CWI Amsterdam — ir.cwi.nl

Full PDF of the MIT Press MDL monograph hosted by CWI Amsterdam — the most accessible complete source.

Read PDF (CWI)

No sources match your search. Try different keywords.