https://udlbook.github.io/udlbook/

Understanding Deep Learning

by Simon J.D. Prince
To be published by MIT Press Dec 5th 2023.

  * Download draft PDF Chapters 1-21 here

    2023-11-24. CC-BY-NC-ND license
    download stats shield
  * Report errata via github or contact me directly at
    udlbookmail@gmail.com
  * Follow me on Twitter or LinkedIn for updates.

Table of contents

  * Chapter 1 - Introduction
  * Chapter 2 - Supervised learning
  * Chapter 3 - Shallow neural networks
  * Chapter 4 - Deep neural networks
  * Chapter 5 - Loss functions
  * Chapter 6 - Training models
  * Chapter 7 - Gradients and initialization
  * Chapter 8 - Measuring performance
  * Chapter 9 - Regularization
  * Chapter 10 - Convolutional networks
  * Chapter 11 - Residual networks
  * Chapter 12 - Transformers
  * Chapter 13 - Graph neural networks
  * Chapter 14 - Unsupervised learning
  * Chapter 15 - Generative adversarial networks
  * Chapter 16 - Normalizing flows
  * Chapter 17 - Variational autoencoders
  * Chapter 18 - Diffusion models
  * Chapter 19 - Deep reinforcement learning
  * Chapter 20 - Why does deep learning work?
  * Chapter 21 - Deep learning and ethics

front cover

Resources for instructors

Instructor answer booklet available with proof of credentials via MIT
Press.

Request an exam/desk copy via MIT Press.

Figures in PDF (vector) / SVG (vector) / Powerpoint (images):

  * Chapter 1 - Introduction: PDF Figures / SVG Figures / PowerPoint
    Figures
  * Chapter 2 - Supervised learning: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 3 - Shallow neural networks: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 4 - Deep neural networks: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 5 - Loss functions: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 6 - Training models: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 7 - Gradients and initialization: PDF Figures / SVG
    Figures / PowerPoint Figures
  * Chapter 8 - Measuring performance: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 9 - Regularization: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 10 - Convolutional networks: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 11 - Residual networks: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 12 - Transformers: PDF Figures / SVG Figures / PowerPoint
    Figures
  * Chapter 13 - Graph neural networks: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 14 - Unsupervised learning: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 15 - Generative adversarial networks: PDF Figures / SVG
    Figures / PowerPoint Figures
  * Chapter 16 - Normalizing flows: PDF Figures / SVG Figures /
    PowerPoint Figures
  * Chapter 17 - Variational autoencoders: PDF Figures / SVG Figures
    / PowerPoint Figures
  * Chapter 18 - Diffusion models: PDF Figures / PowerPoint Figures
  * Chapter 19 - Deep reinforcement learning: PDF Figures / SVG
    Figures / PowerPoint Figures
  * Chapter 20 - Why does deep learning work?: PDF Figures / SVG
    Figures / PowerPoint Figures
  * Chapter 21 - Deep learning and ethics: PDF Figures / SVG Figures/
    PowerPoint Figures
  * Appendices - PDF Figures / SVG Figures / Powerpoint Figures

Instructions for editing figures / equations can be found here.

Resources for students

Answers to selected questions: PDF

Python notebooks: (Early ones more thoroughly tested than later
ones!)

  * Notebook 1.1 - Background mathematics: ipynb/colab
  * Notebook 2.1 - Supervised learning: ipynb/colab
  * Notebook 3.1 - Shallow networks I: ipynb/colab
  * Notebook 3.2 - Shallow networks II: ipynb/colab
  * Notebook 3.3 - Shallow network regions: ipynb/colab
  * Notebook 3.4 - Activation functions: ipynb/colab
  * Notebook 4.1 - Composing networks: ipynb/colab
  * Notebook 4.2 - Clipping functions: ipynb/colab
  * Notebook 4.3 - Deep networks: ipynb/colab
  * Notebook 5.1 - Least squares loss: ipynb/colab
  * Notebook 5.2 - Binary cross-entropy loss: ipynb/colab
  * Notebook 5.3 - Multiclass cross-entropy loss: ipynb/colab
  * Notebook 6.1 - Line search: ipynb/colab
  * Notebook 6.2 - Gradient descent: ipynb/colab
  * Notebook 6.3 - Stochastic gradient descent: ipynb/colab
  * Notebook 6.4 - Momentum: ipynb/colab
  * Notebook 6.5 - Adam: ipynb/colab
  * Notebook 7.1 - Backpropagation in toy model: ipynb/colab
  * Notebook 7.2 - Backpropagation: ipynb/colab
  * Notebook 7.3 - Initialization: ipynb/colab
  * Notebook 8.1 - MNIST-1D performance: ipynb/colab
  * Notebook 8.2 - Bias-variance trade-off: ipynb/colab
  * Notebook 8.3 - Double descent: ipynb/colab
  * Notebook 8.4 - High-dimensional spaces: ipynb/colab
  * Notebook 9.1 - L2 regularization: ipynb/colab
  * Notebook 9.2 - Implicit regularization: ipynb/colab
  * Notebook 9.3 - Ensembling: ipynb/colab
  * Notebook 9.4 - Bayesian approach: ipynb/colab
  * Notebook 9.5 - Augmentation ipynb/colab
  * Notebook 10.1 - 1D convolution: ipynb/colab
  * Notebook 10.2 - Convolution for MNIST-1D: ipynb/colab
  * Notebook 10.3 - 2D convolution: ipynb/colab
  * Notebook 10.4 - Downsampling & upsampling: ipynb/colab
  * Notebook 10.5 - Convolution for MNIST: ipynb/colab
  * Notebook 11.1 - Shattered gradients: ipynb/colab
  * Notebook 11.2 - Residual networks: ipynb/colab
  * Notebook 11.3 - Batch normalization: ipynb/colab
  * Notebook 12.1 - Self-attention: ipynb/colab
  * Notebook 12.2 - Multi-head self-attention: ipynb/colab
  * Notebook 12.3 - Tokenization: ipynb/colab
  * Notebook 12.4 - Decoding strategies: ipynb/colab
  * Notebook 13.1 - Encoding graphs: ipynb/colab
  * Notebook 13.2 - Graph classification : ipynb/colab
  * Notebook 13.3 - Neighborhood sampling: ipynb/colab
  * Notebook 13.4 - Graph attention: ipynb/colab
  * Notebook 15.1 - GAN toy example: ipynb/colab
  * Notebook 15.2 - Wasserstein distance: ipynb/colab
  * Notebook 16.1 - 1D normalizing flows: ipynb/colab
  * Notebook 16.2 - Autoregressive flows: ipynb/colab
  * Notebook 16.3 - Contraction mappings: ipynb/colab
  * Notebook 17.1 - Latent variable models: ipynb/colab
  * Notebook 17.2 - Reparameterization trick: ipynb/colab
  * Notebook 17.3 - Importance sampling: ipynb/colab
  * Notebook 18.1 - Diffusion encoder: ipynb/colab
  * Notebook 18.2 - 1D diffusion model: ipynb/colab
  * Notebook 18.3 - Reparameterized model: ipynb/colab
  * Notebook 18.4 - Families of diffusion models: ipynb/colab
  * Notebook 19.1 - Markov decision processes: ipynb/colab
  * Notebook 19.2 - Dynamic programming: ipynb/colab
  * Notebook 19.3 - Monte-Carlo methods: ipynb/colab
  * Notebook 19.4 - Temporal difference methods: ipynb/colab
  * Notebook 19.5 - Control variates: ipynb/colab
  * Notebook 20.1 - Random data: ipynb/colab
  * Notebook 20.2 - Full-batch gradient descent: ipynb/colab
  * Notebook 20.3 - Lottery tickets: ipynb/colab
  * Notebook 20.4 - Adversarial attacks: ipynb/colab
  * Notebook 21.1 - Bias mitigation: ipynb/colab
  * Notebook 21.2 - Explainability: ipynb/colab


Citation

 @book{prince2023understanding,
 author = "Simon J.D. Prince",
 title = "Understanding Deep Learning",
 publisher = "MIT Press",
 year = 2023,
 url = "http://udlbook.com"
}