https://udlbook.github.io/udlbook/ Understanding Deep Learning by Simon J.D. Prince To be published by MIT Press Dec 5th 2023. * Download draft PDF Chapters 1-21 here 2023-11-24. CC-BY-NC-ND license download stats shield * Report errata via github or contact me directly at udlbookmail@gmail.com * Follow me on Twitter or LinkedIn for updates. Table of contents * Chapter 1 - Introduction * Chapter 2 - Supervised learning * Chapter 3 - Shallow neural networks * Chapter 4 - Deep neural networks * Chapter 5 - Loss functions * Chapter 6 - Training models * Chapter 7 - Gradients and initialization * Chapter 8 - Measuring performance * Chapter 9 - Regularization * Chapter 10 - Convolutional networks * Chapter 11 - Residual networks * Chapter 12 - Transformers * Chapter 13 - Graph neural networks * Chapter 14 - Unsupervised learning * Chapter 15 - Generative adversarial networks * Chapter 16 - Normalizing flows * Chapter 17 - Variational autoencoders * Chapter 18 - Diffusion models * Chapter 19 - Deep reinforcement learning * Chapter 20 - Why does deep learning work? * Chapter 21 - Deep learning and ethics front cover Resources for instructors Instructor answer booklet available with proof of credentials via MIT Press. Request an exam/desk copy via MIT Press. Figures in PDF (vector) / SVG (vector) / Powerpoint (images): * Chapter 1 - Introduction: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 2 - Supervised learning: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 3 - Shallow neural networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 4 - Deep neural networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 5 - Loss functions: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 6 - Training models: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 7 - Gradients and initialization: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 8 - Measuring performance: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 9 - Regularization: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 10 - Convolutional networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 11 - Residual networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 12 - Transformers: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 13 - Graph neural networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 14 - Unsupervised learning: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 15 - Generative adversarial networks: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 16 - Normalizing flows: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 17 - Variational autoencoders: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 18 - Diffusion models: PDF Figures / PowerPoint Figures * Chapter 19 - Deep reinforcement learning: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 20 - Why does deep learning work?: PDF Figures / SVG Figures / PowerPoint Figures * Chapter 21 - Deep learning and ethics: PDF Figures / SVG Figures/ PowerPoint Figures * Appendices - PDF Figures / SVG Figures / Powerpoint Figures Instructions for editing figures / equations can be found here. Resources for students Answers to selected questions: PDF Python notebooks: (Early ones more thoroughly tested than later ones!) * Notebook 1.1 - Background mathematics: ipynb/colab * Notebook 2.1 - Supervised learning: ipynb/colab * Notebook 3.1 - Shallow networks I: ipynb/colab * Notebook 3.2 - Shallow networks II: ipynb/colab * Notebook 3.3 - Shallow network regions: ipynb/colab * Notebook 3.4 - Activation functions: ipynb/colab * Notebook 4.1 - Composing networks: ipynb/colab * Notebook 4.2 - Clipping functions: ipynb/colab * Notebook 4.3 - Deep networks: ipynb/colab * Notebook 5.1 - Least squares loss: ipynb/colab * Notebook 5.2 - Binary cross-entropy loss: ipynb/colab * Notebook 5.3 - Multiclass cross-entropy loss: ipynb/colab * Notebook 6.1 - Line search: ipynb/colab * Notebook 6.2 - Gradient descent: ipynb/colab * Notebook 6.3 - Stochastic gradient descent: ipynb/colab * Notebook 6.4 - Momentum: ipynb/colab * Notebook 6.5 - Adam: ipynb/colab * Notebook 7.1 - Backpropagation in toy model: ipynb/colab * Notebook 7.2 - Backpropagation: ipynb/colab * Notebook 7.3 - Initialization: ipynb/colab * Notebook 8.1 - MNIST-1D performance: ipynb/colab * Notebook 8.2 - Bias-variance trade-off: ipynb/colab * Notebook 8.3 - Double descent: ipynb/colab * Notebook 8.4 - High-dimensional spaces: ipynb/colab * Notebook 9.1 - L2 regularization: ipynb/colab * Notebook 9.2 - Implicit regularization: ipynb/colab * Notebook 9.3 - Ensembling: ipynb/colab * Notebook 9.4 - Bayesian approach: ipynb/colab * Notebook 9.5 - Augmentation ipynb/colab * Notebook 10.1 - 1D convolution: ipynb/colab * Notebook 10.2 - Convolution for MNIST-1D: ipynb/colab * Notebook 10.3 - 2D convolution: ipynb/colab * Notebook 10.4 - Downsampling & upsampling: ipynb/colab * Notebook 10.5 - Convolution for MNIST: ipynb/colab * Notebook 11.1 - Shattered gradients: ipynb/colab * Notebook 11.2 - Residual networks: ipynb/colab * Notebook 11.3 - Batch normalization: ipynb/colab * Notebook 12.1 - Self-attention: ipynb/colab * Notebook 12.2 - Multi-head self-attention: ipynb/colab * Notebook 12.3 - Tokenization: ipynb/colab * Notebook 12.4 - Decoding strategies: ipynb/colab * Notebook 13.1 - Encoding graphs: ipynb/colab * Notebook 13.2 - Graph classification : ipynb/colab * Notebook 13.3 - Neighborhood sampling: ipynb/colab * Notebook 13.4 - Graph attention: ipynb/colab * Notebook 15.1 - GAN toy example: ipynb/colab * Notebook 15.2 - Wasserstein distance: ipynb/colab * Notebook 16.1 - 1D normalizing flows: ipynb/colab * Notebook 16.2 - Autoregressive flows: ipynb/colab * Notebook 16.3 - Contraction mappings: ipynb/colab * Notebook 17.1 - Latent variable models: ipynb/colab * Notebook 17.2 - Reparameterization trick: ipynb/colab * Notebook 17.3 - Importance sampling: ipynb/colab * Notebook 18.1 - Diffusion encoder: ipynb/colab * Notebook 18.2 - 1D diffusion model: ipynb/colab * Notebook 18.3 - Reparameterized model: ipynb/colab * Notebook 18.4 - Families of diffusion models: ipynb/colab * Notebook 19.1 - Markov decision processes: ipynb/colab * Notebook 19.2 - Dynamic programming: ipynb/colab * Notebook 19.3 - Monte-Carlo methods: ipynb/colab * Notebook 19.4 - Temporal difference methods: ipynb/colab * Notebook 19.5 - Control variates: ipynb/colab * Notebook 20.1 - Random data: ipynb/colab * Notebook 20.2 - Full-batch gradient descent: ipynb/colab * Notebook 20.3 - Lottery tickets: ipynb/colab * Notebook 20.4 - Adversarial attacks: ipynb/colab * Notebook 21.1 - Bias mitigation: ipynb/colab * Notebook 21.2 - Explainability: ipynb/colab Citation @book{prince2023understanding, author = "Simon J.D. Prince", title = "Understanding Deep Learning", publisher = "MIT Press", year = 2023, url = "http://udlbook.com" }