Understanding Deep Learning: In Conversation with Simon Prince

Professor Simon Prince was a Research Director for Borealis AI in Montreal from 2019 until 2021, when he left to write a major new textbook on Deep Learning. The results of his labour came to fruition on December 5^th, 2023, when “Understanding Deep Learning” was released by The MIT Press. We caught up with him to ask him about the new book.

Why did you choose to spend two years writing a book?

I’ve always enjoyed writing. While I was working at Borealis AI, I wrote a series of blogs on all kinds of subjects, including natural language processing, bias and fairness, differential privacy, SAT solvers, and so on.

Some of these came out really well, and that ignited my enthusiasm for a larger project. In fact, the chapters on Variational AutoEncoders and Transformers in my new book are direct descendants of these blogs, and I’m really grateful to Borealis AI for allowing me to rework them. Anyone who enjoys the book should check out the Borealis AI blogs; you can find a full list here.

What does it take to write a book like this?

Well, the amount of work involved is incredible. It’s 540 pages and has 275 figures. If you are a researcher, then it’s something like writing 50 NeurIPS papers in a row without a break. But it’s harder than that because you have to carefully introduce each idea in turn so that there are no forward references, and you have to make sure that the notation is consistent throughout, even though it covers many disparate topics. In the end, it’s like solving a giant jigsaw puzzle.

In addition, you need a lot of different technical skills to succeed. Obviously, you need the scientific chops to understand the material properly. But you need to be able to write well, have empathy for what the reader will understand at any given point, have programming skills, have design skills, and so on. I have to say, I’m never critical of other textbooks now because I admire anyone who has the stamina to make it to the finish line. It’s really not easy.

This is the second book that you have written. Did your approach change the second time around?

The first book (Computer Vision Models: Models, Learning and Inference) was a deliberate attempt to steer the field of computer vision in a certain direction by unifying a lot of contemporary models using a probabilistic formulation. It was a neat idea, but it came out in June 2012, and the AlexNet paper came out later that year, which, of course, steered the field in a completely different direction.

This time around, I had less of an agenda. I’m just trying to explain how deep learning works as efficiently and clearly as possible. One thing that is different this time is that the audience is much broader. The last book was squarely targeted at machine learning people, but “Understanding Deep Learning” is designed to be useful to economists, biologists, climate scientists, or any other group who might be interested in this technology. I still think that machine learning people will find it interesting, though, as I frequently describe ideas in a different way from how they are usually expressed, and the latter half of the book covers quite cutting-edge topics like diffusion models. I learned a lot writing it, and so I expect that people will learn a lot reading it, even if they have a good knowledge of deep learning.

Are books still relevant, given how much information is available online?

I actually think that the act of curating material is now more important than ever. The glut of information is becoming a real problem. There are more than 4000 machine learning papers published every month, and obviously, it’s impossible to read them all. Moreover, there are a lot of very low-quality articles online that just confuse matters.

So I really try to make every sentence count, to tell the reader what they need to know as efficiently and compactly as possible and to avoid discussion of interesting but fundamentally unnecessary topics. Neural networks are introduced after only eight pages of technical discussion. This is starkly in contrast with other books, where it can be hundreds of pages before the reader learns what a deep neural network is.

Whether the medium of a printed book is the best way to disseminate this kind of curated information is another question. The PDF is freely available online, and you can also buy a Kindle version if you prefer the electronic route. But there are still a lot of people who like to own a printed book.

Also, I hope it will ultimately form a historical document of where our understanding of deep neural networks was in 2023. This might be interesting in the future because, contrary to the title of the book (which is intended as a joke), we don’t currently `understand deep learning’ very well.

How do you choose which topics to include?

There is obviously a lot of core material that must be in there, but there also exists a wide range of extra topics that are interesting but not crucial. Some topics like recurrent neural networks were omitted simply because extrapolating into the future; I’m not sure they will be that important by the time the book hits peak readership. It seems like they are being supplanted by transformers. Other topics like neural differential equations and energy models were omitted because they need a lot of extra background mathematics, which will make the book harder to read.

Most of my regrets about the first book are things that I included, but that could have been omitted. So, I’m less worried about omitting things this time. I plan to write a second edition, so I can always add them later; it’s harder to remove things. People are welcome to contact me if they have strong opinions about topics that should be addressed in the next edition.

I notice that there is a Chapter on Ethics, which is unusual in this kind of book. How did that come about?

I take the ethical aspects of AI extremely seriously, but I was pretty intimidated by the idea of writing a chapter on ethics. It’s really difficult to get the tone correct, and I didn’t feel I would be as competent at writing this as I was at the more scientific parts.

However, my editor at MIT Press really encouraged me to expand on this topic. In the end, the answer was to write it jointly with a domain expert. Travis LaCroix from Dalhousie University helped me with this chapter. He is a philosopher by trade; essentially, he wrote the chapter, and I worked hard on translating it so that it is accessible to engineers and made sure that it integrates well with the other ideas in the book.

It covers a lot of what you would expect: alignment, misuse, explainability, bias, etc. But it also encourages engineers to be critical of the idea that science can be value-free and to take responsibility for the way that they develop and communicate about AI algorithms. I should add to this that one of the pleasures of working at Borealis AI was that the ethical aspects of machine learning were always taken extremely seriously. I never felt at all compromised in this regard, and this is a great credit to Borealis AI, given that there are many companies out there that behave less scrupulously.

The Understanding Deep Learning textbook.

The PDF of the forthcoming book and the associated Python notebooks can be found at the main book website.
A physical copy of the book can be ordered via the main MIT press page.
Instructors considering teaching from the book can request an exam or desk copy.

Meet the Author

Simon J. D. Prince is an Honorary Professor of Computer Science at the University of Bath and author of Computer Vision: Models, Learning and Inference. A research scientist specializing in artificial intelligence and deep learning, he has led teams of research scientists in academia and industry at Anthropics Technologies Ltd, Borealis AI, and elsewhere.

We’re Hiring!

Borealis AI offers a stimulating work environment and the platform to do world-class machine learning in a startup culture. We’re growing the team and hiring for roles across various functions.

View all jobs