1. Machine Learning: A Probabilistic Perspective by Kevin Murphy. As far as overall comprehensiveness goes, this is one of the absolute best reference texts available. For each machine learning algorithm, a balanced Bayesian and Frequentist perspective is provided. Plus, it covers a lot of topics like MCMC that many textbooks gloss over. That said, explanations can sometime be terse and there are occasional typos/errors. This book is great for getting an overview of many different methods, then picking and choosing which ones you want to look into more closely.
2. Deep Learning by Goodfellow, Bengio, and Courville. The canonical deep learning textbook, this is a great resource for diving into deep learning methods and learning about different types of neural networks. Again, it's best as an overview supplemented with additional reading on topics of interest.
3. Convex Optimization by Boyd and Vandenberghe. This book will introduce you to the huge subfield of convex optimization. There's a ton to learn here, but make sure your linear algebra and calculus are up to snuff before you dive in.
4. Pattern Recognition and Machine Learning by Christopher M. Bishop. Very similar in spirit to MLAPP above and another classic of the field. Tackles classic machine learning methods from a primarily Bayesian perspective. Very clear explanations of concepts.
5. Gaussian Processes for Machine Learning by Rasmussen and Williams. The ONLY book you'll ever need for Gaussian Processes. Super detailed and exceedingly well written. A great book all around.
6. All of Statistics by Larry Wasserman. A great book for understanding the statistics that underlies machine learning methods. I can't recommend it highly enough.
7. Monte Carlo Statistical Methods by Robert and Casella. Not many people have heard of this one, but it's absolutely fantastic. Be warned though - it's heavy on the math and a tough read as a result. That said, this book will teach you everything you ever wanted to know about MCMC and related methods and then some.
8. Probabilistic Graphical Models: Principles and Techniques by Daphne Koller. Daphne Koller is one of the definitive authorities on PGMs, and she has written the canonical textbook on the subject.
9. Numerical Optimization by Nocedal and Wright. Similar to the convex optimization book by Boyd and Vandenberghe, but with more of a focus on numerical methods and less on convexity. Covers algorithms like L-BFGS and Conjugate Gradients. Not an easy read, but well worth it in the end.
10. Statistical Inference by Casella and Berger. Similar in scope to All of Statistics but with a complementary perspective. Another great choice.
Of course, this is not an exhaustive list, but it's certainly more than enough to give any student a solid grounding in the field