It’s no surprise, then, that we’ve seen a blossoming of books, courses, and entire educational programs aimed specifically at training data scientists.
But there are many people, myself included, who like to do part or all of their learning from books. Being able to re-read important sections, pause to think over a problem, and circle back around to earlier chapters combine to make for a very effective way to climb the learning curve.
Below is advice on learning data science from books, along with some of the best books on data science. This is a mix of my own personal experience and research I’ve done on the most popular titles.
Can You Learn Data Science from Books?
First, to address a question some people might have: is data science the sort of thing that can be learned from a book?
The answer is yes, but you have to make sure you’re applying the lessons you learn. Like coding, data science is an active process. You won’t get very far learning to code by simply reading words on a page. You have to type code, discover you’ve made an error, and start hunting down the bug. You have to realize that the language has shifted just a bit since the book was published and head to stack overflow to see if you can figure out how to proceed.
This is the only way to make coding a living skill, and the same applies to data science.
The Best Books for Learning Data Science
Before we get to specific recommendations, it’s important to determine your focus. Are you looking for real world applications and case studies or for something more academic? Are you looking for an introduction to data science, or are you looking for something more advanced? There are many focus topics for data science, like machine learning, deep learning, neural networks, data visualizations, and business intelligence.
There are also books with a programming language focus, that may specifically teach machine learning with Python. It’s important to know what you’re looking for when searching for a book.
This list could be multiplied 50 times and still not cover every worthy title, but here are some of my favorite recommendations for books covering machine learning, with a focus on good introductions to machine learning:
- Thinkstats, Allen B. Downey. I went back and forth on whether or not to include a text devoted exclusively to probability theory on this list because there’s almost nothing more foundational to data science. Downey’s book does a great job of covering just what an aspiring analyst needs to know.
- Data Science From Scratch, Joel Grus. When you’re starting from the beginning, you need a book that does too. Take your time here to build a good foundation.
- An Introduction to Statistical Learning, Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. One of the foundational texts for data science–you’ll be hard pressed to find a more pithy and thorough overview of the major data science algorithms.
- Python for Data Analysis, Wes McKinney. This book will teach you data science through Python, Numy, and Pandas, three of the most important tools you can learn. If you get a job in data science, you will almost certainly be using them on a regular basis.
- Understanding Machine Learning, Shai Shalev-Shwartz and Shai Ben-David. Machine Learning is all the rage these days, but there’s a lot of nonsense mixed in with genuinely exciting advancements. Whether you do ML or not, you need to understand the math behind it.
This list should keep you busy for a good while!