Python for Data Analysis
452 Seiten

A book about data analysis with Python using the popular Pandas library (de-facto standard for data wrangling), written by the creator of Pandas himself. Or as I like to call it: The Pandas Book.

First of, don't get me wrong: The 3-star rating doesn't mean this is not a good book. It just wasn't written in a style that I would have personally preferred.

Pros:

  • Very extensive coverage of (almost) the complete Pandas API. I feel like I have seen (and tried) all major Pandas features now.
  • Many code examples to see features in action.
  • Excellent last chapter where the author goes through real-world data sets and shows how to explore and analyse data using Pandas features.

Cons:

  • Large majority of examples using dummy data (foo and bar and random numbers). While this shows the technical interface, it didn't help me grasp the application potential in many cases.
  • The structure made the book feel like official API documentation extended with a bit of prose. To be fair, the author made that clear in the preface, but the book had promised me a "hands-on guide (...) packed with practical case studies", and I only found that to be true in the last chapter.

What helped me was having a group of friends to discuss the book. We read one chapter a week and shared our notebooks of playing around with Pandas and our own data sets. While I personally prefer a slightly different style of coding books, studying this one has helped tremendously in becoming more familiar and confident in using Pandas for my data science projects.