Bücherregal lädt …

Chip Huyen – Designing Machine Learning Systems

388 Seiten

A book about all that goes into building ML systems for production, but with a good focus on everything aside from building the actual model. This book was great, as it talks about all of the topics typically ignored or glossed over in most machine learning books: data engineering, data collection, deployment, model failures, tooling, and team structures.

Read as part of our weekly Data Science study group.

florian · 15. Dezember 2022 · gelesen

Chip Huyen – Machine Learning Interviews Book

200 Seiten

Not a printed "book", but free online material packaged up like a book. Part 1 is a summary of the ML interview process. Part 2 is a collection of practice exercises.

Being in a Machine Learning role myself, this was an interesting read to see a comprehensive summary of current topics, interview process, and role descriptions. I also got some inspiration for interviewing new candidates for my own team.

florian · 11. März 2022 · gelesen

Peter Bruce, Peter C. Bruce, Andrew Bruce & Peter Gedeck – Practical Statistics for Data Scientists

342 Seiten

Ein weiteres Buch, das wir in unserer "Data Science Study Group" durchgearbeitet haben. Die Ursprungsmotivation: Nach der Lektüre bin ich endlich ein Statistik-Profi! So 100% ist das nicht aufgegangen. Dennoch habe ich wieder einiges gelernt.

Die Autoren machen einen Rundumschlag über die statistischen Grundlagen, aber aus der praktischen Sicht von Data Scientists.

Gut: Sehr viele Code-Beispiele (in R und in Python), breite Themenauswahl. Weniger gut: Manche Beispiele basierten auf komplett unpassenden Beispiel-Datensätzen, manche Erklärungen blieben vage (weil bewusst großteils auf Formeln verzichtet wurde).

Insgesamt: Ein gutes Grundlagenwerk und auch ein gutes Nachschlagwerk.

florian · 10. Oktober 2021 · gelesen

Ted Malaska & Shivnath Babu – Rebuilding Reliable Data Pipelines Through Modern Tools

97 Seiten

Kompakte Übersicht über das Thema "Data Engineering": Was bedeutet es, eine "Data Pipeline" aufzubauen, worauf muss man achten? Prinzipiell interessant, aber leider bleibt der Autor sehr unkonkret und nennt wenig echte Beispiele. Für Entwickler:innen ist es meiner Meinung nach zu abstrakt, für "Manager:innen" setzt es dann doch zu viel technisches Verständnis voraus. Ich verstehe die Zielgruppe nicht so richtig, ich selbst war zumindest nicht Teil davon. Naja, es war ein kostenloses ebook, dafür war es okay.

florian · 30. September 2021 · gelesen

Jeremy Howard & Sylvain Gugger – Deep Learning for Coders with fastai and PyTorch

624 Seiten

florian · 7. März 2021 · gelesen

Valliappa Lakshmanan, Sara Robinson & Michael Munn – Machine Learning Design Patterns

408 Seiten

This was the book I needed to take me one step further: From just knowing "how to train a neural network" to a better understanding of "MLOps", including training workflows, aspects of scalable serving, and reproducibility.

The three authors are employed at Google and it shows in many chapters: The example of choice is always a Google Cloud AI offering or a Tensorflow code snippet. They do make an effort to also mention competitor products and open source alternatives. Because their insight from Google provided them with this wide range of best practices, I won't hold any of this against the book.

The book isn't without its flaws, though. This (recent) first edition has a number of distracting errors (such as misleading numbers in figures and weird code indentation), plus the greyscale print makes it hard to read many of the figures. That fact cost the book its fifth star. A 2nd edition will probably catch up once it irons out these issues.

I for one will keep this book on my shelf for future reference. It's a great collection of best practices to move a team and an organization ahead in terms of "AI readiness".

florian · 15. Januar 2020 · gelesen

François Chollet – Deep Learning with Python

384 Seiten

A book about deep learning that really caters to my preferred learning style: It covers a lot of real-world applications (text analysis, sentiment analysis, vision, ...) and provides clear and practical code examples that invite you to try for yourself. Ultimately, trying it out and building something yourself is the way to really grasp the concepts, I think, and this book does a really good job at it.

While Francois Chollet does give some introduction in the beginning, it may be too little for the complete beginner. For anyone starting at a slightly-above beginner to intermediate level, I'd wholeheartedly recommend this book to learn Deep Learning with Python.

florian · 2. November 2020 · gelesen

Wes McKinney – Python for Data Analysis

452 Seiten

A book about data analysis with Python using the popular Pandas library (de-facto standard for data wrangling), written by the creator of Pandas himself. Or as I like to call it: The Pandas Book.

First of, don't get me wrong: The 3-star rating doesn't mean this is not a good book. It just wasn't written in a style that I would have personally preferred.

Pros:

Very extensive coverage of (almost) the complete Pandas API. I feel like I have seen (and tried) all major Pandas features now.
Many code examples to see features in action.
Excellent last chapter where the author goes through real-world data sets and shows how to explore and analyse data using Pandas features.

Cons:

Large majority of examples using dummy data (foo and bar and random numbers). While this shows the technical interface, it didn't help me grasp the application potential in many cases.
The structure made the book feel like official API documentation extended with a bit of prose. To be fair, the author made that clear in the preface, but the book had promised me a "hands-on guide (...) packed with practical case studies", and I only found that to be true in the last chapter.

What helped me was having a group of friends to discuss the book. We read one chapter a week and shared our notebooks of playing around with Pandas and our own data sets. While I personally prefer a slightly different style of coding books, studying this one has helped tremendously in becoming more familiar and confident in using Pandas for my data science projects.

florian · 13. Juli 2020 · gelesen

Steven S. Skiena – The Data Science Design Manual

445 Seiten

Read this as part of our "Data Science Study Group" that friends and I have been organising for the past three months. This book lends itself quite well to this kind of format: A broad overview of everything that Data Science entails. However, the book also stays at that high level.

While Steven Skiena goes into detail on some of the algorithms, that level of detail really isn't the focus of that book - and that's okay. Having read it, I now feel like I have a good grasp of the field, but to really cater to my personal learning style, I will have to read something else in addition. I personally learn best when there is practical coding work happening. We used our group discussions to work on some examples ourselves (Kaggle competitions and similar), which added a good amount of depth to the pure text book.

The book itself can be found as a free download on Springer ebooks, and if you want a broad overview of Data Science, I can recommend it. If you want to be a full data scientist after having read the book, you will need to put in some more practical work yourself.

florian · 7. Februar 2016 · gelesen

Forrest M. Mims – Getting Started in Electronics

128 Seiten

128 pages of scribbled notes. Very compact and informal introduction to electronics. No unnecessary stories told. Instead, there is room for 100 simple circuits to try out. And that's what I still have to do in order to really "complete" this book.

florian · 4. Februar 2016 · gelesen

Erik Bartmann – Die elektronische Welt mit Arduino entdecken

1080 Seiten

Teilweise ein wenig albern, aber wenn man sich darauf einlassen kann, ist das Buch eine gute und praktische Einführung in die Thematik. Gut: Alles wird in praktischen Projekten erklärt. Manchmal nicht so gut: Einige Projekte sind einfach Selbstzweck, um eine gewisse Sache zu illustrieren. Das ist okay - aber Projekte, die wirklich einzusetzen sind, sind schon cooler (gibt es aber auch im Buch). Ich habe übrigens die erste Ausgabe gelesen. Die war "nur" 600 Seiten lang. Ging aber erstaunlich schnell, da man mit Programmiererfahrung ungefähr ein Drittel einfach überfliegen kann.

Florian florian