Bookshelf for a Data Scientist

As a data scientist, reading book is a daily activity and most of my skills are built from reading.  However, I am always reading different types of books outside the data science tools and technology.

I would like to share some of the books recommended across different areas.

 

Buy Me A Coffee

Methodology

When doing Data Science, many people are just focused on tools and technology.  Unfortunately, I am afraid only focused in technology will never lead to a successful implementation.  For the first book to recommend, it is related to the methodology to do data analytics.

 

Behind Every Good Decision

Authors: Piyanka Jain & Puneet Sharma

ISBN-13: 978-0814449219

ISBN-10: 0814449212

 

The book is written by Piyanka Jain and Puneet Sharma several years ago.  However, it is still a must-have one taking the whole process of data analytics.

 

NOTE: this is the only 1 single book in my bookshelf with both English and traditional Chinese translation version.

Python Programming

There are many different Python programming books available in the market.  However, I would like to suggest 3 of them and they are:

  1. Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming
  2. Data Science from Scratch: First Principles with Python
Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming

Author: Eric Matthes

ISBN-13 : 978-1593279288

ISBN-10 : 1593279280

 

This is the best introduction for Python for reading a book.  You can follow the flow of the book with the small projects to learn Python.

 

NOTE: you can find free PDF copy in the Internet.

 

Data Science from Scratch: First Principles with Python

Author: Joel Grus

ISBN 13: 978-1492041139

ISBN 10: 1492041130

 

This book can really help readers mastering the tools like libraries, frameworks, modules, and toolkits.  On the other hand, it is far more important for the author to discuss the ideas and principles underlying them.

R Programming

There are quite a number of books for R programming.  However, I have picked a textbook for university students due to the guidance for problem solving in this book.

Modern Data Science with R (Chapman & Hall/CRC Texts in Statistical Science) (1st edition)

Authors: Benjamin S. Baumer, Daniel T. Kaplan, etc.

ISBN 13: 978-1498724487

ISBN 10: 9781498724487

 

This one is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world problems with data.  For me, it is very important to get the key concepts of problem solving with data in this book.

 

Further, this book is based on RStudio (the most popular IDE for R) to link up statistical concepts for students.

Statistics

For statistics applied in Data Science, it is just needed the basic knowledge and concepts most of the time.  So, I will share some introductory book on statistics.

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python (2nd ed.)

Author: Peter Bruce, Andrew Bruce, Peter Gedeck

ISBN 13: 978-1492072942

ISBN 10: 149207294X

 

The author is giving an ABC introduction on statistics to help people without formal statistical training to jump into the Data Science world.

Computer Science Knowledge

Some data analysts and data scientists have not attended any formal computing training.  There are some books talking about Computer Science Knowledge introduction.  However, the books are being updated rapidly at a comparatively high cost.

So, I would like to suggest an online Computer Science introduction nearly 10 years ago at the URL below:

https://cnx.org/contents/nyJOW0xZ@1.1:ViZ7rttC@1/Basic-concepts

The above is still a very good introduction.  Also, it is important to learn more about cloud computing concept for not only technical people but also business managers.

Cloud Computing : A Comprehensive Guide to Cloud Computing

Author: Austin Young

ISBN13 9781086039504

ISBN10 1086039505

 

This is just a book giving you a basic introduction. It is valuable for management like CIO, business owner or data scientist seeking for cloud service to fit for their solutions.  If you are interested in architecting a cloud solution, this is not for you.

Machine Learning / AI

In recent years, there are many AI experts suddenly coming out in the job market.  Personally, I am just using python and TensorFlow to do some machine learning stuff such as commodity price prediction and deep learning in face recognition & natural language processing.  Let’s share what I have read.

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.)

Author: Aurélien Géron

ISBN 13: 978-1492032649

ISBN 10: 1492032646

 

This book is ridding on Scikit-Learn and TensorFlow to do machine learning with a number of examples in order to demonstrate the application for the machine learning.

Business Management (Data Science)

In the business world, there are lots of different unique demand on the usage of data and data science.  In my book-shelf, I am just getting any books directly related to my own job demand like clients’ industries and general business needs.  So, it may not be useful for everyone.

Marketing Data Science: Modeling Techniques in Predictive Analytics with R and Python

Author: Thomas Miller

ISBN 13: 978-0133886559

ISBN 10: 0133886557

 

With my experience in applying data science, Marketing is one of the industry comparatively easy to justify the investment in Data Science due to the direct impact on the sales revenue.  This book is very good to tell what to do but the only drawback is a bit old with the coding examples.

 

Thus, it is also my recommendation for business to start their Data Science journey in Marketing. Another good starting point is performance monitoring to do cost saving.

 

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

Author: Foster Provost, Tom Fawcett

ISBN 13: 978-1449361327

ISBN 10: 1449361323

 

Even it is a bit old, but it doesn’t affect the value of this book much.  It is aimed to suggest how to fit data science into your organization.

 

 

Storytelling with Data

Author: Cole Nussbaumer Knaflic

ISBN-13 : 978-1119002253

ISBN-10 : 1119002257

 

Data Visualization is quite important for top management demanding for beautiful charts and dashboards all the time.  Unfortunately, many people are speeding much time to produce fancy items but never providing much values from the charts.

 

This book should provide answers for the balance of layout and content – also with the principals behind the visualization.  It is important for data scientists to explain the data and insights being drawn from the data analyzed.

 

Data Driven Business Transformation: How to Disrupt, Innovate and Stay Ahead of the Competition (1st ed.)

Author: Peter Jackson, Caroline Carruthers

ISBN 13: 978-1119543152

ISBN 10: 1119543150

 

Businesses are facing huge competition with the digital world and globalization.  In order to have higher efficiency, it is vital to use data science to help driving the business by performance monitoring, analytics, predictions, etc.

 

When your competitor is using data as weapon, you should not wait.  Apart from the information provided in the book, I would like to share that data analytic is not as expensive as you think.  I am using open source tools to help different businesses to solve their problems.  Sometimes, I am just taking Excel to do some marketing analytics.

Conclusion

In a nutshell, I would like to share my book collection because I would like to see better general quality of Data Science professionals.  Most of the people are just focused on the tools in use but without proper & in-depth knowledge for the principle behind the tools.  Fundamentals are always important for any profession.  Learning is still always my daily assignment throughout my journey in data analytics.

0Shares