Data Science Reading List – Basics

[vc_row][/vc_row][vc_column][title]Data Science Reading List – Basics[/title][vc_empty_space][content_box]These are several of my favourite go-to books from my recommended Data Science Basics reading list.

What are the first things you should learn to become a data scientist? What are the technical skills that will help you get started? Where do you start?

As with any profession, there are some core skills you need before you can really excel. For practical Data Science, it’s less about advanced machine learning and more about the tools and skills to make you scientific.[/content_box][/vc_column][vc_empty_space]

[vc_custom_heading text=”Guerrilla Analytics: A Practical Approach to Working with Data” use_theme_fonts=”yes” link=”url:http%3A%2F%2Famzn.to%2F2iy5TD2||target:%20_blank|rel:nofollow”][/vc_column][/vc_row][vc_row][vc_column width=”1/3″][vc_single_image source=”external_link” onclick=”custom_link” img_link_target=”_blank” custom_src=”https://images-na.ssl-images-amazon.com/images/I/41FpP06ENSL._SX331_BO1,204,203,200_.jpg” link=”http://amzn.to/2iy5TD2″%5D%5B/vc_column%5D%5Bvc_column width=”2/3″][vc_column_text]Data Science work gets complex very quickly. Changing data, changing requirements, changing understanding of the problem. That’s when you need Guerrilla Analytics.

Learn how to organise your projects (data, code, deliverables, testing, processes and team) so you are robust to all the disruptions of high pressure Data Science projects. Knowing Guerrilla Analytics will make sure you can focus on adding value rather than struggling to understand and repeat your own work.[/vc_column_text][/vc_column][/vc_row][vc_row][vc_column]

 

[vc_custom_heading text=”Pro Git (Expert’s Voice in Software Development)” use_theme_fonts=”yes” link=”url:http%3A%2F%2Famzn.to%2F2ilJM2W||target:%20_blank|rel:nofollow”][vc_row_inner][vc_column_inner width=”1/3″][vc_single_image source=”external_link” onclick=”custom_link” img_link_target=”_blank” custom_src=”https://images-na.ssl-images-amazon.com/images/I/41%2BcySa9E4L._SX376_BO1,204,203,200_.jpg” link=”http://amzn.to/2ilJM2W”%5D%5B/vc_column_inner%5D%5Bvc_column_inner width=”2/3″][vc_column_text]It’s a fact of Guerrilla Analytics life. A significant amount of project chaos will disappear if you have some form of version control.

You don’t need to become an enterprise class dev ops practitioner. You do need to know about versioning, tagging, reverting and other common version control activities. This is the go-to Git reference. Everything you need to know about Git and written from a Git perspective.[/vc_column_text][/vc_column_inner][/vc_row_inner]

[vc_custom_heading text=”Data Science at the Command Line: Facing the Future with Time-Tested Tools” use_theme_fonts=”yes” link=”url:http%3A%2F%2Famzn.to%2F2ilEkx9||target:%20_blank|rel:nofollow”][vc_row_inner][vc_column_inner width=”1/3″][vc_single_image source=”external_link” onclick=”custom_link” img_link_target=”_blank” custom_src=”https://images-na.ssl-images-amazon.com/images/I/51dg4631p1L._SX376_BO1,204,203,200_.jpg” link=”http://amzn.to/2ilEkx9″%5D%5B/vc_column_inner%5D%5Bvc_column_inner width=”2/3″][vc_column_text]To be a true Guerrilla Analyst, you need to be comfortable at the command line. It’s the only way to quickly peek at, summarise, clean and join up the wide variety of data files that you are likely to encounter. It’s also the best way to automate your work for efficiency and reproducibility.

This book will teach you all the tools and tricks you need to get around the most awkward and broken data files that come your way. You’ll learn about chunking files, patching them together, sorting, editing and modifying in ways you probably thought possible only in ‘real’ analytics environment.[/vc_column_text][/vc_column_inner][/vc_row_inner]

[vc_custom_heading text=”Data Smart: Using Data Science to Transform Information into Insight” use_theme_fonts=”yes” link=”url:http%3A%2F%2Famzn.to%2F2hCIgw8||target:%20_blank|rel:nofollow”][vc_row_inner][vc_column_inner width=”1/3″][vc_single_image source=”external_link” onclick=”custom_link” img_link_target=”_blank” custom_src=”https://images-na.ssl-images-amazon.com/images/I/51HwBZNlD7L._SX396_BO1,204,203,200_.jpg” link=”http://amzn.to/2ilQHcJ”%5D%5B/vc_column_inner%5D%5Bvc_column_inner width=”2/3″][vc_column_text]A great introductory book written in a fun and entertaining style and based around analytics done in spreadsheets. Spreadsheets mean trouble for the Guerrilla Analyst but from a beginner’s perspective they are a familiar way to dip a toe in the water.

Sometimes a spreadsheet is the quickest way to get a feel for your data and this book might open your eyes to how much is possible in ubiquitous desktop software.[/vc_column_text][/vc_column_inner][/vc_row_inner]

 

[vc_custom_heading text=”Bad Data Handbook” use_theme_fonts=”yes” link=”url:http%3A%2F%2Famzn.to%2F2hCBFlL||target:%20_blank|rel:nofollow”][vc_row_inner][vc_column_inner width=”1/3″][vc_single_image source=”external_link” onclick=”custom_link” custom_src=”https://images-na.ssl-images-amazon.com/images/I/41Qvl4YJF8L._SX377_BO1,204,203,200_.jpg” link=”http://amzn.to/2hCBFlL”%5D%5B/vc_column_inner%5D%5Bvc_column_inner width=”2/3″][vc_column_text]If you are going to work with data then you really need to understand the many ways it can be flawed. This book is a fun and comprehensive treatment of the flaws to expect and how to detect them in a huge variety of data types. I especially liked the chapter ‘Data Quality Demystified’ which was the foundation for the categorisation of data tests in Guerrilla Analytics: A Practical Approach to Working with Data. You may not have time to implement everything in this book but it never hurts to be aware of problems lurking in your data and what may be causing those strange and unexpected numbers in your report.[/vc_column_text][/vc_column_inner][/vc_row_inner][vc_row][vc_column][/vc_column][/vc_row]