This is a talk explaining the 7 Principles for Agile Analytics and showing their application in a case study. I was invited to give this talk s at Predictive Analytics World 2015 in London on October 28th 2015. My talk covered how the 7 Guerrilla Analytics Principles are the foundation for doing Agile Data Science. With a Data […]
Are you a data scientist working on a project with constantly changing requirements, flawed changing data and other disruptions? Guerrilla Analytics can help.
The key to a high performing Guerrilla Analytics team is its ability to recognise common data preparation patterns and quickly implement them in flexible, defensive data sets.
After this webinar, you’ll be able to get your team off the ground fast and begin demonstrating value to your stakeholders.
You will learn about:
* Guerrilla Analytics: a brief introduction to what it is and why you need it for your agile data science ambitions
* Data Science Patterns: what they are and how they enable agile data science
* Case study: a walk through of some common patterns in use inreal projects
Data Science is ‘defensive’ if it can withstand the disruptions of changing data and requirements while still producing repeatable, explainable insights. Put another way, Defensive Data Science maintains data provenance. Fortunately, the Guerrilla Analytics Principles make it easy to do defensive Data Science . This blog post describes how.
The danger of bias hasn’t been given enough consideration in Data Science. Bias is anything that would cause us to skew our conclusions and not treat results and evidence objectively. Bias is sometimes unavoidable, sometimes accidental and unfortunately sometimes deliberate. While bias is well recognised as a danger in mainstream science, I think Data Science could benefit from improving in this area. In this post I categorise the types of bias encountered in typical Data Science work. I have gathered these from recent blog posts , ,  and a discussion in my PhD thesis . I also show how to reduce bias using some of the principles you can learn about in Guerrilla Analytics: A Practical Approach to Working with Data.
I recently read a Harvard Business Review (HBR) article  “You need an algorithm, not a Data Scientist”. Other articles present similar arguments  . I disagree. Data Scientists and automation (data products, algorithms, production code, whatever) are complementary functions. What you actually need is a Data Scientist and then an algorithm.
McKinsey recently published at excellent guide to Machine Learning for Executives. In this post I categorise the key points that stood out from the perspective of establishing machine learning in an organisation. The key take away for me was that without leadership from the C Suite, machine learning will be limited to being a small part of existing operational processes.
Several topical questions were recently asked on Data Science Central. This post addresses the question “What best practices do you recommend, when starting and working on enterprise analytics projects?” I have worked as a Data Scientist for 8 years now. This was after completing a PhD on “Design of Experiments for Tuning Optimisation Algorithms”. So I have a formal background in rigorous experiment design for Data Science and have also managed some pretty complex and fast paced projects in sectors including Financial Services, IT, Insurance, Government and Audit.
A while back I announced an early release of similarity on GitHub in a blog post. Similarity wraps SQL Server functions around the SimMetrics approximate string matching library, making the library’s functions available in SQL Server. Version 1.1.0 has now been released and is available on GitHub. Version 1.1.0 sees several improvements aimed at making the library easier to install and use and making it easier for others to contribute.
In many cases, where Data Scientists struggle on projects has nothing to do with the technical complexity of problems or any lack of Data Science skills – they have all of that from their study and training and are quite motivated people who are passionate about their field. In fact, what makes Data Science difficult for many is the complexity of operating in a Data Science project environment.
I designed the principles to help avoid the chaos introduced by the dynamics, complexity and constraints of data projects. You will find the principles helpful if you work in Data Science, Data Mining, Statistical Analysis, Machine Learning or any field that uses these techniques.
The Guerrilla Analytics Principles have been applied successfully to many high profile and high pressure projects in domains including Financial Services, Identity and Access Management, Audit, Fraud, Customer Analytics and Forensics.