Teaching Materials

auditorium-572776_1920Guerrilla Analytics is already in use in firms and universities. The following outlines a suggested course structure for Guerrilla Analytics as well as complementary courses that can benefit from Guerrilla Analytics assigned reading. Of course, the book cab be read in any order. See here for a list of chapters and topic.

I am happy to provide Guerrilla Analytics further teaching materials to support courses and training. Please contact me for more information or just to let me know you are using Guerrilla Analytics in your courses.

Guerrilla Analytics Course

Data Science needs methodology

There are now many Data Science courses to choose from, both as part of university degrees and private sector training. Data Science is a practitioner’s field and so courses need to incorporate some aspects of operations and methodology in addition to necessary technical skills in machine learning, programming, databases, statistics, experiment design and related fields.

Guerrilla Analytics provides a sound and tested methodology for typical Data Science projects. Projects in industry and research are both dynamic and constrained due to the complex nature of data, changing requirements, limited resources and pressure to quickly deliver value. Guerrilla Analytics provides simple guiding principles and practice tips that help the Data Scientist produce agile, traceable, testable work at every stage of the Data Science workflow from data extraction through to delivery of reports and work products.

Teaching options

As a teacher, you can use Guerrilla Analytics as either a stand-alone module or as reading material for complementary courses.

Stand-alone Module

This is a suggested structure for a Guerrilla Analytics course.

Expected learning outcomes

After successful completion of the course, the student will be expected to:

  • know the operational risks and challenges to delivery of Data Science work
  • know the 7 Guerrilla Analytics Principles
  • apply Guerrilla Analytics Principles at each stage of the Data Science workflow from data extraction through to reporting

Pre-requisites

While the necessary background from other fields is covered in the course, the student will get most benefit from the course if they have an understanding of databases, types of data, data flows, a programming language such as SQL or Python and software engineering principles such as version control, build automation and testing.

Format

The course is best delivered as a combination of lecture material and parallel exercises relating to a case study referenced in the lectures.

Lecture schedule

Lecture Book chapters Topics covered
Introduction 1, 2, 3
  • Challenges a Data Scientist faces
  • Introduction to Guerrilla Analytics
  • Why Data Science is difficult to manage
  • Risks to delivery
  • Introducing the Guerrilla Analytics Principles
  • Introducing the Guerrilla Analytics workflow
  • Introducing the case study
Data Extraction, Receipt and Load 4, 5, 6
  • Extraction pitfalls and examples
  • Data receipt pitfalls. How to track data received
  • Data loading pitfalls. How to load data, deal with revisions and versions
  • Control totals for data validation
Analytics programming 7, 8
  • Pitfalls and risks when coding
  • How to structure code files
  • How to structure code
  • Linking code to outputs and source data
  • Version control of code
  • How to manipulate data while preserving data provenance
Creating work products and reporting 9, 10
  • Pitfalls and risks
  • How to structure and organise work products
  • Version control
  • Reporting – what is a report?
  • Writing reports that link to analytics
Consolidation  11
  • Why consolidate?
  • Data builds
  • Version control of builds
  • Automation of builds
  • Layering and interfaces
Testing  12, 13, 14, 15
  • What is testing and types of testing
  • Testing data
  • Testing work products
  • Testing builds
  • Automating tests
Round up
  • Review of Guerrilla Analytics and the 7 Principles
  • Review of the case study through
    • data extraction
    • data receipt and loading
    • analytics coding
    • consolidation with a build
    • testing
    • reporting

Complementary courses

Guerrilla Analytics draws on many lessons from Agile Software Development and well-established Software Engineering Practices. The following types of courses would benefit from assigned Guerrilla Analytics reading.

Group projects Students are required to work together to deliver testable, explainable, reproducible data science work including a report.
Machine learning / Statistics / Data Mining Students are developing and running a variety of algorithm versions on a variety of data sets and want to methodically track versions of that work as it evolves.
Data Science Any Data Science course should make reference to proper methodology, the importance of testing and the importance of data provenance.
Data Science Project Management Any course that emphasises how to organise and structure Data Science teams and activities to successfully deliver and to mitigate the risks arising from Data Science complexity.
Agile Analytics Any course that emphasises agile methods for delivery and management of Data Science.
Research Skills Any course in a data related field that covers reproducibility, data provenance, testing and other best practice for scientific research.