A Definition of Data Science for Business

Vague mixes of skill sets. A focus on activities and technology. Bizarre Venn diagrams. There is huge confusion over what Data Science is. Is it Big Data? Isn’t it statistics? Is it something else entirely? This confusion leads to vendor and recruiter hype. It leads to inflated career expectations. It leads to rebranding of solid, established and much-needed fields like Analytics, Business Intelligence and Statistics. The secret to defining data science is to focus on the science

Wouldn’t it be better if you could clearly state what you do as a Data Scientist? You probably agree your work life would be easier if your colleagues and customers could understand what you do.

  • A biologist wouldn’t say they are a biologist because they work with petri dishes as opposed to experiments to understand life. However some Data Science definitions focus on use of tools like Hadoop.
  • A physicist wouldn’t say they are a physicist because they run simulations of their models as opposed to understanding matter. However some Data Science definitions focus on activities like modelling, data cleaning and visualizations.
  • All these sciences use statistics to design their experiments and test their hypotheses. Yet some Data Science definitions focus on overlaps of statistics with computer science and unicorns.

A Definition of Data Science

The secret to defining data science is to focus on the science. Here is a simple definition of Data Science:

Data Science is the application of the scientific method to find opportunities and efficiencies in business data

There are a few things to note about this definition:

  • it’s technology agnostic. It’s not about Big Data, Hadoop or whatever the next technology breakthrough might be.
  • it’s applied to finding opportunities and efficiencies in data. It’s not the study of data – that’s statistics.
  • it’s not about activities that may be part of the lifecycle of working with data.
  • it’s applied to the data that describes a business’s processes, just like the data a natural scientist collects to understand a natural process
  • most importantly, it uses the scientific method, “systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses” [1].

The application of the scientific method is central to data science and something I want to come back to in a more detailed post.


[1] https://en.oxforddictionaries.com/definition/scientific_method

3 thoughts on “A Definition of Data Science for Business”

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s