Applying consulting skills to data science

3 minute read

I published this article first on Linkedin Pulse. You can read this article on Linkedin here

I attended a data science meetup event last week. The speaker talked about executing data science projects in an enterprise production environment and shared some good advice. One such nugget was to encourage every data scientist to be good consultants first.

As an ex-management consultant who recently moved into a more deep-dive analytics role, I took stock of the skills I’ve acquired as a consultant. Although the speaker said this in the context of presentation skills and effective communication, I believe there’s a lot more benefit that a data scientist can derive from pure-play consulting skills. As my friend Alec Smith puts it in his brilliantly insightful article, the primary skill of a data scientist lies not in their mastery over machine learning algorithms nor in their knowledge of 5 different programming languages (it’s a different matter if you know PERL though, because some people believe that every PERL programmer should be an honorary data scientist!). What’s important is their ability to tackle business problems and apply scientific thinking to derive outcomes using data. Compare this to management consultants, who get paid for their ability to quickly break down complex problems and develop effective solutions.

Practicing some powerful techniques in structured problem solving and visualisation that management consultants use day-in and day-out are can help data scientists tremendously with achieving their outcomes. Techniques such as:

  • Commercial thinking - I haven’t come across a single consultant who approaches business problems esoterically. The very business model forces you to be commercial and outcome-oriented. Surely you don’t expect your CEO to be excited about the ROC of your latest lasso regression model!
  • Research - Good consultants do their homework. Having this orientation as a data scientist will force you to go prepared for meetings and be ready with critical thoughts. You will not be restricted to internal data sources. Rather, you’ll find creative ways to source information you need to reach your goals.
  • Reproducibility – Consultants can’t churn flashy presentation packs that quickly without re-using content, context and design. Similarly, data scientists can benefit from investing in reproducible and modular code, analytical pipelines and well-tested processes to churn out repeatable work quickly.
  • Hypotheses Trees – A mainstay of any consultant’s problem solving approach, this has been tremendously helpful to me during data discovery or while performing exploratory data analysis. It helps me structure my thinking and is easier for me to churn out code that proves or disproves specific hypotheses. You’ve probably seen or used such trees yourself in your career. But if you’ve not, I highly recommend it.
  • MECE - I use this “mutually exclusive & completely exhaustive” or MECE (pronounced Mee-see) framework almost sub-consciously for data analysis, for organising my thoughts and classifying large tasks/ activities. Ensuring MECE in my analysis helps me stretch my thinking as well as cover all angles.
  • Story-boarding - Even the most insightful analysis goes for naught if not communicated properly. In my opinion, data scientists don’t spend enough time in structuring the message in their presentations. The result is a complex mesh of numbers and bar graphs, where the key insight sometimes gets lost in the complexity. Implementing Barbara Minto’s Pyramid Principle in your PowerPoint pack forces you as an analyst to prioritise and put only the top two or three business insights or predictions in your presentation slide, and push the mounds of other analysis you’ve done straight into appendix.
  • Prioritisation - A data scientist should avoid “boiling the ocean”. Prioritising on key insights and shipping something quickly is a key skill to data scientist. It helps us to steer away from analysis-paralysis type of situations such as IT-to-business loops or low impact high complexity issues.

Do you think I’ve missed anything or don’t agree with something I’ve written? Don’t hesitate to let me know via comments or messaging.

Leave a Comment