[Your data science project is either: (1) Bespoke & Once-off, (2) Big & Risky, (3) Quick & Dirty, (4) Simple & Effective]
Imagine a senior executive at your firm walks into the room right now, looks at you and says ‘I need you to design a solution that will help us understand the data is telling us. Specifically we are interested in….’ (and let us suppose he continues and outlines some relevant and interesting question in your domain).
How do you begin the process of designing this solution? Assuming that the design will have to meet some non-trivial set of requirements, where do you start?
A good place to start is by thinking about the strategic trade-offs the design will make by thinking about the solution space. The most important trade off that will be made is between the level of sophistication and the ease of execution (or repeatability). The level of sophistication includes the novelty and accuracy of techniques used, expert input during execution, and the quantity of variables and observations included (mostly things correlated with the power & quality of the output).Ease of execution refers to the cost (in time and money) for the organisation to produce this output (now & in the future).
Clearly an ideal solution is a sophisticated, easily-executed implementation. But in all cases (given fixed or limited resources) there is a choice between sophistication and easy-execution. Hammering out which of each is valued by stakeholders is the first (and most important) question that you should answer when designing the solution.
A good way to do this is to break the solution space into four different area (or quadrants). The relative strengths and weakness of each quadrant can often facilite a discussion to identify what is important to the stakeholders.
The 4-Quadrants Heuristic for Data Science Solutions
- Bespoke & Once-off: Projects that require sophisticated tools and skill-sets (best suited to analyses that need only be completed once)
- Quick & Dirty: Simple quick-result analysis usually completed in a spreadsheet or short script (best suited for decisions that need to be made quickly)
- Simple & Effective: Stable well implemented solution that produces regular (or on demand) outputs automatically (best suited for environments where data signal is strong and data changes periodically/frequently)
- Big & Risky: Transformational high-risk projects with big potential if successful (best suited when data science is going to be a core competency of your organisation)
So the next time you see scope for a data science solution, your first though should be “Which Quadrant?”
If you enjoyed this post, you can subscribe to this blog using the link provided in the sidebar. Questions, comments and feedback are always welcome.