What I learnt from failures as a data science consultant¶
Originally published at Data Science Club at Imperial (ICDSS) on 9 November 2021.
Working as a data science consultant in a Big Four firm is both challenging and rewarding. I work with multiple data science and business teams across organisations. The same tools meet very different problems. That mix gives me a clear view of how data science is useful in the real world, and where it often struggles.
I've spent four years as a data scientist, mainly in retail, supply chain and manufacturing. My first few projects failed. They taught me a lot. Since then, I've delivered more consistently. Before the lessons, it helps to define success.
For me, a project counts as a success if it ends in a deployed solution, or a proof of concept that the client’s team can integrate with little friction. Business users should also trust it enough to use it.
This challenge is common. A Gartner study suggests about 85% of data science projects never reach production.
Questions I ask before starting a project¶
When I get a new proposal, I ask a few simple questions.
- Is this a data science problem? Many problems don’t need a model. Some are too complex for today’s data or systems. Others are better served by reporting or process fixes.
- Could a simpler approach work? Often, a dashboard, an SQL query, or a set of clear rules answers the question. Using a model where it’s not needed adds cost and risk.
- How will people use it? Consumption and deployment matter. API, batch job, app, or Excel output—each choice changes design and effort. Early clarity avoids rework.
- Do we have data? Quality, completeness, access and lineage all matter. Even if good data exists, sharing it with external teams may be hard due to compliance or geography.
- Do we have the right data? I follow CRISP–DM. I start with the business and list the features that matter, then check if data exists for them. This exposes gaps early and shapes feature engineering.
- Do we have the right tools? Data science spans optimisation, regression, classification, clustering, forecasting and modern ML. Picking the wrong method makes delivery harder.
- Is there stakeholder buy‑in? Good models fail when users don’t trust them. Engage early, explain choices and show quick wins.
- What does success look like? Align on outcomes. Sometimes a feasible proof of concept is enough; sometimes only a live system counts.
- What changes will this create? New models change workflows. Plan training, handovers and support. Without change management, adoption stalls.
- Is the problem defined? Vague goals lead to scope creep. Set a clear, measurable objective before you begin.
- Is the timeline realistic? Data cleaning, reviews and deployment take time. Rushed timelines hurt quality.
- Is there a feedback loop? Models drift. Plan monitoring, alerts and retraining from day one.
- What are the compliance constraints? Many sectors have strict rules on data usage and retention. Identify them early.
- Is the infrastructure ready? Even the best model won’t run without the right compute, storage and access. Plan the environment.
- What’s the fallback? If the model under‑performs, what’s Plan B? A simple heuristic can keep the lights on.
- What are the success metrics? Accuracy is not the only metric. Tie KPIs to business value—cost, time or revenue.
- How much explainability do we need? In some contexts, black‑box models won’t fly. Use explainable methods or post‑hoc tools where needed.
- Are the teams aligned? IT, business and operations must pull in the same direction. Misalignment is a common reason for failure.
These lessons help me choose projects well and set them up for success. They’re simple questions, but asking them early saves time, cost and goodwill.