Skip to content

Table of Contents

Visualization

Airbnb: Vizualising tabular data (Python)
Analyze London's Airbnb market by visualizing trends in listing popularity, seasonality, and review activity.

Visualisation Strategies for Predictive Models (Python)
Discover how to create effective visualisations for Exploratory Data Analysis and predictive model interpretation.

Univariate Analysis (R)
Master the art of single-variable analysis by learning how to create histograms, box plots, and density curves.

Multivariate Analysis (R)
Explore the power of multivariate exploratory techniques for uncovering relationships between multiple variables.

Detecting and Addressing Multicollinearity (R)
Learn how to identify and handle multicollinearity in regression models using variance inflation factors (VIF).

Statistics Basics

Statistical Foundations for Data Science (Python)
Build a solid foundation in descriptive and inferential statistics for real-world analytics.

Mastering Probability Concepts (R)
Understand probability theory through hands-on examples covering discrete and continuous distributions.

Vector Operations and Data Manipulation (R)
Explore the fundamental building blocks of R programming through comprehensive vector operations.

Matrix Algebra and Applications (R)
Master matrix operations in R for linear algebra applications in statistics and machine learning.

Probability Distributions for Call Centre Analytics (Python)
Model call centre operations by applying Poisson, Exponential, and Geometric distributions.

Hypothesis Testing

Z-test vs t-test (R)
Demonstrate z-tests and t-tests on real-world attendance data with clear hypotheses and code.

Analysis of Variance—ANOVA (R)
Discover how Analysis of Variance helps compare means across multiple groups simultaneously.

Chi-Square Goodness of Fit (R)
Master the chi-square goodness of fit test to determine whether observed data matches an expected theoretical distribution.

Chi-Square Test of Independence (R)
Learn how to test for relationships between categorical variables using the chi-square test of independence.

Comprehensive Hypothesis Testing (Python)
Build a complete hypothesis testing toolkit in Python using four real-world examples.

Factor Analysis

Navigating the Curse of Dimensionality (R)
Understand why high-dimensional data poses unique challenges for machine learning and statistical analysis.

Exploratory Factor Analysis (R)
Discover hidden latent variables in your data using exploratory factor analysis (EFA).

Data Preprocessing

Handling Missing Data with KNN Imputation (R)
Learn how to handle missing values intelligently using K-Nearest Neighbours imputation.

Feature Engineering Fundamentals (Python)
Master the art of creating predictive features from raw data through comprehensive feature engineering techniques.

Strategies for Imbalanced Classification Problems (R)
Tackle the common challenge of imbalanced datasets where one class significantly outnumbers others.

Prediction Algorithms

Classification Algorithms

Titanic Survival: Logistic Regression (R)
Master logistic regression for predicting binary outcomes using one of data science's most iconic datasets.

CHAID Decision Trees for Categorical Analysis (R)
Explore Chi-square Automatic Interaction Detection (CHAID) for building interpretable decision trees.

CART Classification Trees (R)
Learn Classification and Regression Trees (CART) methodology for building robust decision trees.

Regression Algorithms

Understanding Part and Partial Correlations (R)
This blog explains the difference between part and partial correlations.

Predicting House Prices: Linear Regression (R)
Master the foundational technique of linear regression using R's comprehensive statistical toolkit.

Predicting Vehicle Fuel Efficiency: Linear Regression (Python)
Master linear regression using the classic Auto MPG data to predict city-cycle fuel consumption.

Machine Learning

Interactive Machine Learning Explorer (RShiny)
Experience machine learning interactively through this RShiny web application.

Comparing Multiple ML Models with Scikit-Learn (Python)
Predict employee absenteeism by building and comparing multiple machine learning models using Scikit-learn.

Building Clinically Interpretable Models for Heart Disease (Python)
Build an interpretable heart disease prediction model for rapid clinical screening using explainable ML techniques.

Online Learning and Streaming Data (Python)
Discover how to build machine learning models that learn incrementally from streaming data.

Time Series Forecasting

Understanding Stationarity in Time Series (R)
Master the foundational concept of stationarity in time series analysis.

Testing for Stationarity (R)
Learn how to formally test whether your time series data is stationary using the Augmented Dickey-Fuller (ADF) test.

Build and Validate ARIMA Models (R)
Build powerful forecasting models using Auto-Regressive Integrated Moving Average (ARIMA) methodology in R.

Forecasting Air Quality with ARIMA (Python)
Forecast PM2.5 air quality in Hyderabad by building an ARIMA model using real-world data.

Analysing Seasonal Time Series Patterns (R)
Explore techniques for handling seasonal patterns in time series data using R.

Demand Forecasting: Vector Autoregression Models (R)
Learn how to model multiple interdependent time series simultaneously using Vector Autoregression (VAR).

Forecasting Time Series with Deep Neural Networks (Python)
Discover how recurrent neural networks (RNNs) and LSTM networks revolutionise time series forecasting.

Deep Learning

Building Neural Networks from Scratch: The Perceptron (Python)
Understand the fundamental building block of neural networks by implementing a perceptron from scratch.

Backpropagation: The Heart of Neural Network Training (Python)
Master the backpropagation algorithm that powers neural network learning.

Deep Learning with TensorFlow and Keras (Python)
Build production-ready deep neural networks using TensorFlow and Keras.

Generative AI

Comparing Tokenizers Across Popular LLMs (Python)
Compare how different LLMs like BERT, GPT-2, and Flan-T5 tokenize text, code, and names.

Mastering Prompt Engineering Techniques (Python)
Learn advanced prompt engineering strategies to get better results from large language models.

Retrieval Augmented Generation—RAG (Python)
Build a chatbot that answers questions from a PDF document using Retrieval-Augmented Generation (RAG).

Building Agentic AI Systems with LangChain (Python)
Build agentic systems with tool use, memory, and decision-making capabilities using LangChain.

Human-in-the-Loop Blog Validation with LangGraph (Python)
Create a human-in-the-loop blog editing pipeline using LangGraph to automate text and code validation.

Build a Crew to Tailor your Resume (Python)
Orchestrate multiple AI agents using CrewAI to tailor resumes for job applications and prepare interview materials.

Prescriptive Analytics

Optimisation with Linear Programming (R)
Learn how to formulate and solve resource allocation problems using linear programming.

Integer Programming: Advanced Modelling Tricks
Master advanced integer programming techniques for solving complex discrete optimisation problems.

Solving Inventory Problems with OR-Tools and EOQ (Python)
Solve retail inventory problems by building a Mixed Integer Programming model with Google OR-Tools.

Modelling Product Adoption with Bass Diffusion (R)
Forecast new product adoption rates using the Bass Diffusion Model.

Network Diffusion and Bass Models (Python)
Simulate how ideas, products, and behaviours spread through networks using diffusion models.

Decision Making with Analytic Hierarchy Process
Make complex multi-criteria decisions systematically using the Analytic Hierarchy Process (AHP).

Clustering

Hierarchical Clustering for Data Segmentation (R)
Discover how agglomerative hierarchical clustering builds nested groupings of similar data points.

K-Means Clustering for Credit Card Segmentation (R)
Master the K-Means algorithm for partitioning data into distinct clusters.

Analysing Indian Railway Delays with DBSCAN Clustering (Python)
Analyze Indian railway delays by scraping real-time data and applying DBSCAN clustering.

Reinforcement Learning

Product Recommendation: Associative Mining (R)
Create personalised recommendation engines using association mining.

Customer Lifetime Value Prediction (R)
Calculate and predict Customer Lifetime Value (CLV) to optimise marketing spend and retention strategies.

Neural Collaborative Filtering (Python)
Build a music recommendation system using Neural Collaborative Filtering on the NetEase dataset.

Network Science

Getting Started with NetworkX (Python)
Learn network analysis fundamentals using Python's NetworkX library.

Fundamentals of Network Science (Python)
Explore the mathematical principles underlying network theory and complex systems.

Measuring Influence with Network Centrality (Python)
Identify important nodes in networks using centrality measures like degree, betweenness, closeness, and PageRank.

Finding Hidden Groups in Graphs Using NetworkX (Python)
Community detection in complex networks using Girwan Newman and other algorithms.

Solving Shortest Path Problems with Integer Programming (Python)
Formulate and solve shortest path problems using integer programming optimisation.

Modelling Network Flow Problems (Python)
Master network flow optimisation for applications like supply chain logistics and resource allocation.

Bipartite Matching and Assignment Problems (Python)
Solve assignment problems on bipartite graphs using Hall's theorem and augmenting paths in NetworkX.

Deployment & Production

Deploying Machine Learning Models with Flask (Python)
Transform your machine learning models into production-ready HTTP APIs using Flask.

Persisting ML Predictions in Databases (Python)
Learn how to store and retrieve machine learning predictions using databases.

Object-Relational Mapping—ORM (Python)
Master database interactions using SQLAlchemy ORM for clean, Pythonic data access.

Career Lessons

What I learnt from failures as a Data Science Consultant
Reflections and lessons learned from failures and challenges faced as a data science consultant.

Podcast on Making Data Science Outputs Consumable
Podcast on how to bridge the gap between complex data science work and practical, understandable outputs for business stakeholders.

Higher Education Reviews

Inside IIM Bangalore's Business Analytics Programme
A comprehensive first-hand review of IIMB's part-time Business Analytics and Intelligence executive programme.

Choosing the Right Part-Time Data Science Masters
Navigate the landscape of part-time data science masters programmes with this comprehensive comparison guide.

Blogs on Imperial College London
Personal insights and experiences from studying at Imperial College London.

External Blogs

Publications and Conference Presentations
A curated list of academic publications, conference presentations, and research contributions.

Deployed Data Science Applications
Explore live demonstrations of data science applications and interactive tools.

Projects

Predictive Maintenance for Industrial Equipment
Building predictive maintenance systems that forecast equipment failures before they occur.

Contract Intelligence and Analysis Tool
Developing an AI-powered contract intelligence system that extracts key terms and identifies risks.

Competitor Intelligence Pipeline
Building an automated competitive intelligence system that monitors competitor activities.

Intelligent Document Annotation System
Creating an intelligent annotation tool that accelerates machine learning dataset creation.

Demand Forecasting for Inventory Planning
Implementing demand forecasting models that predict product sales and optimise inventory levels.

Optimal Bid Allocation System
Designing a bid allocation optimisation system for auctions and procurement.

Reward and Recognition Programme Analytics
Analysing employee reward and recognition programmes to maximise engagement and retention.

Supply Chain Analytics and Optimisation
Developing comprehensive supply chain analytics that optimise logistics, inventory, and distribution networks.

Back to top