~/projects

Projects

A list of projects I have been working on or built

Water Main Break Prediction

The Water Main Break Prediction is a web application which predicts the probability of water main breaks for the municipality of Kitchener-Waterloo. It is made using Streamlit and deployed on Heroku.

PandasScikit-learnStreamlit

Instacart Market Basket Analysis

This project applies Market Basket Analysis on the Instacart Online Grocery Shopping dataset using Databricks, Spark, and Spark SQL, ingests data through an ETL pipeline, performs exploratory analysis and employs machine learning with PySpark and Scala to discover frequent itemsets, informing purchase recommendations.

Apach SparkScalaDatabricks

DoorDash Delivery Duration Prediction

The aim of this project is to predict the delivery duration of DoorDash orders. The model is trained on deliveries made in 2015 and utilizes classical regression techniques as well as boosted tree regressors.

StatsmodelsPrincipal Component AnalysisLightGBM

Lyft Driver LTV Prediction

The goal of this project is to predict the lifetime value (LTV) of a driver for Lyft and identify the main factors affecting a driver's LTV. I also aim to explore whether there are specific segments of drivers that generate more value for Lyft than the average driver, and provide actionable recommendations for the business based on the findings.

SeabornKMeans ClusteringPandas

Retail Sales Analysis

This project aimed to analyze transaction data for chip sales in order to understand customer behavior and optimize store strategy. The data consisted of customer transaction records from a national grocery store chain over a 12-month period.

Data AnalysisPandasUplift Modelling