📊 SQL & Business Intelligence: A Data-Driven Case Study | Part 1
- Matias Rossi
- Mar 24
- 4 min read
Bike Retail Company | SQL Analysis & Data Exploration

🚀 Introduction
In today’s data-driven world, businesses rely on SQL and Business Intelligence (BI) techniques to uncover trends, optimize performance, and enhance customer strategies. This project simulates a real-world data analysis scenario for a bike retailer with three U.S. stores operating between 2016 and 2018—a period marked by a steady decline in sales, with 2018 being the worst year.
The objective of this project is to apply advanced SQL techniques to analyze customer behavior, revenue trends, store performance, and product sales, while simulating the kind of insights that stakeholders would expect from a data analyst.
This is Part 1 of the project, focusing on data exploration, cleaning, and business intelligence analysis using SQL. In Part 2, I will present findings, key insights, and business recommendations based on this analysis.
📂 Dataset Overview
The dataset consists of three years of sales records (2016-2018) from a bike retailer with stores in New York (NY), California (CA), and Texas (TX). The database includes:
✔ customers – Client details (name, location, contact info).
✔ orders & order_items – Sales transactions, including pricing, discounts, and product quantities.
✔ products & categories – Bike models, brands, and pricing details.
✔ stores & staffs – Store locations, employee performance, and sales contributions.
✔ stocks – Inventory levels per store.
📉 Business Context & Key Questions
The company has experienced declining sales year over year, with 2018 showing particularly weak performance. This project seeks to uncover:
✅ Who are the best and worst customers?
✅ Which products generate the most revenue?
✅ How do different stores perform?
✅ Which customers are likely to churn?
✅ What strategies could improve business performance?
🛠️ Technical Skills & Tools Used
This analysis leverages SQL techniques for data exploration, transformation, and business intelligence analysis, including:
✔ Data Aggregation – SUM(), COUNT(), AVG(), MIN(), MAX()
✔ Advanced Joins & CTEs – INNER JOIN, LEFT JOIN, RIGHT JOIN, WITH (CTE), SELF JOIN
✔ Window Functions – RANK(), DENSE_RANK(), PERCENT_RANK(), ROW_NUMBER(), NTILE()
✔ Customer Segmentation & Classification – CASE, DATEDIFF(), PERCENT_RANK(), NTILE()
✔ Revenue & Sales Trends – GROUP BY, HAVING, ORDER BY, DATEPART()
✔ Business Metrics Optimization – Discount impact, customer loyalty scoring, profit margin calculations
✔ Query Optimization & Validation – Microsoft SQL Server, ChatGPT for query efficiency
🔍 SQL Analysis Breakdown: 8-Phase Approach
This project follows a structured 8-phase approach, mirroring how data analysts tackle real-world business problems using SQL.
📌 Phase 1: Data Exploration & Cleaning
Objective: Understand database structure, detect inconsistencies, and assess data quality.
✔ Checked table relationships (INFORMATION_SCHEMA.COLUMNS).
✔ Identified missing values (e.g., customer contact info, shipped dates) using ISNULL().
✔ Verified time frame of sales data (2016-2018) using MIN(order_date), MAX(order_date).
✔ Detected duplicated customers or incorrect records using GROUP BY customer_id HAVING COUNT(*) > 1.
📌 Phase 2: General Business Performance
Objective: Assess the company’s financial health over time.
✔ Year-over-year revenue trends to evaluate business decline using SUM(list_price quantity).
✔ Total sales by store, identifying the highest-performing location with GROUP BY store_id.
✔ Comparison of monthly sales trends across different years using DATEPART(YEAR, order_date).
✔ Impact of discounts on revenue using SUM(list_price quantity * discount).
📌 Phase 3: Customer Behavior & Segmentation
Objective: Identify purchasing habits and classify customer groups.
✔ Total unique customers who made a purchase using COUNT(DISTINCT customer_id).
✔ Repeat vs. one-time buyers and their revenue contribution using CASE statements.
✔ Top 5 customers by total spending using ORDER BY TotalSpent DESC.
✔ Customer tier classification using PERCENT_RANK().
📌 Phase 4: Product Performance Analysis
Objective: Discover which products drive revenue.
✔ Top-selling products by total revenue using SUM(quantity * list_price).
✔ Best-performing brands based on sales volume using GROUP BY brand_name.
✔ Revenue generated per product category using JOIN categories ON products.category_id = categories.category_id.
📌 Phase 5: Store & Regional Performance
Objective: Understand store profitability & revenue distribution across locations.
✔ Revenue contribution per store and state using SUM(list_price quantity) GROUP BY store_id, state.
✔ Highest and lowest-performing stores based on average order value using AVG(list_price quantity).
✔ Comparing store performance over different years using PARTITION BY store_id, YEAR(order_date).
📌 Phase 6: Shipping & Logistics Efficiency
Objective: Evaluate order fulfillment and shipping delays.
✔ Average shipping time per store using DATEDIFF(DAY, order_date, shipped_date).
✔ Percentage of late shipments using CASE WHEN shipped_date > required_date THEN 'Late' ELSE 'On Time' END.
✔ Impact of delayed shipping on customer retention using JOIN customers ON orders.customer_id = customers.customer_id.
📌 Phase 7: Customer Retention & RFM Analysis
Objective: Classify customers based on spending habits and frequency using RFM segmentation:
✔ Recency – Time since last purchase (DATEDIFF(DAY, MAX(order_date), GETDATE())).
✔ Frequency – How often each customer buys (COUNT(order_id)).
✔ Monetary – Total amount spent (SUM(quantity * list_price)).
📌 Phase 8: Advanced Customer Segmentation
Objective: Identify trends based on geography and spending behavior.
✔ Which states generate the highest revenue? (RANK() OVER (ORDER BY revenue DESC)).
✔ Classifying customers into spending tiers using PERCENT_RANK().
✔ Segmenting customers into ‘Loyal Buyers’ vs. ‘One-Time Purchasers’ using CASE WHEN COUNT(order_id) > 1 THEN 'Repeat Customer' ELSE 'One-Time Buyer' END.
🚀 Next Steps: Business Insights & Recommendations
In Part 2 of this project, I will present:
✅ Key findings and trends discovered from the SQL analysis
✅ Business recommendations to improve customer retention & profitability
✅ Strategic insights for optimizing store performance & inventory management
Comments