top of page

📊 SQL & Business Intelligence: A Data-Driven Case Study | Part 1

  • Writer: Matias Rossi
    Matias Rossi
  • Mar 24
  • 4 min read

Bike Retail Company | SQL Analysis & Data Exploration


ree

🚀 Introduction

In today’s data-driven world, businesses rely on SQL and Business Intelligence (BI) techniques to uncover trends, optimize performance, and enhance customer strategies. This project simulates a real-world data analysis scenario for a bike retailer with three U.S. stores operating between 2016 and 2018—a period marked by a steady decline in sales, with 2018 being the worst year.


The objective of this project is to apply advanced SQL techniques to analyze customer behavior, revenue trends, store performance, and product sales, while simulating the kind of insights that stakeholders would expect from a data analyst.


This is Part 1 of the project, focusing on data exploration, cleaning, and business intelligence analysis using SQL. In Part 2, I will present findings, key insights, and business recommendations based on this analysis.



📂 Dataset Overview

The dataset consists of three years of sales records (2016-2018) from a bike retailer with stores in New York (NY), California (CA), and Texas (TX). The database includes:


customers – Client details (name, location, contact info).

orders & order_items – Sales transactions, including pricing, discounts, and product quantities.

products & categories – Bike models, brands, and pricing details.

stores & staffs – Store locations, employee performance, and sales contributions.

stocks – Inventory levels per store.


📉 Business Context & Key Questions

The company has experienced declining sales year over year, with 2018 showing particularly weak performance. This project seeks to uncover:


Who are the best and worst customers?

Which products generate the most revenue?

How do different stores perform?

Which customers are likely to churn?

What strategies could improve business performance?


🛠️ Technical Skills & Tools Used

This analysis leverages SQL techniques for data exploration, transformation, and business intelligence analysis, including:


Data Aggregation – SUM(), COUNT(), AVG(), MIN(), MAX()

Advanced Joins & CTEs – INNER JOIN, LEFT JOIN, RIGHT JOIN, WITH (CTE), SELF JOIN

Window Functions – RANK(), DENSE_RANK(), PERCENT_RANK(), ROW_NUMBER(), NTILE()

Customer Segmentation & Classification – CASE, DATEDIFF(), PERCENT_RANK(), NTILE()

Revenue & Sales Trends – GROUP BY, HAVING, ORDER BY, DATEPART()

Business Metrics Optimization – Discount impact, customer loyalty scoring, profit margin calculations

Query Optimization & Validation – Microsoft SQL Server, ChatGPT for query efficiency


🔍 SQL Analysis Breakdown: 8-Phase Approach

This project follows a structured 8-phase approach, mirroring how data analysts tackle real-world business problems using SQL.


📌 Phase 1: Data Exploration & Cleaning

Objective: Understand database structure, detect inconsistencies, and assess data quality.


Checked table relationships (INFORMATION_SCHEMA.COLUMNS).

Identified missing values (e.g., customer contact info, shipped dates) using ISNULL().

Verified time frame of sales data (2016-2018) using MIN(order_date), MAX(order_date).

Detected duplicated customers or incorrect records using GROUP BY customer_id HAVING COUNT(*) > 1.


📌 Phase 2: General Business Performance

Objective: Assess the company’s financial health over time.


Year-over-year revenue trends to evaluate business decline using SUM(list_price quantity).

Total sales by store, identifying the highest-performing location with GROUP BY store_id.

Comparison of monthly sales trends across different years using DATEPART(YEAR, order_date).

Impact of discounts on revenue using SUM(list_price quantity * discount).


📌 Phase 3: Customer Behavior & Segmentation

Objective: Identify purchasing habits and classify customer groups.


Total unique customers who made a purchase using COUNT(DISTINCT customer_id).

Repeat vs. one-time buyers and their revenue contribution using CASE statements.

Top 5 customers by total spending using ORDER BY TotalSpent DESC.

Customer tier classification using PERCENT_RANK().


📌 Phase 4: Product Performance Analysis

Objective: Discover which products drive revenue.


Top-selling products by total revenue using SUM(quantity * list_price).

Best-performing brands based on sales volume using GROUP BY brand_name.

Revenue generated per product category using JOIN categories ON products.category_id = categories.category_id.


📌 Phase 5: Store & Regional Performance

Objective: Understand store profitability & revenue distribution across locations.


Revenue contribution per store and state using SUM(list_price quantity) GROUP BY store_id, state.

Highest and lowest-performing stores based on average order value using AVG(list_price quantity).

Comparing store performance over different years using PARTITION BY store_id, YEAR(order_date).


📌 Phase 6: Shipping & Logistics Efficiency

Objective: Evaluate order fulfillment and shipping delays.


Average shipping time per store using DATEDIFF(DAY, order_date, shipped_date).

Percentage of late shipments using CASE WHEN shipped_date > required_date THEN 'Late' ELSE 'On Time' END.

Impact of delayed shipping on customer retention using JOIN customers ON orders.customer_id = customers.customer_id.


📌 Phase 7: Customer Retention & RFM Analysis

Objective: Classify customers based on spending habits and frequency using RFM segmentation:


Recency – Time since last purchase (DATEDIFF(DAY, MAX(order_date), GETDATE())).

Frequency – How often each customer buys (COUNT(order_id)).

Monetary – Total amount spent (SUM(quantity * list_price)).


📌 Phase 8: Advanced Customer Segmentation

Objective: Identify trends based on geography and spending behavior.


Which states generate the highest revenue? (RANK() OVER (ORDER BY revenue DESC)).

Classifying customers into spending tiers using PERCENT_RANK().

Segmenting customers into ‘Loyal Buyers’ vs. ‘One-Time Purchasers’ using CASE WHEN COUNT(order_id) > 1 THEN 'Repeat Customer' ELSE 'One-Time Buyer' END.


🚀 Next Steps: Business Insights & Recommendations

In Part 2 of this project, I will present:

Key findings and trends discovered from the SQL analysis

Business recommendations to improve customer retention & profitability

Strategic insights for optimizing store performance & inventory management



Comments


bottom of page