Introduction
In today's data-driven business environment, reporting from raw transactional systems is no longer scalable. Business leaders demand timely insights, flexible slicing and dicing, and consistency across reports. That’s where Data Marts come in.
A Sales Data Mart is often the first and most impactful data structure in BI allowing you to track revenue, customers, products, and time-based performance with ease.
This article will show you how to build one from scratch using SQL, following the Dimensional Modeling approach by Ralph Kimball.
What We'll Cover
- Dimensional Modeling Basics
- Designing the Schema
- Creating the Tables in SQL
- Populating with Sample Data
- Querying for Insights
1. Dimensional Modeling Basics
Dimensional Modeling simplifies complex business data by organizing it into:
- Fact Tables: Central numeric measures (e.g., sales amount, quantity)
- Dimension Tables: Descriptive context (e.g., customer, product, time, region)
Fact tables are typically long and skinny; dimension tables are short and wide.
2. Designing the Sales Star Schema
Here’s the simple Star Schema we'll build:
![Sales star schema]()
Fact Table: Fact_Sales
Column |
Type |
Description |
sale_id |
INT |
Surrogate PK |
product_id |
INT |
FK to Dim_Product |
customer_id |
INT |
FK to Dim_Customer |
date_id |
INT |
FK to Dim_Date |
region_id |
INT |
FK to Dim_Region |
quantity_sold |
INT |
Number of units |
sales_amount |
DECIMAL |
Total sales in LKR |
3. Creating the Tables in SQL
Here’s the basic structure using PostgreSQL syntax:
-- Dimension: Product
CREATE TABLE Dim_Product (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
brand VARCHAR(50)
);
-- Dimension: Customer
CREATE TABLE Dim_Customer (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(100),
gender VARCHAR(10),
age INT
);
-- Dimension: Date
CREATE TABLE Dim_Date (
date_id INT PRIMARY KEY,
full_date DATE,
day_of_week VARCHAR(10),
month_name VARCHAR(15),
quarter INT,
year INT
);
-- Dimension: Region
CREATE TABLE Dim_Region (
region_id SERIAL PRIMARY KEY,
region_name VARCHAR(50),
zone VARCHAR(50)
);
-- Fact Table
CREATE TABLE Fact_Sales (
sale_id SERIAL PRIMARY KEY,
product_id INT REFERENCES Dim_Product(product_id),
customer_id INT REFERENCES Dim_Customer(customer_id),
date_id INT REFERENCES Dim_Date(date_id),
region_id INT REFERENCES Dim_Region(region_id),
quantity_sold INT,
sales_amount DECIMAL(12,2)
);
4. Populating with Sample Data
Example Insert - Dim_Product
INSERT INTO Dim_Product (product_name, category, brand)
VALUES
('Smartphone A1', 'Mobile', 'BrandX'),
('Laptop Pro', 'Computing', 'BrandY'),
('Tablet Z3', 'Tablet', 'BrandX');
Example Insert - Dim_Date
INSERT INTO Dim_Date (date_id, full_date, day_of_week, month_name, quarter, year)
VALUES
(20240801, '2024-08-01', 'Thursday', 'August', 3, 2024),
(20240802, '2024-08-02', 'Friday', 'August', 3, 2024);
Example Insert - Fact_Sales
INSERT INTO Fact_Sales (product_id, customer_id, date_id, region_id, quantity_sold, sales_amount)
VALUES
(1, 101, 20240801, 1, 3, 450000.00),
(2, 102, 20240802, 2, 1, 240000.00);
5. Querying for Insights
Total Sales by Product Category
SELECT
dp.category,
SUM(fs.sales_amount) AS total_sales
FROM Fact_Sales fs
JOIN Dim_Product dp ON fs.product_id = dp.product_id
GROUP BY dp.category;
Monthly Sales Trend
SELECT
dd.month_name,
dd.year,
SUM(fs.sales_amount) AS total_sales
FROM Fact_Sales fs
JOIN Dim_Date dd ON fs.date_id = dd.date_id
GROUP BY dd.year, dd.month_name
ORDER BY dd.year, dd.month_name;
Bonus: Integrating with BI Tools
Once your data mart is populated:
- Connect it to Power BI, Tableau, or Looker
- Use filters from dimensions (product, region, date) for powerful slicing
- Create KPIs like:
- Total Sales
- Average Order Value
- Repeat Customer Ratio
Summary
You’ve now built a working Sales Data Mart from scratch using:
- Dimensional Modeling
- Star Schema design
- SQL Table creation and population
- Analytical queries
This is the foundation for any scalable, performant, and business-friendly reporting system.