Introduction
Every modern application needs a way to move data between systems, clean it, transform it, and store it in a structured format. This is where an ETL pipeline (Extract, Transform, Load) becomes useful.
Instead of using heavy tools like SSIS, Informatica, or Azure Data Factory, small and mid-sized teams can build their own lightweight ETL pipeline using:
SQL Server for storage, staging, and transformation
.NET Background Worker / Hosted Service for scheduled processing
Stored procedures for optimized data operations
This article explains how to build a complete ETL pipeline from scratch.
ETL Workflow Diagram (Compact)
+------------------+
| Source Data |
| (API/File/DB) |
+---------+--------+
|
Extract (C#)
|
+---------v---------+
| Staging Table |
| (SQL) |
+---------+---------+
|
Transform (SQL)
|
+---------v---------+
| Final Target |
| Reporting Table |
+---------+---------+
|
Load (C#)
1. Designing the ETL Structure
Tables Required
a) Staging Table
Used to temporarily store raw data.
CREATE TABLE Staging_Orders
(
StageId BIGINT IDENTITY(1,1) PRIMARY KEY,
OrderId VARCHAR(50),
CustomerName VARCHAR(200),
Amount DECIMAL(18,2),
RawJson NVARCHAR(MAX),
Processed BIT DEFAULT 0,
CreatedDate DATETIME DEFAULT GETDATE()
);
b) Final Table
CREATE TABLE Fact_Orders
(
OrderId VARCHAR(50) PRIMARY KEY,
CustomerName VARCHAR(200),
CleanedAmount DECIMAL(18,2),
LoadedDate DATETIME DEFAULT GETDATE()
);
2. Architecture Diagram (Smaller Header)
+-----------------------+
| Source System |
| (API/File/Database) |
+-----------+-----------+
|
↓
+-----------------------+
| .NET Worker Service |
| Extract Module |
+-----------+-----------+
|
↓
+-----------------------+
| SQL Staging Table |
+-----------+-----------+
|
↓
+-----------------------+
| SQL Transform SP |
+-----------+-----------+
|
↓
+-----------------------+
| Target Reporting DB |
+-----------------------+
3. Sequence Diagram
C# Worker → Source API: Fetch orders
Source API → Worker: Return JSON
Worker → SQL: Insert into Staging_Orders
Worker → SQL SP: Execute Transform logic
SQL → Worker: Success response
Worker → SQL: Insert into Fact_Orders
4. Step-by-Step ETL Implementation
Step 1: Extract
A .NET worker fetches data from API/File.
public async Task ExtractAsync()
{
var data = await _http.GetStringAsync("https://example.com/orders");
await _repo.InsertStagingAsync(data);
}
Step 2: Load Raw Data into Staging
public async Task InsertStagingAsync(string json)
{
var orders = JsonConvert.DeserializeObject<List<OrderDto>>(json);
foreach (var order in orders)
{
await _context.Staging_Orders.AddAsync(new StagingOrders {
OrderId = order.Id,
CustomerName = order.Name,
Amount = order.Amount,
RawJson = JsonConvert.SerializeObject(order)
});
}
await _context.SaveChangesAsync();
}
Step 3: SQL Transformation Logic (Stored Procedure)
CREATE PROCEDURE sp_Transform_Orders
AS
BEGIN
UPDATE S
SET
CustomerName = UPPER(CustomerName),
Amount = ISNULL(Amount, 0),
Processed = 1
FROM Staging_Orders S
WHERE Processed = 0;
END
Step 4: Load Final Data
public async Task LoadAsync()
{
var list = await _context.Staging_Orders
.Where(x => x.Processed == 1)
.ToListAsync();
foreach(var s in list)
{
await _context.Fact_Orders.AddAsync(new FactOrders {
OrderId = s.OrderId,
CustomerName = s.CustomerName,
CleanedAmount = s.Amount
});
s.Processed = 2; // Mark as completed
}
await _context.SaveChangesAsync();
}
5. Creating the .NET Background Worker
public class ETLWorker : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while(!stoppingToken.IsCancellationRequested)
{
await _service.ExtractAsync();
await _service.TransformAsync();
await _service.LoadAsync();
await Task.Delay(TimeSpan.FromMinutes(5)); // Auto scheduler
}
}
}
6. Exception Logging Table
CREATE TABLE ETL_ErrorLog(
Id BIGINT IDENTITY(1,1),
Step VARCHAR(50),
ErrorMessage NVARCHAR(MAX),
CreatedDate DATETIME DEFAULT GETDATE()
);
7. Handling Large Datasets
Use table partitioning
Use BULK INSERT for large JSON files
Use batch processing (500–5000 rows)
Use SQL Server Agent for scheduling if not using .NET Worker
8. Dashboard Mockup (Text Version)
------------------------------------------------------------
| ETL JOB DASHBOARD |
------------------------------------------------------------
| Last Run Time: 12:45 PM |
| Status: Success |
| Extracted Records: 450 |
| Transformed Records: 448 |
| Loaded Records: 448 |
| Errors: 2 (View Details) |
------------------------------------------------------------
| [Run ETL Now] [View Logs] [Staging Data] [Final Data] |
------------------------------------------------------------
9. Adding Email Notification (Optional)
if(errors > 0)
{
await _email.SendAsync("ETL Errors Found", "Please check error log.");
}
10. Enhancements for Production