Columnstore Index Tuning For Mixed Workloads In SQL Server

Rajesh Gami
Dec 02
1.2k
0
0

Article

Columnstore indexes have transformed high-volume analytics on SQL Server. They provide massive compression, aggressive segment elimination, vectorized batch-mode execution, and overall performance improvements that traditional rowstore indexes cannot match.

However, real production systems rarely run pure OLAP or pure OLTP. Most enterprise databases today run mixed workloads:

Heavy transactional inserts and updates
Real-time dashboards
Analytical queries
Reporting queries
Near-real-time aggregations

This creates a challenge:
How do we tune columnstore indexes without destroying OLTP speed or starving analytical queries?

This article provides a detailed, step-by-step methodology for tuning columnstore indexes specifically for mixed workloads. You get practical design patterns, anti-patterns, proven indexing strategies, and troubleshooting methods used in real production systems.

Understanding Rowstore Vs Columnstore Behavior

Before tuning, you must understand how SQL Server physically organizes data.

Rowstore (B-Tree)

Optimized for:

Single-row lookup
Point queries
Small range scans
High-frequency OLTP inserts and updates

Columnstore

Optimized for:

Large scans
Aggregations
Filtering on few columns
Analytics and reporting
Batch-mode processing

Columnstore shines when you query millions of rows but touch only a few columns.

Why Mixed Workloads Are Hard For Columnstore

Columnstore indexes introduce:

Rowgroup compression
Delta stores
Deleted bitmap management
Batch mode execution
Segment elimination

OLTP workloads introduce:

Many small writes
Many single-row lookups
Minimal tolerance for locking

These two worlds collide when we store both workloads in the same table.

How Columnstore Works Internally (Simplified)

Each column is stored independently inside rowgroups of roughly 1 million rows each.

+-----------------------------+
|           Table             |
+----------------+------------+
| Column A       | Segment A  |
| Column B       | Segment B  |
| Column C       | Segment C  |
+---------------------------------

Rowgroups:

Rowgroup 1: 1,048,576 rows
Rowgroup 2: 1,048,576 rows
Rowgroup 3: Delta Store (uncompressed)

Write Path (Insert)

Insert -> Delta Store (up to 1M rows)
      -> compressed into Columnstore Rowgroup

Delete/Update Path

Update/Delete -> Delete Bitmap (not physically removed)

How Mixed Workloads Break Columnstore Performance

Common issues:

1. Too Many Rows In Delta Store

Heavy OLTP inserts never compress, causing scanning penalties.

2. Excessive Deleted Rows

Analytical scans slow down when metadata grows.

3. Too Many Small Rowgroups

Bad inserts lead to compressed rowgroups of only 20K–50K rows instead of 1M.

4. Too Many Rebuilds

Incorrect maintenance causes fragmentation and downtime.

5. OLTP Reads Slowed Down

Without proper secondary rowstore indexes, point-lookup performance collapses.

6. Locking And Latching Problems

Columnstore indexes can block writes if not sized or partitioned correctly.

When To Use Clustered Vs Nonclustered Columnstore Indexes

Choosing the correct type is key.

Clustered Columnstore Index (CCI)

Best when:

Table is mostly analytical
Inserts are large / batched
Updates are rare
Queries scan large ranges
Table is append-only

Not good for OLTP-heavy tables.

Nonclustered Columnstore Index (NCCI)

Best when:

You must keep OLTP speed
Table already uses clustered B-tree
You need reporting on live OLTP data
Point lookups must remain fast

This is the preferred option for mixed workloads.

Architecture Patterns For Mixed Workloads

Pattern 1: OLTP Base Table + Nonclustered Columnstore Index

Most recommended strategy.

Clustered Index (Primary Key)
    |
    +-- Nonclustered Columnstore Index (on analytical columns)

Pattern 2: Partitioned Table With CCI On Archive Partitions

Ideal for systems where most queries are on recent data.

Partition1 (Current Week)  : Rowstore
Partition2 (Last Week)     : Columnstore
Partition3 (Older Data)    : Columnstore

Pattern 3: Two Tables (Hot + Cold)

Also known as Lambda architecture.

Hot Table (Rowstore)       - For OLTP
Cold Table (Columnstore)   - For reporting

Micro-batching ETLs move data from Hot to Cold.

How To Design Columnstore Indexes For Mixed Workloads

This section gives precise, battle-tested tuning rules.

Step 1: Choose the Right Columns

Do not blindly include all columns.

Include Only

Columns frequently used in SELECT
Columns used in GROUP BY
Columns used in WHERE clauses of reports

Avoid

Unique identifiers
High-cardinality GUIDs
Frequently updated columns

Example

CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Sales
ON Sales (Amount, Quantity, ProductId, CustomerRegion, SalesDate);

Step 2: Control Rowgroup Quality

High-quality rowgroups = better segment elimination and better performance.

Rowgroup quality checker query:

SELECT 
    row_group_id, 
    total_rows, 
    deleted_rows
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = OBJECT_ID('Sales');

Target Numbers

total_rows close to 1,048,576
deleted_rows < 15%

If you see:

Many rowgroups with < 100K rows
Delta stores not compressing
Excessive deleted rows

then your workload is not columnstore-friendly.

Step 3: Use Minimal Secondary Rowstore Indexes

OLTP requires point lookups.

Add narrow rowstore indexes to support:

FK lookups
point queries
small range scans

Example:

CREATE INDEX IX_Sales_OrderId ON Sales(OrderId);

Do not create wide indexes.

Step 4: Partition For Better Compression And Maintenance

Partition by:

Date (most common)
Region
Customer Segment

Example partition scheme:

Current Day     (Rowstore)
Last 7 Days     (Rowstore + NCCI)
Older Partitions (CCI)

Step 5: Reduce Deletes And Updates

Columnstore hates row-by-row updates.

Rewrite OLTP patterns:

1. Replace Delete + Insert with Update

Avoid double work on delete bitmap.

2. Use Soft Deletes

Store IsDeleted flag instead of physical delete.

Step 6: Improve Batch Processing And Ingestion

Best ingestion method:

INSERT INTO Sales
SELECT * FROM Staging
ORDER BY SalesDate;   -- better compression

Worst ingestion method:

Multi-row single inserts (OLTP style)

Step 7: Optimize Segment Elimination

Columnstore works best when filtering by ordered columns.

If your analytics use:

SalesDate
Region
ProductId

Then pre-sort loads by those columns.

Detailed Production Example

Assume a Sales table supporting:

10,000 transactions per second
Analytical reporting every 5 minutes
Dashboards hitting the live table

Table Structure

SalesId        INT IDENTITY
ProductId      INT
CustomerId     INT
Region         VARCHAR(10)
Amount         DECIMAL(10,2)
Quantity       INT
SalesDate      DATETIME2

Step-by-Step Indexing Strategy

1. Create Clustered PK for OLTP Lookups

CREATE CLUSTERED INDEX CIX_Sales
ON Sales(SalesId);

2. Create Required OLTP Rowstore Indexes

CREATE INDEX IX_Sales_ProductId ON Sales(ProductId);
CREATE INDEX IX_Sales_CustomerId ON Sales(CustomerId);

3. Add Nonclustered Columnstore Index For Reporting

CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Sales
ON Sales (Amount, Quantity, Region, SalesDate);

4. Partition Table By Month

CREATE PARTITION FUNCTION PF_SalesDate (datetime2)
AS RANGE RIGHT FOR VALUES ('2023-01-01', '2023-02-01', ...);

5. Load Data in Batches

INSERT INTO Sales (...)
SELECT ...
FROM Staging
ORDER BY SalesDate;

Execution Plan Improvements

Before NCCI

Rowstore scans
CPU high
Memory grants heavy
No batch mode
Full table scan for dashboard queries

After NCCI

Batch mode enabled
80–90% fewer CPU cycles
Segment elimination
Rowstore indexes still support OLTP lookups

Query Flow Diagram With NCCI

                   +----------------------------+
OLTP Queries ----->|   Clustered Rowstore CI    |
                   +------------+---------------+
                                |
                                v
                        +-------+------+
                        | OLTP Point   |
                        | Lookup Index |
                        +-------+------+
                                |
                                v
                     +----------+-----------+
Reports/Dashboards ->| Nonclustered         |
                     | Columnstore Index    |
                     +----------+-----------+
                                |
                                v
                         +------+-------+
                         | Batch Mode   |
                         | Execution    |
                         +--------------+

Troubleshooting Columnstore Problems In Production

Symptom 1: Columnstore Not Being Used

Cause: Missing predicates or wrong columns in NCCI.

Symptom 2: Small Rowgroups

Cause: Row-by-row OLTP inserts.

Fix: Batch load or use partition switching.

Symptom 3: High Fragmentation

Cause: Excessive deletes.

Fix: Periodic index REORGANIZE, not REBUILD.

Symptom 4: High Delta Store Rows

Cause: Too many trickle inserts.

Fix: Increase batch size or enable large ingestion pipelines.

Symptom 5: Slow OLTP Queries

Cause: Missing rowstore indexes after adding NCCI.

Fix: Add narrow indexes.

Maintenance Strategy For Mixed Workloads

1. Reorganize Columnstore Index

Do not rebuild daily.

ALTER INDEX NCCI_Sales  
ON Sales  
REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON);

2. Periodically Purge Delete Bitmaps

ALTER INDEX NCCI_Sales ON Sales REBUILD PARTITION = 5;

3. Use Partition Switching For Archival

Move entire partitions with zero locking.

Best Practices Summary

For Mixed Workloads:

Prefer NCCI, not CCI
Partition by date
Maintain high-quality rowgroups
Avoid small inserts
Reorganize, do not rebuild
Preserve OLTP rowstore indexes
Filter analysts to use columnstore paths
Keep delete row percentages under 15 percent
Combine ingestion and reporting architecture
Monitor rowgroup metadata and delta stores

Final Summary

Columnstore indexes are extremely powerful, but only when implemented correctly for mixed workloads. You must balance OLTP responsiveness with analytical performance. This requires:

Choosing NCCI wisely
Partitioning data properly
Maintaining rowstore indexes
Ensuring high-quality rowgroups
Tuning ingestion patterns
Monitoring delta stores and delete bitmaps

When done right, columnstore indexes can reduce CPU usage by 80 percent, speed up analytical queries by 20–50x, and maintain OLTP stability at the same time.