SQL Server  

Auto-Repair Engine (Detect Broken Foreign Keys, Orphans, and Incorrect Totals)

Introduction

Large enterprise systems evolve over time. New modules are added, integrations change, and migrations happen. With such changes, data inconsistencies gradually appear. Examples include:

  • A child record exists without a parent (orphan data).

  • Totals stored for performance become incorrect compared to computed values.

  • Foreign key references point to deleted or invalid entries.

  • Duplicate reference relationships appear due to race conditions.

  • Data sync between microservices or integration pipelines leaves gaps.

Traditional SQL constraints can prevent some mistakes, but many enterprise systems allow soft deletes, asynchronous processing, queued workflows, and multi-system integration pipelines. In such cases, perfect consistency cannot always be enforced in real time.

A Data Auto-Repair Engine solves this problem by continuously scanning the database, detecting inconsistencies, and fixing them safely—either automatically or with human approval.

1. Goals of the System

The objective is to create a controlled, incremental, auditable repair process, not random updates.

Key outcomes:

  • Identify data mismatches with classification (severity-based).

  • Provide automated and semi-automated repair actions.

  • Allow approval workflows before executing fixes.

  • Track trends to find broken processes, not just broken records.

2. High-Level Architecture

           ┌──────────────────────────────┐
           │      Metadata Catalog        │
           └─────────────┬────────────────┘
                          │
                 ┌────────▼────────┐
                 │   Scanner Job   │
                 └────────┬────────┘
                          │
                    ┌─────▼─────┐
                    │   Detector │
                    └─────┬─────┘
                          │
                  ┌───────▼────────┐
                  │  Repair Engine  │
                  └───────┬────────┘
                          │
         ┌────────────────▼─────────────────┐
         │  Execution Modes: Auto | Manual  │
         └───────────────┬──────────────────┘
                         │
             ┌───────────▼───────────┐
             │ Audit + Alert System  │
             └────────────────────────┘

3. Types of Data Problems Handled

CategoryExample
Orphan dataInvoiceLines exists but Invoice deleted
Broken FK referenceProductID refers to non-existing Product
Mismatched totalsHeader totals differ from line sum
Range violationsDateReceived < DateCreated
Duplicate business keyTwo active customers with same PAN
Referential cascades missingDeleting a contract did not delete children

Each issue should be assigned a severity level:

LevelAction
InfoReport only
WarningFix after approval
CriticalAuto-fix immediately

4. Metadata-Driven Rules Engine

Instead of writing rigid SQL scripts, define consistency rules in a rules table.

Example Rule Table

RuleIdRuleTypeTargetTableExpressionSeverityRepairAction
1MissingParentInvoiceLineInvoiceId NOT IN (SELECT Id FROM Invoice)CriticalDelete Line
2IncorrectTotalInvoiceTotal != SELECT SUM(LineAmount)WarningRecalculate Total
3SoftDeleteMismatchCustomerIsDeleted = 1 AND EXISTS(Orders)ManualCancel Orders or Undelete

This approach keeps logic flexible.

5. Detection Patterns

5.1 Orphan Detection Query

SELECT l.Id, l.InvoiceId
FROM InvoiceLine l
LEFT JOIN Invoice i ON l.InvoiceId = i.Id
WHERE i.Id IS NULL;

5.2 Incorrect Total Check

SELECT i.Id, i.Total AS StoredTotal, SUM(l.Amount) AS ActualTotal
FROM Invoice i
JOIN InvoiceLine l ON l.InvoiceId = i.Id
GROUP BY i.Id, i.Total
HAVING SUM(l.Amount) <> i.Total;

5.3 FK Violation Trend Tracking

INSERT INTO AnomalyLog (RuleId, RecordId, CreatedOn)
SELECT 10, CustomerId, GETDATE()
FROM Orders o
WHERE CustomerId NOT IN (SELECT Id FROM Customer);

6. Repair Strategies

StrategyWhen Used
Auto-correctDeterministic fix available (recomputing totals)
Cascade deleteChild has no meaning without parent
Set defaultNull safe fallback when allowed
MergeWhen duplicates must be consolidated
Escalate to workflowWhen human decision required

7. .NET Repair Pipeline Example

public async Task RunRepairAsync()
{
    var rules = await ruleRepo.GetActiveRules();

    foreach (var rule in rules)
    {
        var records = await scanner.Execute(rule.Expression);

        foreach (var record in records)
        {
            switch (rule.RepairAction)
            {
                case RepairAction.Recalculate:
                    await repairExecutor.RecalculateTotals(record);
                    break;
                case RepairAction.Delete:
                    await repairExecutor.DeleteRecord(record);
                    break;
                case RepairAction.FlagForManualReview:
                    await reviewQueue.Add(rule.RuleId, record.Id);
                    break;
            }
        }
    }
}

8. Angular Admin UI

The UI should:

  • Show detected anomalies

  • Show confidence in repair logic

  • Allow bulk approval or reject

  • Show before/after preview

UI Flow

Dashboard → Anomaly List → Preview Impact → Approve/Reject → Execute → Audit Log

9. Scheduling and Execution Model

Use incremental scheduling to avoid large locks.

ModeRuns
OnlineLightweight constraints check
Nightly batchDeep scans, large repair operations
Event-drivenTriggered after migration or integration

10. Governance, Safety, and Compliance

  • Store before/after values for every repair

  • Use versioned snapshot table patterns

  • Integrate role-based access

  • Provide read-only dry-run preview mode

Conclusion

A Data Auto-Repair Engine is essential in large distributed systems where data eventually becomes inconsistent. Instead of running manual SQL patches or firefighting data defects, a controlled automatic system ensures the data stays clean.

By using metadata-driven rules, audit logs, and safe execution workflows, enterprises can maintain trustable data without disrupting live operations.