Files, Directory, IO  

High-Scale File Sync Service (Detecting Changes, Pushing Deltas, and Tracking Versions)

Synchronizing files across devices, users, and regions is a core requirement in enterprise platforms like cloud storage, document management systems, ERPs, CAD systems, and collaboration tools. A high-scale file sync service must balance consistency, performance, bandwidth efficiency, and security while supporting simultaneous edits and version history.

This article explains how to design and implement a scalable file sync system using Angular front-end and .NET backend, with delta detection, version tracking, and intelligent sync scheduling.

Introduction

A file sync service needs to answer four core questions:

  1. Has a file changed since the last sync?

  2. What exactly changed: the full file or only part of it?

  3. Which version is authoritative if multiple users modified the file?

  4. How do we synchronize these updates efficiently across devices?

Traditional solutions upload full files on every change. This is inefficient and expensive. A modern sync service must:

  • Detect file changes efficiently

  • Transfer only deltas (partial differences)

  • Maintain version history

  • Support conflict resolution

  • Work offline and sync later

This approach reduces:

  • Bandwidth consumption

  • Storage usage

  • Sync duration

  • CPU load

High-Level Architecture

Below is the recommended architecture for a scalable sync system.

 ┌─────────────────────────────┐
 │     Angular Client App      │
 │ (Detect Local Changes, Sync)│
 └─────────────┬───────────────┘
               │ REST / WebSocket
               │
      ┌────────▼───────────┐
      │   Sync Controller  │
      └───────┬────────────┘
              │
     ┌────────▼───────────────────────────┐
     │ Change Detector + Version Manager  │
     └───────┬───────────────┬───────────┘
             │               │
   ┌─────────▼───────┐      │
   │ Delta Engine     │      │
   └───────┬─────────┘      │
           │                │
   ┌───────▼──────────┐   ┌▼───────────────────┐
   │ File Blob Storage │   │ Metadata Database │
   └───────────────────┘   └───────────────────┘

This architecture separates:

  • File storage from metadata

  • Version tracking from actual content

  • Communication from compute processing

Core Components and Responsibilities

Change Detector

Detects if a file has changed. Instead of relying on timestamps, use content hashing:

  • MD5 (fast, risky for collisions)

  • SHA-256 (ideal and secure)

  • Rolling hash (for chunk-based comparison)

Workflow:

  1. File chunked (4MB recommended)

  2. Hash computed per chunk

  3. Hash list compared with server hash list

Only changed chunks are uploaded.

Delta Engine

Responsible for:

  • Identifying changed chunks

  • Applying binary patching

  • Creating new version output

Algorithms often used:

  • Rsync rolling checksum

  • Binary diff algorithms (bsdiff, xdelta)

  • Custom chunk hashing with Merkle trees

A Merkle-tree representation looks like:

        ┌───────────────Root Hash───────────────┐
        │                │                       │
   ChunkGroup 1      ChunkGroup 2          ChunkGroup 3
   │   │   │         │   │   │   │         │   │   │
 chunk chunk chunk  chunk chunk chunk chunk chunk chunk chunk

If one chunk changes, only the affected parents are recalculated.

Version Manager

Each file version must store:

  • Version number

  • Chunk hash list

  • Parent version reference

  • Timestamp

  • Editor identity

  • Sync strategy mode

Example metadata entry:

FileIdVersionParentVersionChangeTypeCreatedByHashRootCreatedAt
F10231918DeltauserAA97FF...2025-11-20

Conflict Resolver

Conflicts occur when two devices modify the same version independently. Strategy options:

  • Last writer wins (not recommended for enterprise)

  • Force manual merge

  • Preserve parallel versions (fork model)

  • Auto-merge text documents using diff+merge logic

Binary merges are harder; treat them as:

  • Manual approval required

  • Producing two branches

Sync Modes

A mature sync service supports three models:

ModeDescriptionUse Case
Push SyncClient detects a change and uploads immediatelySingle active user
Pull SyncServer notifies client when remote change occursShared workspace
Hybrid SyncBoth push and pull based on stateLarge collaboration networks

WebSockets or SignalR are recommended for live sync events.

Angular Implementation

Angular handles:

  • File change detection (local hashing)

  • Upload of changed chunks

  • Download missing chunks

  • UI for version history and conflict actions

Example service (simplified):

computeChunkHashes(file: File): Promise<string[]> {
  return new Promise(resolve => {
    const chunks: string[] = [];
    const reader = new FileReader();
    const chunkSize = 4 * 1024 * 1024;
    let offset = 0;

    reader.onload = async () => {
      const hash = await crypto.subtle.digest("SHA-256", reader.result as ArrayBuffer);
      chunks.push(this.arrayBufferToHex(hash));
      offset += chunkSize;
      if (offset < file.size) {
        reader.readAsArrayBuffer(file.slice(offset, offset + chunkSize));
      } else {
        resolve(chunks);
      }
    };

    reader.readAsArrayBuffer(file.slice(offset, offset + chunkSize));
  });
}

This example demonstrates client-side hashing.

UI must show:

  • File status

  • Progress bar per chunk

  • Version history

  • Conflicts

.NET Backend Implementation

Metadata Database Table Examples

Files
FileVersions
FileChunks
SyncSessions
ConflictLog

Delta Save Logic (simplified)

public async Task StoreFileDelta(Guid fileId, FileDeltaRequest request)
{
    var version = await _versionRepo.CreateVersion(fileId);
    
    foreach (var chunk in request.ChangedChunks)
    {
        await _chunkRepo.StoreChunk(fileId, version.Id, chunk.Index, chunk.Data);
    }

    await _versionRepo.CloseVersion(version.Id, request.HashRoot);
}

Sync Scheduler and Retry Model

A file sync system must handle:

  • Network failures

  • Partial uploads

  • Device offline mode

  • Chunk-level retry

Scheduler design:

 ┌─────────────────┐
 │ Pending Queue   │
 └───────┬──────────┘
         │ retry/backoff
 ┌───────▼──────────┐
 │ Chunk Processor  │
 └───────┬──────────┘
         │
 ┌───────▼────────────┐
 │ Server Commit       │
 └─────────────────────┘

Retry backoff formula example:

Retry delay = min(30 seconds, 2^attempt seconds)

Security Considerations

  • Encrypt file chunks at rest

  • Sign metadata to prevent tampering

  • Use TLS for transfer

  • Secure chunk-level access with signed URLs

  • Validate versions to avoid replay attacks

Testing Strategy

  1. Large file sync performance test

  2. Partial chunk modification detection

  3. Conflict creation and resolution

  4. Multi-device concurrency

  5. Network drops and recovery

  6. Cold start full-resync

  7. Scalability test with 1M+ versions

Summary

A high-scale file sync architecture requires careful planning around:

  • Chunking

  • Hash-based change detection

  • Delta-only transfer

  • Versioning

  • Conflict handling

  • Secure metadata and file storage

  • Reliable retries

With Angular handling intelligent file diffing and a .NET backend managing version logic, the complete system becomes efficient, scalable, and enterprise-ready.