High-Scale File Sync Service (Detect Changes, Push Deltas, Track Versions)

Rajesh Gami
8h
141
0
0

Article

Introduction

Modern enterprise systems — collaboration platforms, document management, CAD repositories and content delivery networks — need robust file synchronization that works at scale. Users expect near-real-time sync across many devices, efficient use of bandwidth, reliable version history, and safe conflict handling. A naive upload-everything approach soon breaks: it consumes bandwidth, increases storage cost and forces long sync delays.

This article describes a production-ready architecture and implementation patterns for a High-Scale File Sync Service that:

Detects local and server-side changes efficiently
Transfers only deltas (changed chunks) instead of whole files
Tracks versions, parents, and history reliably
Resolves conflicts safely and provides clear UX for users
Works offline and supports resumable syncs
Scales horizontally and secures data at rest and in transit

Examples and code snippets use Angular for the client and .NET (ASP.NET Core) for the backend. Diagrams follow the block-style format you chose.

Goals and Non-Goals

Goals

Minimise uploaded bytes using chunking and delta/diff transfer
Provide exact version history with facile replay and rollback
Support concurrent edits with deterministic conflict strategies
Keep sync latency low and resource usage predictable
Offer robust retry, resume and partial-sync behaviour

Non-Goals

This article does not implement an OT/CRDT merge engine for arbitrary binary formats (that is complex and format-specific). For text documents, CRDTs or operational transform approaches are recommended separately.

High-Level Architecture

 ┌──────────────────────────────┐
 │       Angular Client(s)      │
 │ (Detect change, chunking,    │
 │  delta upload, conflict UI)  │
 └───────────────┬──────────────┘
                 │ HTTPS / WebSocket / SignalR
                 ▼
 ┌───────────────┴──────────────┐
 │      Sync Gateway API        │
 │  (auth, request validation,  │
 │   routing, tenant resolution)│
 └───────┬────────────┬─────────┘
         │            │
         │            ▼
         │     ┌──────────────┐
         │     │  Sync Engine │
         │     │ (delta, queue,│
         │     │  versioning)  │
         │     └────┬─────────┘
         │          │
 ┌───────▼──────┐ ┌─▼──────────┐
 │ Chunk Storage│ │ Metadata DB│
 │ (object store│ │ (versions,  │
 │  S3/Blob)    │ │  locks, refs)│
 └──────────────┘ └─────────────┘
         │                │
         ▼                ▼
    CDN / Edge /         Monitoring,
    Backup               Metrics, Audit

Core Components

1. Change Detector (Client Side)

Detect file changes using file system watchers or application-level events. For robust delta transfer, compute chunk-level hashes:

Split file into fixed-size chunks (e.g., 4 MiB) or use variable-size rolling chunks (Rsync-style; better for insertions).
Compute SHA-256 (recommended) per chunk. Keep list chunkHashes[].

Client sends minimal metadata: file id (or path), file size, last-modified, chunkHashes, and root hash.

Why chunk hashing on client?

Server compares hashes to its latest version and requests only missing chunks.
It reduces unnecessary transfers on large files with small edits.

2. Delta Engine (Server Side)

Delta Engine determines which chunks are new and responds with an upload plan:

missingChunks = clientChunkHashes - serverKnownHashes
Server returns upload URLs (pre-signed) for only missing chunks or an instruction to send a binary patch (bsdiff) depending on algorithm chosen.

3. Chunk Storage / Blob Store

Store chunks in object store (S3, Azure Blob, GCS) using content-addressable keys: sha256(chunk) or fileId/version/chunkIndex. Content-addressable storage avoids duplication across tenants and versions.

Keep two stores conceptually:

Chunk store: immutable chunk blobs
File manifests: metadata that lists chunk sequence for a particular file version (Merkle root, chunk list)

4. Version Manager / Manifest

A version manifest records:

fileId, versionId, parentVersionId, chunkHashes[], createdBy, createdAt, changeType
rootHash (Merkle root computed over chunk hashes)

Manifest is small JSON; write it to metadata DB atomically once all missing chunks uploaded.

5. Conflict Resolver

If two clients modify the same base version concurrently:

Detect by comparing the base parentVersionId.
Strategies:
- Forking: create parallel versions and present both to user for manual merge.
- Auto-merge for text: use three-way merge if file is text and merges cleanly.
- Manual resolve: preserve both versions, mark conflict and show UI to merge or choose.

Prefer preserving both versions rather than deleting or silently overwriting.

Chunking Strategies and Delta Algorithms

Fixed-size chunking

Simple, fast, predictable offsets.
Poor for insertions (shifts all subsequent chunks).

Rolling hash / Content-defined chunking

Uses Rabin fingerprinting to find boundaries that remain stable with insert/delete.
Used by rsync, Dropbox. Better delta detection for many real-world edits.

Binary diffs

If both sides have similar versions, bsdiff/xdelta produce compact binary patches.
Require both versions locally or server-assisted chunk comparison. Good for very specific binary formats.

Merkle Tree for Integrity

Arrange chunk hashes into a Merkle tree; store root in manifest.
Helps detect tampering and quickly verify integrity on client or server.

Protocol: Sync Session and APIs

Typical Sync Flow

Client computes chunkHashes[] and requests POST /sync/plan with { fileId, size, chunkHashes, versionHint }.
Server responds with { missingChunks[], uploadUrls[], targetManifestId }.
Client uploads missing chunks directly to blob store using PUT to pre-signed URL.
Client calls POST /sync/complete to signal all chunks uploaded; server validates chunks, writes manifest, and increments version.
Server notifies other devices via WebSocket/SignalR push or by queued notification.

Example API Signatures (REST)

POST /api/sync/plan → returns upload plan
PUT /api/sync/chunk/{fileId}/{chunkHash} → upload chunk (or via pre-signed URL)
POST /api/sync/complete → finalize version (manifest commit)
GET /api/files/{fileId}/versions → list versions
GET /api/files/{fileId}/manifest/{versionId} → get manifest

Design the APIs to be idempotent. PUT chunk can safely be retried.

.NET Backend Patterns

Metadata Model (SQL)

CREATE TABLE files (
  file_id UUID PRIMARY KEY,
  current_version UUID NULL,
  created_at timestamptz,
  tenant_id UUID
);

CREATE TABLE file_versions (
  version_id UUID PRIMARY KEY,
  file_id UUID REFERENCES files(file_id),
  parent_version UUID NULL,
  root_hash text NOT NULL,
  created_by UUID,
  created_at timestamptz DEFAULT now()
);

CREATE TABLE version_chunks (
  version_id UUID REFERENCES file_versions(version_id),
  chunk_index int,
  chunk_hash text,
  PRIMARY KEY(version_id, chunk_index)
);

Finalize Version (pseudo .NET)

public async Task FinalizeVersion(Guid fileId, Guid versionId, List<string> chunkHashes, Guid parentVersion)
{
    using var tx = await _db.BeginTransactionAsync();
    await _db.InsertFileVersion(versionId, fileId, parentVersion, ComputeRootHash(chunkHashes));
    for (int i=0;i<chunkHashes.Count;i++)
        await _db.InsertVersionChunk(versionId, i, chunkHashes[i]);
    await _db.UpdateFileCurrentVersion(fileId, versionId);
    await tx.CommitAsync();
}

Commit manifest atomically to prevent partial versions.

Chunk Uploading

Provide pre-signed URLs so large chunk upload bypasses the API server.
Validate uploaded chunk by server-side checksum before manifest commit.
Keep a garbage collection policy to delete orphaned chunks after a retention window.

Client (Angular) Implementation

Client Responsibilities

Compute chunk hashes (Web Crypto API)
Request upload plan
Upload missing chunks with retry and concurrency
Notify finalize, show progress
Subscribe to push notifications for remote changes

Example chunk hashing in Angular:

async function hashChunk(buffer: ArrayBuffer): Promise<string> {
  const hash = await crypto.subtle.digest('SHA-256', buffer);
  return Array.from(new Uint8Array(hash)).map(b => b.toString(16).padStart(2,'0')).join('');
}

Upload Manager

Use a concurrent upload queue (e.g., 4-8 parallel uploads).
Persist sync session locally to resume after crash.
Expose progress per-chunk and aggregated.

Offline Support and Resumability

Keep local journal: pending uploads, manifest draft, last synced version.
When client reconnects:
- Compare server manifest for file
- Recompute missing chunks using chunk hashes saved locally
- Resume pending chunk uploads

Avoid recomputing full-file hashes frequently — cache chunk hashes for unchanged files.

Conflict Detection and UI

Show a clear “conflict” state with both versions listed and timestamps.
Provide merge helpers:
- For text, show 3-way diff and allow merge in UI.
- For binary, allow preview of both and choose one or upload merged version.
Provide automatic conflict policies per tenant: last-writer-wins, manual-merge, version-branching.

Security and Compliance

Encrypt in transit (TLS 1.2+).
Encrypt chunks at rest (server-side encryption or client-side encryption with tenant key).
Use signed URLs with short expiry for direct uploads.
Authenticate and authorize per-tenant operations.
Audit writes, uploads, deletes and version changes for compliance.

For BYOK-sensitive tenants, support client-side envelope encryption: client encrypts chunk payload with tenant key before upload; server stores ciphertext (server never holds raw key).

Scalability, Performance and Cost Optimizations

Use a distributed object store (S3 / Blob) that scales horizontally.
Use CDN for serving frequently read versions.
Deduplicate chunks globally by content-addressable storage.
Compress chunks when beneficial; choose compression tradeoffs (CPU vs bandwidth).
Store small file inline in metadata for speed (e.g., <16 KB).
Use batched manifest commits and backpressure the publisher.

Garbage Collection & Retention

Implement GC to remove orphaned chunks:
- Mark chunks referenced by any manifest as live.
- Periodically scan chunk store and delete objects not referenced and older than retention threshold.
Provide retention policy per-tenant to retain versions for compliance.

Monitoring, Observability and SLOs

Track:

Sync success/failure rates
Average upload latency per chunk
Percentage of bytes saved by delta vs full upload
Number of conflicts per tenant
Storage growth and orphaned chunk rate
Throughput: chunks/sec, manifests/sec

Expose metrics to Prometheus and tracing via OpenTelemetry. Use traceId correlation for debugging complex sync flows.

Testing Strategy

Unit tests for chunk hashing, manifest computation and conflict detection.
Integration tests with real object store and database.
Performance tests with synthetic large files and concurrent clients.
Network-chaos tests: packet loss, slow links, reconnects.
End-to-end tests for resume and offline scenarios.

Operational Playbook

Onboard: provision storage and default retention per tenant.
Throttling: protect system by per-tenant rate limits on chunk uploads.
Backfill: support background re-sync jobs for clients that missed pushes.
Recovery: if manifest commit fails due to missing chunks, provide clear remediation steps (re-upload missing chunks or roll back).
Backups: snapshot metadata DB and ensure object store versioning is enabled or use cross-region replication.

Example End-to-End Scenario

User edits a 1 GB CAD file and changes a 5 MB subsection. Client detects changed chunks (say 2 chunks changed). Instead of uploading 1 GB, client will upload only those 2 chunks (~8 MB) plus a small manifest commit. Server composes new version by referencing existing chunks and new chunks. Other devices get notified and download only two chunks.

Summary

A high-scale file sync service must balance correctness, efficiency and usability. The architecture described provides:

Efficient delta detection via chunking and content hashing
Safe, atomic version commits with manifests and Merkle roots
Robust conflict handling with clear UX patterns
Offline support and resumable uploads
Scalable storage and deduplication patterns

Implement this with Angular for client hashing/upload orchestration and .NET for the Sync Gateway, metadata DB and background processes. Focus on idempotency, observability and security. Start with fixed-size chunking and add rolling chunk logic as you observe real workloads — that is often the fastest path to production.