Claim Check Architecture

Varun Setia
Dec 30
608
0
0

Article

In this article, we will understand one of the most important architectural patterns responsible for sharing a large amount of data between two applications. Before we dig deeper into this concept, we will first understand the basics of reliable data sharing between two applications.

Ways to share data between two systems:

Approach 1: HTTP Request

Pros

· Simple to implement

· Easier to understand and maintain

· Cost Effective

· Can pass large payloads over the network

Cons

· Error-prone

· Can affect destination applications during peak loads

· Cannot scale well due to the above limitations

Approach 2: Message Broker

Pros

· Fault tolerant

· Can work asynchronously depending upon system availability of destination application

· Highly scalable

Cons

· Additional cost of maintaining message broker

· Cannot send large payloads via message broker

Now from above two approaches, it is wise to conclude that approach 2 message broker works well for mission critical application and is ideally productionable without doubt but comes with additional cost of maintenance. We can also conclude that approach two has data sharing limitations because message brokers cannot share large amount of data since they are not designed in such a way.

To overcome this limitation of data sharing, we implement Claim Check Architecture Pattern.

This diagram illustrates the Claim Check pattern for sharing large payloads between systems using a message broker (RabbitMQ) without sending the actual data through the queue.

Step 1 : Store data and return ID

The Web App Source generates a large payload and stores it in an external Data Store (for example, a database or object storage). Instead of sending the full data, the data store returns a unique identifier (ID), also called a claim check token.

Step 2 : Send data ID via RabbitMQ:

The source application sends only this ID to RabbitMQ. The message is lightweight, fast to transmit, and avoids broker size limits or performance issues caused by large messages.

Step 3 : Receive data ID at destination:

The Web App Destination consumes the message from RabbitMQ and receives the data ID. At this stage, it knows that new data is available but does not yet have the actual payload.

Step 4 : Request data by ID:

Using the received ID, the destination application directly requests the full data from the Data Store. The data is fetched only when needed.

Overall, this approach improves performance, reliability, and scalability. RabbitMQ is used purely for coordination and event notification, while large payloads are handled by a storage system better suited for them. This is especially useful in microservices and high-throughput systems.

Note: We also have to maintain cleanup logic for old data with a background job that increases overhead of management. This approach has cons to scale but comes with the cost of complexity in management that we need to understand. Choice of using this pattern is solely based on use case and not to be used for every use cases.

Code Walkthrough

1. Python environment setup

uv venv creates an isolated virtual environment, and uv pip install pika --upgrade installs pika, the RabbitMQ client for Python.

uv venv
uv pip install "pika" "psycopg[binary]" --upgrade

2. PostgreSQL table

The datastore table stores large message payloads.

id - is a UUID (used as the claim check ID)
data - holds the actual message
created_at - tracks insertion date, also useful for cleanup by background job.

Only the lightweight UUID is sent via RabbitMQ; the full data stays in Postgres.

-- Table: public.datastore
CREATE TABLE IF NOT EXISTS public.datastore
(
    id uuid NOT NULL DEFAULT uuidv7(),
    data text COLLATE pg_catalog."default" NOT NULL,
    created_at date NOT NULL DEFAULT now(),
    CONSTRAINT datastore_pkey PRIMARY KEY (id)
)

TABLESPACE pg_default;

ALTER TABLE IF EXISTS public.datastore
    OWNER to postgres;

3. Sender (claim_check_sender.py)

Environment variables are loaded (PGSERVER, RABBITMQ).
send_claimcheck_message() receives the message payload.
The payload is inserted into Postgres, and the generated id is returned.
A RabbitMQ connection is created.
The queue claim-check-share is declared.
Only the claim check ID (UUID) is published to the queue.
Connections are closed.

Running the file inserts data into Postgres and sends its ID to RabbitMQ.

import pika
from dotenv import load_dotenv
import os
import psycopg

load_dotenv()

def send_claimcheck_message(message_payload: str):
    claim_check_id = None
    # Connect to an existing database
    with psycopg.connect(os.environ["PGSERVER"]) as conn:

        # Open a cursor to perform database operations
        with conn.cursor() as cur:
            cur.execute("INSERT INTO public.datastore (data) VALUES (%s) RETURNING id",[message_payload])
            (claim_check_id,) = cur.fetchone()
            conn.commit()

    connection = pika.BlockingConnection(pika.ConnectionParameters(os.environ["RABBITMQ"]))
    channel = connection.channel()
    QUEUE_NAME = 'claim-check-share'

    channel.queue_declare(queue=QUEUE_NAME)
    channel.basic_publish(exchange='',
                        routing_key=QUEUE_NAME,
                        body=str(claim_check_id))

    connection.close()

send_claimcheck_message('Test')

Run using below command

uv run claim_check_sender.py

4. Receiver (claim_check_receive.py)

Environment variables are loaded.
A RabbitMQ connection and channel are created.
The same queue (claim-check-share) is declared.
A callback function is registered to process messages.
When a message arrives, the UUID is decoded from the queue message.
Using this ID, the receiver fetches the actual data from Postgres.
The retrieved data is printed to the console.

#!/usr/bin/env python
import pika, sys, os
from dotenv import load_dotenv
import psycopg

load_dotenv()

QUEUE_NAME = 'claim-check-share'

def main():
    connection = pika.BlockingConnection(pika.ConnectionParameters(host=os.environ["RABBITMQ"]))
    channel = connection.channel()

    channel.queue_declare(queue=QUEUE_NAME)

    def callback(ch, method, properties, body):
        with psycopg.connect(os.environ["PGSERVER"]) as conn:

            # Open a cursor to perform database operations
            with conn.cursor() as cur:
                cur.execute("SELECT data FROM public.datastore WHERE id = %s",[body.decode("utf-8")])
                (body_data,) = cur.fetchone()
                conn.commit()
        print(f" [x] Received {body_data}")

    channel.basic_consume(queue=QUEUE_NAME, on_message_callback=callback, auto_ack=True)

    print(' [*] Waiting for messages. To exit press CTRL+C')
    channel.start_consuming()

if __name__ == '__main__':
    try:
        main()
    except KeyboardInterrupt:
        print('Interrupted')
        try:
            sys.exit(0)
        except SystemExit:
            os._exit(0)

Run using below command

uv run claim_check_receive.py

Now, we can see both these in parallels

Everytime we execute sender, one message comes to receiver

Sender

(.venv) PS E:\KnowledgeSharing\claim-check> uv run claim_check_sender.py

(.venv) PS E:\KnowledgeSharing\claim-check> uv run claim_check_sender.py

(.venv) PS E:\KnowledgeSharing\claim-check> uv run claim_check_sender.py

(.venv) PS E:\KnowledgeSharing\claim-check>

Receiver

(.venv) PS E:\KnowledgeSharing\claim-check> uv run claim_check_receive.py

 [*] Waiting for messages. To exit press CTRL+C

 [x] Received Test

 [x] Received Test

 [x] Received Test

Conclusion

The Claim Check Architecture Pattern provides an elegant and practical solution for sharing large payloads between distributed systems without compromising performance or reliability. By decoupling payload storage from message transmission, this pattern allows message brokers like RabbitMQ to focus on what they do best—reliable, scalable message delivery while delegating large data handling to a dedicated storage system such as PostgreSQL or object storage.

Through this approach, systems avoid common pitfalls like broker memory pressure, message size limits, and performance degradation during peak loads. The sender remains lightweight and fast by publishing only a claim check ID, while the receiver gains flexibility by retrieving the actual data only when required. This also improves fault tolerance, as messages can be retried or replayed without duplicating large payload transfers.

The hands-on implementation using Python, RabbitMQ, and PostgreSQL demonstrates how simple and effective this pattern is in real-world scenarios. It fits naturally into microservices, event-driven systems, and high-throughput architectures where scalability and resilience are critical.

In summary, when large data exchange is unavoidable, the Claim Check pattern strikes the right balance between efficiency, scalability, and maintainability making it a production-ready choice for modern distributed systems.