n8n Advanced Data Handling: How to Use Set, Split In Batches & Merge Like a Pro

Jayraj Chhaya
Sep 24
2.7k
0
3

Article

Automation is all about moving and transforming data efficiently. While n8n offers hundreds of integrations, its true power lies in how you handle the flow of data between nodes.

Three of the most powerful (but often overlooked) nodes are:
👉 Set – node is used to restructure data or to assign values to variables that can be reused in subsequent nodes within the current workflow.
👉 Split In Batches – for breaking data into smaller chunks
👉 Merge – for combining different data streams

This trio forms the backbone of advanced n8n workflows. In this guide, we'll explore what they are, when to use them, and how they work together, complete with real-world examples.

Prerequisites

Before diving in, make sure you have:

Basic knowledge of n8n workflows.
Familiarity with JSON data structures.
Hands-on experience with nodes like HTTP Request, Google Sheets, or Function node.

The Set Node

What is it?

The Set node allows you to define, restructure, or clean your data before passing it to the next step.

Where/Why/When to Use?

To map API fields (rename keys to match another system).
To add default values (e.g., set status = "pending").
To remove unnecessary fields and keep workflows clean.

How to Use?

Open the Set node → Add new fields manually or use expressions.
Choose which fields to keep, remove, or transform.

Real-World Example

Imagine you scraped leads from LinkedIn with firstName, lastName, and mail. Before sending them to a CRM API, you can use Set to restructure as:

{
	"fullName": "{{ $json.firstName }} {{ $json.lastName }}",
	"primaryEmail": "{{ $json.mail }}",
	"status": "new"
}

Pros & Cons

✅ Easy to restructure data
✅ Perfect for cleaning payloads
❌ Can get messy with large objects

The Split In Batches Node

What is it?

The Split In Batches node processes large datasets in smaller chunks, instead of sending everything at once.

Where/Why/When to Use?

Handling APIs with rate limits.
Breaking large Google Sheets exports into chunks.
Processing data step by step to avoid overload.

How to Use?

Set batch size (e.g., 10 items).
The workflow loops until all batches are processed.

Real-World Example

You fetch 100 contacts from Google Sheets. Instead of sending all 500 emails at once (risking API errors), you:

Use Split In Batches (10 per batch).
Send each batch via the Gmail node.
Continue until all contacts are processed.

Pros & Cons

✅ Prevents API quota issues
✅ Makes workflows more reliable
❌ Adds extra looping logic

The Merge Node

What is it?

The Merge node lets you combine two data streams in different ways.

Where/Why/When to Use?

Joining results from two APIs.
Adding extra fields from a lookup database.
Comparing or enriching data.

Merge Modes

Append → Stack results from both inputs.
Keep Key Matches → Join records on a matching key (like SQL join).
Pass Through → Keep one stream and extend it with the other.

Real-World Example for Merge Node

Scenario: You have customer orders from one source and customer info from another. You want a single dataset with all details.

Data Streams

Orders: OrderId, CustomerId, Amount
Customers: CustomerId, Name, Email

Merge Node

Mode: Keep Key Matches
Key: CustomerId

Output:

OrderId, CustomerId, Amount, Name, Email

Explanation: Now each order is enriched with customer details, combining data from two sources into one clean dataset.

Pros & Cons

✅ Works like SQL joins
✅ Great for enrichment workflows
❌ Needs careful key mapping

Key Differences

Node	Purpose	Best Use Case	Analogy
Set	Restructure/clean data	Rename, add, or remove fields	Editing a form before submission
Split In Batches	Process in small parts	Handle large datasets & rate limits	Cutting a cake into slices
Merge	Combine two streams	Join API results or datasets	Merging two Excel sheets

When to Use Them Together

Often, these nodes work best in combination. Example:

Fetch 1000 leads from Google Sheets.
Use Split In Batches (50 at a time).
Use Set to restructure fields (firstName + lastName → fullName).
Call CRM API.
Use Merge with CRM's response to enrich with company IDs.
Save the final data in MongoDB.

This ensures the process is clean, efficient, and scalable.

Pros & Cons of the Trio

✅ Essential for advanced workflow design
✅ Help manage data quality, scalability, and enrichment
✅ Make workflows modular & professional
❌ Overusing them may complicate workflows
❌ Debugging batch/merge flows can be time-consuming

Conclusion

The Set, Split In Batches, and Merge nodes are not just "extra" tools—they are core building blocks for advanced n8n workflows.

Use Set to keep data clean.
Use Split In Batches to scale safely.
Use Merge to enrich and unify data.

By mastering this trio, you'll unlock the ability to build robust, scalable, and production-grade automations in n8n. 🚀