Table of Contents
Introduction
What Is Connected Component Labeling?
Real-World Scenario: Real-Time Wildlife Monitoring in National Parks
Step-by-Step Implementation from Scratch
Complete Code with Test Cases
Performance Tips and Best Practices
Conclusion
Introduction
Counting objects in an image sounds simple—until you realize those objects might be touching, overlapping, or partially hidden. Connected Component Labeling (CCL) is a foundational computer vision technique that identifies and labels distinct regions of connected foreground pixels, enabling accurate object counting without deep learning.
While libraries like OpenCV offer connectedComponents()
, implementing CCL from scratch in pure Python gives you full control, transparency, and the ability to deploy in restricted environments—like remote wildlife cameras with no internet or external dependencies.
In this guide, we’ll build a robust two-pass CCL algorithm using only NumPy and apply it to a critical conservation use case.
What Is Connected Component Labeling?
Connected Component Labeling assigns a unique label to each group of connected foreground pixels (typically white pixels in a binary image). Two common connectivity rules:
4-connectivity: Pixels connected horizontally or vertically
8-connectivity: Pixels connected horizontally, vertically, or diagonally
The classic two-pass algorithm works as follows:
First pass: Scan the image, assign temporary labels, and record equivalences
Second pass: Replace temporary labels with final root labels using union-find
The result? A labeled image where each object has a unique integer ID—perfect for counting, tracking, or measuring.
Real-World Scenario: Real-Time Wildlife Monitoring in National Parks
Imagine a solar-powered camera trap in Kenya’s Maasai Mara, silently watching a watering hole at night. Its goal: count how many elephants visit each hour to monitor herd movements and prevent human-wildlife conflict.
Using thermal imaging, the system generates a binary silhouette of warm bodies against a cool background. But elephants often stand close together—their silhouettes merge into one blob. Connected Component Labeling separates these blobs into individual animals by analyzing pixel connectivity.
![PlantUML Diagram]()
This isn’t theoretical. Organizations like Wildlife Insights and TrailGuard AI already use lightweight CCL algorithms on edge devices in rainforests and savannas—where every byte of memory and milliwatt of power counts. A pure-Python, dependency-free CCL implementation ensures reliability in the wild.
Step-by-Step Implementation from Scratch
We’ll implement the two-pass CCL with union-find using only numpy
:
Validate input is a binary 2D image
Initialize label matrix and union-find structure
First pass: assign provisional labels and record merges
Flatten equivalence classes using path compression
Second pass: assign final labels
No recursion. No external libraries. Just clean, auditable code.
Complete Code with Test Cases
import numpy as np
import unittest
def connected_component_labeling(binary_image: np.ndarray) -> np.ndarray:
"""
Perform connected component labeling (4-connectivity) on a binary image.
Args:
binary_image: 2D NumPy array with 0 (background) and 1/255 (foreground)
Returns:
Labeled image where each connected component has a unique integer ≥1
"""
if binary_image.ndim != 2:
raise ValueError("Input must be a 2D binary image.")
# Normalize to 0/1
img = (binary_image > 0).astype(np.uint8)
h, w = img.shape
labels = np.zeros((h, w), dtype=np.int32)
# Union-Find structure
parent = [0] # index 0 unused; labels start at 1
label_counter = 1
# First pass
for i in range(h):
for j in range(w):
if img[i, j] == 0:
continue
neighbors = []
# Check top and left (4-connectivity)
if i > 0 and labels[i-1, j] > 0:
neighbors.append(labels[i-1, j])
if j > 0 and labels[i, j-1] > 0:
neighbors.append(labels[i, j-1])
if not neighbors:
# New component
labels[i, j] = label_counter
parent.append(label_counter)
label_counter += 1
else:
# Use smallest neighbor label
min_label = min(neighbors)
labels[i, j] = min_label
# Union all neighbor labels to min_label
for nb in neighbors:
root_nb = nb
while parent[root_nb] != root_nb:
root_nb = parent[root_nb]
if root_nb != min_label:
parent[root_nb] = min_label
# Path compression: flatten parent tree
for i in range(1, len(parent)):
root = i
while parent[root] != root:
root = parent[root]
# Path compression
temp = i
while parent[temp] != temp:
next_temp = parent[temp]
parent[temp] = root
temp = next_temp
# Second pass: assign final labels
final_labels = np.zeros_like(labels)
unique_label = 1
label_map = {}
for i in range(h):
for j in range(w):
if labels[i, j] == 0:
continue
root = labels[i, j]
while parent[root] != root:
root = parent[root]
if root not in label_map:
label_map[root] = unique_label
unique_label += 1
final_labels[i, j] = label_map[root]
return final_labels
class TestConnectedComponentLabeling(unittest.TestCase):
def test_single_object(self):
img = np.array([[0, 0, 0],
[0, 1, 0],
[0, 0, 0]])
labeled = connected_component_labeling(img)
self.assertEqual(labeled[1, 1], 1)
self.assertEqual(np.max(labeled), 1)
def test_two_separate_objects(self):
img = np.array([[1, 0, 1],
[0, 0, 0],
[1, 0, 1]])
labeled = connected_component_labeling(img)
self.assertEqual(np.max(labeled), 4) # 4 isolated pixels = 4 components (4-connectivity)
def test_connected_block(self):
img = np.ones((3, 3), dtype=np.uint8)
labeled = connected_component_labeling(img)
self.assertEqual(np.max(labeled), 1) # All connected
def test_empty_image(self):
img = np.zeros((5, 5), dtype=np.uint8)
labeled = connected_component_labeling(img)
self.assertTrue(np.all(labeled == 0))
def test_reject_3d_input(self):
rgb = np.random.randint(0, 2, (10, 10, 3), dtype=np.uint8)
with self.assertRaises(ValueError):
connected_component_labeling(rgb)
if __name__ == "__main__":
# Run tests
unittest.main(argv=[''], exit=False, verbosity=2)
# Demo
print("\n Connected Component Labeling ready for field deployment!")
print("Pass a binary 2D NumPy array (0=background, >0=foreground).")
demo = np.array([[1, 1, 0, 0],
[1, 1, 0, 1],
[0, 0, 0, 1],
[1, 0, 0, 0]])
result = connected_component_labeling(demo)
print("Input:\n", demo)
print("Labeled Output:\n", result)
print("Number of objects:", np.max(result))
![34]()
Performance Tips and Best Practices
Use 8-connectivity if objects are diagonally touching (modify neighbor checks)
Preprocess with morphological operations to separate nearly-touching objects
Avoid deep recursion—our union-find uses iterative path compression
For large images, consider scanline-based or one-pass algorithms
Validate input: Ensure binary format (0/1 or 0/255) to prevent mislabeling
Conclusion
Connected Component Labeling is more than an academic exercise—it’s a lifeline for real-world systems that must count, track, or analyze objects with minimal resources. From counting endangered species in remote habitats to monitoring inventory on factory floors, a lightweight, transparent CCL implementation ensures accuracy where it matters most. With under 60 lines of pure Python and NumPy, you now have a field-tested object counter that runs anywhere—no internet, no OpenCV, no compromises.