How to Use Python Multiprocessing and Multithreading Effectively

Riya Patel
Dec 05
1.3k
0
0

Article

Introduction

Python is easy to write, but it is often slow when handling heavy tasks. This is because Python runs most of its code using the Global Interpreter Lock (GIL), which allows only one thread to execute Python code at a time. However, Python gives us tools like multithreading and multiprocessing to run tasks faster—if used correctly. Many developers misuse them, resulting in little or no performance improvement. In this article, we will explain how to use threads and processes correctly to make your Python programs run faster, with simple explanations and clean examples.

Understanding the Difference Between Multithreading and Multiprocessing

Before optimizing performance, you must understand when to use each one.

Multithreading

Runs multiple threads inside the same Python process.

Best for:

I/O-bound tasks (waiting on files, API calls, network operations)
Programs that spend the most time waiting

Multiprocessing

Runs multiple processes, each with its own Python interpreter and memory.

Best for:

CPU-bound tasks (math operations, loops, image processing)
Heavy computations

Simple Rule

Use multithreading for I/O-bound work
Use multiprocessing for CPU-bound work

Understanding the GIL (Global Interpreter Lock)

The GIL allows only one thread to execute Python bytecode at a time.

Because of the GIL:

Threads do NOT run Python code in parallel
Threads CAN run I/O operations in parallel
Processes CAN run CPU code in parallel

This is why choosing between threads and processes is important.

Multithreading Example (I/O-Bound Tasks)

Example: Downloading multiple URLs

import threading
import time
import requests

def download(url):
    print(f"Downloading {url}")
    response = requests.get(url)
    print(f"Completed {url}")

urls = [
    "https://example.com",
    "https://www.google.com",
    "https://www.github.com"
]

threads = []
for url in urls:
    t = threading.Thread(target=download, args=(url,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

Why This Works

While one thread waits for the network, others continue running
Significant improvement over sequential requests

Multiprocessing Example (CPU-Bound Tasks)

Example: Computing squares of large numbers

from multiprocessing import Process
import time

def compute(n):
    for _ in range(10_000_000):
        n * n

processes = []
for i in range(4):
    p = Process(target=compute, args=(i,))
    processes.append(p)
    p.start()

for p in processes:
    p.join()

Why This Works

Each process runs in parallel using separate CPU cores
No GIL blocking

Using multiprocessing.Pool for Easy Parallel Mapping

Example: Parallel execution with a pool

from multiprocessing import Pool

def square(n):
    return n * n

with Pool(processes=4) as pool:
    results = pool.map(square, [1, 2, 3, 4, 5])

print(results)

Benefits

Simple API
Automatically manages worker processes

Using concurrent.futures for Cleaner Code

concurrent.futures provides modern wrappers for both threads and processes.

ThreadPoolExecutor

from concurrent.futures import ThreadPoolExecutor

def fetch(url):
    return requests.get(url).status_code

with ThreadPoolExecutor(max_workers=5) as executor:
    results = executor.map(fetch, urls)

ProcessPoolExecutor

from concurrent.futures import ProcessPoolExecutor

def square(n):
    return n * n

with ProcessPoolExecutor() as executor:
    results = executor.map(square, range(10))

Why It’s Useful

Cleaner syntax
Automatic worker management
Easy switching between threads and processes

Avoiding Common Mistakes

Mistake 1: Using Threads for CPU Tasks

Threads will not speed up CPU tasks due to the GIL.

Mistake 2: Using Processes for Small Tasks

Starting a process has overhead. Use threads for quick tasks.

Mistake 3: Forgetting `if name == "main":`

Multiprocessing requires this protection.

Mistake 4: Overusing Too Many Workers

Too many threads → context switching overhead
Too many processes → memory overload

Best Practice

Use number of CPU cores:

import os
os.cpu_count()

Sharing Data Between Processes

Processes do not share memory by default.

Use multiprocessing.Manager

from multiprocessing import Manager

with Manager() as manager:
    shared_list = manager.list()

When Needed

IPC (Inter-Process Communication)
Shared counters or collections

Avoiding Deadlocks and Race Conditions

Use Locks in Threads

lock = threading.Lock()

Use Queues Instead of Shared Variables

from queue import Queue

Best Practice

Use message passing instead of shared-memory
Avoid modifying shared objects from multiple threads

Choosing Between Threads and Processes — Quick Guide

Task Type	Best Tool	Reason
API Calls	Threads	Tasks are waiting on network
File I/O	Threads	Not CPU heavy
Mathematical Computation	Processes	Avoids GIL
Image Processing	Processes	CPU intensive
Web Scraping	Threads	Mostly I/O
Data Parsing	Processes	CPU heavy

Conclusion

Python multithreading and multiprocessing can speed up your programs significantly—but only when used correctly. Threads are perfect for I/O tasks like API calls and file operations, while processes are ideal for CPU-heavy work such as mathematical calculations or data processing. By understanding the GIL, choosing the right tool, avoiding common mistakes, and using modern libraries like concurrent.futures, you can build efficient, fast, and scalable Python applications. With the right parallel strategy, Python can perform just as fast as many compiled languages in real-world scenarios.

How to Use Python Multiprocessing and Multithreading Effectively

Introduction

Understanding the Difference Between Multithreading and Multiprocessing

Multithreading

Multiprocessing

Simple Rule

Understanding the GIL (Global Interpreter Lock)

Because of the GIL:

Multithreading Example (I/O-Bound Tasks)

Multiprocessing Example (CPU-Bound Tasks)

Using multiprocessing.Pool for Easy Parallel Mapping

Using concurrent.futures for Cleaner Code

Avoiding Common Mistakes

Mistake 1: Using Threads for CPU Tasks

Mistake 2: Using Processes for Small Tasks

Mistake 3: Forgetting if __name__ == "__main__":

Mistake 4: Overusing Too Many Workers

Sharing Data Between Processes

Avoiding Deadlocks and Race Conditions

Choosing Between Threads and Processes — Quick Guide

Conclusion

Mistake 3: Forgetting `if name == "main":`