This chapter tackles a complex but essential advanced topic: how to execute multiple tasks simultaneously or seemingly simultaneously (concurrency and parallelism). We will define the different approaches and explain the critical role of Python’s Global Interpreter Lock (GIL).
1. Concurrency vs. Parallelism
While often used interchangeably, these two concepts are distinct:
| Concept | Description | Execution | Best Use Case |
| Concurrency | The program deals with many tasks at the same time (by rapidly switching between them). | Tasks make progress one at a time on a single CPU core. | I/O-bound tasks (waiting for network, file access). |
| Parallelism | The program executes many tasks literally simultaneously. | Tasks are running at the exact same moment on multiple CPU cores. | CPU-bound tasks (heavy calculations). |
2. The Global Interpreter Lock (GIL)
The Global Interpreter Lock (GIL) is a mutex (a lock) in the default CPython interpreter that ensures only one thread executes Python byte code at any given time.
- Impact: Even on a multi-core machine, the GIL prevents multiple CPU-intensive Python threads from running truly in parallel. It limits Multithreading to concurrency, not parallelism.
- Why it Exists: The GIL makes the memory management of CPython much simpler and safer, which is why it remains standard.
3. Multithreading (threading module)
Threads are sequences of execution within the same program process. They share the same memory space, making data sharing fast but also dangerous (requiring explicit thread-safe mechanisms).
- Focus: Because of the GIL, threading is best suited for I/O-bound tasks. When one thread is blocked (waiting for a file to load or a network response), the GIL is released, allowing another thread to start running CPU code. This overlap significantly speeds up I/O tasks.
import threading
import time
def process_data(delay, name):
time.sleep(delay) # Simulating I/O Wait (Releases the GIL)
print(f"{name} finished after {delay} seconds.")
# Create threads
t1 = threading.Thread(target=process_data, args=(4, 'Task 1'))
t2 = threading.Thread(target=process_data, args=(2, 'Task 2'))
# Start threads execution concurrently
t1.start()
t2.start()
4. Multiprocessing (multiprocessing module)
Processes are independent execution units that do not share memory. Each process has its own Python interpreter and its own instance of the GIL.
- Focus: Multiprocessing is the solution for true parallelism in Python. It is ideal for CPU-bound tasks (e.g., intensive mathematical calculations) because the tasks can be distributed across multiple CPU cores, each running independently.
- Drawback: Communication between processes is slower because data must be explicitly serialized and transferred (e.g., using queues).
import multiprocessing
import os
def calculate_square(number):
# This task runs on a separate CPU core, achieving parallelism
print(f"Process ID: {os.getpid()} - Square of {number} is {number * number}")
# Create processes
p1 = multiprocessing.Process(target=calculate_square, args=(10,))
p2 = multiprocessing.Process(target=calculate_square, args=(20,))
# Start processes execution in parallel
p1.start()
p2.start()
