Why was multithreading faster than multiprocessing?

I recently wrote a small snippet to read a file using multithreading as well as multiprocessing. I noticed that time taken to read the file using multithreading was less compared to multiprocessing. file was around 2 gb

Multithreading code

import time import threading def process_chunk(chunk): # Simulate processing the chunk (replace with your actual logic) # time.sleep(0.01) # Add a small delay to simulate work print(chunk) # Or your actual chunk processing def read_large_file_threaded(file_path, chunk_size=2000): try: with open(file_path, ‘rb’) as file: threads = [] while True: chunk = file.read(chunk_size) if not chunk: break thread = threading.Thread(target=process_chunk, args=(chunk,)) threads.append(thread) thread.start() for thread in threads: thread.join() #wait for all threads to complete. except FileNotFoundError: print(« error ») except IOError as e: print(e) file_path = r »C:UsersrohitVideosCaptureseee.mp4″ start_time = time.time() read_large_file_threaded(file_path) print(« time taken « , time.time() – start_time)

Multiprocessing code import time import multiprocessing

import time import multiprocessing def process_chunk_mp(chunk): « » »Simulates processing a chunk (replace with your actual logic). » » » # Replace the print statement with your actual chunk processing. print(chunk) # Or your actual chunk processing def read_large_file_multiprocessing(file_path, chunk_size=200): « » »Reads a large file in chunks using multiprocessing. » » » try: with open(file_path, ‘rb’) as file: processes = [] while True: chunk = file.read(chunk_size) if not chunk: break process = multiprocessing.Process(target=process_chunk_mp, args=(chunk,)) processes.append(process) process.start() for process in processes: process.join() # Wait for all processes to complete. except FileNotFoundError: print(« error: File not found ») except IOError as e: print(f »error: {e} ») if __name__ == « __main__ »: # Important for multiprocessing on Windows file_path = r »C:UsersrohitVideosCaptureseee.mp4″ start_time = time.time() read_large_file_multiprocessing(file_path) print(« time taken « , time.time() – start_time)

submitted by /u/rohitwtbs to r/Python
[link] [comments]

Why was multithreading faster than multiprocessing?

Commentaires

Laisser un commentaire Annuler la réponse