How to unzip files in Python

Posted in Python by Dirk - last update: Feb 09, 2024

To unzip a file in Python, you can use the extractall() function from the built-in zipfile module or use the unpack_archive() from the shutil module. Both methods help extract the contents of a zip file. While shutil also works for tar files, you may need specific libraries for other compression formats (gzip, bz2,…).

Using the extractall from zipfile module (Zip files only)

Example

import zipfile

with zipfile.ZipFile('example.zip', 'r') as zip_ref:
    zip_ref.extractall('extracted_folder')

This example opens the 'example.zip' file and extracts its contents into the 'extracted_folder'. The extractall() method extracts all files and directories.

Using unpack_archive from shutil module for Zip files / Tar files

Syntax:

shutil.unpack_archive(filename, extract_dir=None, format=None)
  • filename: The name of the archive file to be extracted.
  • extract_dir: The optional destination directory where the contents of the archive will be extracted. If not provided, it will be extracted in the current working directory.
  • format: The optional format of the archive (‘zip’, ’tar’, ‘gztar’, ‘bztar’, or ‘xztar’). If not specified, the format is determined based on the filename extension.

Example:


import shutil

shutil.unpack_archive('your_file.zip', 'destination_folder')

The contents of your_file.zip will be unpacked in the ‘destination_folder'

Here an example where the format is specified:

import shutil

shutil.unpack_archive('your_file.tar', 'destination_folder', 'tar')

Special case: password-protected zip files

Handling password-protected ZIP files in Python involves using the zipfile module along with the ZipFile.setpassword() method

import zipfile

zip_file_path = 'encrypted.zip'
extract_folder = 'extracted_folder'
password = 'your_password'

with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    # Set the password
    zip_ref.setpassword(password)

    try:
        # Extract the contents
        zip_ref.extractall(extract_folder)
        print("Extraction successful.")
    except zipfile.BadZipFile:
        print("Incorrect password or the file is not a ZIP file.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

with:

  • zip_file_path: Path to the password-protected ZIP file.
  • extract_folder: Destination folder where the contents will be extracted.
  • password: The password used to decrypt the ZIP file.

The with statement is used to ensure that the ZipFile object is properly closed after the block. The setpassword() method is used to set the password for the ZIP file.

Inside the try block, the extractall() method is called to extract the contents of the ZIP file. If the password is incorrect or the file is not a valid ZIP file, a BadZipFile exception is caught. Other exceptions might be caught to handle unforeseen errors during extraction.

Note: Remember to handle passwords securely and avoid hardcoding them directly in your script for security reasons (like in the example above).

For tarfiles only: the tarfile module

Example

import tarfile

with tarfile.open('example.tar', 'r') as tar_ref:
    tar_ref.extractall('extracted_folder')

This example uses the tarfile module to open 'example.tar' and extracts its contents into the 'extracted_folder'.

Unpacking rar files

Python’s standard library does not include built-in support for RAR files. However, you can use the rarfile library, a third-party library that provides functionality for working with RAR files in Python.

To use rarfile, you’ll need to install it first. You can install it using a package manager like pip. Open your terminal or command prompt and run:

pip install rarfile

After installing rarfile, you can use it to work with RAR files in your Python script. Here’s a simple example:

import rarfile

rar_file_path = 'example.rar'
extract_folder = 'extracted_folder'
password = 'your_password'  # If the RAR file is password-protected

with rarfile.RarFile(rar_file_path, 'r', password=password) as rar_ref:
    rar_ref.extractall(extract_folder)
    print("Extraction successful.")

With:

  • rar_file_path: Path to the RAR file.
  • extract_folder: Destination folder where the contents will be extracted.
  • password (optional): If the RAR file is password-protected, you can provide the password during the RarFile instantiation.

Other 3rd party tools to unpack archives

There are several other third-party libraries in Python that can be used to unpack or extract files in various formats.

  • patoolib: This library supports a wide range of archive formats, including ZIP, TAR, GZ, BZ2, and more.
  • pyunpack: This library is built on top of patoolib and provides a higher-level interface for unpacking various archive formats.
  • py7zr: This library focuses on 7z format, which is known for its high compression ratio.

Other articles