How to unzip files in Python
Posted in Python by Dirk - last update: Feb 09, 2024
To unzip a file in Python, you can use the extractall()
function from the built-in zipfile
module or use the unpack_archive()
from the shutil
module. Both methods help extract the contents of a zip
file. While shutil
also works for tar
files, you may need specific libraries for other compression formats (gzip
, bz2
,…).
Example
import zipfile
with zipfile.ZipFile('example.zip', 'r') as zip_ref:
zip_ref.extractall('extracted_folder')
This example opens the 'example.zip'
file and extracts its contents into the 'extracted_folder'
. The extractall()
method extracts all files and directories.
Using unpack_archive from shutil module for Zip files / Tar files
Syntax:
shutil.unpack_archive(filename, extract_dir=None, format=None)
filename
: The name of the archive file to be extracted.
extract_dir
: The optional destination directory where the contents of the archive will be extracted. If not provided, it will be extracted in the current working directory.
format
: The optional format of the archive (‘zip’, ’tar’, ‘gztar’, ‘bztar’, or ‘xztar’). If not specified, the format is determined based on the filename extension.
Example:
import shutil
shutil.unpack_archive('your_file.zip', 'destination_folder')
The contents of your_file.zip
will be unpacked in the ‘destination_folder'
Here an example where the format is specified:
import shutil
shutil.unpack_archive('your_file.tar', 'destination_folder', 'tar')
Special case: password-protected zip files
Handling password-protected ZIP files in Python involves using the zipfile
module along with the ZipFile.setpassword()
method
import zipfile
zip_file_path = 'encrypted.zip'
extract_folder = 'extracted_folder'
password = 'your_password'
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
# Set the password
zip_ref.setpassword(password)
try:
# Extract the contents
zip_ref.extractall(extract_folder)
print("Extraction successful.")
except zipfile.BadZipFile:
print("Incorrect password or the file is not a ZIP file.")
except Exception as e:
print(f"An error occurred: {str(e)}")
with:
zip_file_path
: Path to the password-protected ZIP file.
extract_folder
: Destination folder where the contents will be extracted.
password
: The password used to decrypt the ZIP file.
The with statement is used to ensure that the ZipFile object is properly closed after the block. The setpassword()
method is used to set the password for the ZIP file.
Inside the try block, the extractall()
method is called to extract the contents of the ZIP file. If the password is incorrect or the file is not a valid ZIP file, a BadZipFile
exception is caught. Other exceptions might be caught to handle unforeseen errors during extraction.
Note: Remember to handle passwords securely and avoid hardcoding them directly in your script for security reasons (like in the example above).
For tarfiles only: the tarfile module
Example
import tarfile
with tarfile.open('example.tar', 'r') as tar_ref:
tar_ref.extractall('extracted_folder')
This example uses the tarfile
module to open 'example.tar'
and extracts its contents into the 'extracted_folder'
.
Unpacking rar files
Python’s standard library does not include built-in support for RAR files. However, you can use the rarfile
library, a third-party library that provides functionality for working with RAR files in Python.
To use rarfile
, you’ll need to install it first. You can install it using a package manager like pip. Open your terminal or command prompt and run:
After installing rarfile
, you can use it to work with RAR files in your Python script. Here’s a simple example:
import rarfile
rar_file_path = 'example.rar'
extract_folder = 'extracted_folder'
password = 'your_password' # If the RAR file is password-protected
with rarfile.RarFile(rar_file_path, 'r', password=password) as rar_ref:
rar_ref.extractall(extract_folder)
print("Extraction successful.")
With:
rar_file_path
: Path to the RAR file.
extract_folder
: Destination folder where the contents will be extracted.
password
(optional): If the RAR file is password-protected, you can provide the password during the RarFile instantiation.
There are several other third-party libraries in Python that can be used to unpack or extract files in various formats.
patoolib
: This library supports a wide range of archive formats, including ZIP, TAR, GZ, BZ2, and more.
pyunpack
: This library is built on top of patoolib
and provides a higher-level interface for unpacking various archive formats.
py7zr
: This library focuses on 7z
format, which is known for its high compression ratio.
Other articles