Open Source Python PDF Merger Library
Free & open source Python library to split, merge, add, rotate and crop pages of PDF documents.
What is pypdf?
pypdf is a free and open-source Python library which allows several operations on PDF documents like adding, rotating, cropping, splitting and merging the pages of PDF files.
Some of the features are listed below:
- Merging PDFs: You can merge multiple PDF documents into a single PDF file using pypdf. This is useful for combining PDF reports, presentations, or other documents.
- Splitting PDFs: Pypdf also supports splitting a PDF into multiple smaller PDFs. This can be handy when you want to break down a large PDF into individual sections.
- Rotating Pages: You can rotate individual pages in a PDF document using pypdf. This is useful for correcting the orientation of scanned documents or images.
Getting Started with pypdf
You need Python version 3.6.0 or higher to install and use pypdf. So, first install Python and then use below commands to install pypdf on your machine using pip and virtual environment.
Linux
python3 -m venv venv
source venv/bin/activate
pip install pypdf
MacOS
python -m venv venv
source venv/bin/activate
pip install pypdf
Windows
python3 -m venv venv
venv\Scripts\activate.bat
pip install pypdf
Add, Rotate & Crop PDF Pages
You can use the pypdf library in Python to manipulate PDF files, such as adding, rotating, cropping pages, and even adding JavaScript actions to PDF documents by using PdfWriter and PdfReader classes as shown in below code:
Note: Just because content is no longer visible, it is not gone. Cropping works by adjusting the viewbox. That means content that was cropped away can still be restored.
Merge PDF Files
You can use the pypdf library in Python to manipulate PDF files such as merging multiple PDFs into a single document. You can merge and manipulate PDF files by using the functions of PdfWriter class as shown in below code:
Split PDF Document
We can split a PDF document into several documents using different functions of PdfWriter and PdfReader classes in the pypdf library. The below code snippet divides the provided PDF document into two halves and saves them as two separate PDF files:
Conclusion
In conclusion, pypdf’s merging capabilities are exemplary, showcasing the library's proficiency in handling PDF documents. pypdf’s capability to effortlessly combine PDFs makes it an excellent choice for tasks requiring document assembly, report generation, or the consolidation of various PDF resources. Its user-friendly features and efficient merging capabilities contribute to its reputation as a valuable tool in PDF document management.