1. Products
  2.   Conversion
  3.   Python
  4.   xhtml2pdf
 
  

Open Source Python HTML to PDF Conversion Library

Try this Free & Open Source Python library to convert HTML to PDF documents.

What is xhtml2pdf?

Xhtml2pdf is an open source Python library designed for converting HTML content to PDF documents with a primary focus on preserving the original structure and styling. This powerful library seamlessly converts web pages into professional print-ready PDFs.

GitHub

GitHub Stats

Name:
Language:
Stars:
Forks:
License:
Repository was last updated at

Getting Started with xhtml2pdf

You need Python version 3.8.0 or higher to install and use xhtml2pdf. So, first install Python and then use below commands to install xhtml2pdf on your machine using pip and virtual environment.

Linux


python -m venv env
. env/bin/activate
pip install xhtml2pdf

MacOS


python -m venv env
. env/bin/activate
pip install xhtml2pdf

Windows


python -m venv env
.\env\Scripts\activate
pip install xhtml2pdf

Converting HTML String to PDF Document

We can convert an HTML string to a PDF document using the xhtml2pdf library. We store the complete HTML content in a variable and then pass that variable to the function pisa.CreatePDF(html_content, dest, encoding). This method provides us with the PDF content wherein hyperlinks, images, and other elements are managed within a BytesIO object that we passed to it. Finally, we create a PDF file using the data stored in the BytesIO object. Check below code snippet for the details:

Output

The screenshot below displays the PDF document converted from the HTML string:

Converting HTML File to PDF Document

We can also convert an HTML file to a PDF using the xhtml2pdf library. We read the HTML file contents into a variable and then pass that variable to pisa.CreatePDF(html_content, dest, encoding) method. After that we follow the same steps as we shared in previous example to create the PDF document using BytesIO object as demonstrated in the below code snippet:

Output

The screenshot below displays the PDF document that was converted from the HTML file:

Conclusion

In summary, xhtml2pdf is an open-source Python library that converts HTML to PDF documents while handling hyperlinks, images and external stylesheets. xhtml2pdf doesn't include front-end libraries while converting HTML to PDF, so when there are variables in the HTML like {{name}}, they appear in the PDF exactly as written in the HTML, instead of showing their actual values.

Additionally, it lacks support for dynamic pages that depend on JavaScript to fetch content, and it doesn't follow complex CSS-specified layouts (for example, it applies the colors, font sizes etc. mentioned in the CSS but it ignores the layout CSS such as paddings, margin, display etc). Despite these constraints, xhtml2pdf remains a valuable tool for straightforward static HTML-to-PDF conversion needs.

Similar Products

 English