1. Products
  2.   Metadata
  3.   Python
  4.   hachoir-metadata
 
  

Open Source Python Metadata Library

Free & open source Python library to read, edit and update metadata of documents.

What is Hachoir-metadata API for Python?

hachoir-metadata is a Python library that is part of the broader Hachoir project, designed for parsing and extracting metadata from a wide variety of file types. It provides tools to read metadata without needing to decompress or fully decode the files, making it lightweight and efficient for basic metadata inspection tasks.

Features of hachoir-metadata API

hachoir-metadata is a powerful API that has rich features as follow:

  • File Type Support: Works with many file formats, including images, videos, audio files, archives, and documents.
  • Metadata Extraction: Extracts basic metadata such as file size, creation date, modification date, and more format-specific properties (e.g., EXIF for images, codecs for videos, etc.).
  • Read-Only Operations: Focuses on reading and inspecting metadata without modifying the original file.
  • File Type Agnostic: Automatically detects file types and extracts metadata accordingly.
  • Integration: Can be integrated into Python applications for use in workflows like content organization, digital forensics, and archival systems.

Modes of hachoir-metadata API

hachoir-metadata has three modes:

  • classic mode: extract metadata, you can use –level=LEVEL to limit quantity of information to display (and not to extract)
  • --type: show on one line the file format and most important informations
  • --mime: just display file MIME type

Getting Started with Hachoir API for Python

GitHub

GitHub Stats

Name:
Language:
Stars:
Forks:
License:
Repository was last updated at

In order to use Hachoir API for Python, you need to install Python 3.6+ version and Hachoir on your system. So, first install Python and then use below commands to install Hachoir API on your machine using pip and virtual environment.


pip install hachoir

Alternatively, you can also install hachoir from Github repository using the following steps:


1. Checkout the source code from Github repository git clone git://github.com/vstinner/hachoir.git
2. Run setup.py to install the module from source python setup.py install [--user|--prefix=]

Working with hachoir-metadata API for Python - Examples

hachoir-metadata API for Python lets you read the metadata information from media file types. With just a few lines of code, you can develop powerful applications that can read metadata information from different file formats. The following code samples show how the hachoir-metadata API can be used in Python applications.

Working with hachoir-metadata API for Python - Examples

pyExifTool provides you support for reading metadata of a variety of file formats such as PDF, BMP, JPEG, DOCX, XLSX and many others. The API lets you read the metadata information of a file using the get_metadata method. Check the below code snippet where we read the metadata information from a PDF file.

Output

When you execute this code, the output will be somewhat similar to the following (depending upon the inforamtion available in your sample file:


Metadata:
- Duration: 1 min 56 sec 261 ms
- Image width: 1280 pixels
- Image height: 720 pixels
- Creation date: 1904-01-01 00:00:00
- Last modification: 1904-01-01 00:00:00
- Comment: Play speed: 100.0%
- Comment: User volume: 100.0%
- MIME type: video/mp4
- Endianness: Big endian

Conclusion

The hachoir-metadata API offers a powerful yet lightweight solution for extracting metadata from a wide variety of file formats, making it an excellent tool for python developers working in fields like digital forensics, content management, and data analysis. Its ability to parse files without modification ensures data integrity, while its Pythonic interface simplifies integration into applications and workflows. With support for diverse file types and metadata properties, hachoir-metadata is a versatile choice for quick and efficient metadata inspection for both personal as well as professional projects/systems.

Similar Products

 English