GroupDocs.Metadata for Python via .NET - Read, write and remove metadata from documents and images
Project description
Product Page | Docs | Demos | API Reference | Blog | Free Support | Temporary License
GroupDocs.Metadata for Python via .NET is a metadata management API that reads, edits, and removes metadata from documents, spreadsheets, presentations, PDFs, images, audio, and video — 70+ file formats. It works with the major metadata standards (XMP, EXIF, IPTC, Image Resource Blocks, ID3, document properties) through one unified, format-independent API.
Get Started
pip install groupdocs-metadata-net
from groupdocs.metadata import Metadata
with Metadata("document.docx") as metadata:
root = metadata.get_root_package()
print("Format:", root.file_type.file_format)
How It Works
The package is a self-contained Python wheel (~160 MB) that bundles the embedded .NET runtime and everything needed to process metadata. No external software installation is required — just pip install and start reading metadata. The wheel works across Python 3.5 – 3.14 on Windows, Linux, and macOS (Intel + Apple Silicon).
Features
- Read, edit, and remove metadata from 70+ formats with one unified API.
- Metadata standards: XMP, EXIF, IPTC IIM, Image Resource Blocks, ID3 (ID3v1/ID3v2), Lyrics3, APE.
- Search engine: find, update, add, and remove properties with simple Python predicates and predefined tags.
- One-call sanitize: strip every detected property before sharing a file.
- Document inspection: detect format/MIME type, page count, encryption, digital signatures, comments, and hidden pages.
- Export: dump the metadata tree to CSV, XLSX, JSON, or XML.
- Cross-Platform: Windows x64/x86, Linux x64, macOS x64/ARM64.
Common Tasks
- Detect a file's real format and MIME type by its internal structure
- Read EXIF/XMP/IPTC properties from photos
- Read and edit ID3/APE/Lyrics tags in audio files
- Strip author, comments, and revision history from Office documents before publishing
- Find and remove properties that match a condition (by tag, name, type, or value)
- Export a document's full metadata tree to a spreadsheet for auditing
- Feed extracted metadata into a search, compliance, or DAM pipeline
Supported File Formats
For a complete list, see supported formats.
- Microsoft Office (Word, Excel, PowerPoint, Visio, Project, OneNote)
- OpenDocument (ODT, ODS, ODP)
- Images (JPEG, PNG, GIF, BMP, TIFF, WebP, DICOM, JPEG 2000, PSD, HEIF/HEIC, CR2/DNG and other RAW)
- Audio (MP3, WAV, OGG)
- Video (AVI, MOV, FLV, ASF, Matroska/MKV)
- Email (EML, MSG)
- eBook (EPUB)
- Archives (ZIP, RAR, 7Z, TAR)
- CAD (DWG, DXF)
- 3D (FBX, STL, 3DS, DAE)
- Fonts (TTF, OTF) and other formats (vCard, torrent)
Examples
Read metadata
from groupdocs.metadata import Metadata
with Metadata("input.docx") as metadata:
for prop in metadata.find_properties(lambda p: True):
print(f"{prop.name} = {prop.value}")
Get document info
from groupdocs.metadata import Metadata
with Metadata("input.xlsx") as metadata:
info = metadata.get_document_info()
print("Format:", info.file_type.file_format)
print("MIME type:", info.file_type.mime_type)
print("Pages:", info.page_count)
print("Size:", info.size, "bytes")
print("Encrypted:", info.is_encrypted)
Find properties by tag
from groupdocs.metadata import Metadata
from groupdocs.metadata.tagging import Tags
with Metadata("input.docx") as metadata:
authors = metadata.find_properties(lambda p: Tags.person.creator in list(p.tags))
for prop in authors:
print(prop.name, "=", prop.value)
Set / update properties and save
from datetime import datetime
from groupdocs.metadata import Metadata
from groupdocs.metadata.common import PropertyValue
from groupdocs.metadata.tagging import Tags
with Metadata("input.docx") as metadata:
affected = metadata.set_properties(
lambda p: Tags.time.created in list(p.tags),
PropertyValue(datetime.now()),
)
print("Updated:", affected)
metadata.save("output.docx")
Remove all metadata (sanitize)
from groupdocs.metadata import Metadata
with Metadata("input.pdf") as metadata:
removed = metadata.sanitize()
print("Removed:", removed)
metadata.save("clean.pdf")
Export the metadata tree
from groupdocs.metadata import Metadata
from groupdocs.metadata.export import ExportManager, ExportFormat
with Metadata("input.pdf") as metadata:
properties = list(metadata.find_properties(lambda p: True))
ExportManager(properties).export("metadata.xlsx", ExportFormat.XLSX)
Load from a stream
import io
from groupdocs.metadata import Metadata
with open("input.docx", "rb") as stream:
with Metadata(stream) as metadata:
print("Format:", metadata.file_format)
buf = io.BytesIO(downloaded_bytes)
with Metadata(buf) as metadata:
print(metadata.get_document_info().file_type.file_format)
AI Agent & LLM Friendly
This package is designed for seamless integration with AI agents, LLMs, and automated code generation tools.
AGENTS.mdin the package — AI coding assistants (Claude Code, Cursor, GitHub Copilot) auto-discover the API surface, usage patterns, and troubleshooting tips from the installed package- MCP server — connect your AI tool to GroupDocs documentation for on-demand API lookups:
{ "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } }
- Machine-readable docs — full documentation available as plain text for RAG and LLM context:
- Single file:
https://docs.groupdocs.com/metadata/python-net/llms-full.txt - Per page: append
.mdto any docs URL
- Single file:
Evaluation Mode
The API works without a license in evaluation mode, with these limitations:
- Only the first few properties of each metadata package are read.
- Saving files is disabled —
save()raises an "Evaluation only" exception.
To remove these limitations, apply a license or request a temporary license:
from groupdocs.metadata import License
License().set_license("path/to/license.lic")
Or set the environment variable (auto-applied at import):
export GROUPDOCS_LIC_PATH="path/to/license.lic"
Troubleshooting
| Issue | Platform | Fix |
|---|---|---|
System.Drawing.Common is not supported |
Linux/macOS | apt-get install libgdiplus (Linux) or brew install mono-libgdiplus (macOS) |
The type initializer for 'Gdip' threw an exception |
macOS | brew install mono-libgdiplus |
| Errors processing images that need fonts | Linux | apt-get install ttf-mscorefonts-installer fontconfig && fc-cache -f |
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT errors |
Linux | Do NOT set this variable. ICU must be available. |
System Requirements
- Python 3.5 - 3.14
- Windows x64/x86, Linux x64, macOS x64/ARM64
More Resources
Also available for other platforms: .NET | Java | Node.js
Product Page | Docs | Demos | API Reference | Blog | Free Support | Temporary License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file groupdocs_metadata_net-26.5.0-py3-none-win_amd64.whl.
File metadata
- Download URL: groupdocs_metadata_net-26.5.0-py3-none-win_amd64.whl
- Upload date:
- Size: 170.3 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c5d9b69ac87d639f8cbe8a3ba9ae156db5b66071daefe5f971fbfaacd1066ae
|
|
| MD5 |
fe98b039372a7725ab478b9739e1dada
|
|
| BLAKE2b-256 |
60128c00c137dc05b7e1f80b32102c0dd46c702bdb41b30ea80039ea773980e0
|
File details
Details for the file groupdocs_metadata_net-26.5.0-py3-none-manylinux1_x86_64.whl.
File metadata
- Download URL: groupdocs_metadata_net-26.5.0-py3-none-manylinux1_x86_64.whl
- Upload date:
- Size: 169.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce16af30bef73cdd3b5b60a11c9fadea371d4fd226742792bb4b7a072fe4ee32
|
|
| MD5 |
144865a72a18ac021c252d88400f54e9
|
|
| BLAKE2b-256 |
ca900196e6236b35881723c45634e64f63d57fcb509f25d651431055e3a63758
|
File details
Details for the file groupdocs_metadata_net-26.5.0-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: groupdocs_metadata_net-26.5.0-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 169.4 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a875a8883e1d0e642128f6d9a4784572e261f0e855d3ccff0020cb3826474027
|
|
| MD5 |
1619817f71387c684868c1e89eff3c77
|
|
| BLAKE2b-256 |
e888f1066db8784a87f3351e7fefe48f3eb36808fc7b7e2a6b08891390bf5d4c
|
File details
Details for the file groupdocs_metadata_net-26.5.0-py3-none-macosx_10_14_x86_64.whl.
File metadata
- Download URL: groupdocs_metadata_net-26.5.0-py3-none-macosx_10_14_x86_64.whl
- Upload date:
- Size: 171.7 MB
- Tags: Python 3, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4edd97c99ddceeec81f530f4dfb68d375ed291fcf4b6ba38c3dd70fe6ce2b31a
|
|
| MD5 |
71ae96edce4f28ec17950f4167cd27ed
|
|
| BLAKE2b-256 |
fbe7c48a08200f27efe2121be2db8681d55f07edc9919081882ebd23fa90638e
|