officeextractor extracts media files (images, videos, music) from Microsoft Office and LibreOffice files.
Project description
officeextractor
| Test Status | |
| Version Info | |
| Compatibility | |
| Style |
About
officeextractor is a Python library to extract media files like images, audio and video from office documents (Microsoft Office & LibreOffice).
Supported File Types
| Supported | File Types | Supported Media Formats |
|---|---|---|
| Microsoft Word | docx, docm, dotm, dotx | images |
| Microsoft Excel | xlsx, xlsb, xlsm, xltm, xltx | images |
| Microsoft PowerPoint | potx, ppsm, ppsx, pptm, pptx, potm | images, video & audio |
| LibreOffice Writer | odt, ott | images |
| LibreOffice Calc | ods, ots | images |
| LibreOffice Impress | odp, otp, odg | images |
⚠ NOTE: Microsoft Office 2003 files (doc, dot, xls, xlt, ppt, pot) are not supported.
Installation
pip install officeextractor
Usage
>>> import officeextractor
>>> officeextractor.extract(src=("File1.docx", "Folder/File2.xlsx"), dest="Path/To/Output/Folder")
4 media files extracted from File1.docx:
- 2 jpeg
- 1 gif
- 1 png
1 media file extracted from Folder/File2.xlsx:
- 1 png
Parameters
officeextractor.extract(src, dest, log=True)
src : str, list of str or tuple of str
Either a single file (string) or several files (list/tuple of strings) as relative or full path.
dest : str
Output directory as relative or full path. If the directory doesn't exist, it will be created.
log : bool, optional
Whether logging should be actived or not. If True, print a summary of the extraction. Default is True.
Release Notes
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file officeextractor-0.1.2.tar.gz.
File metadata
- Download URL: officeextractor-0.1.2.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88ad0272c299d72cb5b5c0e6e3503c2ad0e01f7e8d453cda95923acb6306dcca
|
|
| MD5 |
0fda1e0704c15113da3c426ef7620e64
|
|
| BLAKE2b-256 |
6149af188c037d2b9b4f37bf83278b86fd861f8b835cd1ed281ed61ec328331a
|
File details
Details for the file officeextractor-0.1.2-py3-none-any.whl.
File metadata
- Download URL: officeextractor-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
049fef2bdc0df4e3c8f76b7278dc74cf77c8232667362c6d82648ccacca120d9
|
|
| MD5 |
cf39502bf1f03eb3477da08a0dd44ee9
|
|
| BLAKE2b-256 |
2746f0c74a2718b18b1bda68529d3c2d8c3a06e49c0753266c346594eb3a11f4
|