officeextractor extracts media files (images, videos, music) from Microsoft Office and LibreOffice files.
Project description
officeextractor
Test Status | |
Version Info | |
Compatibility | |
Style |
About
officeextractor is a Python library to extract media files like images, audio and video from office documents (Microsoft Office & LibreOffice).
Supported File Types
Supported | File Types | Supported Media Formats |
---|---|---|
Microsoft Word | docx, docm, dotm, dotx | images |
Microsoft Excel | xlsx, xlsb, xlsm, xltm, xltx | images |
Microsoft PowerPoint | potx, ppsm, ppsx, pptm, pptx, potm | images, video & audio |
LibreOffice Writer | odt, ott | images |
LibreOffice Calc | ods, ots | images |
LibreOffice Impress | odp, otp, odg | images |
⚠ NOTE: Microsoft Office 2003 files (doc, dot, xls, xlt, ppt, pot) are not supported.
Installation
pip install officeextractor
Usage
>>> import officeextractor
>>> officeextractor.extract(src=("File1.docx", "Folder/File2.xlsx"), dest="Path/To/Output/Folder")
4 media files extracted from File1.docx:
- 2 jpeg
- 1 gif
- 1 png
1 media file extracted from Folder/File2.xlsx:
- 1 png
Parameters
officeextractor.extract(src, dest, log=True)
src : str, list of str or tuple of str
Either a single file (string) or several files (list/tuple of strings) as relative or full path.
dest : str
Output directory as relative or full path. If the directory doesn't exist, it will be created.
log : bool, optional
Whether logging should be actived or not. If True, print a summary of the extraction. Default is True.
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for officeextractor-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74d74f5fef5e247286ea7eb391e64674f0bcb358c43f35d2514420e6968aa72f |
|
MD5 | ee9c3d402398e6c64ea73c67790137fa |
|
BLAKE2b-256 | 1e0d54f65b1043f3349f95911f5fb0854f9d4d8b2e77dfaf628a8aea39300180 |