officeextractor extracts media files (images, videos, music) from Microsoft Office and LibreOffice files.
Project description
officeextractor
Test Status | |
Version Info | |
Compatibility | |
Style |
About
officeextractor is a Python library to extract media files like images, audio and video from office documents (Microsoft Office & LibreOffice).
Supported File Types
Supported | File Types | Supported Media Formats |
---|---|---|
Microsoft Word | docx, docm, dotm, dotx | images |
Microsoft Excel | xlsx, xlsb, xlsm, xltm, xltx | images |
Microsoft PowerPoint | potx, ppsm, ppsx, pptm, pptx, potm | images, video & audio |
LibreOffice Writer | odt, ott | images |
LibreOffice Calc | ods, ots | images |
LibreOffice Impress | odp, otp, odg | images |
⚠ NOTE: Microsoft Office 2003 files (doc, dot, xls, xlt, ppt, pot) are not supported.
Installation
pip install officeextractor
Usage
>>> import officeextractor
>>> officeextractor.extract(src=("File1.docx", "Folder/File2.xlsx"), dest="Path/To/Output/Folder")
4 media files extracted from File1.docx:
- 2 jpeg
- 1 gif
- 1 png
1 media file extracted from Folder/File2.xlsx:
- 1 png
Parameters
officeextractor.extract(src, dest, log=True)
src : str, list of str or tuple of str
Either a single file (string) or several files (list/tuple of strings) as relative or full path.
dest : str
Output directory as relative or full path. If the directory doesn't exist, it will be created.
log : bool, optional
Whether logging should be actived or not. If True, print a summary of the extraction. Default is True.
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file officeextractor-0.1.0.tar.gz
.
File metadata
- Download URL: officeextractor-0.1.0.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 776feb40896cd2e022e222e57494bc60ecc786dc49eeb9a7fd16c27077ea5f5e |
|
MD5 | cc6cb831781f030592693de57ebe4f83 |
|
BLAKE2b-256 | 218af0e67cbc4f12b75e01f76d2a964ef4ce681495808a7f24442df293d9bf2c |
File details
Details for the file officeextractor-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: officeextractor-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74d74f5fef5e247286ea7eb391e64674f0bcb358c43f35d2514420e6968aa72f |
|
MD5 | ee9c3d402398e6c64ea73c67790137fa |
|
BLAKE2b-256 | 1e0d54f65b1043f3349f95911f5fb0854f9d4d8b2e77dfaf628a8aea39300180 |