A pure python based utility to extract text from PPT files.
Project description
ppt2txt
A pure python based utility to extract text from PPT files.
The code is based on the official documentation for MS-PPT files available at https://msopenspecs.azureedge.net/files/MS-PPT/%5bMS-PPT%5d.pdf.
How to install?
pip install ppt2txt
How to run?
- From command line:
ppt2txt file.ppt -o output_dir
- From python:
import ppt2txt
# extract content
parsed_ppt_dict = ppt2txt.process("file.ppt")
Output
parsed_ppt_dict
is a dictionary with the following structure:
{
"filename": "file.ppt",
"slides": 4,
"content": {
"0": "Text from the first record",
"1": "Text from the second record"
}
}
where:
filename
is the name of the input fileslides
is the number of slidescontent
is a dictionary containing an element for each record of type text found in the document
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ppt2txt-0.1.0.tar.gz
(4.8 kB
view details)
Built Distribution
File details
Details for the file ppt2txt-0.1.0.tar.gz
.
File metadata
- Download URL: ppt2txt-0.1.0.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc9401f1859475a09657a453577063f396d0157b1d379d755ac6e49fdafd1e8a |
|
MD5 | ed21311babf234a0a42573c77e506791 |
|
BLAKE2b-256 | 6dc25a4b032934eb4c5518f269f9b18ef6453e0fe877c8642edda541013089ee |
File details
Details for the file ppt2txt-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: ppt2txt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c5535d9a3af1c048814bffbefef62677dfc5c9e051a1e690db2b011448c9198 |
|
MD5 | 2d8d6a11cfbfb78357857237e82e71d3 |
|
BLAKE2b-256 | 713c9d8e82b6ea753f35c543a615d0dd27521688669d7ea5a7ce6850a25070e0 |