A pure python based utility to extract text from PPT files.
Project description
ppt2txt
A pure python based utility to extract text from PPT files.
The code is based on the official documentation for MS-PPT files available at https://msopenspecs.azureedge.net/files/MS-PPT/%5bMS-PPT%5d.pdf.
How to install?
pip install ppt2txt
How to run?
- From command line:
ppt2txt file.ppt -o output_dir
- From python:
import ppt2txt
# extract content
parsed_ppt_dict = ppt2txt.process("file.ppt")
Output
parsed_ppt_dict is a dictionary with the following structure:
{
"filename": "file.ppt",
"slides": 4,
"content": {
"0": "Text from the first record",
"1": "Text from the second record"
}
}
where:
filenameis the name of the input fileslidesis the number of slidescontentis a dictionary containing an element for each record of type text found in the document
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ppt2txt-0.1.0.tar.gz.
File metadata
- Download URL: ppt2txt-0.1.0.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc9401f1859475a09657a453577063f396d0157b1d379d755ac6e49fdafd1e8a
|
|
| MD5 |
ed21311babf234a0a42573c77e506791
|
|
| BLAKE2b-256 |
6dc25a4b032934eb4c5518f269f9b18ef6453e0fe877c8642edda541013089ee
|
File details
Details for the file ppt2txt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ppt2txt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c5535d9a3af1c048814bffbefef62677dfc5c9e051a1e690db2b011448c9198
|
|
| MD5 |
2d8d6a11cfbfb78357857237e82e71d3
|
|
| BLAKE2b-256 |
713c9d8e82b6ea753f35c543a615d0dd27521688669d7ea5a7ce6850a25070e0
|