將各種格式的檔案提取成txt
Project description
cameo-txt
cameo-txt
是一個用於將不同檔案格式(如 docx、pdf、csv、odt 等)轉換為純文本文件的 Python 庫。
安裝
您可以使用以下命令安裝此套件:
pip install cameo-txt
用法
以下是一個簡單的例子,說明如何使用這個函數庫:
from cameo_txt import convert_to_txt
# 單個檔案
result = convert_to_txt('path/to/your/file.docx')
# 多個檔案
results = convert_to_txt(['path/to/your/file1.pdf', 'path/to/your/file2.csv'])
# 保存到特定輸出資料夾
results = convert_to_txt(['path/to/your/file1.pdf', 'path/to/your/file2.csv'], output_folder='path/to/output/folder')
功能
cameo-txt主要提供以下功能:
下載文件
如果提供了URL,庫將自動下載文件並保存為臨時文件。
支援多種格式
支援docx、pdf、csv和odt格式的文件。您可以輕鬆添加對更多格式的支援。
並行處理
使用concurrent.futures並行處理多個文件,以提高效率。
自動編碼檢測
使用chardet自動檢測和處理不同編碼的文件。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cameo-txt-0.0.1.tar.gz
(3.3 kB
view details)
Built Distribution
File details
Details for the file cameo-txt-0.0.1.tar.gz
.
File metadata
- Download URL: cameo-txt-0.0.1.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c590a80ca360b9acdffded7be755b933cb58dfa5db07d54f48846a5fc1dab23d |
|
MD5 | 067e692de6da26f527d971ac8932d7b9 |
|
BLAKE2b-256 | 6647d50c62cfb058b18763a28d47a8dca76f262be0fe429252add82a87131684 |
File details
Details for the file cameo_txt-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: cameo_txt-0.0.1-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b5724688b63280a62f08eb80d86c0bce084d99b4e6994ebc270de2fc9eaafc0 |
|
MD5 | bd24fa92587570294cad19c24ca849dc |
|
BLAKE2b-256 | 1960de69c7e1dc49afae97eef7a83f28c85b25a934d483bdf05c45b0e049c0bf |