Extract structured data from Excel files with minimal token usage
Project description
carloforte
Extract structured data from Excel files with minimal token usage.
carloforte uses an island-detection algorithm to convert Excel sheets into a compact intermediate representation (CSV, Markdown, or JSON), making it efficient to pass spreadsheet data to LLMs.
Installation
uv add carloforte
Usage
import carloforte
# Extract all sheets as CSV (default)
text = carloforte.extract("data.xlsx")
# Extract specific sheets as Markdown
text = carloforte.extract("data.xlsx", sheets=["Revenue", "Costs"], fmt="markdown")
# Extract as JSON
text = carloforte.extract("data.xlsx", fmt="json")
Formats
| Format | Best for |
|---|---|
csv |
Compact, low token count |
markdown |
Readable, good for LLM prompts |
json |
Structured output, programmatic use |
CLI
carloforte data.xlsx --fmt markdown
carloforte data.xlsx --sheets Revenue Costs --fmt json
How it works
Excel sheets often contain multiple disconnected tables, empty rows, and metadata scattered around. carloforte detects each contiguous block of data ("island") independently and serialises only what matters — reducing token usage by 60–75% compared to passing raw Excel content to an LLM.
Architecture
flowchart LR
A["📄 .xlsx file"] --> B["_reader\nload sheets"]
B --> C["dict[sheet → grid]"]
C --> D["_islands\nBFS detection"]
D --> E["dict[sheet → islands]"]
E --> F{"fmt?"}
F -->|csv| G["CSV"]
F -->|markdown| H["Markdown"]
F -->|json| I["JSON"]
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file carloforte-0.1.2.tar.gz.
File metadata
- Download URL: carloforte-0.1.2.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
832f3722d774a2d3b111cbee2ee9844cc9f365d2ebea3819e4e9d3f9995967ea
|
|
| MD5 |
48dd738000cacb33782ca5e88b9c0806
|
|
| BLAKE2b-256 |
1bf4d8983f7cc6b6aedced1c8bc487dc1d50281290b93a40821d9c3dd82716c0
|
File details
Details for the file carloforte-0.1.2-py3-none-any.whl.
File metadata
- Download URL: carloforte-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70f4d75e128253928a1c647a980d942c2b2b26f85c97df0ef81bd34e73c7cb06
|
|
| MD5 |
117461503af4f4ea3a9c465c254c92d7
|
|
| BLAKE2b-256 |
eabf44c37c726d119f59aa354c864f64ce27b94eda42674957e636a935690829
|