llama-index readers structured_data integration
Project description
LlamaIndex Readers Integration: Structured-Data
The function 'StructuredDataReader' supports reading files in JSON, JSONL, CSV, and XLSX formats. It provides parameters 'col_index' and 'col_metadata' to differentiate between columns that should be written into the document's main text and additional metadata.
Install package
pip install llama-index-readers-structured-data
Or install locally:
pip install -e llama-index-integrations/readers/llama-index-readers-structured-data
Usage
- for single document:
from pathlib import Path
from llama_index.readers.structured_data.base import StructuredDataReader
parser = StructuredDataReader(col_index=["col1", "col2"], col_metadata=0)
documents = parser.load_data(Path("your/file/path.json"))
- for dictory of documents:
from pathlib import Path
from llama_index.core import SimpleDirectoryReader
from llama_index.readers.structured_data.base import StructuredDataReader
parser = StructuredDataReader(col_index=[1, -1], col_metadata="col3")
file_extractor = {
".xlsx": parser,
".csv": parser,
".json": parser,
".jsonl": parser,
}
documents = SimpleDirectoryReader(
"your/dic/path", file_extractor=file_extractor
).load_data()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for llama_index_readers_structured_data-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5937459e09f400bb812f70b04a497d75a1b76a9b9c5a9fae8ac0b3006f5ff5e |
|
MD5 | be7320bb00eb67553fabf3e005d7009f |
|
BLAKE2b-256 | 863fa04ec9194d9ebf16d7f6f788a4dc8477672635ddf71e11b21026c4b88163 |
Close
Hashes for llama_index_readers_structured_data-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfb4f449ac3e7afee4039ff68bbc9d20d28f3a9f4c910e2bc3ebbcc7012e52c0 |
|
MD5 | b017a3301bf8266c01ec5d18eb515f5b |
|
BLAKE2b-256 | b32ebdde8fa52b815131aa6854811253b6ec8991826ec6fdc58872b25b15c435 |