Skip to main content

llama-index readers HuggingFace Datasets integration

Project description

LlamaIndex Readers Integration: HuggingFace Datasets

Overview

HuggingFace Datasets Reader is a tool designed to load HuggingFace datasets as documents.

Installation

You can install HuggingFace Datasets Reader via pip:

pip install llama-index-readers-datasets

Usage

from llama_index.readers.datasets import DatasetsReader
from datasets import load_dataset

reader = DatasetsReader()

# Load train split (default) as metadata
docs = reader.load_data("lhoestq/demo1")

# Load test split as metadata
docs = reader.load_data("lhoestq/demo1", split="test")

# Load specify the dictionary key to use as text value
docs = reader.load_data("lhoestq/demo1", text_key="review")

# Pass additional arguments to datasets.load_dataset
docs = reader.load_data("lhoestq/demo1", cache_dir="/tmp/huggingface")

# Load from a preloaded dataset (ignore all other arguments)
dataset = load_dataset("lhoestq/demo1", split="train")
docs = reader.load_data(dataset=dataset)

# Lazy loading (stream samples)
for it in reader.lazy_load_data(
    "lhoestq/demo1", split="test", text_key="review", doc_id_key="id"
):
    print(it)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_datasets-0.1.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_readers_datasets-0.1.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_readers_datasets-0.1.0.tar.gz.

File metadata

  • Download URL: llama_index_readers_datasets-0.1.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_datasets-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5f210d954f174cd12ba5f3befaa9cb38f668da26091ab561cf00d087b56db760
MD5 78ad619f2ac245a071dd0ed19ff8be10
BLAKE2b-256 da039a587cbf5d257be68446f78579e43655f5d24d4cb2fe645b86eb95020ff9

See more details on using hashes here.

File details

Details for the file llama_index_readers_datasets-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_readers_datasets-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_datasets-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4596a5a0fb752e73fd502b401657750bb098a4a677290ef0148c4b75aa839eb6
MD5 8cc014498c780cd11c4e25ae64c1480e
BLAKE2b-256 5473c1f4fb58d3d5d1e99dc6759f3ddd5b9739cb694c6becdadf632739d08aae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page