Skip to main content

llama-index readers HuggingFace Datasets integration

Project description

LlamaIndex Readers Integration: HuggingFace Datasets

Overview

HuggingFace Datasets Reader is a tool designed to load HuggingFace datasets as documents.

Installation

You can install HuggingFace Datasets Reader via pip:

pip install llama-index-readers-datasets

Usage

from llama_index.readers.datasets import DatasetsReader
from datasets import load_dataset

reader = DatasetsReader()

# Load train split (default) as metadata
docs = reader.load_data("lhoestq/demo1")

# Load test split as metadata
docs = reader.load_data("lhoestq/demo1", split="test")

# Load specify the dictionary key to use as text value
docs = reader.load_data("lhoestq/demo1", text_key="review")

# Pass additional arguments to datasets.load_dataset
docs = reader.load_data("lhoestq/demo1", cache_dir="/tmp/huggingface")

# Load from a preloaded dataset (ignore all other arguments)
dataset = load_dataset("lhoestq/demo1", split="train")
docs = reader.load_data(dataset=dataset)

# Lazy loading (stream samples)
for it in reader.lazy_load_data(
    "lhoestq/demo1", split="test", text_key="review", doc_id_key="id"
):
    print(it)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_datasets-0.2.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_readers_datasets-0.2.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_readers_datasets-0.2.0.tar.gz.

File metadata

  • Download URL: llama_index_readers_datasets-0.2.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_datasets-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0a178acc9715b9bd78cba0578e8682af86e1cb191a537d423f15406ba29c26e5
MD5 2eca3b49030fbea315741a347054732e
BLAKE2b-256 0229a2ab200ab240424a157cc4d3c3200f7ef50719b52e2857fb2d0e272371bb

See more details on using hashes here.

File details

Details for the file llama_index_readers_datasets-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_readers_datasets-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_datasets-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ffbf98510e634e5685bcbc5f004720e4afcd5342225a4b0534f25bf2ff2398ef
MD5 a9f51266526dbb0f6d0bb64d2eb483cb
BLAKE2b-256 91669cf8e3d2b5887292e5999e1b685e0b88a6761cfb0f97ad2055234ca4e8b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page