Skip to main content

Pipeline to transform text chunks into embeddings and load to Qdrant

Project description

embedding-flow

Biblioteca para transformar chunks de texto en embeddings de 768 dimensiones y cargarlos en Qdrant.

Instalación

# Instalar torch CPU primero (evita descargar CUDA)
pip install torch --index-url https://download.pytorch.org/whl/cpu

# Luego instalar embedding-flow
pip install embedding-flow

Uso

from embedding_flow import embedding_flow

# Recibe el path del parquet con chunks y carga embeddings a Qdrant
embedding_flow("/path/to/chunks.parquet")

Variables de entorno

QDRANT_URL=http://localhost:6333
QDRANT_COLLECTION=embeddings_collection
VECTOR_SIZE=768

Flujo

  1. Lee chunks desde parquet
  2. Genera embeddings (768 dim) con all-mpnet-base-v2
  3. Carga embeddings a Qdrant (Docker local)

Licencia

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embedding_flow-0.1.9.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embedding_flow-0.1.9-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file embedding_flow-0.1.9.tar.gz.

File metadata

  • Download URL: embedding_flow-0.1.9.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for embedding_flow-0.1.9.tar.gz
Algorithm Hash digest
SHA256 cdeb0dbabd77a540874e88d3e16a7836ec02fa1b2e82046770a80a223372dc96
MD5 41ac75e3215693796dbba688e464fd44
BLAKE2b-256 35ce5c09bd9b551cddf7ff717c40b9e8b55c311eed4692212dea0da10b032595

See more details on using hashes here.

File details

Details for the file embedding_flow-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: embedding_flow-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for embedding_flow-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7c812b6fecf57b2c666a2ff5a65ed0bfbc5df99766e2f4b9c3d13ae5bcd9d342
MD5 edea93a9c67dd89c5a06f1df61c1fb81
BLAKE2b-256 1b8879dea4c44e42a466373a1432782b20a9ff666bd2eb94deba0d68f3e4c115

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page