Skip to main content

A local UI package for pooling existing embedding zips into grouped vectors

Project description

poolin

A local UI package for pooling existing embedding vectors from an embedding zip into grouped higher-level vectors.

Important note

Standard sentence-transformer style pooling usually happens inside the embedding model when token embeddings are converted into one sentence embedding. This package does post-embedding vector pooling over already-created chunk embeddings.

What it does

  • launches with the poolin command
  • reads an embedding zip such as RAG_chunks_recursive_chunks_embeddings.zip
  • auto-groups related chunk embeddings by filename pattern like RAG_chunk_001_rcs_001.md -> RAG_chunk_001
  • pools vectors with one of these methods:
    • auto
    • mean
    • max
    • weighted_char_mean
    • weighted_word_mean
    • mean_sqrt_len
  • exports a zip with:
    • pooling_summary.json
    • pooling_manifest.csv
    • *_pooled_embeddings.jsonl (optional)
    • *_pooled_embeddings.csv (optional)
    • *_pooled_embeddings.npz (optional)

Install

pip install poolin

Run

poolin

Suggested input

Use a zip produced by your embedding step, containing an embeddings .npz or .jsonl payload plus the summary file.

Ownership note

The package metadata and copyright notice are set to Wenxi Wang. You should still verify PyPI package-name availability, trademark questions, and any legal or patent issues yourself before publishing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

poolin-0.1.0.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

poolin-0.1.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file poolin-0.1.0.tar.gz.

File metadata

  • Download URL: poolin-0.1.0.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for poolin-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ba3d1f7ed79261b7899843ef3ee1aa22e3d9a072a2051b33dbdf518091947354
MD5 8fbe9986cc22969c5251a55a06779554
BLAKE2b-256 df8b32f01b8c8784e9f35c8b50ac1d80beeb923cc63dfd86e99d80a3943d427c

See more details on using hashes here.

File details

Details for the file poolin-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: poolin-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for poolin-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 87279d79c88871d00b728b3cf30014af2fa8305e5b3408b78e82c6beb36ad646
MD5 2106cda4ed841a93f2308a5ae6aa27c0
BLAKE2b-256 4d4c82ca88e3f72f106eff05926760f49dda49d4f89016dc893d8ba574f1082a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page