docuverse

State-of-the-art Retrieval/Search engine models, including ElasticSearch, ChromaDB, Milvus, and PrimeQA

These details have not been verified by PyPI

Project links

Project description

Repository for (almost) all your document search needs.

Part of the Prime Repository for State-of-the-Art Multilingual QuestionAnswering Research and Development.

DocUServe is a public open source repository that enables researchers and developers to quickly experiment with various search engines (such as ElasticSearch, ChromaDB, Milvus, PrimeQA, FAISS) both in direct search and reranking scenarios. By using DocUVerse, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. DocUVerse is built on top of the Transformers, PrimeQA, and Elasticsearch toolkits and uses datasets and models that are directly downloadable.

Design

The following is a code snippet showing how to ingesting a new corpus (create an index for a specific engine), read the query file, run the search, compute the results and print them:

from docuverse import SearchEngine
engine = SearchEngine(config_or_path="data/clapnq_small/milvus-test.yaml")

# Read the ClapNQ dataset
data = engine.read_data() # or engine.read_data(engine.config.input_passages)
#Ingest the data
engine.ingest(data)

# Read the queries
queries = engine.read_questions() # or engine.read_questions(engine.config.input_queries)
# Run the retrieval
results = engine.search(queries)
# Evaluation and print the results
scores = engine.compute_score(queries, results)

# Print the evaluation results in a human-readable format.
print(f"Results:\n{scores}")

✔️ Getting Started

Installation

Installation doc

# cd to project root

# If you want to run on GPU make sure to install torch appropriately

# Install as editable (-e) or non-editable using pip, with extras (e.g. tests) as desired
# Example installation commands:

# Minimal install (non-editable)
pip install .

# Full install (editable)
pip install -e .

# Install milvus and/or elastic dependencies, and the pyizumo library (if you have acecess to it)
pip install -r requirements-milvus.txt
pip install -r requirements-elastic.txt
pip install -r requirements_extra.txt

Please note that dependencies (specified in setup.py) are pinned to provide a stable experience. When installing from source these can be modified, however this is not officially supported.

🔭 Learn more (not yet working)

Section	Description
📒 Documentation	Start API documentation and tutorials
📓 Tutorials: Jupyter Notebooks	Notebooks to get started on QA tasks
🤗 Model sharing and uploading	Upload and share your fine-tuned models with the community
✅ Pull Request	PrimeQA Pull Request
📄 Generate Documentation	How Documentation works

❤️ DocUVerse collaborators include: Sara Rosenthal, Parul Awasthy, Scott McCarley, Jatin Ganhotra, and Radu Florian.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.13

Dec 10, 2025

0.0.8

Nov 22, 2024

0.0.1

Jun 19, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docuverse-0.0.13-py3-none-any.whl (213.2 kB view details)

Uploaded Dec 10, 2025 Python 3

File details

Details for the file docuverse-0.0.13-py3-none-any.whl.

File metadata

Download URL: docuverse-0.0.13-py3-none-any.whl
Upload date: Dec 10, 2025
Size: 213.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for docuverse-0.0.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fddab8eed26afff1ccdbae05a47c9f0681cdca7a8a351e47d73478df6ea115c9`
MD5	`3047b712dcf1b0e73c58cf92f8c06b98`
BLAKE2b-256	`581047a0bd5bfeb3d89516702bf7a180ee6449e0e0434007b9b6f5877a7be37e`

See more details on using hashes here.

docuverse 0.0.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Repository for (almost) all your document search needs.

Part of the Prime Repository for State-of-the-Art Multilingual QuestionAnswering Research and Development.

Design

✔️ Getting Started

Installation

🔭 Learn more (not yet working)

❤️ DocUVerse collaborators include: Sara Rosenthal, Parul Awasthy, Scott McCarley, Jatin Ganhotra, and Radu Florian.

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

docuverse 0.0.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Repository for (almost) *all* your document search needs. Part of the Prime Repository for State-of-the-Art Multilingual QuestionAnswering Research and Development.

Design

✔️ Getting Started

Installation

🔭 Learn more (not yet working)

❤️ DocUVerse collaborators include: Sara Rosenthal, Parul Awasthy, Scott McCarley, Jatin Ganhotra, and Radu Florian.

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Repository for (almost) all your document search needs.

Part of the Prime Repository for State-of-the-Art Multilingual QuestionAnswering Research and Development.