Document ingestion and processing package with vector store and graph capabilities
Project description
Cognify
A Python package for document ingestion and processing with vector store capabilities.
Installation
pip install cognify
Features
- Document processing from URLs (PDF, DOC, etc.)
- Vector storage with Qdrant
- Bulk document upload
- Document chunking and metadata management
- OpenAI embeddings integration
Usage
Basic Setup
from cognify import VectorStore
# Initialize vector store
vector_store = VectorStore(qdrant_url="http://localhost:6333")
Create Collection
# Create a new collection
vector_store.create_collection("my_documents")
Bulk Upload Documents
# Upload multiple documents from URLs
urls = [
"https://example.com/document1.pdf",
"https://example.com/document2.pdf"
]
results = vector_store.bulk_url_upload("my_documents", urls)
print(f"Uploaded {results['successful_uploads']} documents")
Retrieve Document Chunks
# Get all chunks for a specific document
chunks = vector_store.get_document_chunks("my_documents", "document_id")
Delete Documents
# Delete specific document chunks
chunk_ids = ["chunk_id1", "chunk_id2"]
vector_store.delete_documents("my_documents", chunk_ids)
Requirements
- Python 3.12+
- Qdrant vector database
- OpenAI API key (for embeddings)
Dependencies
- langchain
- langchain-openai
- langchain-qdrant
- qdrant-client
- docling
- python-dotenv
- openai
- tiktoken
Environment Variables
Set your OpenAI API key:
export OPENAI_API_KEY="your-api-key-here"
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zencognify-0.1.0.tar.gz
(4.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zencognify-0.1.0.tar.gz.
File metadata
- Download URL: zencognify-0.1.0.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce709763e582f497fdbeceeabbb49fc9bd2d3e69e654a36f8aaeca24cd0d07ba
|
|
| MD5 |
88d93e9fbf1db767175dfc0700096b9c
|
|
| BLAKE2b-256 |
e9bcfa9339e9f732fe031880d933e58b64c25aec6a160e501341e2602448ee2a
|
File details
Details for the file zencognify-0.1.0-py3-none-any.whl.
File metadata
- Download URL: zencognify-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5e0b76c7ad5165bd39befaa04aa8ae1db13ea8d531c99ece70a3505669baa4a
|
|
| MD5 |
31bc774b9fadd096b58de8bf2e1e7738
|
|
| BLAKE2b-256 |
017e5174d6539be4b334d7aec8b10b041e7e1358af88c88335e30b18eb1179f6
|