Command-line tools for managing document stores like Qdrant and Solr
Project description
docstore-manager
A general-purpose command-line tool for managing document store databases, currently supporting Qdrant vector database and Solr search platform. Simplifies common document store management tasks through a unified CLI interface.
Features
- Multi-platform Support:
- Qdrant vector database for similarity search and vector operations
- Solr search platform for text search and faceted navigation
- Collection Management:
- Create, delete, and list collections
- Get detailed information about collections
- Document Operations:
- Add/update documents to collections
- Remove documents from collections
- Retrieve documents by ID
- Search Capabilities:
- Vector similarity search (Qdrant)
- Full-text search (Solr)
- Filtering and faceting
- Batch Operations:
- Add fields to documents
- Delete fields from documents
- Replace fields in documents
- Advanced Features:
- Support for JSON path selectors for precise document modifications
- Multiple configuration profiles support
- Flexible output formatting (JSON, YAML, CSV)
Installation
# From PyPI
pipx install docstore-manager
# From source
git clone https://github.com/allenday/docstore-manager.git
cd docstore-manager
pipx install -e .
Configuration
When first run, docstore-manager will create a configuration file at:
- Linux/macOS:
~/.config/docstore-manager/config.yaml - Windows:
%APPDATA%\docstore-manager\config.yaml
You can edit this file to add your connection details and schema configuration:
default:
# Common settings for all document stores
connection:
type: qdrant # or solr
collection: my-collection
# Qdrant-specific settings
qdrant:
url: localhost
port: 6333
api_key: ""
vectors:
size: 256
distance: cosine
indexing_threshold: 0
payload_indices:
- field: category
type: keyword
- field: created_at
type: datetime
- field: price
type: float
# Solr-specific settings
solr:
url: http://localhost:8983/solr
username: ""
password: ""
schema:
fields:
- name: id
type: string
- name: title
type: text_general
- name: content
type: text_general
- name: category
type: string
- name: created_at
type: pdate
production:
connection:
type: qdrant
collection: production-collection
qdrant:
url: your-production-instance.region.cloud.qdrant.io
port: 6333
api_key: your-production-api-key
vectors:
size: 1536 # For OpenAI embeddings
distance: cosine
indexing_threshold: 1000
payload_indices:
- field: product_id
type: keyword
- field: timestamp
type: datetime
solr:
url: https://your-production-solr.example.com/solr
username: admin
password: your-production-password
Each profile can define its own:
- Connection settings for both Qdrant and Solr
- Vector configuration for Qdrant (size, distance metric, indexing behavior)
- Schema configuration for Solr
- Payload indices for optimized search performance
The YAML format makes it easy to maintain a clean, organized configuration across multiple environments.
You can switch between profiles using the --profile flag:
docstore-manager --profile production list
You can also override any setting with command-line arguments.
Testing
This project uses pytest for testing. Tests are divided into two main categories:
- Unit Tests: These tests verify individual components in isolation and do not require external services. They are fast and should be run frequently during development.
- Integration Tests: These tests verify the interaction between the CLI tool and external services (Qdrant, Solr). They require these services to be running (e.g., via
docker-compose up -d) and are marked with@pytest.mark.integration.
Running Tests:
-
Run only Unit Tests (Default Behavior):
pytest -v(Integration tests are skipped by default)
-
Run only Integration Tests:
# First, ensure Qdrant/Solr containers are running (e.g., docker-compose up -d) RUN_INTEGRATION_TESTS=true pytest -m integration -v
(Requires setting the RUN_INTEGRATION_TESTS environment variable)
-
Run All Tests (Unit + Integration):
# First, ensure Qdrant/Solr containers are running RUN_INTEGRATION_TESTS=true pytest -v
Usage
docstore-manager <document-store> <command> [options]
Document Stores:
qdrant: Commands for Qdrant vector databasesolr: Commands for Solr search platform
Available Commands:
list: List all collectionscreate: Create a new collectiondelete: Delete an existing collectioninfo: Get detailed information about a collectionadd-documents: Add documents to a collectionremove-documents: Remove documents from a collectionget: Retrieve documents by IDsearch: Search documents in a collectionscroll: Scroll through documents in a collection (Qdrant only)count: Count documents in a collection (Qdrant only)config: View available configuration profiles
Connection Options:
--profile PROFILE Configuration profile to use
--url URL Server URL
--port PORT Server port (Qdrant only)
--api-key API_KEY API key (Qdrant only)
--username USER Username (Solr only)
--password PASS Password (Solr only)
--collection NAME Collection name
Examples:
Qdrant Examples:
# List all Qdrant collections
docstore-manager qdrant list
# Create a new Qdrant collection with custom settings
docstore-manager qdrant create --collection my-collection --size 1536 --distance euclid
# Get info about a Qdrant collection
docstore-manager qdrant info --collection my-collection
# Retrieve points by ID from Qdrant
docstore-manager qdrant get --ids "1,2,3" --with-vectors
# Search Qdrant using vector similarity
docstore-manager qdrant search --vector-file query_vector.json --limit 10
# Retrieve points using a filter and save as CSV
docstore-manager qdrant get --filter '{"key":"category","match":{"value":"product"}}' \
--format csv --output results.csv
# Add a field to documents matching a filter
docstore-manager qdrant batch --filter '{"key":"category","match":{"value":"product"}}' \
--add --doc '{"processed": true}'
# Delete a field from specific documents
docstore-manager qdrant batch --ids "doc1,doc2,doc3" --delete --selector "metadata.temp_data"
# Replace fields in documents from an ID file
docstore-manager qdrant batch --id-file my_ids.txt --replace --selector "metadata.source" \
--doc '{"provider": "new-provider", "date": "2025-03-31"}'
Solr Examples:
# List all Solr collections
docstore-manager solr list
# Create a new Solr collection
docstore-manager solr create --collection my-collection
# Get info about a Solr collection
docstore-manager solr info --collection my-collection
# Add documents to Solr from a file
docstore-manager solr add-documents --collection my-collection --file documents.json
# Search documents in Solr
docstore-manager solr search --collection my-collection --query "title:example" --fields "id,title,score"
# Get documents by ID from Solr
docstore-manager solr get --collection my-collection --ids "doc1,doc2,doc3"
# Remove documents from Solr by query
docstore-manager solr remove-documents --collection my-collection --query "category:obsolete"
Switching Between Profiles:
# Use the production profile with Qdrant
docstore-manager --profile production qdrant list
# Use the production profile with Solr
docstore-manager --profile production solr list
Changelog
v0.1.0 (2025-05-03)
- Initial release of docstore-manager
- Support for both Qdrant and Solr document stores
- Comprehensive usage examples for all operations
- Improved error handling and logging
- Standardized interfaces across document store implementations
- Configuration profiles for different environments
- Command-line interface for managing collections and documents
- Detailed documentation and API reference
- Renamed from "Qdrant Manager" to "docstore-manager"
- Consolidated CLI entry points to a single
docstore-managercommand - Improved test coverage and reliability
- Enhanced formatting options for command outputs
- Fixed collection info formatting issues
- Fixed CLI testing context handling
- Fixed parameter validation in get_documents function
- Fixed CollectionConfig validation
License
Apache-2.0
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docstore_manager-0.1.0.tar.gz.
File metadata
- Download URL: docstore_manager-0.1.0.tar.gz
- Upload date:
- Size: 162.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f129bcdc0bb0fd980974e4d4373491b6729ae4d9682e84fc5e50d96dd10858e
|
|
| MD5 |
8e4f879b891d1ccf338b2a9a3ec8014c
|
|
| BLAKE2b-256 |
e5a60699979bcfb83721bfd15288258f56443156fdc21d1b9e98e4dca6875209
|
File details
Details for the file docstore_manager-0.1.0-py3-none-any.whl.
File metadata
- Download URL: docstore_manager-0.1.0-py3-none-any.whl
- Upload date:
- Size: 123.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed4b17694e18a8b9365800fad28fc34ecc1b86cb34ccafd3eb8d8eff77a86365
|
|
| MD5 |
2dd5e41e5cd5593e99a6bae5bf068e1b
|
|
| BLAKE2b-256 |
2800791db74c5c9170212564a8d26ed1b5a815f1718a8c79d2da08844a4018e1
|