Automated tool for migrating h2oGPTe collections
Project description
h2oGPTe Migration Tool
Automated tool for migrating h2oGPTe collections with tracking, verification, and resume capabilities.
Overview
This tool helps administrators and collection owners migrate collections to new embedding models while:
- Preserving collection settings - permissions, lifecycle settings, scheduled connectors, document metadata
- Tracking migration state in a SQLite database with step-level granularity
- Supporting resume - interrupted migrations can be resumed without duplicating work
- Verifying job completion with document counts, embedding model, lifecycle settings, and document statuses
- RAG verification - optionally test each migrated collection with a chat query to confirm it works
- Optionally migrating chat sessions from old to new collections
- Manual move operations - move connectors and chats between any collections
- Supporting both admin bulk and self-service migrations
Installation
pip install h2ogpte-migration
This installs the h2ogpte-migrate command.
How It Works
Per-Collection Migration Flow
For each collection, the tool performs these steps:
1. Create new collection (target embedding model) --> DB: collection_created=1
2. Copy permissions (public, user, group) --> DB: permissions_copied=1
3. Import documents + settings (single server job) --> DB: import_submitted=1
- Documents (with preserved ingest modes)
- Lifecycle settings (expiry, inactivity, size limit)
- Document metadata
4. Migrate scheduled connectors (after import succeeds) --> DB: connectors_migrated=1
5. [Optional] Migrate chat sessions (after connectors) --> DB: chats_migrated=1
The import job (step 3) handles documents, lifecycle settings, and metadata in a single server-side operation. Connectors and chats are migrated separately after the import succeeds to ensure they are only moved to a working collection.
Execution Modes
Parallel (default): Submit all import jobs without waiting. Jobs run in the background on the server. Use --verify later to check completion — verify also triggers connector migration for completed jobs, and optionally chat migration with --verify --migrate-chats. Note: passing --migrate-chats without --wait-for-completion will NOT migrate chats during the run — it is deferred to the next --verify --migrate-chats invocation.
Sequential (--wait-for-completion): Wait for each import job to complete before moving to the next collection. After each successful import, connectors are migrated immediately. If --migrate-chats is specified, chat sessions are migrated after connectors. Optionally use --max-concurrent-jobs N to process multiple collections concurrently while still waiting for each to complete.
Both execution modes produce identical end results — they differ only in when connectors/chats are migrated (inline for sequential, on next --verify for parallel).
Behavior Summary
| Action | --wait-for-completion |
Parallel (default) |
|---|---|---|
| Create new collection | Automatic | Automatic |
| Copy permissions | Automatic | Automatic |
| Import documents + settings | Waits for completion | Submits and exits |
| Validate migration (doc counts, model, statuses) | Automatic after import | On --verify |
| RAG verification | Needs --verify-query in the migration command |
On --verify --verify-query |
| Migrate connectors | Automatic after import | On --verify |
| Migrate chats | Needs --migrate-chats in the migration command |
On --verify --migrate-chats |
Why connectors are automatic but chats are opt-in:
- Connectors handle scheduled document ingestion — if they aren't moved, the new collection won't receive future data updates
- Chats are optional — admins may want to verify the migration before moving users' chat history, giving flexibility on timing
What happens to the old collection
The old collection is not deleted by this tool. After migration:
- Documents are shared (referenced by both old and new collections via
copy_document=False) - Scheduled connectors have been moved to the new collection (old collection has none)
- Chat sessions have been moved to the new collection (if
--migrate-chatswas used) - The old collection can be manually deleted once you've confirmed the migration is complete
- With
copy_document=False, deleting the old collection is safe — documents survive because the new collection still references them
Why connectors and chats are separate from the import job
Scheduled connectors are moved (not copied) from the old collection to the new one. This is a destructive operation — the old collection loses its connectors. If the import job failed (e.g., embedding model error), moving connectors to a broken collection would leave the old collection without connectors and the new collection unusable. By running connector migration only after a confirmed successful import, we ensure connectors are only moved to a working collection.
Chat sessions follow the same principle — they should only be moved after both the import and connector migration succeed.
Resume Capability
If the tool is interrupted, re-running the same command will:
- Skip collections that are fully migrated (unless
--force-remigrateis used). If connectors/chats were previously moved to another collection, the tool logs the exact--move-connectors/--move-chatscommands needed to recover them - Reuse collections that were created but not yet imported (avoids orphaned collections)
- Direct to
--verifyfor collections with submitted but incomplete imports - Run pending post-import steps for collections where import completed but connectors/chats haven't been migrated
- Retry failed collections when
--retry-failedis specified (creates a new collection)
What Gets Migrated
| Item | How (API calls / configs) | When |
|---|---|---|
| Documents | Server-side import job (import_collection_into_collection) |
During import |
| Document metadata | preserve_metadata=True |
During import |
| Document ingest modes | preserve_document_status=True (agent_only stays agent_only) |
During import |
| Lifecycle settings | copy_lifecycle_settings=True (expiry, inactivity, size limit) |
During import |
| Public permissions | list_collection_public_permissions + make_collection_public |
Before import |
| User/group permissions | list_collection_permissions + share_collection |
Before import |
| Scheduled connectors | migrate_scheduled_connectors_to_collection (moved from old to new) |
After import succeeds |
| Chat sessions | migrate_chat_sessions_to_collection (opt-in via --migrate-chats) |
After connectors succeed |
Authentication Modes
Admin Mode (--admin-key)
- Migrate collections for any user
- Use
--users,--all-users, or--collectionsto set scope - Automatically creates temporary API keys for collection owners
- Best for bulk migrations across the organization
Self-Service Mode (--user-key)
- Migrate only collections you own
- Optionally use
--collectionsto specify which collections - No additional temporary API keys needed (your user key is used directly)
- For collection owners managing their own migrations
Model Mappings
When using the --use-model-mappings flag, the tool uses the following predefined source→target model mappings. Collections whose current embedding model matches a source model below will be migrated to the corresponding target model. Collections using any other model are skipped.
| Source Model (deprecated) | Target Model (compliant) |
|---|---|
| BAAI/bge-m3 | h2oai/embeddinggemma-300m-qat-q8_0-unquantized |
| BAAI/bge-large-en-v1.5 | mixedbread-ai/mxbai-embed-large-v1 |
To migrate to a model not listed here, use --target-model instead.
Flag Reference
| Flag | Description |
|---|---|
| Required | |
--url <url> |
h2oGPTe instance URL |
| Authentication (choose one) | |
--admin-key [key] |
Admin API key. Enables migration for any user via --users, --all-users, or --collections/--collections-file. Automatically creates and cleans up temporary API keys for collection owners. Pass without a value to read from H2OGPTE_ADMIN_KEY env var |
--user-key [key] |
User API key. Migrates only collections you own. No additional temporary API keys created. Cannot use --users or --all-users. Pass without a value to read from H2OGPTE_USER_KEY env var |
| Migration Scope (choose one) | |
--users <names> |
Comma-separated usernames to migrate (admin only). Example: "john.doe, jane.smith" |
--all-users |
Migrate all users in the organization (admin only). Use with caution on large organizations — this will consider all collections in the system |
--collections <ids> |
Comma-separated collection IDs. Works with both admin and user keys. Auto-detects owners in admin mode. Also used to filter --verify scope |
--collections-file <path> |
Path to a file containing collection IDs (one per line, # comments ignored). Works like --collections but reads from a file |
| Migration Mode (choose one) | |
--use-model-mappings |
Use predefined source→target model mappings (see Model Mappings section). Collections whose model isn't in the mapping are skipped. Cannot be combined with --source-model or --target-model |
--target-model <model> |
Target embedding model. Required when not using --use-model-mappings |
--source-model <model> |
Only migrate collections using this specific embedding model (optional). Without this, all collections are migrated to --target-model regardless of what embedding model they are currently using |
| Execution | |
--wait-for-completion |
Wait for each collection to fully complete migration before moving to the next (import, validation, connectors, and optionally chats). Without this flag, jobs are submitted in parallel and --verify must be run separately. Use --max-concurrent-jobs N to process multiple collections concurrently |
--max-concurrent-jobs <N> |
Number of collections to process concurrently with --wait-for-completion (default: 1). Each worker runs the full migration cycle (create, import, validate, connectors, chats) independently. Has no effect without --wait-for-completion |
--verify |
Check status of previously submitted import jobs. Validates completed imports (document counts, embedding model, lifecycle settings, document statuses). Migrates connectors for successfully completed imports. Often combined with --migrate-chats and --verify-query. Cannot be combined with migration flags (--use-model-mappings, --target-model, --source-model) |
--migrate-chats |
Migrate chat sessions from old to new collection after successful import and connector migration. With --wait-for-completion: migrates chats inline. Without it: deferred to --verify --migrate-chats. Chats are only moved after connectors succeed |
--dry-run |
Preview what would be migrated without making any changes. Shows target models, permissions, lifecycle settings, document counts, and import settings. No collections created, no database writes |
| Retry/Resume | |
--retry-failed |
Retries collections whose import jobs failed. Creates a new collection and re-submits the import. The previous failed collection remains and needs manual cleanup. Note: This flag is not needed for the case where a collection's import job succeeded but connector/chat migration failed. Those are automatically retried on the next --verify --migrate-chats run. Cannot be combined with --force-remigrate |
--force-remigrate |
Re-migrate collections regardless of their migration status. Creates new collections even if previously migrated successfully. Overwrites database records. Use with caution — verify the state of the old collection before re-migrating. If a previous successful migration already moved connectors/chats, use --move-connectors/--move-chats to restore the original collection's state beforehand, or use the recovery commands logged during re-migration. Cannot be combined with --retry-failed |
| Manual Move (recovery actions, not part of regular migration workflows) | |
--move-connectors |
Move scheduled connectors from --from collection to --to collection. The h2oGPTe API enforces ownership — the user must own both collections. With --admin-key, the tool looks up the source collection owner and impersonates them. Cannot be combined with migration or verify flags |
--move-chats |
Move chat sessions from --from collection to --to collection. Can be combined with --move-connectors to move both in a single command. Same ownership rules apply |
--from <id> |
Source collection ID for --move-connectors/--move-chats |
--to <id> |
Target collection ID for --move-connectors/--move-chats. Must be different from --from |
| Verification | |
--verify-query <query> |
RAG verification query. For each completed migration, creates a temporary chat session on the new collection, sends the query, checks that the response includes document references, logs a response preview, and deletes the test chat session. Informational only — does not block connector/chat migration. Requires --verify or --wait-for-completion. The same query is sent to every collection being verified — often combined with --collections to target specific collections |
| Options | |
--copy-document |
Copy documents instead of referencing. Default (False) references documents — both old and new collections point to the same document record (with different embeddings). Faster (skips creating new document records, storage uploads, and cataloging) and saves storage. Use --copy-document for full storage isolation between collections |
--skip-reparse |
Re-embed existing chunks without re-parsing documents. Reads text chunks from the source collection, re-embeds them with the target embedding model, and stores them in the new collection — skipping file fetch, PDF conversion, OCR, and chunking. Significantly faster for embedding model migrations where only the embedding model changes. Requires copy_document=False (default). Cannot be used with --copy-document |
--ocr-model <model> |
OCR model to use during document re-parsing (default: auto). Use this to override the source collection's OCR model, e.g., when migrating away from a CN model. Examples: auto, off, tesseract. Not applicable when --skip-reparse is used |
--db-path <path> |
Path to SQLite database for tracking migration state (default: migration_tracking.db). A database file is created automatically in the directory where the tool is run. This flag is optional — only needed if you want to store the database in a custom location. Must use the same path for --verify as the original migration, otherwise it creates a new empty database and finds no pending jobs |
--cert <path> |
Path to CA certificate file for SSL verification. Omit this flag if no certificate is required |
--api-key-expiry <duration> |
Expiry duration for temporary API keys created in admin mode (default: 30 days). Example: "7 days", "30 days" |
Usage Examples
Quick Start
1. Sequentially migrate specific collections (one at a time)
Migrate, validate, move chats, and verify RAG in a single command:
h2ogpte-migrate --url https://h2ogpte.example.com --user-key sk-xxx --collections "col-123, col-456" --use-model-mappings --wait-for-completion --migrate-chats --verify-query "What is our refund policy?"
What happens (for each collection, one at a time — fully completes before moving to the next):
- Creates a new collection with the target model from the predefined model mapping
- Copies permissions (public, user, group) and lifecycle settings (expiry, inactivity, size limit) to the new collection
- Imports documents and waits for the import job to complete
- Validates the import (document counts, embedding model, lifecycle settings, document statuses)
- Creates a temporary test chat session, sends the RAG verification query, checks for document references, logs the response preview, and deletes the test chat session
- Migrates scheduled connectors automatically
- Migrates chat sessions (because
--migrate-chatsis passed)
Note: The same --verify-query is sent to every collection. For collection-specific queries, run separate commands per collection (e.g., in multiple terminals).
Tip: Add --max-concurrent-jobs N to process multiple collections concurrently instead of one at a time. See Example 2 below.
2. Concurrent migration with controlled parallelism
Migrate many collections concurrently with --max-concurrent-jobs:
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-1, col-2, ..., col-100" --use-model-mappings --wait-for-completion --max-concurrent-jobs 10 --migrate-chats
What happens:
- Up to 10 collections are processed concurrently
- Each worker independently: creates a new collection, copies permissions, imports documents, waits for completion, validates, migrates connectors and chats
- When a worker finishes one collection, it picks up the next from the queue
- At most 10 import jobs are active on the server at any time
- If any collection fails, the rest continue unaffected — failed collections can be retried later with
--retry-failed
Note: Without --max-concurrent-jobs (or with --max-concurrent-jobs 1), --wait-for-completion processes one collection at a time. Use higher values to speed up large-scale migrations while controlling server load.
Tip: With concurrent workers, log lines from different collections are interleaved. Each line is prefixed with [Collection Name], so you can filter the log file for a specific collection:
grep "\[Collection Alpha\]" migration_20260316_225645.log
3. Migrate specific collections in parallel (multiple at the same time)
Submit all jobs at once, then verify separately:
# Step 1: Submit migration jobs (runs in background)
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-123, col-456, col-789" --use-model-mappings
# Step 2: Verify completion, migrate connectors + chats, and run RAG check
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-123, col-456, col-789" --verify --migrate-chats --verify-query "What is our refund policy?"
What happens in Step 1:
- For each collection: looks up the owner, creates a temporary API key for them
- Creates new collections with the target model from the predefined model mapping
- Copies permissions, lifecycle settings, and submits import jobs for each collection
- Exits immediately once all collections have had jobs created — jobs continue running in the background on the server
What happens in Step 2:
- Checks the status of each import job
- For successfully completed imports: validates document counts, embedding model, lifecycle settings, and document statuses
- Runs the RAG verification query on each completed collection (because
--verify-querywas included) - Migrates scheduled connectors automatically for successfully completed imports
- Migrates chat sessions (because
--migrate-chatsis passed) — chats are only moved after the import succeeds, so it's safe to include in the verify step
Note: The --collections flag in Step 2 applies verification to those specific collections. The same --verify-query is sent to every collection specified. Without --collections, --verify checks all pending jobs in the database (admin mode). If you were to use --user-key instead and omitted the --collections with --verify, only jobs belonging to your account are checked.
Tip for admins: --users "john.doe, jane.smith" can be used to scope to specific users instead of collection IDs. --all-users is also available to migrate every collection across all users, but use with caution on large organizations as it submits import jobs for all collections at once.
More Examples
4. Dry run (preview changes)
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --users john.doe --use-model-mappings --dry-run
What happens:
- Shows which collections would be migrated and the target embedding models
- Displays permissions that would be copied (public, user, group)
- Shows lifecycle settings that would be copied (expiry, inactivity interval, size limit)
- Shows document counts and import settings
- No actual changes — no collections created, no database writes
5. Specific model migration with OCR model override
# Step 1: Submit migration jobs for a specific source model, using tesseract for OCR
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --users john.doe --source-model "BAAI/bge-large-en-v1.5" --target-model "mixedbread-ai/mxbai-embed-large-v1" --ocr-model "tesseract"
# Step 2: Verify completion, migrate connectors + chats
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --users john.doe --verify --migrate-chats
What happens in Step 1:
- Creates a temporary API key for the user
- Scans all collections owned by the user
- Only processes collections using
BAAI/bge-large-en-v1.5(ignores all others) - For each matching collection: creates a new collection with
mixedbread-ai/mxbai-embed-large-v1, copies permissions and lifecycle settings, and submits an import job - The
--ocr-model "tesseract"flag overrides the OCR model used during document re-parsing (default:auto). Use this when migrating away from a CN OCR model or to preserve a specific OCR model like Tesseract - Useful for phased migrations — migrate one model at a time instead of using predefined mappings
What happens in Step 2:
- Checks the status of each import job for the user
- For successfully completed imports: validates document counts, embedding model, lifecycle settings, and document statuses
- Migrates scheduled connectors automatically for successfully completed imports
- Migrates chat sessions for successfully completed imports (because
--migrate-chatsis passed) - No RAG verification is done (
--verify-querywas not included — add it to Step 2 if needed, but keep in mind the same query would apply to all collections verified)
6. Verify and migrate chats (check job status, migrate connectors + chats)
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --verify --migrate-chats
What happens:
- Does NOT run any new migrations
- Queries the database for all pending/submitted/running jobs, plus completed jobs with pending post-import steps (i.e., connectors and chats that weren't migrated inline because
--wait-for-completionwas omitted) - Checks the status of each import job
- For successfully completed imports: validates document counts, embedding model, lifecycle settings, and document statuses
- Migrates scheduled connectors automatically for successfully completed imports
- Migrates chat sessions for successfully completed imports (because
--migrate-chatsis passed) - Reports summary (completed/failed/running/canceled counts)
Without --users or --collections, checks all pending jobs in the database with --admin-key. With --user-key, only jobs belonging to your account are checked.
Filter by user or collection:
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --verify --users john.doe
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --verify --collections "col-123, col-456"
h2ogpte-migrate --url https://h2ogpte.example.com --user-key sk-xxx --verify --collections "col-123, col-456"
7. Retry failed migrations
# Admin: retry specific collections
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-123, col-456" --use-model-mappings --retry-failed --wait-for-completion --migrate-chats
# Self-service: retry all your failed collections (no --collections scope needed)
h2ogpte-migrate --url https://h2ogpte.example.com --user-key sk-xxx --use-model-mappings --retry-failed --wait-for-completion --migrate-chats
What happens:
- Skips collections that are completed, submitted, or running
- For collections with a failed import job status: creates a new collection and re-submits the import
- Logs a warning with the previous failed collection ID for reference (needs manual cleanup)
- With
--wait-for-completion: waits for each retried import to complete, validates, migrates connectors and chats inline - Without
--wait-for-completion: submits jobs in the background — run--verify --migrate-chatslater to check completion and migrate connectors + chats
8. Force re-migration
# Step 1: Force re-migrate specific collections
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-123, col-456" --use-model-mappings --force-remigrate
# Step 2: Verify completion, migrate connectors + chats
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --collections "col-123, col-456" --verify --migrate-chats
What happens in Step 1:
- Collections are picked up regardless of their migration status
- Creates new collections for ALL specified collections, even if they were previously migrated successfully
- Overwrites previous local migration database records for those collections
- Previously migrated collections remain in the user's account (needs manual cleanup)
- Caution: If a previous successful migration already moved connectors/chats to another collection, they should either be manually moved back to the original collection before running this (via
--move-connectors/--move-chats), or the tool will log the exact move commands to recover them after the re-migration completes
What happens in Step 2:
- Checks the status of each import job
- For successfully completed imports: validates document counts, embedding model, lifecycle settings, and document statuses
- Migrates scheduled connectors from the original collection, if they still exist there
- Migrates chat sessions from the original collection, if they still exist there (because
--migrate-chatsis passed) - Important: If a previous migration already moved connectors/chats out of the original collection, they won't be found here — use
--move-connectors/--move-chatsto recover them (see commands logged in Step 1)
9. Manually move connectors and/or chats between collections (for recovery purposes)
h2ogpte-migrate --url https://h2ogpte.example.com --user-key sk-xxx --move-connectors --from "col-abc" --to "col-def"
h2ogpte-migrate --url https://h2ogpte.example.com --user-key sk-xxx --move-chats --from "col-abc" --to "col-def"
h2ogpte-migrate --url https://h2ogpte.example.com --admin-key sk-xxx --move-connectors --move-chats --from "col-abc" --to "col-def"
What happens:
- Moves scheduled connectors and/or chat sessions from the source collection to the target collection
- The source collection will no longer have the moved items after this operation
- Useful for recovering connectors/chats after
--force-remigrate, or reorganizing collections - Works with both
--admin-keyand--user-key(server enforces ownership) - In admin mode, automatically creates a temporary API key for the source collection's owner
Database Tracking
Schema
CREATE TABLE collection_migrations (
old_collection_id TEXT PRIMARY KEY,
old_collection_name TEXT,
new_collection_id TEXT,
new_collection_name TEXT,
old_model TEXT,
new_model TEXT,
job_id TEXT,
job_status TEXT,
user_id TEXT,
username TEXT,
created_at TIMESTAMP,
completed_at TIMESTAMP,
error TEXT,
-- Step tracking
collection_created BOOLEAN DEFAULT 0,
permissions_copied BOOLEAN DEFAULT 0,
import_submitted BOOLEAN DEFAULT 0,
import_completed BOOLEAN DEFAULT 0,
connectors_migrated BOOLEAN DEFAULT 0,
chats_migrated BOOLEAN DEFAULT 0
);
Job Statuses
pending- Collection created, import not yet submittedsubmitted- Import job submitted, running in backgroundrunning- Import job verified as in-progresscompleted- Import job completed successfullyfailed- Import job failed, canceled, or had errors
Resume Behavior
| DB State | On Re-run |
|---|---|
collection_created=1, import_submitted=0 |
Reuses existing collection, re-copies permissions, submits import |
import_submitted=1, import_completed=0 |
Skips, tells user to run --verify |
import_completed=1, connectors_migrated=0 |
Migrates connectors |
import_completed=1, connectors_migrated=1, chats_migrated=0 |
With --migrate-chats: migrates chats |
import_completed=1, connectors_migrated=1, chats_migrated=1 |
Fully done, skips |
job_status='failed' |
With --retry-failed: creates new collection |
| Any state | With --force-remigrate: ignores DB, creates new collection. Logs --move-connectors/--move-chats commands if connectors/chats were previously moved |
Output Files
migration_YYYYMMDD_HHMMSS.log- Detailed log with timestamps, job IDs, errors. Created in the directory where the tool is runmigration_tracking.db- SQLite database with migration state. Created automatically in the directory where the tool is run. Use--db-pathfor a custom location. If running from a different directory later (e.g., for--verify), pass--db-pathpointing to the original database, otherwise a new empty database is created and no pending jobs are found
Troubleshooting
SSL Certificate Errors
--cert ~/path/to/ca-chain.crt # Provide certificate
# Omit --cert if no certificate is required
Check Migration Status
sqlite3 migration_tracking.db "SELECT old_collection_name, job_status, import_completed, connectors_migrated, chats_migrated, error FROM collection_migrations;"
Collection Already Migrated
--force-remigrate # Re-migrate (creates new collection, old one needs manual cleanup)
Caution: If a previous successful migration already moved connectors/chats, use --move-connectors/--move-chats to restore state before re-migrating, or use the recovery commands logged during re-migration.
Using --verify with a custom database path
If your initial migration used --db-path /custom/path/migration.db, you must use the same --db-path for --verify, otherwise it creates a new empty database and finds no pending jobs.
Failed Import - Retry
--retry-failed # Creates new collection for failed imports
Best Practices
- Always dry-run first - Use
--dry-runto preview changes - Test on a single collection or user - Understand and validate how the migration works before running on a larger scale
- Run during off-hours - Minimize impact on users
- Use parallel mode for large batches - Submit jobs without waiting, verify later
- Always run
--verifyafter parallel migrations - This checks completion, validates imports, and migrates connectors. Include--migrate-chatsto also migrate chat sessions - Use
--verify-queryfor RAG validation - Sends a test query to each migrated collection, checks for document references, and cleans up the test chat session. Informational only — does not block connector/chat migration. The same query applies to all collections being verified, so use--collectionsto target collections with similar content for accurate results - Use
--move-connectors/--move-chatswith--force-remigrate- If a collection needs to be re-migrated, a previous successful migration may have already moved connectors/chats to another collection. Use--move-connectors/--move-chatsto recover them to the appropriate collection. The tool logs the exact commands needed during re-migration - Keep database and logs - Archive for audit trail
- Clean up failed collections manually - After
--retry-failed, old failed collections remain
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file h2ogpte_migration-1.2.0.tar.gz.
File metadata
- Download URL: h2ogpte_migration-1.2.0.tar.gz
- Upload date:
- Size: 56.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86e2712b7775da409a9653d3ced1a9634025b1fd179cd1c8df816dd77921a11d
|
|
| MD5 |
dbf37388111040cffe6539aa7ed9b4e7
|
|
| BLAKE2b-256 |
ee50727b471bb16fd1784307b5ad99f98e347924ccfa9985fd2f85ab2debcff9
|
Provenance
The following attestation bundles were made for h2ogpte_migration-1.2.0.tar.gz:
Publisher:
publish-pypi.yml on h2oai/h2ogpte-embedding-migration
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
h2ogpte_migration-1.2.0.tar.gz -
Subject digest:
86e2712b7775da409a9653d3ced1a9634025b1fd179cd1c8df816dd77921a11d - Sigstore transparency entry: 1245261327
- Sigstore integration time:
-
Permalink:
h2oai/h2ogpte-embedding-migration@d8a64e90738b84c1638f89337b666486a134de4d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/h2oai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d8a64e90738b84c1638f89337b666486a134de4d -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file h2ogpte_migration-1.2.0-py3-none-any.whl.
File metadata
- Download URL: h2ogpte_migration-1.2.0-py3-none-any.whl
- Upload date:
- Size: 43.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cd83d9a88ae9c0d231a6c3268d8191f61272a760680a8a7e583a950a59d2288
|
|
| MD5 |
ab5b85014bc92f2bc5394640a03cdc51
|
|
| BLAKE2b-256 |
a0ec52016ee8d389ae3f5220d08df380696d26d233b3bf654d8cc5fcc548e30a
|
Provenance
The following attestation bundles were made for h2ogpte_migration-1.2.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on h2oai/h2ogpte-embedding-migration
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
h2ogpte_migration-1.2.0-py3-none-any.whl -
Subject digest:
1cd83d9a88ae9c0d231a6c3268d8191f61272a760680a8a7e583a950a59d2288 - Sigstore transparency entry: 1245261373
- Sigstore integration time:
-
Permalink:
h2oai/h2ogpte-embedding-migration@d8a64e90738b84c1638f89337b666486a134de4d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/h2oai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d8a64e90738b84c1638f89337b666486a134de4d -
Trigger Event:
workflow_dispatch
-
Statement type: