User pattern learning with trajectory-aware DPO training
Project description
CognitiveTwin
User pattern learning with trajectory-aware DPO (Direct Preference Optimization) training.
Overview
CognitiveTwin is a sophisticated system for learning user communication patterns through:
- Corpus Surgery: Data cleaning, validation, and quality filtering
- WORMS: Trajectory generators for synthetic training data
- Conversation Worm: Dialogue trajectory generation
- Repo Worm: Code repository analysis
- Task Worm: Task execution patterns
- DPO Generator: Preference pair generation
- Dataset Building: Preference pair labeling and export
- Evaluation Suite: Comprehensive testing framework
Installation
pip install cognitive-twin
# With training dependencies
pip install cognitive-twin[training]
Quick Start
from cognitive_twin.v3 import pipeline, schema
from cognitive_twin.framework import config
# Initialize pipeline
cfg = config.CognitiveTwinConfig(
model_name="your-base-model",
output_dir="./output"
)
# Run corpus surgery
pipeline.run_corpus_surgery(cfg)
# Generate training data
pipeline.generate_dpo_pairs(cfg)
# Train
pipeline.train(cfg)
Components
v3/ - Main Implementation
corpus_surgery/- Data cleaning and validationdataset/- Dataset generation and labelingeval/- Evaluation frameworkgenerators/- Batch and DPO generatorsingest/- Data ingestion (Claude, OpenAI, Supabase)worms/- Trajectory generatorspipeline.py- Main orchestratorschema.py- Type definitions
framework/ - Supporting Infrastructure
config.py- Configuration managementtwin.py- Core twin abstractiontrainer.py- Training loop
Documentation
See the docs/ directory for detailed documentation:
00_OVERVIEW.md- System overview01_CORPUS_SURGERY.md- Data cleaning pipeline02_REPO_WORM.md- Repository analysis03_CONVERSATION_WORM.md- Dialogue generation04_ENHANCER_AGENT.md- Quality enhancement05_DATASET_BUILDER.md- Dataset construction06_TRAINING_PIPELINE.md- Training guide07_EVALUATION_SUITE.md- Evaluation metrics08_API_INTEGRATION.md- API usage
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cognitive_twin-3.0.0.tar.gz
(319.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cognitive_twin-3.0.0.tar.gz.
File metadata
- Download URL: cognitive_twin-3.0.0.tar.gz
- Upload date:
- Size: 319.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09bcf2435279bdc10d32c7eae62f75d87b2712ead25e9ffd60682bc4cd17a97c
|
|
| MD5 |
3a8c96b09a0acc6a73e45301843f4556
|
|
| BLAKE2b-256 |
2790b7bac746fa0d42da6aa86e5ad86433c65950da4341d5e2b2854363e27e98
|
File details
Details for the file cognitive_twin-3.0.0-py3-none-any.whl.
File metadata
- Download URL: cognitive_twin-3.0.0-py3-none-any.whl
- Upload date:
- Size: 270.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9f634c9b0b23418a663374d4122ce07ce5f55ebb260d5e8d95b755d79ae2962
|
|
| MD5 |
ed65b6b72224b6c95bf9ae55adedbbde
|
|
| BLAKE2b-256 |
c449e369853b4940463e819da25d34c1ae650a000bd4c7ac1ab76fd6c717f20a
|