ragstruct - A Pseudo-Finetuning RAG Framework for structured JSON-based retrieval

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Joshikaran

Project description

🎵 ragstruct — A Pseudo-Finetuning RAG Framework

🔍 Lightweight semantic retriever using structured JSON 💡 Built with love by Joshikaran K. (Joshi Felix)

🧠 What is ragstruct?

ragstruct is a minimal, blazing-fast semantic retrieval library built for anyone who wants to simulate fine-tuned behavior without ever training a model. You give it structured JSON memory, and it gives your LLM meaningful context, fast.

No vector DBs. No finetuning. No heavy dependencies.

✨ Key Features

✅ Zero-database JSON-based retriever
🎯 Built on BGE embeddings (BAAI/bge-large-en-v1.5)
⏳ Fast top-k semantic matches
🔄 Memory tracking for context injection
💪 Works with any LLM (OpenAI, Mistral, local)
🙌 Great for agents, personal AIs, digital twins

🚀 Installation

pip install ragstruct .

📊 Use Case Examples

👤 Digital Twin memory retrieval (e.g., Joshi AI)
🧑‍💼 Resume bots and personal agent context
🧠 Mental health / therapy state tracking
🎓 LLM Study-buddy with syllabus JSON
📚 Retrieval-based storytelling agents
🎮 Game character memory/NPCs

🎡 Why ragstruct Exists

I (Joshikaran) built ragstruct while creating Joshi AI, a digital twin that could talk like me, remember my projects, reflect my mindset.

Every existing RAG pipeline felt like overkill. LangChain + vector DB + server just to search my own memory? Nah.

So I built this:

“I wanted a RAG system that was so simple it could run in a terminal, speak like me, and understand what part of me it's referring to.”

🕵️‍♂️ When to Use ragstruct

Use ragstruct if:

✅ You have structured memory or JSON knowledge
✅ You want fast retrieval from text keys
✅ You want context-aware LLMs without training
✅ You care about token savings + control
✅ You’re building personal AI or local agents

🪖 How it Works (Pseudo-Finetuning)

Instead of retraining the model, you remind the model what to say by:

Embedding your JSON keys
Matching input queries to relevant memory
Injecting that into the LLM prompt

This creates the effect of fine-tuning, without touching weights.

📊 Comparison: Finetuning vs ragstruct

Traditional Finetuning	ragstruct (Pseudo)
Requires large training data	Works off your real JSON
Needs GPUs, money, time	Just Python + CPU
Locked once trained	Dynamic memory updates
Expensive to iterate	Instant memory edits
One model only	Use any LLM (local/cloud)

🔄 Smart Tips

🔄 Format Your JSON

Nested or list-heavy JSON? ragstruct flattens and formats it like this:

{
  "name": "Felix AI",
  "description": "A crypto forecasting agent.",
  "tech_stack": ["Python", "XGBoost"]
}

...so your LLM sees clean chunks. Perfect for memory injection.

🧐 Compress Chat History

If injecting full chat is too heavy, summarize it:

from transformers import pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
summary = summarizer(chat_text, max_length=100)[0]['summary_text']

🌐 Structuring Text into JSON (Optional)

Use the Structurer module to convert long .txt docs into structured JSON using any LLM:

from ragstruct.structurer import Structurer
struct = Structurer(llm=your_llm)
structured = struct.structure_document("raw text block")

Handles chunking, cleaning, and LLM-guided structuring.

⚠️ What ragstruct Is Not

Not a full generation pipeline — you supply the LLM
Not multi-user scalable out of the box (but extendable)
Not a replacement for real finetuning — it fakes it smartly

🔖 Summary

🔄 ragstruct injects only what matters
✅ JSON-only, no infra needed
🌍 Works with any LLM or chat agent
🚀 Fast, clean, dev-focused retrieval
🫠 Perfect for personal AI memory

“Don’t train your model. Train your memory.” — Joshi Felix

Ready to build something with soul? Plug in your JSON, choose your LLM, and go.

Built with vim & vision by Joshi Felix.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Joshikaran

Release history Release notifications | RSS feed

This version

0.1.0

Apr 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragstruct-0.1.0.tar.gz (7.0 kB view details)

Uploaded Apr 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragstruct-0.1.0-py3-none-any.whl (7.8 kB view details)

Uploaded Apr 16, 2025 Python 3

File details

Details for the file ragstruct-0.1.0.tar.gz.

File metadata

Download URL: ragstruct-0.1.0.tar.gz
Upload date: Apr 16, 2025
Size: 7.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragstruct-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bb0857e666eef291f76c9122cbfd649ced17307413cbb23c79c2a901bf745b4c`
MD5	`dfcbb9db15370665c5d64ba79c2d87e0`
BLAKE2b-256	`ef0fac8da4cac1749b957b2bcc40a941686b2f1cbce41c0f86297850c2d4feae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragstruct-0.1.0.tar.gz:

Publisher: publish.yml on Joshikarank/ragstruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragstruct-0.1.0.tar.gz
- Subject digest: bb0857e666eef291f76c9122cbfd649ced17307413cbb23c79c2a901bf745b4c
- Sigstore transparency entry: 198221192
- Sigstore integration time: Apr 16, 2025
Source repository:
- Permalink: Joshikarank/ragstruct@aea0a9c0dbae4f3e70b9f58aad77286d00846de6
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Joshikarank
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@aea0a9c0dbae4f3e70b9f58aad77286d00846de6
- Trigger Event: release

File details

Details for the file ragstruct-0.1.0-py3-none-any.whl.

File metadata

Download URL: ragstruct-0.1.0-py3-none-any.whl
Upload date: Apr 16, 2025
Size: 7.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragstruct-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a2314da57266e124a06a38116bc9e88aaed8a5d3c9e750f7f78074c8e05f1eab`
MD5	`468f773972b7559c49180c8a4856189c`
BLAKE2b-256	`990f5b98bca1669a76fc74e93d83ce4106ae5f2879599a3d2ea8e3ddb8c2cb88`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragstruct-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Joshikarank/ragstruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragstruct-0.1.0-py3-none-any.whl
- Subject digest: a2314da57266e124a06a38116bc9e88aaed8a5d3c9e750f7f78074c8e05f1eab
- Sigstore transparency entry: 198221194
- Sigstore integration time: Apr 16, 2025
Source repository:
- Permalink: Joshikarank/ragstruct@aea0a9c0dbae4f3e70b9f58aad77286d00846de6
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Joshikarank
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@aea0a9c0dbae4f3e70b9f58aad77286d00846de6
- Trigger Event: release

ragstruct 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

🎵 ragstruct — A Pseudo-Finetuning RAG Framework

🧠 What is ragstruct?

✨ Key Features

🚀 Installation

📊 Use Case Examples

🎡 Why ragstruct Exists

🕵️‍♂️ When to Use ragstruct

🪖 How it Works (Pseudo-Finetuning)

📊 Comparison: Finetuning vs ragstruct

🔄 Smart Tips

🔄 Format Your JSON

🧐 Compress Chat History

🌐 Structuring Text into JSON (Optional)

⚠️ What ragstruct Is Not

🔖 Summary

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance