Structured NLP tasks powered by a fine-tuned small language model
Project description
neural-txt
Structured NLP tasks powered by a fine-tuned 135M parameter language model. Extract bullets, generate Q&A pairs, build knowledge graphs, and more — all running locally. Narrow vertical local intelligence that runs super cheaply in resource constrained envs.
https://github.com/user-attachments/assets/04774af0-dc51-42e7-b2a6-d6f50bf4e258
Support
If you find this helpful, consider supporting on Patreon — it hosts all code, projects, slides, and write-ups from the YouTube channel.
Install
# Base (no inference backend)
pip install neural-txt
# With HuggingFace backend (torch)
pip install neural-txt[hf]
# With MLX backend (Apple Silicon)
pip install neural-txt[mlx]
Quick start
from neuraltxt import NeuralTxt
model = NeuralTxt(backend="mlx") # or backend="hf"
passage = """
Transformers have revolutionized NLP by introducing the self-attention
mechanism. Unlike RNNs, transformers process all tokens in parallel,
leading to significant training speedups.
"""
# Extract key points
bullets = model.extract_bullets(passage)
# Generate question-answer pairs
pairs = model.generate_qa_pairs(passage)
# Extract knowledge graph triplets
triplets = model.extract_triplets(passage)
JSON mode
Every method supports json=True for guaranteed structured output via outlines:
# Returns a BulletsOutput pydantic model
bullets = model.extract_bullets(passage, json=True)
print(bullets.bullets) # list[str]
# Returns a QAPairsOutput pydantic model
qa = model.generate_qa_pairs(passage, json=True)
for pair in qa.pairs:
print(pair.question, pair.answer)
# Returns a TripletsOutput pydantic model
triplets = model.extract_triplets(passage, json=True)
for t in triplets.triplets:
print(t.subject, t.relation, t.object)
API
| Method | Input | Output | JSON Output |
|---|---|---|---|
extract_bullets(passage) |
passage | list[str] |
BulletsOutput |
generate_qa_pairs(passage) |
passage | list[QAPair] |
QAPairsOutput |
generate_question(passage) |
passage | str |
QuestionOutput |
generate_questions_list(passage) |
passage | list[str] |
QuestionsListOutput |
extract_fact(passage) |
passage | str |
FactOutput |
answer(question, passage) |
question + passage | str |
AnswerOutput |
rephrase(passage) |
passage | str |
RephraseOutput |
continue_from(passage) |
passage start | str |
ContinuationOutput |
extract_triplets(passage) |
passage | list[Triplet] |
TripletsOutput |
compare(passage_a, passage_b) |
two passages | str |
ComparisonOutput |
find_relevant(question, passages) |
question + passage list | RetrievalResult |
RetrievalOutput |
Models
| Backend | Default model |
|---|---|
hf |
paperbd/smollm_135M_neuraltxt_v1 |
mlx |
paperbd/smollm_135M_neuraltxt_mlx_v1 |
Pass a custom path: NeuralTxt("path/to/model", backend="hf")
- Training dataset:
paperbd/paper_instructions_300K-v1 - Synthetic data generation:
text-albumentations
Gradio demo
pip install neural-txt[app]
# HuggingFace (default)
python app.py
# MLX (Apple Silicon)
python app.py --mlx
# Options
# --temperature 0.4 sampling temperature (default 0.4)
# -n 2 parallel generations, 1-4 (default 2)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neural_txt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: neural_txt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81e0db0592261448d671c0fc8f1c8f86d526b3386f827d38a2ea77d85a604064
|
|
| MD5 |
3a0d30264f370c070668354ebf456793
|
|
| BLAKE2b-256 |
4b593474b5e9939d0b483d7869199e904f7c34d74bef01c65b50c93fe2e6903a
|