Skip to main content

Structured NLP tasks powered by a fine-tuned small language model

Project description

neural-txt

Structured NLP tasks powered by a fine-tuned 135M parameter language model. Extract bullets, generate Q&A pairs, build knowledge graphs, and more — all running locally. Narrow vertical local intelligence that runs super cheaply in resource constrained envs.

https://github.com/user-attachments/assets/04774af0-dc51-42e7-b2a6-d6f50bf4e258

Support

If you find this helpful, consider supporting on Patreon — it hosts all code, projects, slides, and write-ups from the YouTube channel.

Become a Patron!

Install

# Base (no inference backend)
pip install neural-txt

# With HuggingFace backend (torch)
pip install neural-txt[hf]

# With MLX backend (Apple Silicon)
pip install neural-txt[mlx]

NeuralTxtReward works with either backend: install neural-txt[hf] for the Hugging Face / torch scorer, or neural-txt[mlx] for Apple Silicon MLX.

Quick start

from neuraltxt import NeuralTxt

model = NeuralTxt(backend="mlx")  # or backend="hf"

passage = """
Transformers have revolutionized NLP by introducing the self-attention
mechanism. Unlike RNNs, transformers process all tokens in parallel,
leading to significant training speedups.
"""

# Extract key points
bullets = model.extract_bullets(passage)

# Generate question-answer pairs
pairs = model.generate_qa_pairs(passage)

# Extract knowledge graph triplets
triplets = model.extract_triplets(passage)

Reward scoring

NeuralTxtReward scores generated responses against a reference answer with paperbd/neuraltxt-reward-tiny. Use it to score one answer, score a batch, or rank candidate responses.

from neuraltxt import NeuralTxtReward

reward = NeuralTxtReward(backend="mlx")  # or backend="hf"

reference = "The capital of France is Paris."
responses = [
    "Paris is the capital of France.",
    "France's capital is Lyon.",
]
references = [
    "The capital of France is Paris.",
    "The capital of France is Paris.",
]

score = reward.score(responses[0], reference)          # float between 0 and 1
scores = reward.batch_score(responses, reference)      # list[float], batch_size=64
paired_scores = reward.batch_score(responses, references)
ranked = reward.rank(responses, reference)             # list[RankedResponse]

for item in ranked:
    print(item.index, item.score, item.response)

batch_score() scores responses in chunks of 64 by default. Pass batch_size= to tune memory use. Pass a list of references to score corresponding (response, reference) pairs; the list length must match responses. rank() preserves the original response index and sorts highest score first. Pass a local model directory with NeuralTxtReward("path/to/reward-model").

Multiple rollouts

Every generation method accepts rollouts. The default is 1, which preserves the usual single-output API. Set rollouts > 1 to get a list of parsed outputs.

answers = model.answer(
    question="What mechanism do transformers use?",
    passage=passage,
    temperature=0.7,
    rollouts=4,
)

for answer in answers:
    print(answer)

num_beams is still available as a decoding strategy. Use rollouts when you want multiple returned outputs; use num_beams when you want beam search.

JSON mode

Every method supports json=True for guaranteed structured output via outlines:

# Returns a BulletsOutput pydantic model
bullets = model.extract_bullets(passage, json=True)
print(bullets.bullets)  # list[str]

# Returns a QAPairsOutput pydantic model
qa = model.generate_qa_pairs(passage, json=True)
for pair in qa.pairs:
    print(pair.question, pair.answer)

# Returns a TripletsOutput pydantic model
triplets = model.extract_triplets(passage, json=True)
for t in triplets.triplets:
    print(t.subject, t.relation, t.object)

API

Generation API

Method Input Output JSON Output
extract_bullets(passage) passage list[str] BulletsOutput
generate_qa_pairs(passage) passage list[QAPair] QAPairsOutput
generate_question(passage) passage str QuestionOutput
generate_questions_list(passage) passage list[str] QuestionsListOutput
extract_fact(passage) passage str FactOutput
answer(question, passage) question + passage str AnswerOutput
rephrase(passage) passage str RephraseOutput
continue_from(passage) passage start str ContinuationOutput
extract_triplets(passage) passage list[Triplet] TripletsOutput
compare(passage_a, passage_b) two passages str ComparisonOutput
find_relevant(question, passages) question + passage list RetrievalResult RetrievalOutput

Reward API

Method Input Output
score(response, reference) one response + reference answer float
batch_score(responses, reference, batch_size=64) response list + one reference or paired references list[float]
rank(responses, reference) response list + one reference or paired references list[RankedResponse]

NeuralTxtReward accepts backend="hf" or backend="mlx".

Models

Interface Default model
NeuralTxt(backend="hf") paperbd/neuraltxt-v1-135M
NeuralTxt(backend="mlx") paperbd/neuraltxt-v1-135M-mlx
NeuralTxtReward(backend="hf") paperbd/neuraltxt-reward-tiny
NeuralTxtReward(backend="mlx") paperbd/neuraltxt-reward-tiny-mlx

Pass a custom path: NeuralTxt("path/to/model", backend="hf")

Gradio demo

pip install neural-txt[app]

# HuggingFace (default)
python app.py

# MLX (Apple Silicon)
python app.py --mlx

# Options
#   --temperature 0.4    sampling temperature (default 0.4)
#   --num-beams 2        beam candidates, 1-4 (default 1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_txt-0.1.6.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neural_txt-0.1.6-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file neural_txt-0.1.6.tar.gz.

File metadata

  • Download URL: neural_txt-0.1.6.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.6.tar.gz
Algorithm Hash digest
SHA256 a5a917a8e27883aede7b33e9979c4c30d2bd661cf54331a274955eefd7c28d3d
MD5 69385e62863a7ffbe6e3ee873f609afb
BLAKE2b-256 482b6db7351e0e0e88f1edf5ab620b16289ec76bc20c5b42c8102753240a0da0

See more details on using hashes here.

File details

Details for the file neural_txt-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: neural_txt-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 19fb385c54f1bfe9f0210ed2c12a9ffd6bdf8e687a251c3cccc15bb66c9d1b6b
MD5 7a2e4059874f0a43cd36687ffea8a12c
BLAKE2b-256 67b459c8dd729cafe1a298558cf00a38fe84658748464977efc72206e64a36e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page