Skip to main content

Structured NLP tasks powered by a fine-tuned small language model

Project description

neural-txt

Structured NLP tasks powered by a fine-tuned 135M parameter language model. Extract bullets, generate Q&A pairs, build knowledge graphs, and more — all running locally. Narrow vertical local intelligence that runs super cheaply in resource constrained envs.

https://github.com/user-attachments/assets/04774af0-dc51-42e7-b2a6-d6f50bf4e258

Support

If you find this helpful, consider supporting on Patreon — it hosts all code, projects, slides, and write-ups from the YouTube channel.

Become a Patron!

Install

# Base (no inference backend)
pip install neural-txt

# With HuggingFace backend (torch)
pip install neural-txt[hf]

# With MLX backend (Apple Silicon)
pip install neural-txt[mlx]

NeuralTxtReward works with either backend: install neural-txt[hf] for the Hugging Face / torch scorer, or neural-txt[mlx] for Apple Silicon MLX.

Quick start

from neuraltxt import NeuralTxt

model = NeuralTxt(backend="mlx")  # or backend="hf"

passage = """
Transformers have revolutionized NLP by introducing the self-attention
mechanism. Unlike RNNs, transformers process all tokens in parallel,
leading to significant training speedups.
"""

# Extract key points
bullets = model.extract_bullets(passage)

# Generate question-answer pairs
pairs = model.generate_qa_pairs(passage)

# Extract knowledge graph triplets
triplets = model.extract_triplets(passage)

Reward scoring

NeuralTxtReward scores generated responses against a reference answer with paperbd/neuraltxt-reward-22M. Use it to score one answer, score a batch, or rank candidate responses.

from neuraltxt import NeuralTxtReward

reward = NeuralTxtReward(backend="mlx")  # or backend="hf"

reference = "The capital of France is Paris."
responses = [
    "Paris is the capital of France.",
    "France's capital is Lyon.",
]

score = reward.score(responses[0], reference)          # float between 0 and 1
scores = reward.batch_score(responses, reference)      # list[float]
ranked = reward.rank(responses, reference)             # list[RankedResponse]

for item in ranked:
    print(item.index, item.score, item.response)

rank() preserves the original response index and sorts highest score first. Pass a local model directory with NeuralTxtReward("path/to/reward-model").

Beam candidates

Generation methods accept num_beams with a default of 1. The public methods still return one parsed result: the first / highest-ranked candidate. With the HuggingFace backend, num_beams is forwarded as beam search with num_return_sequences=num_beams. With MLX, candidates are generated the same way as the existing repeated generation path.

bullets = model.extract_bullets(passage, num_beams=4)

See examples/beam_candidates.py for a complete example, including how to inspect all raw beam candidates.

JSON mode

Every method supports json=True for guaranteed structured output via outlines:

# Returns a BulletsOutput pydantic model
bullets = model.extract_bullets(passage, json=True)
print(bullets.bullets)  # list[str]

# Returns a QAPairsOutput pydantic model
qa = model.generate_qa_pairs(passage, json=True)
for pair in qa.pairs:
    print(pair.question, pair.answer)

# Returns a TripletsOutput pydantic model
triplets = model.extract_triplets(passage, json=True)
for t in triplets.triplets:
    print(t.subject, t.relation, t.object)

API

Generation API

Method Input Output JSON Output
extract_bullets(passage) passage list[str] BulletsOutput
generate_qa_pairs(passage) passage list[QAPair] QAPairsOutput
generate_question(passage) passage str QuestionOutput
generate_questions_list(passage) passage list[str] QuestionsListOutput
extract_fact(passage) passage str FactOutput
answer(question, passage) question + passage str AnswerOutput
rephrase(passage) passage str RephraseOutput
continue_from(passage) passage start str ContinuationOutput
extract_triplets(passage) passage list[Triplet] TripletsOutput
compare(passage_a, passage_b) two passages str ComparisonOutput
find_relevant(question, passages) question + passage list RetrievalResult RetrievalOutput

Reward API

Method Input Output
score(response, reference) one response + reference answer float
batch_score(responses, reference) response list + reference answer list[float]
rank(responses, reference) response list + reference answer list[RankedResponse]

NeuralTxtReward accepts backend="hf" or backend="mlx".

Models

Interface Default model
NeuralTxt(backend="hf") paperbd/neuraltxt-v1-135M
NeuralTxt(backend="mlx") paperbd/neuraltxt-v1-135M-mlx
NeuralTxtReward(backend="hf") paperbd/neuraltxt-reward-22M
NeuralTxtReward(backend="mlx") paperbd/neuraltxt-reward-22M-mlx

Pass a custom path: NeuralTxt("path/to/model", backend="hf")

Gradio demo

pip install neural-txt[app]

# HuggingFace (default)
python app.py

# MLX (Apple Silicon)
python app.py --mlx

# Options
#   --temperature 0.4    sampling temperature (default 0.4)
#   --num-beams 2        beam candidates, 1-4 (default 1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_txt-0.1.4.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neural_txt-0.1.4-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file neural_txt-0.1.4.tar.gz.

File metadata

  • Download URL: neural_txt-0.1.4.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.4.tar.gz
Algorithm Hash digest
SHA256 0eedb9b62fe923810545f4558fb1a775a410d388690430fe4d6a1139670412ac
MD5 476075eecc95d63d7f423261b507a053
BLAKE2b-256 dc3fda1020d5bfc3260a7f8bcce6adc87335d75a454d706fcbf5a52fb7b688a6

See more details on using hashes here.

File details

Details for the file neural_txt-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: neural_txt-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3a72f0d0b3c30489f8ea7957e98eb672b6ddfca0a5f0fe151186b8c814e3ca23
MD5 61dd4c0eeee2b10a5bf724eda33856ab
BLAKE2b-256 d07d45553df17b36cf471b71ee689ad731dd60894d95961a7a3469e4b941fc7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page