Skip to main content

Structured NLP tasks powered by a fine-tuned small language model

Project description

neural-txt

Structured NLP tasks powered by a fine-tuned 135M parameter language model. Extract bullets, generate Q&A pairs, build knowledge graphs, and more — all running locally. Narrow vertical local intelligence that runs super cheaply in resource constrained envs.

https://github.com/user-attachments/assets/04774af0-dc51-42e7-b2a6-d6f50bf4e258

Support

If you find this helpful, consider supporting on Patreon — it hosts all code, projects, slides, and write-ups from the YouTube channel.

Become a Patron!

Install

# Base (no inference backend)
pip install neural-txt

# With HuggingFace backend (torch)
pip install neural-txt[hf]

# With MLX backend (Apple Silicon)
pip install neural-txt[mlx]

NeuralTxtReward works with either backend: install neural-txt[hf] for the Hugging Face / torch scorer, or neural-txt[mlx] for Apple Silicon MLX.

Quick start

from neuraltxt import NeuralTxt

model = NeuralTxt(backend="mlx")  # or backend="hf"

passage = """
Transformers have revolutionized NLP by introducing the self-attention
mechanism. Unlike RNNs, transformers process all tokens in parallel,
leading to significant training speedups.
"""

# Extract key points
bullets = model.extract_bullets(passage)

# Generate question-answer pairs
pairs = model.generate_qa_pairs(passage)

# Extract knowledge graph triplets
triplets = model.extract_triplets(passage)

Reward scoring

NeuralTxtReward scores generated responses against a reference answer with paperbd/neuraltxt-reward-tiny. Use it to score one answer, score a batch, or rank candidate responses.

from neuraltxt import NeuralTxtReward

reward = NeuralTxtReward(backend="mlx")  # or backend="hf"

reference = "The capital of France is Paris."
responses = [
    "Paris is the capital of France.",
    "France's capital is Lyon.",
]
references = [
    "The capital of France is Paris.",
    "The capital of France is Paris.",
]

score = reward.score(responses[0], reference)          # float between 0 and 1
scores = reward.batch_score(responses, reference)      # list[float], batch_size=64
paired_scores = reward.batch_score(responses, references)
ranked = reward.rank(responses, reference)             # list[RankedResponse]

for item in ranked:
    print(item.index, item.score, item.response)

batch_score() scores responses in chunks of 64 by default. Pass batch_size= to tune memory use. Pass a list of references to score corresponding (response, reference) pairs; the list length must match responses. rank() preserves the original response index and sorts highest score first. Pass a local model directory with NeuralTxtReward("path/to/reward-model").

Multiple rollouts

Every generation method accepts rollouts. The default is 1, which preserves the usual single-output API. Set rollouts > 1 to get a list of parsed outputs.

answers = model.answer(
    question="What mechanism do transformers use?",
    passage=passage,
    temperature=0.7,
    rollouts=4,
)

for answer in answers:
    print(answer)

num_beams is still available as a decoding strategy. Use rollouts when you want multiple returned outputs; use num_beams when you want beam search.

JSON mode

Every method supports json=True for guaranteed structured output via outlines:

# Returns a BulletsOutput pydantic model
bullets = model.extract_bullets(passage, json=True)
print(bullets.bullets)  # list[str]

# Returns a QAPairsOutput pydantic model
qa = model.generate_qa_pairs(passage, json=True)
for pair in qa.pairs:
    print(pair.question, pair.answer)

# Returns a TripletsOutput pydantic model
triplets = model.extract_triplets(passage, json=True)
for t in triplets.triplets:
    print(t.subject, t.relation, t.object)

API

Generation API

Method Input Output JSON Output
extract_bullets(passage) passage list[str] BulletsOutput
generate_qa_pairs(passage) passage list[QAPair] QAPairsOutput
generate_question(passage) passage str QuestionOutput
generate_questions_list(passage) passage list[str] QuestionsListOutput
extract_fact(passage) passage str FactOutput
answer(question, passage) question + passage str AnswerOutput
rephrase(passage) passage str RephraseOutput
continue_from(passage) passage start str ContinuationOutput
extract_triplets(passage) passage list[Triplet] TripletsOutput
compare(passage_a, passage_b) two passages str ComparisonOutput
find_relevant(question, passages) question + passage list RetrievalResult RetrievalOutput

Reward API

Method Input Output
score(response, reference) one response + reference answer float
batch_score(responses, reference, batch_size=64) response list + one reference or paired references list[float]
rank(responses, reference) response list + one reference or paired references list[RankedResponse]

NeuralTxtReward accepts backend="hf" or backend="mlx".

Models

Interface Default model
NeuralTxt(backend="hf") paperbd/neuraltxt-v1-135M
NeuralTxt(backend="mlx") paperbd/neuraltxt-v1-135M-mlx
NeuralTxtReward(backend="hf") paperbd/neuraltxt-reward-tiny
NeuralTxtReward(backend="mlx") paperbd/neuraltxt-reward-tiny-mlx

Pass a custom path: NeuralTxt("path/to/model", backend="hf")

Gradio demo

pip install neural-txt[app]

# HuggingFace (default)
python app.py

# MLX (Apple Silicon)
python app.py --mlx

# Options
#   --temperature 0.4    sampling temperature (default 0.4)
#   --num-beams 2        beam candidates, 1-4 (default 1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_txt-0.1.5.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neural_txt-0.1.5-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file neural_txt-0.1.5.tar.gz.

File metadata

  • Download URL: neural_txt-0.1.5.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b7a245c596827f05cb2b5375394a48a4bf150e4e1f315d1113c5db4e07847079
MD5 45668162b482cb94b0bfae07aaa68787
BLAKE2b-256 03bb906f3a53b517bb50270646dd18ea05adc8766f6eb4e60272af7b598a4512

See more details on using hashes here.

File details

Details for the file neural_txt-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: neural_txt-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 87d728ac7df225283ac9393b8caeb01e92b1dd198e9c2259b65ae648e710f250
MD5 91299b46d84bc81f7c0c9ffab3d94a31
BLAKE2b-256 9f061d69778353023bdf3d0e99aecc78f28b303a238258be9243555847d7d0ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page