Skip to main content

Structured NLP tasks powered by a fine-tuned small language model

Project description

neural-txt

Structured NLP tasks powered by a fine-tuned 135M parameter language model. Extract bullets, generate Q&A pairs, build knowledge graphs, and more — all running locally. Narrow vertical local intelligence that runs super cheaply in resource constrained envs.

https://github.com/user-attachments/assets/04774af0-dc51-42e7-b2a6-d6f50bf4e258

Support

If you find this helpful, consider supporting on Patreon — it hosts all code, projects, slides, and write-ups from the YouTube channel.

Become a Patron!

Install

# Base (no inference backend)
pip install neural-txt

# With HuggingFace backend (torch)
pip install neural-txt[hf]

# With MLX backend (Apple Silicon)
pip install neural-txt[mlx]

Quick start

from neuraltxt import NeuralTxt

model = NeuralTxt(backend="mlx")  # or backend="hf"

passage = """
Transformers have revolutionized NLP by introducing the self-attention
mechanism. Unlike RNNs, transformers process all tokens in parallel,
leading to significant training speedups.
"""

# Extract key points
bullets = model.extract_bullets(passage)

# Generate question-answer pairs
pairs = model.generate_qa_pairs(passage)

# Extract knowledge graph triplets
triplets = model.extract_triplets(passage)

Beam candidates

Generation methods accept num_beams with a default of 1. The public methods still return one parsed result: the first / highest-ranked candidate. With the HuggingFace backend, num_beams is forwarded as beam search with num_return_sequences=num_beams. With MLX, candidates are generated the same way as the existing repeated generation path.

bullets = model.extract_bullets(passage, num_beams=4)

See examples/beam_candidates.py for a complete example, including how to inspect all raw beam candidates.

JSON mode

Every method supports json=True for guaranteed structured output via outlines:

# Returns a BulletsOutput pydantic model
bullets = model.extract_bullets(passage, json=True)
print(bullets.bullets)  # list[str]

# Returns a QAPairsOutput pydantic model
qa = model.generate_qa_pairs(passage, json=True)
for pair in qa.pairs:
    print(pair.question, pair.answer)

# Returns a TripletsOutput pydantic model
triplets = model.extract_triplets(passage, json=True)
for t in triplets.triplets:
    print(t.subject, t.relation, t.object)

API

Method Input Output JSON Output
extract_bullets(passage) passage list[str] BulletsOutput
generate_qa_pairs(passage) passage list[QAPair] QAPairsOutput
generate_question(passage) passage str QuestionOutput
generate_questions_list(passage) passage list[str] QuestionsListOutput
extract_fact(passage) passage str FactOutput
answer(question, passage) question + passage str AnswerOutput
rephrase(passage) passage str RephraseOutput
continue_from(passage) passage start str ContinuationOutput
extract_triplets(passage) passage list[Triplet] TripletsOutput
compare(passage_a, passage_b) two passages str ComparisonOutput
find_relevant(question, passages) question + passage list RetrievalResult RetrievalOutput

Models

Backend Default model
hf paperbd/smollm_135M_neuraltxt_dpo_v2
mlx paperbd/smollm_135M_neuraltxt_mlx_dpo_v2

Pass a custom path: NeuralTxt("path/to/model", backend="hf")

Gradio demo

pip install neural-txt[app]

# HuggingFace (default)
python app.py

# MLX (Apple Silicon)
python app.py --mlx

# Options
#   --temperature 0.4    sampling temperature (default 0.4)
#   --num-beams 2        beam candidates, 1-4 (default 1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_txt-0.1.2.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neural_txt-0.1.2-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file neural_txt-0.1.2.tar.gz.

File metadata

  • Download URL: neural_txt-0.1.2.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0f354646739b3e780a9d98af941256849e7a00fcd5f7e124230b8ecd3a2ce8be
MD5 4cc4c4d6246fbfaa367d8f956ffeed5d
BLAKE2b-256 1665e46a2490f680974330cb895fb3ae89558444000fdee1afaa45befa5222b4

See more details on using hashes here.

File details

Details for the file neural_txt-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: neural_txt-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for neural_txt-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 69ae94ba59db654fa395e73d97dbe5dd537c34722e5ccb6c99bd9478bf2dffcc
MD5 b32b88cf22831a713f2a87d5a91c9e1b
BLAKE2b-256 cb380b21a13db7dbf01f3f12a1dc1f396653b8cc217dc5dea1f9a5be2177916c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page