Offline AI-text humanizer that preserves images, tables, and research context.

These details have not been verified by PyPI

Project links

Issues

Project description

texthumanizer 📝

Offline AI-text humanizer & plagiarism reducer for students and researchers. No internet needed after model download. Preserves research context, citations, and semantic meaning.

✨ Features

Feature	Detail
Humanize AI text	Rewrites ChatGPT / Claude / Gemini output to sound natural
Plagiarism reduction	Paraphrase-based, not just synonym swap
Semantic preservation	Meaning, tone, and argument structure kept intact
Research-aware	Citations `[1]`, abbreviations `DNA`, units `95%`, `et al.` — all preserved
DOCX support	Paragraph-level processing; headings untouched
100% offline	T5-based model runs locally after first download (~250 MB)
Lightweight	CPU-friendly, no GPU required

📦 Installation

Full install (recommended)

pip install texthumanizer[all]

Minimal (text only, no docx)

pip install texthumanizer[ml]

With DOCX support only

pip install texthumanizer[ml,docx]

First run downloads the T5 model (~250 MB) from HuggingFace once and caches it locally.

🚀 Quick Start

1. Humanize pasted text

from texthumanizer import TextHumanizer

th = TextHumanizer()

ai_text = """
Artificial intelligence has rapidly transformed numerous sectors of society,
demonstrating unprecedented capabilities in natural language processing,
computer vision, and decision-making systems.
"""

result = th.humanize_text(ai_text)
print(result)

2a. Humanize a .docx → save new .docx

from texthumanizer import TextHumanizer

th = TextHumanizer()

# Saves "humanized_my_essay.docx" next to the original
output_path = th.humanize_doc("my_essay.docx", output="doc")
print(f"Saved: {output_path}")

# Custom output path
th.humanize_doc("my_essay.docx", output="doc", output_path="D:/final_essay.docx")

2b. Humanize a .docx → get plain text back

from texthumanizer import TextHumanizer

th = TextHumanizer()
text = th.humanize_doc("my_essay.docx", output="text")
print(text)

⚙️ Configuration

th = TextHumanizer(
    diversity=0.7,    # 0.0 = minimal changes, 1.0 = maximum rewriting (default: 0.7)
    device=-1,        # -1 = CPU, 0 = GPU (default: -1)
    verbose=True,     # Show progress (default: True)
)

Parameter	Range	Effect
`diversity=0.3`	Low	Light rewording, very safe for technical papers
`diversity=0.7`	Medium	Balanced — good for essays and reports ✅
`diversity=0.9`	High	Heavy rewriting — good for blog posts or general text

🖥️ CLI Usage

Interactive mode

python -m texthumanizer.cli

Direct text

python -m texthumanizer.cli text "Your AI-generated text here" --diversity 0.7

Pipe from file

cat essay.txt | python -m texthumanizer.cli text

DOCX → humanized DOCX

python -m texthumanizer.cli doc essay.docx --output doc

DOCX → print text

python -m texthumanizer.cli doc essay.docx --output text

🔬 How It Works

Input Text
    │
    ▼
[Mask technical terms]         ← citations, abbreviations, units, years
    │
    ▼
[Split into sentences]         ← smart splitter (handles abbreviations)
    │
    ▼
[T5 Paraphrasing model]        ← humarin/chatgpt_paraphraser_on_T5_base
    │                            temperature + top-k + top-p sampling
    ▼
[Restore masked terms]         ← [1], DNA, 2023 put back exactly
    │
    ▼
Output Text

Why T5 and not GPT-style? T5 is an encoder–decoder model trained specifically on paraphrase tasks. It is:

Much smaller (~250 MB vs multi-GB GPT models)
CPU-friendly and fast
Better at preserving meaning than decoder-only models

📋 What Gets Preserved

Type	Example	Preserved?
Academic citations	`[1]`, `[1,2,3]`	✅
Author citations	`(Smith et al., 2021)`	✅
Abbreviations	`DNA`, `AI`, `COVID`, `LSTM`	✅
Years	`2023`, `1990`	✅
Percentages	`95%`, `3.5%`	✅
Scientific units	`kg`, `MHz`, `nm`, `kcal`	✅
Figure/Table refs	`Fig. 3`, `Table 1`	✅
DOIs / URLs	`doi:10.xxx`, `https://...`	✅
Latin abbreviations	`et al.`, `e.g.`, `i.e.`	✅
Headings (in .docx)	Section titles	✅ untouched

🧪 Example Output

Input (AI-generated):

The utilization of machine learning algorithms has demonstrated significant efficacy in the domain of medical diagnosis, achieving accuracy rates exceeding 95% in several clinical trials [1,2].

Output (humanized):

Using machine learning methods has shown strong results in medical diagnosis, reaching accuracy levels above 95% in a number of clinical studies [1,2].

💡 Tips for Best Results

Research papers: Use diversity=0.4 – 0.6 to keep technical accuracy
Essays / assignments: Use diversity=0.7 (default)
Blog posts / creative writing: Use diversity=0.8 – 0.9
Process section-by-section for long papers for best control
GPU users: set device=0 for ~5× speed improvement

🔬 How It Works (In-place Replacement)

Unlike other humanizers that strip away formatting, texthumanizer uses an "In-place run-level replacement" strategy:

It creates a temporary copy of your .docx.
It identifies text-bearing "runs" within each paragraph.
It humanizes the text while skipping runs that contain images or drawings.
It injects the new text back into the original XML structure, keeping your layout 100% intact.

📄 License

MIT License — free for personal and academic use.

⚠️ Disclaimer & Ethics

This tool is designed to assist researchers in improving the readability of their own writing. It is NOT intended for academic dishonesty or bypassing plagiarism checks for unoriginal work. Use responsibly and always cite your AI assistance if required by your institution.

Project details

These details have not been verified by PyPI

Project links

Issues

Release history Release notifications | RSS feed

1.1.1

May 12, 2026

This version

1.1.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

texthumanizer-1.1.0.tar.gz (14.8 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

texthumanizer-1.1.0-py3-none-any.whl (13.0 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file texthumanizer-1.1.0.tar.gz.

File metadata

Download URL: texthumanizer-1.1.0.tar.gz
Upload date: May 12, 2026
Size: 14.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for texthumanizer-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e0b7b9e2de2c808be770b98c6dabec7ba9a39b19ace78279b792602efa755410`
MD5	`e028b59da564fe672a372008ecc407d2`
BLAKE2b-256	`0286d2ab3d929522e2d397b3e6fea2cff63e800495724341019bff9dc826e6e7`

See more details on using hashes here.

File details

Details for the file texthumanizer-1.1.0-py3-none-any.whl.

File metadata

Download URL: texthumanizer-1.1.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 13.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for texthumanizer-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9569326ca802e8f04086d89188b59adb5df0d027837af61a14c6fead32111644`
MD5	`61628e89355aad8ebb43cf1bdcc4456d`
BLAKE2b-256	`2aa30299bb34c7ba175dc1cb0e32e9d0ee6f9520f352350badf09ad899ef95a0`

See more details on using hashes here.

texthumanizer 1.1.0

Navigation

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

texthumanizer 📝

✨ Features

📦 Installation

Full install (recommended)

Minimal (text only, no docx)

With DOCX support only

🚀 Quick Start

1. Humanize pasted text

2a. Humanize a .docx → save new .docx

2b. Humanize a .docx → get plain text back

⚙️ Configuration

🖥️ CLI Usage

Interactive mode

Direct text

Pipe from file

DOCX → humanized DOCX

DOCX → print text

🔬 How It Works

📋 What Gets Preserved

🧪 Example Output

💡 Tips for Best Results

🔬 How It Works (In-place Replacement)

📄 License

⚠️ Disclaimer & Ethics

Project details

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes