Skip to main content

A tool to separate truncated text.

Project description

Word Slicer

Cut your unspaced (or 'too spaced') long texts.


import wordslicer

model = wordslicer.train('train_file')
text = open('input_file', 'r').read()
text = wordslicer.separate(model, text) # or wordslicer.join(model, text)
save('output_file', text)


For an input of:

  • 161029 words to train
  • 1000 lines to separate

The results:

  • Text with 36889 words
  • Time: real 0m1,368s


>>> wordslicer.separate(model, "Boromirhesitatedforasecond.'Yes,andno,'heansweredslowly.'Yes:Ifoundhimsomewayupthehill,andIspoketohim.IurgedhimtocometoMinasTirithandnottogoeast.Igrewangryandheleftme.Hevanished.Ihaveneverseensuchathinghappenbefore.thoughIhaveheardofitintales.HemusthaveputtheRingon.Icouldnotfindhimagain.Ithoughthewouldreturntoyou.'")

Boromir hesitated for a second. 'Yes, and no,' he answered slowly. 'Yes: I found him some way up the hill, and I spoke to him. I urged him to come to Minas Tirith and not to go east. I grew angry and he left me. He vanished. I have never seen such a thing happen before. though I have heard of it in tales. He must have put the Ring on. I could not find him again. I though the would return to you.'

How to Install

pip3 install wordslicer


  • Train your model: with the training ability, this package works with every language.

  • Evaluate your model: check if your training text is good enough for your input text:



This project was inspired by Generic Human on . Thank you!

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wordslicer, version 0.1.0
Filename, size File type Python version Upload date Hashes
Filename, size wordslicer-0.1.0-py3-none-any.whl (4.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size wordslicer-0.1.0.tar.gz (3.3 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page