Skip to main content

A text summarizer

Project description

Summarizer [![Build Status](https://travis-ci.org/michigan-com/summarizer.svg)](https://travis-ci.org/michigan-com/summarizer)
==========

Summarizer is an automatic summarization algorithm.

Requirements
------------

* Python 2.7, 3.3, or 3.4
* NLTK

Install it
----------

```
pip install summarizer
```

Use it
------

```
from summarizer import summarize
summarize(title, text)
```

Documentation
-------------

Summarizer.summarize(title, text, count=5, summarizer=Summarizer())

* title: The title of the article
* text: The actual text of the article
* count: The number of summarized sentences to return
* summarizer: The class instance that will do all the work

Sanitizer module helps remove common oddities from the body of text.

Sanitizer.sanitize(text)

Contributing
------------

All contributions must be accompanied by some form of unit testing




CHANGES
=======

v0.0.6
------

* [FIX] Sgt, Gov, and No abbreviations accounted for

v0.0.5 10-01-2015
-----------------

* [FIX] Python2.7 unicode errors
* Updated PunktLanguageVars to detect special quotation marks

v0.0.4 10-01-2015
-----------------

* [FIX] PYPI didn't pick up the new training data in JSON format

v0.0.3 10-01-2015
-----------------

* [FIX] Tokenizer would think a custom abbreviation, F.B.I., was a sentence break
* [FIX] Added sanitizer module to preprocess text before summarizing it
* [TESTS] Added ability to pull valid tokenized articles from brevity.detroitnow.io
and test that the new tokenizer is still valid

v0.0.2 08-26-2015
-----------------

* PYPI not picking up data files

v0.0.1 08-26-2015
-----------------

* Added setup.py
* Added \_\_version\_\_
* Added unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

summarizer-0.0.6.tar.gz (278.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page