Skip to main content

Custom scientific/research article summarization library based on Statistical features

Project description

The summarization of research articles is a complex task compared to general-purpose summaries.This is a result of the distict nature or semantic structure of these scientific articles. The presence of inline citations and summarization modules bias to certain Text features that often work well on less-tecnical text but fail to produce coherence in this area are all underlying factors.

We circumvent these challenges in order to produce more coherent, human-understandable summaries of manuscripts and research text using this libary.

Installation

You can easily install the package using the pip command:

  pip install articlesumm

Usage

The package takes a string as input(specify a path/directory for an article or alternatively pass a string as a variable). The tokenization of the sentences and words can be performed with the first function:

  parse=purge(text)
  type(parse) 

  #tuple

Alternatively, you can tokenize the sentences and words with any other technique and pass the processed text to the summarization model.

Example

  text='''TextRank is a graph-based ranking model for text processing which can be used in order to find the most relevant sentences in text and also to find keywords. The algorithm is explained in detail in the paper at https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf . In order to find the most relevant sentences in text, a graph is constructed where the vertices of the graph represent each sentence in a document and the edges between sentences are based on content overlap, namely by calculating the number of words that 2 sentences have in common.'''

  from ArticleSumm import purge
  from ArticleSumm import summarizer

  parse=purge(text)

  #summary=summarizer(text,parse[0], parse[1], summary_length=3)
  summary=summarizer(text,words=purge(text)[0], sentence_list=purge(text)[1], summary_length=3)
  print(summary)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

articlesumm-0.0.2.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

articlesumm-0.0.2-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file articlesumm-0.0.2.tar.gz.

File metadata

  • Download URL: articlesumm-0.0.2.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for articlesumm-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d2dd86026d1c348f1a72a64a9d367db99f554ba296f6c79b471cfccea12c2e26
MD5 678780b06f27fd831cc969539a87365b
BLAKE2b-256 fbba5e44a8a76ac366619498ae789153e4d946c508b7977da0db56dcf2fa57d4

See more details on using hashes here.

File details

Details for the file articlesumm-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: articlesumm-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for articlesumm-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c34fe03a9c746db3e3b682d873a8e58b3765d601dec4727227a4d0b857e57a02
MD5 3f3411190d2aecbbaa1ad5295cfa7f99
BLAKE2b-256 866a217efc29ae3b15bccd559515a37cfc96e297640bea6d0db3100dd73af7c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page