Skip to main content

Corpus library

Project description

For Developers

You can also see Cython, Java, C++, Swift, Js, or C# repository.

Requirements

Python

To check if you have a compatible version of Python installed, use the following command:

python -V

You can find the latest version of Python here.

Git

Install the latest version of Git.

Pip Install

pip3 install NlpToolkit-Corpus

Download Code

In order to work on code, create a fork from GitHub page. Use Git for cloning the code to your local or below line for Ubuntu:

git clone <your-fork-git-link>

A directory called Corpus will be created. Or you can use below link for exploring the code:

git clone https://github.com/olcaytaner/Corpus-Py.git

Open project with Pycharm IDE

Steps for opening the cloned project:

  • Start IDE
  • Select File | Open from main menu
  • Choose Corpus-Py file
  • Select open as project option
  • Couple of seconds, dependencies will be downloaded.

Detailed Description

Corpus

To store a corpus in memory

a = Corpus("derlem.txt")

If this corpus is split with dots but not in sentences

Corpus(self, fileName=None, splitterOrChecker=None)

The number of sentences in the corpus

sentenceCount(self) -> int

To get ith sentence in the corpus

getSentence(self, index: int) -> Sentence

TurkishSplitter

TurkishSplitter class is used to split the text into sentences in accordance with the . rules of Turkish.

split(self, line: str) -> list

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

NlpToolkit-Corpus-1.0.15.tar.gz (10.6 kB view details)

Uploaded Source

File details

Details for the file NlpToolkit-Corpus-1.0.15.tar.gz.

File metadata

  • Download URL: NlpToolkit-Corpus-1.0.15.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.26.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for NlpToolkit-Corpus-1.0.15.tar.gz
Algorithm Hash digest
SHA256 5c8f2b434b573377170b957cd0c29f0c68bb4b2cc2d1d160f452900b5abf082e
MD5 ee68f58d2eed518d3e9f8a8d33be4ede
BLAKE2b-256 6eef7831fc8ee93ea760ea243177473f7caed5d7d1e4a6726fc037cbe62f7b93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page