Skip to main content

Classification dataset generator library for high level Nlp tasks

Project description

Data Generator

For Developers

You can also see Python, Java, Js, C++, or C# repository.

Requirements

Python

To check if you have a compatible version of Python installed, use the following command:

python -V

You can find the latest version of Python here.

Git

Install the latest version of Git.

Pip Install

pip3 install NlpToolkit-DataGenerator-Cy

Download Code

In order to work on code, create a fork from GitHub page. Use Git for cloning the code to your local or below line for Ubuntu:

git clone <your-fork-git-link>

A directory called DataGenerator will be created. Or you can use below link for exploring the code:

git clone https://github.com/starlangsoftware/DataGenerator-Cy.git

Open project with Pycharm IDE

Steps for opening the cloned project:

  • Start IDE
  • Select File | Open from main menu
  • Choose DataGenerator-CY file
  • Select open as project option
  • Couple of seconds, dependencies will be downloaded.

Detailed Description

AnnotatedDataSetGenerator

DataSet yaratmak için AnnotatedDataSetGenerator sınıfı önce üretilir.

AnnotatedDataSetGenerator(self, folder: str, pattern: str, instanceGenerator: InstanceGenerator)

Ardından generate metodu ile DataSet yaratılır.

generate(self) -> DataSet

InstanceGenerator

DataGeneratorlerin InstanceGeneratorlere ihtiyacı vardır. Bunlar bir tek kelimeden bir Instance yaratan sınıflardır.

generateInstanceFromSentence(self, sentence: Sentence, wordIndex: int) -> Instance

NER problemi için NerInstanceGenerator, FeaturedNerInstanceGenerator ve VectorizedNerInstanceGeneratorsınıfı

ShallowParse problemi için ShallowParseInstanceGenerator, FeaturedShallowParseInstanceGenerator ve VectorizedShallowParseInstanceGenerator sınıfı

WSD problemi için SemanticInstanceGenerator, FeaturedSemanticInstanceGenerator ve VectorizedSemanticInstanceGenerator sınıfı

Morphological Disambiguation problemi için FeaturedDisambiguationInstanceGenerator sınıfı

Cite

If you use this resource on your research, please cite the following paper:

@article{acikgoz,
  title={All-words word sense disambiguation for {T}urkish},
  author={O. Açıkg{\"o}z and A. T. G{\"u}rkan and B. Ertopçu and O. Topsakal and B. {\"O}zenç and A. B. Kanburoğlu and {\.{I}}. Çam and B. Avar and G. Ercan and O. T. Y{\i}ld{\i}z},
  journal={2017 International Conference on Computer Science and Engineering (UBMK)},
  year={2017},
  pages={490-495}
}
@inproceedings{ertopcu17,  
	author={B. {Ertopçu} and A. B. {Kanburoğlu} and O. {Topsakal} and O. {Açıkgöz} and A. T. {Gürkan} and B. {Özenç} and İ. {Çam} and B. {Avar} and G. {Ercan} and O. T. {Yıldız}},  
	booktitle={2017 International Conference on Computer Science and Engineering (UBMK)},  title={A new approach for named entity recognition},   
	year={2017},  
	pages={474-479}
}


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

NlpToolkit-DataGenerator-Cy-1.0.2.tar.gz (1.7 MB view details)

Uploaded Source

File details

Details for the file NlpToolkit-DataGenerator-Cy-1.0.2.tar.gz.

File metadata

File hashes

Hashes for NlpToolkit-DataGenerator-Cy-1.0.2.tar.gz
Algorithm Hash digest
SHA256 74837689f015ca9c9d094a26341f3534f54a2457122d064566ee99fdde4fdc40
MD5 2fb409ee34ac40c252ba1ee96881cd4f
BLAKE2b-256 caf4f4d17dbcabad00320b6db4c6c0d2c533276ad8771d49249b25aff805f221

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page