Skip to main content

Classification dataset generator library for high level Nlp tasks

Project description

Data Generator ============

    Video Lectures
    ============
    
    [<img src=https://github.com/StarlangSoftware/DataGenerator/blob/master/video1.jpg width="50%">](https://youtu.be/E9rE_eCffPE)[<img src=https://github.com/StarlangSoftware/DataGenerator/blob/master/video2.jpg width="50%">](https://youtu.be/ISHmGWvHL7k)
    
    For Developers
    ============
    You can also see [Python](https://github.com/starlangsoftware/DataGenerator-Py), [Java](https://github.com/starlangsoftware/DataGenerator), [Js](https://github.com/starlangsoftware/DataGenerator-Js), [C++](https://github.com/starlangsoftware/DataGenerator-CPP), [C](https://github.com/starlangsoftware/DataGenerator-C), [Swift](https://github.com/starlangsoftware/DataGenerator-Swift),  or [C#](https://github.com/starlangsoftware/DataGenerator-CS) repository.
    
    ## Requirements
    
    * [Python 3.7 or higher](#python)
    * [Git](#git)
    
    ### Python 
    
    To check if you have a compatible version of Python installed, use the following command:
    
        python -V
        
    You can find the latest version of Python [here](https://www.python.org/downloads/).
    
    ### Git
    
    Install the [latest version of Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).
    
    ## Pip Install
    
    	pip3 install NlpToolkit-DataGenerator-Cy
    
    ## Download Code
    
    In order to work on code, create a fork from GitHub page. 
    Use Git for cloning the code to your local or below line for Ubuntu:
    
    	git clone <your-fork-git-link>
    
    A directory called DataGenerator will be created. Or you can use below link for exploring the code:
    
    	git clone https://github.com/starlangsoftware/DataGenerator-Cy.git
    
    ## Open project with Pycharm IDE
    
    Steps for opening the cloned project:
    
    * Start IDE
    * Select **File | Open** from main menu
    * Choose `DataGenerator-CY` file
    * Select open as project option
    * Couple of seconds, dependencies will be downloaded. 
    
    Detailed Description
    ============
    
    + [AnnotatedDataSetGenerator](#annotateddatasetgenerator)
    + [InstanceGenerator](#instancegenerator)
    
    ## AnnotatedDataSetGenerator
    
    DataSet yaratmak için AnnotatedDataSetGenerator sınıfı önce üretilir.
    
    	AnnotatedDataSetGenerator(self, folder: str, pattern: str, instanceGenerator: InstanceGenerator)
    
    Ardından generate metodu ile DataSet yaratılır.
    
    	generate(self) -> DataSet
    
    ## InstanceGenerator
    
    DataGeneratorlerin InstanceGeneratorlere ihtiyacı vardır. Bunlar bir tek kelimeden bir 
    Instance yaratan sınıflardır.
    
    	generateInstanceFromSentence(self, sentence: Sentence, wordIndex: int) -> Instance
    
    NER problemi için NerInstanceGenerator, FeaturedNerInstanceGenerator ve 
    VectorizedNerInstanceGeneratorsınıfı
    
    ShallowParse problemi için ShallowParseInstanceGenerator, 
    FeaturedShallowParseInstanceGenerator ve VectorizedShallowParseInstanceGenerator sınıfı
    
    WSD problemi için SemanticInstanceGenerator, FeaturedSemanticInstanceGenerator ve
    VectorizedSemanticInstanceGenerator sınıfı
    
    Morphological Disambiguation problemi için FeaturedDisambiguationInstanceGenerator sınıfı
    
    ## Cite
    If you use this resource on your research, please cite the following paper: 
    
    ```
    @article{acikgoz,
      title={All-words word sense disambiguation for {T}urkish},
      author={O. Açıkg{\"o}z and A. T. G{\"u}rkan and B. Ertopçu and O. Topsakal and B. {\"O}zenç and A. B. Kanburoğlu and {\.{I}}. Çam and B. Avar and G. Ercan and O. T. Y{\i}ld{\i}z},
      journal={2017 International Conference on Computer Science and Engineering (UBMK)},
      year={2017},
      pages={490-495}
    }
    @inproceedings{ertopcu17,  
    	author={B. {Ertopçu} and A. B. {Kanburoğlu} and O. {Topsakal} and O. {Açıkgöz} and A. T. {Gürkan} and B. {Özenç} and İ. {Çam} and B. {Avar} and G. {Ercan} and O. T. {Yıldız}},  
    	booktitle={2017 International Conference on Computer Science and Engineering (UBMK)},  title={A new approach for named entity recognition},   
    	year={2017},  
    	pages={474-479}
    }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlptoolkit_datagenerator_cy-1.0.5.tar.gz (2.8 MB view details)

Uploaded Source

File details

Details for the file nlptoolkit_datagenerator_cy-1.0.5.tar.gz.

File metadata

File hashes

Hashes for nlptoolkit_datagenerator_cy-1.0.5.tar.gz
Algorithm Hash digest
SHA256 c494ae024aed0cd28ee3537c4213a71598ee0ec8a5fb85433527b07434d90974
MD5 9f5b6f9ce540f3251f3e3ed1f98aea75
BLAKE2b-256 80ca3686fc1e8acffd0ee8d03afe5c0c7bd9be0a7a9799f8a436fff1a71a741c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page