Skip to main content

Iterator for prevert files

Project description

Prevert iterator

To use the prevert parser, copy the file prevert.py in your directory.

Use

# import libraries
from prevert import dataset
import pandas as pd

If you are using the MaCoCu corpora in the XML format, the method dataset() needs only the path of the file as the argument:

# Open the dataset with the prevert parser 
dset = dataset("/data/monolingual/mk.xml")

dset consists of docs where you can access the metadata by doc.meta['attribute_name']. Docs consist of paragraphs where you can access the metadata by par.meta['attribute_name'].

Basic use:

for doc in dset: # iterating through documents of a dataset
    print(doc.meta) # all attributes
    print(eval(doc.meta['lang_distr'])[0][0]) # most prominent language in the document
    print(str(doc)) # whole document text
    for par in doc: # iterating through paragraphs of a document
        print(par.meta['id']) # specific attribute
        print(str(par)) # whole paragraph text
    print(doc.to_prevert()) # obtaining the original format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prevert-1.0.2.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

prevert-1.0.2-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file prevert-1.0.2.tar.gz.

File metadata

  • Download URL: prevert-1.0.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for prevert-1.0.2.tar.gz
Algorithm Hash digest
SHA256 293babf3d98ff6e1212870409cfb6c81aace64012f3da784d297adb835748965
MD5 a4bc40312361f00fcd8c8f41f352cc80
BLAKE2b-256 5c8ddf2373be62b8a18870464559e69a7baeb5f8533e3729dd2616a745ea5404

See more details on using hashes here.

File details

Details for the file prevert-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: prevert-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for prevert-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ee6aa5862a51cf78514c7a7a7fe98916d049e32fd253f16c83afa3a6d616f00e
MD5 3e929b3e54037f2036bc1dca367a6e66
BLAKE2b-256 06345cc3087e96b103cc7783963a448cf612dd4cd4eaace31e1f51e439452e22

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page