CQP and CL interfaces for Python
Project description
This is a Python wrapper to the low-level API of CQP which allows
you to access CQP corpora in the same way as Perl's CWB::CL
If you installed CQP in a non-standard location (which is the default for
newer versions of CQP), point the setup in the right direction with, e.g.
export CWB_DIR=/usr/local/cwb-3.4.10
To install the module, use the standard
python setup.py build
sudo python setup.py install
command sequence.
As a prerequisite, install Cython with
pip install Cython
If you use an old version of CQP (CWB 2.99 and older), you need to
change the value of the "extra_libs" variable in setup.py.
To give you an idea how to use the library, see the following sample:
--- 8< ---
from CWB.CL import Corpus
# open the corpus
corpus=Corpus('TUEPP')
# get sentences and words
sentences=corpus.attribute('s','s')
words=corpus.attribute('word','p')
postags=corpus.attribute('pos','p')
# retrieve offsets of the 1235th sentence (0-based)
s_1234=sentences[1234]
for w,p in zip(words[s_1234[0]:s_1234[1]+1],postags[s_1234[0]:s_1234[1]+1]):
print "%s/%s"%(w,p)
--- 8< ---
In order to test the CWB.CL module's correct installation
independently of any CQP corpora, you can do a
python -m doctest tests/idlist.txt
which should terminate with no output when everything is well.
you to access CQP corpora in the same way as Perl's CWB::CL
If you installed CQP in a non-standard location (which is the default for
newer versions of CQP), point the setup in the right direction with, e.g.
export CWB_DIR=/usr/local/cwb-3.4.10
To install the module, use the standard
python setup.py build
sudo python setup.py install
command sequence.
As a prerequisite, install Cython with
pip install Cython
If you use an old version of CQP (CWB 2.99 and older), you need to
change the value of the "extra_libs" variable in setup.py.
To give you an idea how to use the library, see the following sample:
--- 8< ---
from CWB.CL import Corpus
# open the corpus
corpus=Corpus('TUEPP')
# get sentences and words
sentences=corpus.attribute('s','s')
words=corpus.attribute('word','p')
postags=corpus.attribute('pos','p')
# retrieve offsets of the 1235th sentence (0-based)
s_1234=sentences[1234]
for w,p in zip(words[s_1234[0]:s_1234[1]+1],postags[s_1234[0]:s_1234[1]+1]):
print "%s/%s"%(w,p)
--- 8< ---
In order to test the CWB.CL module's correct installation
independently of any CQP corpora, you can do a
python -m doctest tests/idlist.txt
which should terminate with no output when everything is well.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cwb-python-0.2.tar.gz
(62.8 kB
view hashes)