CQP and CL interfaces for Python
Project description
This is a Python wrapper to the low-level API of CQP which allows
you to access CQP corpora in the same way as Perl's CWB::CL
If you installed CQP in a non-standard location (which is the default for
newer versions of CQP), point the setup in the right direction with, e.g.
export CWB_DIR=/usr/local/cwb-3.4.10
To install the module, use the standard
python setup.py build
sudo python setup.py install
command sequence.
As a prerequisite, install Cython with
pip install Cython
If you use an old version of CQP (CWB 2.99 and older), you need to
change the value of the "extra_libs" variable in setup.py.
To give you an idea how to use the library, see the following sample:
--- 8< ---
from CWB.CL import Corpus
# open the corpus
corpus=Corpus('TUEPP')
# get sentences and words
sentences=corpus.attribute('s','s')
words=corpus.attribute('word','p')
postags=corpus.attribute('pos','p')
# retrieve offsets of the 1235th sentence (0-based)
s_1234=sentences[1234]
for w,p in zip(words[s_1234[0]:s_1234[1]+1],postags[s_1234[0]:s_1234[1]+1]):
print "%s/%s"%(w,p)
--- 8< ---
In order to test the CWB.CL module's correct installation
independently of any CQP corpora, you can do a
python -m doctest tests/idlist.txt
which should terminate with no output when everything is well.
you to access CQP corpora in the same way as Perl's CWB::CL
If you installed CQP in a non-standard location (which is the default for
newer versions of CQP), point the setup in the right direction with, e.g.
export CWB_DIR=/usr/local/cwb-3.4.10
To install the module, use the standard
python setup.py build
sudo python setup.py install
command sequence.
As a prerequisite, install Cython with
pip install Cython
If you use an old version of CQP (CWB 2.99 and older), you need to
change the value of the "extra_libs" variable in setup.py.
To give you an idea how to use the library, see the following sample:
--- 8< ---
from CWB.CL import Corpus
# open the corpus
corpus=Corpus('TUEPP')
# get sentences and words
sentences=corpus.attribute('s','s')
words=corpus.attribute('word','p')
postags=corpus.attribute('pos','p')
# retrieve offsets of the 1235th sentence (0-based)
s_1234=sentences[1234]
for w,p in zip(words[s_1234[0]:s_1234[1]+1],postags[s_1234[0]:s_1234[1]+1]):
print "%s/%s"%(w,p)
--- 8< ---
In order to test the CWB.CL module's correct installation
independently of any CQP corpora, you can do a
python -m doctest tests/idlist.txt
which should terminate with no output when everything is well.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
Close
Hashes for cwb_python-0.2.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 500b2eaf9f2f5282e26ef9945c8e5edf44dc4eb1f50c8759d4493d57c867dd96 |
|
MD5 | f221a05cc6d851d3fc5394d1e38eb3a5 |
|
BLAKE2b-256 | b185f0213e1fe39454651f32e2ad6c542dce3f045317b84dd2850740c76ba0ca |
Close
Hashes for cwb_python-0.2.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ebac349a6e9cb93faa9995c98ef55019a12dff765dfa270db5b44521733de6c |
|
MD5 | dd5a83cde65c52bec3035143a0c27e56 |
|
BLAKE2b-256 | da749f73c7f5a7861d37767f03966fa8d5989abddc8e08bb9e5ed2b671e4e8a7 |
Close
Hashes for cwb_python-0.2.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4712116d92f191ca374e91ce767e2b8dc3a3836e8bb534e5e8ec1b91c06aa3c |
|
MD5 | f6a1f1ffd90c64b982c703eccca6d0ff |
|
BLAKE2b-256 | f2a1d41c2816b37ae4598f5010019c4b31ad09b1b31378f5f8befaa45203b7fb |
Close
Hashes for cwb_python-0.2.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d20a2ea6627eeb204a3a48417b50f58a55d8d560a4c7bf51de0afa0bc81caf73 |
|
MD5 | d22460a5f7ae6892912001e2a021bcf4 |
|
BLAKE2b-256 | fcb41aa3359bff24127660aa5dbbc27063e8b6d9773b3850d482947e5662abbb |