Skip to main content

A latent semantic search engine implementation

Project description

https://travis-ci.org/odarbelaeze/condor-ir.svg?branch=master https://zenodo.org/badge/DOI/10.5281/zenodo.495722.svg

This is a program to work with examples of Latent Semantic Analysis search engines, a.k.a., LSA. The program is set up so that it understands froac xml documents on input as well as plain text records from isi web of knowledge.

You can find more information about froac repositories at http://froac.manizales.unal.edu.co/froac/ http://froac.manizales.unal.edu.co/froac/ and about isi web of knowledge text files at the thomson reuters website

Installing the condor-ir package

The second thing you will need is to download the program from its pypi repository,

pip install -U condor-ir

the -U parameter will upgrade the package to the latest version, a very recommendable step for a unstable package.

For specific databases support you can install their appropriate extra package:

pip install -U condor-ir[mysql]
pip install -U condor-ir[postgres]

Furthermore, we require a bit of the nltk data package for the stems and stop word removal to work.

python -m nltk.downloader snowball_data stopwords

Finally, in order to prepare the database or reset the database in preparation for a new version of condor-ir you can run the database preparation script,

condor utils preparedb

If you need to specify a database other than the default you can do so through environment variables:

export CONDOR_DB_URL="mysql://localhost/condor"
condor utils preparedb # will now work on mysql://localhost/condor

CLI Interface

After installing the program you will have three basic commands at your disposal, for handling bibliography sets, term document matrices and engines, the CLI interface gives you most CRUD operations in a hierachical manner.

condor triggers the main program and you can get top level help by running condor --help.

condor bibliography namespaces the bibliography set related commands, you can list and get help about those using condor bibliography --help.

condor model is a short cut that offers the condor model create sub command, that creates both a term document matrix and an lsa search engine, get help on models using condor model --help.

condor query <string...> this non crud command search a bibliography set using a previously created search engine, the search engine can be targeted figure out how using condor query --help.

Feel free to check detailed descriptions of these commands using their --help flag.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

condor-ir-1.2.2.tar.gz (29.2 kB view details)

Uploaded Source

Built Distribution

condor_ir-1.2.2-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file condor-ir-1.2.2.tar.gz.

File metadata

  • Download URL: condor-ir-1.2.2.tar.gz
  • Upload date:
  • Size: 29.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.8.0

File hashes

Hashes for condor-ir-1.2.2.tar.gz
Algorithm Hash digest
SHA256 5ab8a803cb4fc8c98c39687ed7c6f57629ce04292865d6ddea41663748d66b14
MD5 68c6a540ca6dbcd58f3877a62f6872ad
BLAKE2b-256 8a0fcf03b06407eb65e3a79ea06d5876df34a08fbc622342d6094f76ef1d7f62

See more details on using hashes here.

File details

Details for the file condor_ir-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: condor_ir-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.8.0

File hashes

Hashes for condor_ir-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2850fe96f175dae5552e175fc6b7269bfd079214e09183def7a6277328c449be
MD5 9170ef3d3e21b126dc90085a1e55f3e3
BLAKE2b-256 33969a763f9dde9138922c7cf2f68aa4967bce87b651c7621f7d27ec873a1792

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page