Skip to main content

No project description provided

Project description

madai

Compute difference between two corpus by using chi2. Implementation is based on Measures for Corpus Similarity and Homogeneity.

I am not fully sure if this implementation is perfectly follow this paper. Feel free to make issues to point out some problems if you find.

Installation

pip install madai

Usage

madai implements two ways of computing similarity between two corpus, chi2 and spearman. Use spearman when two corpus are different in size.

Two target corpus need to be text files, each line containing one document/sentence.

madai chi2 /path/to/corpus/a /path/to/corpus/b

# or

madai spearman /path/to/corpus/a /path/to/corpus/b

To view parameters, run,

madai --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

madai-0.2.2.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

madai-0.2.2-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file madai-0.2.2.tar.gz.

File metadata

  • Download URL: madai-0.2.2.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.5 Darwin/22.4.0

File hashes

Hashes for madai-0.2.2.tar.gz
Algorithm Hash digest
SHA256 a3679f2835f358ec6bf17c79650beaba3e9e26152131fa2472446c442c4ad747
MD5 4a7a4e73288dd57351d6e899f3557e7d
BLAKE2b-256 ff14b8a281e3510b24a6f96aee647ac4a93db5ca2503f5887e2cb4248d7a8270

See more details on using hashes here.

File details

Details for the file madai-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: madai-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.5 Darwin/22.4.0

File hashes

Hashes for madai-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6fbed4617e97723ec1c72cbd5fd3b2a05d70f9bd97731b9dcbf374b8c788a40b
MD5 c65b5fa0b4c8f5205e9e65abdf5c8208
BLAKE2b-256 526988dddb9f5a3a2a5f7314e8ca93ddfb46ddd9a79a51895148435166c8de37

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page