Skip to main content

MTMinePy - A Multilingual Text Mining Platform for Academic Research

Project description

MTMinePy - Multilingual Text Miner with Python

PyPI version License: GPL v3

MTMinePy is a Python-based academic text mining platform inspired by MTMineR. It is a comprehensive Flask web application designed for powerful text mining and analysis, supporting interactive visualization and advanced modeling.

Key Features

  • Advanced NLP: Integrated with jieba, HanLP, LTP, Spacy, and NLTK.
  • Multilingual: Native support for 10+ languages including Chinese, English, Japanese, and more.
  • Interactive Visualization: Powered by ECharts, supporting responsive force-directed networks, dynamic word clouds, and interactive scatter plots.
  • Academic Metric Analysis: Supports advanced distance functions (Hsim, Close, Esim) and high-end visualization.
  • Advanced Modeling: Comprehensive suite of Unsupervised (Clustering, Topic Modeling) and Supervised learning algorithms.

Screenshots

Chinese Analysis

Co-occurrence Network Word Cloud Clustering
Chinese Network Chinese WordCloud Chinese Clustering

English Analysis

Co-occurrence Network Word Cloud Clustering
English Network English WordCloud English Clustering

Installation

Install from PyPI (Recommended)

pip install mtminepy

To install with all optional NLP backends (Janome, spaCy, HanLP, LTP, UMAP, Boruta, etc.):

pip install mtminepy[full]

Install from source

git clone https://github.com/EasyCam/MTMinePy.git
cd MTMinePy
pip install -e .

Usage

Run from command line

After installation, run directly:

mtminepy

Access the dashboard at http://localhost:5000.

Command-line options

mtminepy --help
mtminepy --port 8080          # Custom port
mtminepy --host 127.0.0.1    # Bind to localhost only
mtminepy --debug              # Flask debug mode
mtminepy --version            # Show version

Run from Python

from mtminepy.app import create_app

app = create_app()
app.run(host='0.0.0.0', port=5000)

Advanced Capabilities

Modeling Algorithms

MTMinePy supports a wide range of standard machine learning algorithms for text analysis:

  • Feature Engineering: TF-IDF, Bag of Words (CountVectorizer), N-gram support.
  • Unsupervised Learning:
    • Topic Modeling: Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), STM (Structural Topic Model).
    • Clustering: K-Means, Agglomerative (Hierarchical), DBSCAN, Spectral Clustering.
    • Dimensionality Reduction: PCA, t-SNE, UMAP, Factor Analysis.
  • Supervised Learning (Classification):
    • Support Vector Machines (SVM)
    • Random Forest
    • Linear Discriminant Analysis (LDA)
    • Quadratic Discriminant Analysis (QDA)
    • Logistic Regression (Elastic Net)

Mathematical Models (Distance & Similarity)

MTMinePy supports advanced metrics for academic research:

Advanced Custom Similarity Measures

  1. Hsim (Yang Fengzhao, 2007) $$ Hsim(x_i, x_j) = \frac{1}{n} \sum_{k=1}^n \frac{1}{1+|x_{ik}-x_{jk}|} $$

  2. Close (Shao Changsheng, et al., 2011) $$ Close(x_i, x_j) = \frac{1}{n} \sum_{k=1}^n e^{-|x_{ik}-x_{jk}|} $$

  3. Esim (Wang Xiaoyang, et al., 2013) $$ Esim(x_{ik}, x_{jk}) = \frac{1}{n} \sum_{k=1}^d \omega_k e^{-\frac{|x_{ik}-x_{jk}|}{|x_{ik}-x_{jk}|+|x_{ik}+x_{jk}|/2}} $$

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mtminepy-0.1.0.tar.gz (44.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mtminepy-0.1.0-py3-none-any.whl (46.1 kB view details)

Uploaded Python 3

File details

Details for the file mtminepy-0.1.0.tar.gz.

File metadata

  • Download URL: mtminepy-0.1.0.tar.gz
  • Upload date:
  • Size: 44.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for mtminepy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ecef932befd5d7923e32c72618685f405f5c436576e9553785a89d1e47dda590
MD5 d2bc711ff7f8c0ddc361e2f2ac9b8daf
BLAKE2b-256 6466522d3a252bf110c42a94c4b6a852eb692e1eb123807f5edff23e1bd230e2

See more details on using hashes here.

File details

Details for the file mtminepy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mtminepy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 46.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for mtminepy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 253007f96dcd1216b1e03707a0ae6e2708ca3d5164f3827bda2f4e3c528b7f17
MD5 d5c42075a4df0c8ffb14cf67c850f6a7
BLAKE2b-256 c1d9149eb26c84562c0b10b86d91d400d075349e1d46105ff73ab2361b1b8f24

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page