A toolkit for Corpus Linguistics Analysis
Project description
Kitconc 3.4.3
Kitconc is a package for Corpus Linguistics and text analysis with Python. It contains, among other things, tools for creating:
Corpora;
Frequency wordlists;
Keywords (Log-Likelihood, Chi-Square, TF-IDF);
Concordance lines;
Collocates;
N-gram lists;
Dispersion plots;
Excel data files;
Semantic search with sentence embeddings.
The package is built on top of platforms and packages for scientific research: numpy, pandas, NLTK, XlsxWriter and matplotlib.
Requirements
Kitconc requires Python 3.10 or later.
Package dependencies (pip install kitconc):
numpy>=1.26.4,<2.0.0 pandas>=2.2.0,<3.0.0 matplotlib>=3.7.0,<4.0.0 xlsxwriter>=3.2.3,<4.0.0 ttkbootstrap>=1.12.0,<2.0.0 pillow>=11.2.0,<12.0.0 requests>=2.31.0,<3.0.0 nltk>=3.9.1,<4.0.0 chardet>=5.2.0,<6.0.0 pypdf>=4.0.0,<7.0.0 cryptography>=3.1,<47.0.0 mcp>=1.0.0,<2.0.0 setuptools>=70.0.0
Additional dependencies listed in requirements.txt (full local environment):
torch>=2.6,<2.10 (CPU wheels via –extra-index-url https://download.pytorch.org/whl/cpu) transformers>=4.45,<6.0.0 sentence-transformers>=3.0,<6.0.0 sqlite-vec>=0.1.7,<1.0.0 fastapi>=0.110,<1.0.0 uvicorn[standard]>=0.27,<1.0.0 python-dotenv>=1.0.0,<2.0.0
Installation
pip install kitconc
Kitconc App (graphical interface)
kitconc-app
Agent Layer (internal actions)
Kitconc now includes an internal action layer for agent/tool orchestration:
kitconc.agent.actions.KitconcActions
Full parity with shell commands from kit_cmd.py (do_*)
Typed schemas in kitconc.agent.schemas
Contract documentation in kitconc/agent/CONTRACT.md
Semantic retrieval action: semantic_search(…)
Basic usage:
from kitconc.agent import KitconcActions actions = KitconcActions(“kitconc_workspace”) actions.create(“ads”, “kitconc_corpora/ads”, “english”) actions.use(“ads”) rows = actions.keywords(limit=10)
MCP Server (for agent integrations)
kitconc-mcp –transport stdio
For HTTP clients (recommended):
kitconc-mcp –transport streamable-http –host 127.0.0.1 –port 8001
or (legacy SSE):
kitconc-mcp –transport sse –host 127.0.0.1 –port 8001
Includes semantic retrieval tool: semantic_search (query, top_k, db_path, model_name)
MCP runtime is included in package dependencies (pip install kitconc is enough).
What’s new in 3.2.0
Tkinter launcher command – start GUI with kitconc-app
Agent action layer – kitconc.agent.actions.KitconcActions with command parity from kit_cmd.py
Typed schemas – available in kitconc.agent.schemas
MCP server entrypoint – run with kitconc-mcp
Semantic search MCP tool – semantic_search for sqlite-vec indexes
Embedding index hardening – transactional writes and thread-safe SQLite access
Progress flag rename – use verbose=True (replacing show_progress=True)
What’s new in 3.1.0
TF-IDF keywords – third keyword extraction method alongside Log-Likelihood and Chi-Square
Keyword filters – ignore numbers, ignore words with strange characters, minimum word length
PDF support – add PDF files directly to a corpus
Embeddings module – semantic search with sentence-transformers and SQLite vector storage
Dialog improvements – dialog boxes now center correctly in fullscreen and large-window mode
Language resources
Kitconc comes with built-in language resources for Portuguese and English corpora. It also provides functions for adding your own language resources.
Usage example
See how easy it is to use Kitconc:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kitconc-3.4.4.tar.gz.
File metadata
- Download URL: kitconc-3.4.4.tar.gz
- Upload date:
- Size: 6.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
102506d6ae34e4582df698e6ff0ab06fd9f2e3d274c5b649427bcd46a0f0204a
|
|
| MD5 |
2ed115c9400bc21ad096b17510dfc32e
|
|
| BLAKE2b-256 |
3ead6923404f54e9728ed5be6c5d058cea6e9741fb5ca2b00f3f8cb49fd2d2ba
|
Provenance
The following attestation bundles were made for kitconc-3.4.4.tar.gz:
Publisher:
fluxodetrabalho.yml on ilexistools/kitconc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kitconc-3.4.4.tar.gz -
Subject digest:
102506d6ae34e4582df698e6ff0ab06fd9f2e3d274c5b649427bcd46a0f0204a - Sigstore transparency entry: 1275414313
- Sigstore integration time:
-
Permalink:
ilexistools/kitconc@b13bc0e22267bb17a2ad73970d4fdfbcb9c76b1b -
Branch / Tag:
refs/tags/v3.4.4 - Owner: https://github.com/ilexistools
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
fluxodetrabalho.yml@b13bc0e22267bb17a2ad73970d4fdfbcb9c76b1b -
Trigger Event:
release
-
Statement type:
File details
Details for the file kitconc-3.4.4-py3-none-any.whl.
File metadata
- Download URL: kitconc-3.4.4-py3-none-any.whl
- Upload date:
- Size: 6.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ee86c1cb3c14695137214abad069904860bd1af992f6825356ed97eb1500614
|
|
| MD5 |
f5aff3aadd26ec50081b704f49b94c35
|
|
| BLAKE2b-256 |
36e13f4c641e7b89825dc5887aad5ac08edcc03affc4d937054eee0713588b03
|
Provenance
The following attestation bundles were made for kitconc-3.4.4-py3-none-any.whl:
Publisher:
fluxodetrabalho.yml on ilexistools/kitconc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kitconc-3.4.4-py3-none-any.whl -
Subject digest:
0ee86c1cb3c14695137214abad069904860bd1af992f6825356ed97eb1500614 - Sigstore transparency entry: 1275414363
- Sigstore integration time:
-
Permalink:
ilexistools/kitconc@b13bc0e22267bb17a2ad73970d4fdfbcb9c76b1b -
Branch / Tag:
refs/tags/v3.4.4 - Owner: https://github.com/ilexistools
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
fluxodetrabalho.yml@b13bc0e22267bb17a2ad73970d4fdfbcb9c76b1b -
Trigger Event:
release
-
Statement type: