Skip to main content

Semantic and co-occurrence graphs for midsized texts

Project description

kenon logo

kenon

Semantic and co-occurrence graphs for midsized texts. Kenon builds weighted graphs from text using corpus-internal statistics — no neural models or external training data required. Supports co-occurrence windows, TF-IDF similarity, PMI embeddings, and disparity filter backbone extraction.

Installation

uv add kenon
python -m spacy download en_core_web_sm

Quickstart

from kenon import (
    Tokenizer,
    get_stopwords,
    build_cooccurrence_graph,
    extract_backbone,
)

# 1. Tokenize
tokenizer = Tokenizer("en_core_web_sm", lemmatize=True)
tokens = tokenizer.flat_tokens("The cat sat on the mat. The dog ran in the park.")

# 2. Build graph
stopwords = get_stopwords("english")
graph = build_cooccurrence_graph(tokens, window=2, stopwords=stopwords)

# 3. Extract backbone
backbone = extract_backbone(graph, min_alpha_ptile=0.3, min_degree=2)
print(f"Backbone: {backbone.number_of_nodes()} nodes, {backbone.number_of_edges()} edges")

Features

  • Tokenization: spaCy-backed sentence splitting, tokenization, and lemmatization
  • Stopwords: Merged NLTK + sklearn stopword lists with custom extensions
  • Embeddings: Count vectors, TF-IDF, and PMI (via chronowords) — all corpus-internal
  • Co-occurrence graphs: Skip-gram window co-occurrence with collocation detection
  • Semantic graphs: Cosine similarity graphs from any embedder
  • Backbone extraction: Disparity filter for statistically significant edges

Documentation

See the docs/ directory for full API reference and examples.

Made by

Kenon is made by Crow Intelligence.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kenon-0.1.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kenon-0.1.1-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file kenon-0.1.1.tar.gz.

File metadata

  • Download URL: kenon-0.1.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kenon-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8858f3a58e25205c39b9676f77f528416b554db4d0c995ef21fd4925bea693c8
MD5 9c9cda1d6ee08067bf953d47d72a574a
BLAKE2b-256 e1fb09e9e08c8a00fb4e5a6add7fe9cac9cd57135c776921a1eb24feec70b862

See more details on using hashes here.

Provenance

The following attestation bundles were made for kenon-0.1.1.tar.gz:

Publisher: publish.yml on crow-intelligence/kenon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kenon-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: kenon-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kenon-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b4eafcb2bd7bdcbde77478310545cac919c45b079da0d7273192aa1e70d377fc
MD5 4c97f6713b159b87e869aec2343aa72b
BLAKE2b-256 a0697882ebb1c304f39f7f346295e63554da31cc658ea4ed603e18e29b5dbcba

See more details on using hashes here.

Provenance

The following attestation bundles were made for kenon-0.1.1-py3-none-any.whl:

Publisher: publish.yml on crow-intelligence/kenon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page