Skip to main content

Knot recognition and Gauss/PD extraction from images.

Project description

Knot Recognition

Abstract

This project provides a scientific pipeline for knot recognition from images. It combines a ResNet-based CNN classifier with a structured, heuristic Gauss/PD extractor operating on skeletonized drawings. The repository is organized to support reproducible experiments, clear documentation, and future extensions.

Installation

pip install knot-recognition

Quickstart

knot --image /path/to/image.png --checkpoint ./checkpoints/best.pth --mapping mapping_example.csv

Optional symmetry-invariant feature extraction:

knot --image /path/to/image.png --checkpoint ./checkpoints/best.pth --mapping mapping_example.csv --features

Force a device:

knot --image /path/to/image.png --checkpoint ./checkpoints/best.pth --device cpu
knot-moves --image /path/to/image.png --overlay results/figures/moves_overlay.png

Diagram reducer + classifier (solver):

knot-solve --image /path/to/image.png --checkpoint ./checkpoints/best.pth --mapping mapping_example.csv

Training:

python -m knot_recognition.train --data-dir /path/to/data --outdir ./checkpoints --epochs 20 --batch 32 --lr 1e-3

Training on a specific device:

python -m knot_recognition.train --data-dir /path/to/data --device cuda

Protein Knot Pipeline (Stages 1–4)

Stage 1: Extract Cα backbones (KnotProt chains):

PYTHONPATH=./src python scripts/extract_knotprot_stage1.py \
  --pdb-dir data/knotprot/pdb \
  --out data/knotprot/backbones.npz \
  --manifest data/knotprot/backbones.csv

Stage 2: Projection + crossing detection

  • sample_viewpoints, project_polyline, detect_crossings

Stage 3: Gauss code from crossings

  • gauss_code_from_crossings

Stage 4: Hybrid ML classifier (projection image + Gauss embedding):

PYTHONPATH=./src python scripts/build_knotprot_hybrid_dataset.py \
  --viewpoints 32 --limit 20 --offset 0 --stride 3 --max-points 300 \
  --out data/knotprot/hybrid_dataset_part1.npz \
  --manifest data/knotprot/hybrid_manifest_part1.csv

PYTHONPATH=./src python scripts/merge_hybrid_parts.py \
  --out data/knotprot/hybrid_dataset.npz

python scripts/train_hybrid_classifier.py \
  --data data/knotprot/hybrid_dataset.npz \
  --out checkpoints/hybrid_classifier.pth \
  --epochs 2 --batch 64 --lr 1e-3

Project Structure

  • src/knot_recognition/: Core Python package (models, dataset, preprocessing, inference, Gauss/PD extraction).
  • docs/: Methods and reproducibility notes.
  • notebooks/: Exploratory analysis and ablations.
  • scripts/: Experiment drivers and automation helpers.
  • data/: Reserved for datasets and processed artifacts.
  • results/: Reserved for experiment outputs and figures.
  • raw_knot/: Legacy dataset location (kept for compatibility).
  • outputs/: Legacy outputs location (kept for compatibility).
  • tests/: Synthetic tests for Gauss/PD extraction.

Data Format

Folder-structured dataset:

data_root/
  3_1/
  4_1/

Each subfolder is a class label and contains images.

Methods (Summary)

get_resnet(num_classes=1000, pretrained=True, model_name="resnet18", freeze_backbone=False)

  • Skeleton graph -> spur pruning -> junction clustering -> graph simplification
  • Edge pairing at crossings -> curve traversal -> PD construction
  • Entry point: extract_gauss_code(skel, img_gray=None, cfg=None, return_debug=False)

Mapping CSV Schema

mapping_example.csv:

label,pd_code,gauss_code
3_1,"PD[ [1,2],[3,4] ]","1 -2 3"

Reproducibility

  • Documented environment and workflow notes are in docs/reproducibility.md.
  • Scientific documentation is in docs/scientific.md.
  • Use a clean virtual environment and pinned versions for formal experiments.

Tests

pytest -q

Citation

See CITATION.cff for citation metadata.

Usage

See USAGE.md for end-to-end examples.

Known Limitations

  • Chirality detection is heuristic and depends on how flips affect CNN confidence.
  • Gauss/PD extractor assumes clean, high-contrast drawings.
  • Over/under (sign) inference is not reliable from skeletons alone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knot_recognition-0.1.0.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knot_recognition-0.1.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file knot_recognition-0.1.0.tar.gz.

File metadata

  • Download URL: knot_recognition-0.1.0.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for knot_recognition-0.1.0.tar.gz
Algorithm Hash digest
SHA256 abd96b1de45ab94eef49fb2ce5681f5fe5c1d8bf53aa4bbf55315d167356f9b5
MD5 763ef94f4fc942112c49c9d088d7b14c
BLAKE2b-256 39450276dc3e79974834c46983f6a24baf88d558d1b3f3982e2d23a54caac9ac

See more details on using hashes here.

File details

Details for the file knot_recognition-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for knot_recognition-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 17ee9014a0758d1d51cff5698b6e10b291cfd7b91d111ac870d308e30d707f7a
MD5 bb09df809d1ba51690052616bc7d4987
BLAKE2b-256 d5dadfc2e48eaf6ae1004c1100687d543d7018ee5625c15fd69722794e98ad67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page