Read marked paper forms (bubble sheets, surveys, checklists, exams) into structured data — locally, no cloud.
Project description
marksense
Read marked paper forms — bubble sheets, surveys, checklists, exams — into structured data (JSON/CSV). Runs locally: no cloud, no account, no telemetry.
Point marksense at a scan or phone photo of a filled form plus a template you define once, and it returns every answer with a confidence score:
marksense read examples/samples/quiz_filled_01.png -t examples/templates/quiz.json
{
"form_type": "quiz",
"source": "examples/samples/quiz_filled_01.png",
"answers": {
"Q1": "A",
"Q2": "D",
"Q3": "D"
}
}
(output truncated — the full result carries all 20 answers plus per-question confidence, multi-mark flags, and per-page alignment confidence)
Why marksense
- Any layout. Layout knowledge lives in a template JSON, not in code — checkboxes, bubbles, grids, multi-page forms, mixed mark types. Adding a form means writing JSON, never code.
- Robust to real-world scans. Every page is aligned onto the blank template (feature matching
- ECC refinement) before detection, so skewed scans and phone photos read correctly.
- Model-free by default, learned detection when you want it. A pixel-density detector works
out of the box with zero downloads; a small ONNX mark-detection model can be plugged in
(
--model/--download) for harder real-world scans. - Lean runtime. onnxruntime, OpenCV, NumPy, PyMuPDF. No PyTorch, no GPU needed.
Install
pip install marksense
Quickstart
The repository ships a self-contained synthetic demo (generated by
examples/generate_samples.py):
# Read one form -> JSON on stdout
marksense read examples/samples/quiz_filled_01.png -t examples/templates/quiz.json
# A whole stack -> one CSV row per form
marksense batch examples/samples/ -t examples/templates/survey.json -o results.csv
# Check a template you are authoring
marksense template validate examples/templates/quiz.json
Or from Python:
from marksense import read_form
result = read_form("scan.jpg", template="my-form.json")
result.answers # {"Q1": "3", ...}
result.confidence # per-question confidence
result.multi_marked # questions with more than one mark (review these)
result.to_csv() # question,answer,confidence,flags
Run as a service
pip install "marksense[service]"
marksense serve # http://127.0.0.1:8000, bundled demo templates
marksense serve --templates-dir ./my-templates --port 9000
Or with Docker:
docker build -t marksense .
docker run -p 8000:8000 -v ./my-templates:/templates marksense
Endpoints: GET /health, GET /templates, POST /read?form_type=<name> (multipart file).
Interactive docs at /docs.
Reading your own forms
- Get a clean image of the blank form (render the PDF or scan an empty copy).
- Write a template JSON describing where each option is — see the template authoring guide.
marksense template validate my-form.json, thenmarksense read.
How it works
input (PDF / image)
└─ render pages ──> align to template ──> detect marks ──> map to answers
(ORB → SIFT → ECC) (ONNX YOLO or (nearest ROI +
density fallback) confidence)
The detector only knows two things: what a check looks like and what a circle looks like. All form-specific knowledge — page sizes, question positions, answer values — is declarative template JSON. That separation is what makes the engine general.
v0.1 uses the density detector by default — no downloads, fully offline. Learned-model weights
trained on public datasets ship in an upcoming release; pass --download /
auto_download=True to fetch them once published (cached under ~/.marksense/models/), or
--model path/to/weights.onnx to use your own.
Roadmap
Self-hosted REST service + Docker image(shipped in v0.2)- Published accuracy benchmarks
- Clean-provenance model weights (training pipeline and guide:
docs/training.md) - Template authoring helpers (auto-detect form regions)
Development
git clone https://github.com/RoyAbra27/marksense
cd marksense
uv venv && uv pip install -e ".[dev]"
pytest # full suite runs with no model file and no network
ruff check .
Design docs live in docs/design/; start with
0001-marksense-v1.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file marksense-0.2.0.tar.gz.
File metadata
- Download URL: marksense-0.2.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dbba97247f5bc5da62ad20a4d900c3bf49f8c8b251b09f53f8d0870c5143a65
|
|
| MD5 |
6c0260c320ade228ce25eeaed6ee1acf
|
|
| BLAKE2b-256 |
e5ea188918f39bb1b3cb52e434772da5ebbb2f81097fbb4a5db6dc54b8aba6d5
|
Provenance
The following attestation bundles were made for marksense-0.2.0.tar.gz:
Publisher:
release.yml on RoyAbra27/marksense
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
marksense-0.2.0.tar.gz -
Subject digest:
0dbba97247f5bc5da62ad20a4d900c3bf49f8c8b251b09f53f8d0870c5143a65 - Sigstore transparency entry: 2047342939
- Sigstore integration time:
-
Permalink:
RoyAbra27/marksense@5306627b3c61ab94434c142830d1c34cda372d75 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/RoyAbra27
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5306627b3c61ab94434c142830d1c34cda372d75 -
Trigger Event:
release
-
Statement type:
File details
Details for the file marksense-0.2.0-py3-none-any.whl.
File metadata
- Download URL: marksense-0.2.0-py3-none-any.whl
- Upload date:
- Size: 35.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b5f0dd8a2c376d0f7a6858dc780c9558c793997ee63da2c80ec9453c02d86cf
|
|
| MD5 |
dc01bf8202a6a4dd83940f9cf1b4a12b
|
|
| BLAKE2b-256 |
3029e1f201011def29fc2e96ad9e4b82c569c30df334eac8ff91f13f41b89bb3
|
Provenance
The following attestation bundles were made for marksense-0.2.0-py3-none-any.whl:
Publisher:
release.yml on RoyAbra27/marksense
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
marksense-0.2.0-py3-none-any.whl -
Subject digest:
7b5f0dd8a2c376d0f7a6858dc780c9558c793997ee63da2c80ec9453c02d86cf - Sigstore transparency entry: 2047343007
- Sigstore integration time:
-
Permalink:
RoyAbra27/marksense@5306627b3c61ab94434c142830d1c34cda372d75 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/RoyAbra27
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5306627b3c61ab94434c142830d1c34cda372d75 -
Trigger Event:
release
-
Statement type: