Aggregate build for the nl_processing multi-package repository
Project description
nl_processing
Dutch language processing toolkit organized as a multi-package Python repository.
Install
pip install nl_processing
The published nl_processing package is the aggregate build from the repo root. Day-to-day development happens inside the package folders under packages/.
Repository Layout
packages/
core/
extract_text_from_image/
extract_words_from_text/
translate_text/
translate_word/
database/
database_cache/
sampling/
docs/
pyproject.toml # aggregate build for the published nl_processing package
Makefile # repo-wide lint/test entrypoint
Each package has its own:
pyproject.tomlruff.tomlpytest.initests/docs/
Modules
| Module | Class | Description | Docs |
|---|---|---|---|
core |
N/A | Shared models, ports, exceptions, and prompt helpers | docs |
extract_text_from_image |
ImageTextExtractor |
Extract Dutch text from images via Vision API | docs |
extract_words_from_text |
WordExtractor |
Extract and normalize words from markdown text | docs |
translate_text |
TextTranslator |
Translate text (NL -> RU) with markdown preservation | docs |
translate_word |
WordTranslator |
Batch-translate words (NL -> RU) | docs |
database |
DatabaseService |
Remote source of truth and default progress/sync provider | docs |
database_cache |
DatabaseCacheService |
Local-first SQLite cache with injectable remote progress sync | docs |
sampling |
WordSampler |
Weighted word sampling over any compatible scored-pair provider | docs |
Development
Work inside one package when you only touch one module:
cd packages/translate_word
uv sync --all-groups
uv run pytest tests/unit
Run the repo-wide quality gate from the root:
make check
Useful package-local examples:
cd packages/core
uv run pytest tests/unit/core
cd packages/database
doppler run -- uv run pytest tests/integration/database
Dependency Rule
Modules are independent packages. Cross-module dependencies must be explicit in the consuming package's pyproject.toml.
Shared cross-module storage contracts live in nl_processing.core.ports. database and database_cache are concrete implementations and adapters, not the owners of those shared interfaces.
One intentional design change in this layout: database no longer imports translate_word directly. If you want automatic translation on add_words(), compose it explicitly:
from nl_processing.core.models import Language
from nl_processing.database.service import DatabaseService
from nl_processing.translate_word.service import WordTranslator
db = DatabaseService(
user_id="alex",
translator=WordTranslator(
source_language=Language.NL,
target_language=Language.RU,
),
)
Docs
- Repository module spec: docs/module-spec.md
- Environment variables: docs/ENV_VARS.md
- Release workflow: docs/REALEASE_WORKFLOW.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nl_processing-1.0.2.tar.gz.
File metadata
- Download URL: nl_processing-1.0.2.tar.gz
- Upload date:
- Size: 2.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d17b9f88e065ecea4041c84525186e28a922686301436bbb832d06cac6a6993b
|
|
| MD5 |
ac174a0c5edb0e79870232ab6235c0c1
|
|
| BLAKE2b-256 |
a0d363bf9bb47f985bc676c580156c4d168c4b8817de70f1608315188377e67a
|
File details
Details for the file nl_processing-1.0.2-py3-none-any.whl.
File metadata
- Download URL: nl_processing-1.0.2-py3-none-any.whl
- Upload date:
- Size: 2.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47e3be2f1a2f19af843d25df221dddf5c8eecd2f01ce17e9a01471daa1fdf340
|
|
| MD5 |
9fb369f5116ab021e159d113d459aebc
|
|
| BLAKE2b-256 |
29d366ce9e6c6c67ffb8293a18b4d18079fc25bfb70c1ecd9f94fcecc7b435e2
|