Python library for datannur catalog metadata management

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

datannur

These details have not been verified by PyPI

Project description

datannurpy

Python library for datannur catalog metadata management.

Supported formats

A lightweight catalog compatible with most data sources:

Category	Formats
Spreadsheets	CSV, Excel (.xlsx, .xls)
Columnar	Parquet, Delta Lake, Apache Iceberg, Hive partitioned
Statistical	SAS (.sas7bdat), SPSS (.sav), Stata (.dta)
Databases	PostgreSQL, MySQL, Oracle, SQL Server, SQLite, DuckDB

All formats support automatic schema inference and statistics computation.

Installation

pip install datannurpy

Optional extras

# Databases
pip install datannurpy[postgres]  # PostgreSQL
pip install datannurpy[mysql]     # MySQL
pip install datannurpy[oracle]    # Oracle
pip install datannurpy[mssql]     # SQL Server

# File formats
pip install datannurpy[stat]      # SAS, SPSS, Stata
pip install datannurpy[delta]     # Delta Lake metadata extraction
pip install datannurpy[iceberg]   # Apache Iceberg metadata extraction

# Cloud storage
pip install datannurpy[s3]        # Amazon S3
pip install datannurpy[azure]     # Azure Blob Storage
pip install datannurpy[gcs]       # Google Cloud Storage
pip install datannurpy[cloud]     # All cloud providers

# Multiple extras
pip install datannurpy[postgres,stat,delta]

Note: The iceberg, s3, azure, gcs, and cloud extras require Python 3.10+.

SQL Server note: Requires an ODBC driver on the system:

macOS: brew install unixodbc freetds
Linux: apt install unixodbc-dev tdsodbc
Windows: Microsoft ODBC Driver

Quick start

# catalog.yml
app_path: ./my-catalog
open_browser: true

add:
  - folder: ./data
    include: ["*.csv", "*.xlsx", "*.parquet"]

  - database: sqlite:///mydb.sqlite

python -m datannurpy catalog.yml

Or use the Python API:

from datannurpy import Catalog

catalog = Catalog()
catalog.add_folder("./data", include=["*.csv", "*.xlsx", "*.parquet"])
catalog.add_database("sqlite:///mydb.sqlite")
catalog.export_app("./my-catalog", open_browser=True)

CLI

python -m datannurpy catalog.yml
python -m datannurpy --help     # show usage
python -m datannurpy --version  # show version

Scan depth

The depth parameter controls how much metadata is extracted. Set it globally or per entry:

depth: variable                   # global default

add:
  - folder: ./data                # inherits "variable"

  - folder: ./big
    depth: stat                   # override for this entry

  - database: sqlite:///db.sqlite
    depth: dataset

Feature	`dataset`	`variable`	`stat`	`value` (default)
Folders	✓	✓	✓	✓
Datasets (format, path, mtime)	✓	✓	✓	✓
Variables (names, types)		✓	✓	✓
DB introspection (PK, FK, comments)		✓	✓	✓
Row count, statistics			✓	✓
Modalities, frequencies, patterns				✓
Auto-tagging (format, security, text)				✓

Note: At depth="variable", CSV and Excel files only extract column names (types require reading data, available from depth="stat"). All other formats provide types at this level.

Typical use cases:

dataset — quick inventory of available files/tables without reading data
variable — lightweight schema discovery (column names and types)
stat — data profiling without modality detection (faster than value)
value — full catalog with frequency tables and modality assignment (default)

Auto-tagging

At depth="value" (default), string columns are automatically tagged by content type. Tags use a two-level hierarchy under the auto parent:

Category	Tags
Format	`auto---email`, `auto---phone`, `auto---uuid`, `auto---iban`
Security	`auto---bcrypt`, `auto---argon2`, `auto---jwt`, `auto---secret`
Text	`auto---structured`, `auto---semi-structured`, `auto---free-text` → `auto---natural-text`

Each variable receives at most one leaf tag. The frontend can use parent_id to filter by category (e.g. selecting auto---security shows all bcrypt/argon2/jwt/secret variables).

Security-tagged columns (bcrypt, argon2, jwt, secret) have their raw frequency values suppressed — only pattern frequencies are emitted, so no actual secrets appear in the exported catalog.

Scanning files

add:
  # Scan a folder (CSV, Excel, SAS)
  - folder: ./data

  # With custom folder metadata
  - folder: ./data
    id: prod
    name: Production

  # With filtering options
  - folder: ./data
    include: ["*.csv", "*.xlsx"]
    exclude: ["**/tmp/**"]
    recursive: true
    csv_encoding: utf-8        # or cp1252, iso-8859-1 (auto-detected by default)

  # Multiple folders with shared options
  - folder: [./data/sales, ./data/hr]
    include: ["*.csv"]

  # A single file
  - dataset: ./data/sales.csv

  # Multiple files
  - dataset:
      - ./data/sales.csv
      - ./data/products.csv

Time series detection

When time_series: true (default), files with temporal patterns in their names are automatically grouped into a single dataset:

data/
├── enquete_2020.csv    ─┐
├── enquete_2021.csv     ├─→ Single dataset "enquete" with nb_resources=3
├── enquete_2022.csv    ─┘
└── reference.csv       ─→ Separate dataset "reference"

Detected patterns: year (2024), quarter (2024Q1, 2024T2), month (2024-03, 202403), date (2024-03-15).

The resulting dataset includes:

nb_resources: number of resources in the series
start_date / end_date: temporal coverage
Variables track their own start_date / end_date based on presence across periods

Set time_series: false to treat each file as a separate dataset.

Parquet formats

Supports simple Parquet files and partitioned datasets (Delta, Hive, Iceberg):

add:
  # add_folder auto-detects all formats
  - folder: ./data             # scans *.parquet + Delta/Hive/Iceberg directories

  # Single partitioned directory with metadata override
  - dataset: ./data/sales_delta
    name: Sales Data
    description: Monthly sales
    folder:
      id: sales
      name: Sales

With extras [delta] and [iceberg], metadata (name, description, column docs) is extracted when available.

Remote storage

Scan files on SFTP servers or cloud storage (S3, Azure, GCS):

env_file: .env               # SFTP_PASSWORD, AWS_KEY, AWS_SECRET, etc.

add:
  # SFTP (paramiko included by default)
  - folder: sftp://user@host/path/to/data
    storage_options:
      password: ${SFTP_PASSWORD}   # or key_filename: /path/to/key

  # Amazon S3 (requires: pip install datannurpy[s3])
  - folder: s3://my-bucket/data
    storage_options:
      key: ${AWS_KEY}
      secret: ${AWS_SECRET}

  # Azure Blob (requires: pip install datannurpy[azure])
  - folder: az://container/data
    storage_options:
      account_name: ${AZURE_ACCOUNT}
      account_key: ${AZURE_KEY}

  # Google Cloud Storage (requires: pip install datannurpy[gcs])
  - folder: gs://my-bucket/data
    storage_options:
      token: /path/to/credentials.json

  # Single remote file
  - dataset: s3://my-bucket/data/sales.parquet
    storage_options:
      key: ${AWS_KEY}
      secret: ${AWS_SECRET}

  # Remote SQLite / GeoPackage database
  - database: sftp://host/path/to/db.sqlite
    storage_options:
      key_filename: /path/to/key
  - database: s3://bucket/geodata.gpkg
    storage_options:
      key: ${AWS_KEY}
      secret: ${AWS_SECRET}

The storage_options dict is passed directly to fsspec. See provider documentation for available options:

Scanning databases

add:
  # SQLite / GeoPackage
  - database: sqlite:///path/to/db.sqlite
  - database: sqlite:///path/to/geodata.gpkg

  # PostgreSQL / MySQL / Oracle / SQL Server
  - database: postgresql://user:pass@host:5432/mydb
  - database: mysql://user:pass@host:3306/mydb
  - database: oracle://user:pass@host:1521/service_name
  - database: mssql://user:pass@host:1433/mydb

  # SSL/TLS
  - database: postgresql://user:pass@host/db?sslmode=require

  # SQL Server with Windows auth (requires proper Kerberos setup)
  - database: mssql://host/db?TrustedConnection=yes

  # With options
  - database: postgresql://localhost/mydb
    schema: public
    include: ["sales_*"]
    exclude: ["*_tmp"]
    sample_size: 10000
    group_by_prefix: true       # group tables by common prefix (default)
    prefix_min_tables: 2        # minimum tables to form a group

  # Multiple schemas
  - database: postgresql://localhost/mydb
    schema: [public, sales, hr]

  # SSH tunnel (for databases behind a firewall)
  - database: mysql://user:pass@dbhost/mydb
    ssh_tunnel:
      host: ssh.example.com
      user: sshuser

  # SSH tunnel with more options
  - database: postgresql://user:pass@dbhost/mydb
    ssh_tunnel:
      host: bastion.example.com
      port: 2222
      user: admin
      key_file: ~/.ssh/id_rsa

Connection string formats:

SQLite: sqlite:///path/to/db.sqlite or sftp://host/path/db.sqlite (remote)
PostgreSQL: postgresql://user:pass@host:5432/database
MySQL: mysql://user:pass@host:3306/database
Oracle: oracle://user:pass@host:1521/service_name
SQL Server: mssql://user:pass@host:1433/database
DuckDB: pass an ibis.duckdb.connect(...) backend directly (no connection string)

Database metadata enrichment (requires depth: variable or higher):

Metadata	Target field	Backends
Primary keys	`Variable.key`	All 6
Foreign keys	`Variable.fk_var_id`	All 6
Table/column comments	`description`	All except SQLite
NOT NULL, UNIQUE, INDEX	Auto tags (`db---*`)	All 6
Auto-increment	Auto tag	All 6

This metadata is always refreshed, even when table data is unchanged (cache hit).

Sampling

By default, sample_size is 100000. All entries inherit this value. Override per entry, or set null to disable:

sample_size: 100000               # default

add:
  - folder: ./data                # inherits 100000

  - folder: ./small
    sample_size: null             # no sampling

  - database: postgresql://localhost/mydb
    sample_size: 50000            # override

To disable sampling globally:

sample_size: null

When a dataset has more rows than sample_size, a uniform random sample is used for frequency counts and modality detection. All other statistics (nb_row, nb_missing, nb_distinct, min, max, mean, std) are computed on the full dataset.

The actual number of sampled rows is recorded in Dataset.sample_size (null when no sampling was applied).

CSV options

Avoid the UTF-8 temp copy when files are already local and UTF-8 (auto-fallback if encoding detection fails):

csv_skip_copy: true

Manual metadata

add:
  # Load from a folder containing metadata files
  - metadata: ./metadata

  # Load from a database
  - metadata: sqlite:///metadata.db

Can be used alone or combined with auto-scanned metadata (add_folder, add_database).

Expected structure: One file/table per entity, named after the entity type:

metadata/
├── variable.csv      # Variables (descriptions, tags...)
├── dataset.xlsx      # Datasets
├── institution.json  # Institutions (owners, managers)
├── tag.csv           # Tags
├── modality.csv      # Modalities
├── value.csv         # Modality values
└── ...

Supported formats: CSV, Excel (.xlsx), JSON, SAS (.sas7bdat), or database tables.

File format: Standard tabular structure following datannur schemas.

# variable.csv
id,description,tag_ids
source---employees_csv---salary,"Monthly gross salary in euros","finance,hr"
source---employees_csv---department,"Department code","hr"

Merge behavior:

Existing entities are updated (manual values override auto-scanned values)
New entities are created
List fields (tag_ids, doc_ids, etc.) are merged

Ordering: In YAML, add_metadata is automatically processed last regardless of declaration order. In Python, call add_metadata after add_folder, add_dataset, and add_database so manual values take precedence.

Environment variables

Environment variables ($VAR or ${VAR}) are expanded in all YAML values. All sources are loaded — env:, env_file, and .env next to the YAML file:

env:
  data_dir: /shared/data
  db_host: db.example.com
env_file: /secure/path/.env    # secrets: DB_USER, DB_PASSWORD

add:
  - folder: ${data_dir}/sales
  - folder: ${data_dir}/hr
  - database: oracle://${DB_USER}:${DB_PASSWORD}@${db_host}:1521/ORCL

env_file supports a list of paths (last overrides first):

env_file:
  - /shared/common.env         # defaults
  - /secure/credentials.env    # overrides common.env

Priority (first set wins): system env vars > env: YAML > env_file > .env local.

Output

# Complete standalone app
app_path: ./my-catalog
open_browser: true

# JSON metadata only (for existing datannur instance)
output_dir: ./output

Incremental scan

Re-run with the same app_path to only rescan changed files (compares mtime) or tables (compares schema + row count):

app_path: ./my-catalog

add:
  - folder: ./data               # skips unchanged files

Use refresh: true to force a full rescan.

Evolution tracking

Changes between exports are automatically tracked in evolution.json:

add: new folder, dataset, variable, modality, etc.
update: modified field (shows old and new value)
delete: removed entity

Cascade filtering: when a parent entity is added or deleted, its children are automatically filtered out to reduce noise. For example, adding a new dataset won't generate separate entries for each variable.

Disable tracking:

track_evolution: false

app_config

Configure the web app with key-value entries (written as config.json):

app_path: ./my-catalog
app_config:
  contact_email: contact@example.com
  more_info: "Data from [open data portal](https://example.com)."

If app_config is not provided, no config.json is generated.

post_export

Run Python scripts automatically after export:

# Single script (bare name → python-scripts/generate_links.py)
post_export: generate_links

# Multiple scripts
post_export:
  - generate_links
  - start_app

Script resolution:

Format	Resolved path
`generate_links`	`{output}/python-scripts/generate_links.py`
`hook.py`	`{output}/hook.py`
`scripts/hook.py`	`{output}/scripts/hook.py`
`/absolute/path.py`	`/absolute/path.py`

Works with both app_path and output_dir exports.

Python API

All YAML features are also available programmatically via the Python API.

`Catalog`

Catalog(app_path=None, depth="value", refresh=False, freq_threshold=100, csv_encoding=None, sample_size=100_000, csv_skip_copy=False, app_config=None, quiet=False, verbose=False, log_file=None)

Attribute	Type	Description
app_path	str \| Path \| None	Load existing catalog for incremental scan
depth	"dataset" \| "variable" \| "stat" \| "value"	Default scan depth (default: "value")
refresh	bool	Force full rescan ignoring cache (default: False)
freq_threshold	int	Max distinct values for frequency/modality detection. Strings above this threshold get pattern frequencies instead
csv_encoding	str \| None	Default CSV encoding (utf-8, cp1252, etc.)
sample_size	int \| None	Default sample size for stats (default: 100_000)
csv_skip_copy	bool	Skip UTF-8 temp copy for local CSV (default: False)
app_config	dict[str, str] \| None	Key-value config for the web app
quiet	bool	Suppress progress logging (default: False)
verbose	bool	Show full tracebacks on errors (default: False)
log_file	str \| Path \| None	Write full scan log to file (truncated each run)
folder	Table[Folder]	Folder table (`.all()`, `.count`, `.get_by(...)`)
dataset	Table[Dataset]	Dataset table
variable	Table[Variable]	Variable table
modality	Table[Modality]	Modality table
value	Table[Value]	Modality value table
freq	Table[Freq]	Frequency table (computed)
institution	Table[Institution]	Institution table
tag	Table[Tag]	Tag table
doc	Table[Doc]	Document table

`Catalog.add_folder()`

catalog.add_folder(
    path,
    folder=None,
    *,
    depth=None,
    include=None,
    exclude=None,
    recursive=True,
    csv_encoding=None,
    sample_size=None,
    csv_skip_copy=None,
    storage_options=None,
    refresh=None,
    quiet=None,
    time_series=True,
    id=None,
    name=None,
    description=None,
)

Parameter	Type	Default	Description
path	str \| Path \| list[str \| Path]	required	Directory or list of directories to scan
folder	Folder \| None	None	Custom folder metadata
depth	"dataset" \| "variable" \| "stat" \| "value" \| None	None	Scan depth (uses catalog.depth if None)
include	list[str] \| None	None	Glob patterns to include
exclude	list[str] \| None	None	Glob patterns to exclude
recursive	bool	True	Scan subdirectories
csv_encoding	str \| None	None	Override CSV encoding
sample_size	int \| None	None	Sample rows for stats (overrides catalog)
csv_skip_copy	bool \| None	None	Skip UTF-8 temp copy (overrides catalog)
storage_options	dict \| None	None	Options for remote storage (passed to fsspec)
refresh	bool \| None	None	Force rescan (overrides catalog setting)
quiet	bool \| None	None	Override catalog quiet setting
time_series	bool	True	Group files with temporal patterns
id	str \| None	None	Override folder ID
name	str \| None	None	Override folder name
description	str \| None	None	Override folder description

`Catalog.add_dataset()`

catalog.add_dataset(
    path,
    folder=None,
    *,
    folder_id=None,
    depth=None,
    csv_encoding=None,
    sample_size=None,
    csv_skip_copy=None,
    storage_options=None,
    refresh=None,
    quiet=None,
    name=None,
    description=None,
    ...,
)

Parameter	Type	Default	Description
path	str \| Path \| list[str \| Path]	required	File(s) or partitioned directory (local/remote)
folder	Folder \| None	None	Parent folder
folder_id	str \| None	None	Parent folder ID (alternative to folder)
depth	"dataset" \| "variable" \| "stat" \| "value" \| None	None	Scan depth (uses catalog.depth if None)
csv_encoding	str \| None	None	Override CSV encoding
sample_size	int \| None	None	Sample rows for stats (overrides catalog)
csv_skip_copy	bool \| None	None	Skip UTF-8 temp copy (overrides catalog)
storage_options	dict \| None	None	Options for remote storage (passed to fsspec)
refresh	bool \| None	None	Force rescan (overrides catalog setting)
quiet	bool \| None	None	Override catalog quiet setting
name	str \| None	None	Override dataset name
description	str \| None	None	Override dataset description

Additional metadata parameters: type, link, localisation, manager_id, owner_id, tag_ids, doc_ids, start_date, end_date, updating_each, no_more_update

`Catalog.add_database()`

catalog.add_database(
    connection,
    folder=None,
    *,
    depth=None,
    schema=None,
    include=None,
    exclude=None,
    sample_size=None,
    group_by_prefix=True,
    prefix_min_tables=2,
    time_series=True,
    storage_options=None,
    refresh=None,
    quiet=None,
    oracle_client_path=None,
    ssh_tunnel=None,
    id=None,
    name=None,
    description=None,
)

Parameter	Type	Default	Description
connection	str \| ibis.BaseBackend	required	Connection string or ibis backend object
folder	Folder \| None	None	Custom root folder
depth	"dataset" \| "variable" \| "stat" \| "value" \| None	None	Scan depth (uses catalog.depth if None)
schema	str \| list[str] \| None	None	Schema(s) to scan
include	list[str] \| None	None	Table name patterns to include
exclude	list[str] \| None	None	Table name patterns to exclude
sample_size	int \| None	None	Sample rows for stats (overrides catalog)
group_by_prefix	bool \| str	True	Group tables by prefix into subfolders
prefix_min_tables	int	2	Min tables to form a prefix group
time_series	bool	True	Detect temporal table patterns
storage_options	dict \| None	None	Options for remote SQLite/GeoPackage
refresh	bool \| None	None	Force rescan (overrides catalog setting)
quiet	bool \| None	None	Override catalog quiet setting
oracle_client_path	str \| None	None	Path to Oracle Instant Client libraries
ssh_tunnel	dict \| None	None	SSH tunnel config (host, user, port, etc.)
id	str \| None	None	Override folder ID
name	str \| None	None	Override folder name
description	str \| None	None	Override folder description

`Catalog.add_metadata()`

catalog.add_metadata(path, depth=None, quiet=None)

Parameter	Type	Default	Description
path	str \| Path	required	Folder or database containing metadata files
depth	"dataset" \| "variable" \| "stat" \| "value" \| None	None	Filter which entities to load
quiet	bool \| None	None	Override catalog quiet setting

Supported entity files/tables: all catalog entities. The id column is not required for value and freq (composite key computed automatically).

`Catalog.export_db()`

catalog.export_db(output_dir=None, track_evolution=True, quiet=None)

Parameter	Type	Default	Description
output_dir	str \| Path \| None	None	Output directory (uses app_path if None)
track_evolution	bool	True	Track changes between exports
quiet	bool \| None	None	Override catalog quiet setting

Exports JSON metadata files. Calls finalize() automatically when data has been scanned.

`Catalog.finalize()`

catalog.finalize()

Removes entities no longer seen during scan. Called automatically by export_db()/export_app().

`Catalog.export_app()`

catalog.export_app(output_dir=None, open_browser=False, track_evolution=True, quiet=None)

Parameter	Type	Default	Description
output_dir	str \| Path \| None	None	Output directory (uses app_path if None)
open_browser	bool	False	Open app in browser after export
track_evolution	bool	True	Track changes between exports
quiet	bool \| None	None	Override catalog quiet setting

Exports complete standalone datannur app with data. Uses app_path by default if set at init.

`Folder`

Folder(id, parent_id=None, tag_ids=[], doc_ids=[], name=None, description=None, type=None, data_path=None)

Parameter	Type	Description
id	str	Unique identifier
parent_id	str \| None	Parent folder ID
tag_ids	list[str]	Associated tag IDs
doc_ids	list[str]	Associated document IDs
name	str \| None	Display name
description	str \| None	Description
type	str \| None	Folder type
data_path	str \| None	Path to the data source

ID helpers

from datannurpy import sanitize_id, build_dataset_id, build_variable_id

Function	Description	Example
sanitize_id(s)	Clean string for use as ID	"My File (v2)" → "My File v2"
build_dataset_id(folder_id, dataset_name)	Build dataset ID	("src", "sales") → "src---sales"
build_variable_id(folder_id, dataset_name, var)	Build variable ID	("src", "sales", "amount") → "src---sales---amount"

License

MIT License - see the LICENSE file for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

datannur

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.24.1

May 21, 2026

0.24.0

May 20, 2026

0.23.1

May 7, 2026

0.23.0

May 5, 2026

0.22.4

May 4, 2026

0.22.3

May 3, 2026

0.22.2

Apr 30, 2026

0.22.1

Apr 30, 2026

0.22.0

Apr 30, 2026

0.21.0

Apr 28, 2026

0.20.0

Apr 25, 2026

0.20.0a5 pre-release

Apr 25, 2026

0.20.0a4 pre-release

Apr 25, 2026

0.20.0a3 pre-release

Apr 25, 2026

0.20.0a2 pre-release

Apr 25, 2026

0.20.0a1 pre-release

Apr 25, 2026

0.19.2

Apr 25, 2026

0.19.1

Apr 24, 2026

0.19.0

Apr 23, 2026

0.17.1

Apr 21, 2026

0.17.0

Apr 19, 2026

This version

0.16.2

Apr 17, 2026

0.16.1

Apr 16, 2026

0.16.0

Apr 16, 2026

0.15.0

Apr 15, 2026

0.14.2

Apr 6, 2026

0.14.1

Apr 5, 2026

0.14.0

Apr 5, 2026

0.13.1

Apr 2, 2026

0.13.0

Mar 31, 2026

0.12.1

Mar 30, 2026

0.12.0

Mar 30, 2026

0.11.0

Mar 26, 2026

0.10.2

Mar 23, 2026

0.10.1

Mar 22, 2026

0.10.0

Mar 22, 2026

0.9.8

Mar 20, 2026

0.9.7

Mar 20, 2026

0.9.6

Mar 20, 2026

0.9.5

Mar 19, 2026

0.9.4

Mar 19, 2026

0.9.3

Mar 17, 2026

0.9.2

Mar 16, 2026

0.9.1

Mar 16, 2026

0.9.0

Mar 12, 2026

0.8.0

Mar 12, 2026

0.7.0

Mar 8, 2026

0.6.0

Mar 8, 2026

0.5.0

Mar 7, 2026

0.4.2

Mar 5, 2026

0.4.1

Feb 1, 2026

0.4.0

Jan 29, 2026

0.3.2

Jan 29, 2026

0.3.1

Jan 25, 2026

0.3.0

Jan 22, 2026

0.2.1

Jan 21, 2026

0.2.0

Jan 20, 2026

0.1.3

Jan 18, 2026

0.1.2

Jan 18, 2026

0.1.1

Jan 18, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datannurpy-0.16.2.tar.gz (7.5 MB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datannurpy-0.16.2-py3-none-any.whl (7.6 MB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file datannurpy-0.16.2.tar.gz.

File metadata

Download URL: datannurpy-0.16.2.tar.gz
Upload date: Apr 17, 2026
Size: 7.5 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datannurpy-0.16.2.tar.gz
Algorithm	Hash digest
SHA256	`50a3916f0ededa7279338ae54e222b0a4c15b0b4d75151a1b3b2140befabaeb0`
MD5	`96eefde7ae9fc0c7586430ac0b244968`
BLAKE2b-256	`10cfdc6d1089974450e8add8e5a2836bf6ce77df5efe6e04e0d8153597dc1871`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datannurpy-0.16.2.tar.gz:

Publisher: release.yml on datannur/datannurpy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datannurpy-0.16.2.tar.gz
- Subject digest: 50a3916f0ededa7279338ae54e222b0a4c15b0b4d75151a1b3b2140befabaeb0
- Sigstore transparency entry: 1332916717
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: datannur/datannurpy@868fb89c979cdfd69e9496fe1013ce0b6318add7
- Branch / Tag: refs/heads/main
- Owner: https://github.com/datannur
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@868fb89c979cdfd69e9496fe1013ce0b6318add7
- Trigger Event: workflow_run

File details

Details for the file datannurpy-0.16.2-py3-none-any.whl.

File metadata

Download URL: datannurpy-0.16.2-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 7.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datannurpy-0.16.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6fe57c383179be9569582b7c32840bfffbf4509f1eba2d69d8277cb24ada4e5e`
MD5	`6f3628cb67d5226f8edb82f8549181f1`
BLAKE2b-256	`4f47e80b99e3c986144e9af93c2cb54e5a7ff6178fe0a781fce54b866c071360`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datannurpy-0.16.2-py3-none-any.whl:

Publisher: release.yml on datannur/datannurpy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datannurpy-0.16.2-py3-none-any.whl
- Subject digest: 6fe57c383179be9569582b7c32840bfffbf4509f1eba2d69d8277cb24ada4e5e
- Sigstore transparency entry: 1332916795
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: datannur/datannurpy@868fb89c979cdfd69e9496fe1013ce0b6318add7
- Branch / Tag: refs/heads/main
- Owner: https://github.com/datannur
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@868fb89c979cdfd69e9496fe1013ce0b6318add7
- Trigger Event: workflow_run

datannurpy 0.16.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

datannurpy

Supported formats

Installation

Optional extras

Quick start

CLI

Scan depth

Auto-tagging

Scanning files

Time series detection

Parquet formats

Remote storage

Scanning databases

Sampling

CSV options

Manual metadata

Environment variables

Output

Incremental scan

Evolution tracking

app_config

post_export

Python API

Catalog

Catalog.add_folder()

Catalog.add_dataset()

Catalog.add_database()

Catalog.add_metadata()

Catalog.export_db()

Catalog.finalize()

Catalog.export_app()

Folder

ID helpers

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`Catalog`

`Catalog.add_folder()`

`Catalog.add_dataset()`

`Catalog.add_database()`

`Catalog.add_metadata()`

`Catalog.export_db()`

`Catalog.finalize()`

`Catalog.export_app()`

`Folder`