Skip to main content

Fast Polars-based reading/writing of dBase/DBF files with async I/O and compression support

Project description

klaw-dbase

A Polars plugin for reading and writing dBase III files (.DBF), with built-in support for DATASUS compressed files (.DBC).

Features

  • Polars IO plugin with lazy scanning, projection pushdown, and predicate pushdown
  • DATASUS .DBC support for compressed Brazilian health system files
  • Parallel reading across multiple files
  • Flexible encodings (cp1252, utf-8, iso-8859-1, etc.)
  • Globbing and directory scanning

Installation

pip install klaw-dbase

Requirements: Python 3.13+

Quickstart

Read a .DBF file

from klaw_dbase import read_dbase

df = read_dbase('data.dbf')

Lazy scan for large files

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data.dbf')
result = lf.filter(pl.col('age') > 30).select('name', 'age').collect()

Write a DataFrame

import polars as pl
from klaw_dbase import write_dbase

df = pl.DataFrame({'name': ['Alice', 'Bob'], 'age': [25, 30]})
write_dbase(df, 'output.dbf', overwrite=True)

DATASUS .DBC Files

The primary use case for this library is handling DATASUS files from Brazil's public health system—both compressed (.DBC) and uncompressed (.DBF).

Read a compressed .DBC file

from klaw_dbase import read_dbase

# Auto-detected by .dbc extension
df = read_dbase('RDPA2402.dbc')

# Or explicitly
df = read_dbase('RDPA2402.dbc', compressed=True)

Read multiple DATASUS files

from klaw_dbase import read_dbase

files = [
    'RDPA2401.dbc',
    'RDPA2402.dbc',
    'RDPA2403.dbc',
]
df = read_dbase(files)

Lazy scan with glob patterns

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data/RDPA24*.dbc')
summary = lf.filter(pl.col('IDADE') >= 65).group_by('UF_RESID').agg(pl.len().alias('count')).collect()

Get record count without loading data

from klaw_dbase import get_dbase_record_count

n = get_dbase_record_count('RDPA2402.dbc')

API Reference

read_dbase

read_dbase(
    sources,                    # path, list of paths, directory, or glob pattern
    *,
    columns=None,               # columns to select (names or indices)
    n_rows=None,                # limit number of rows
    row_index_name=None,        # add row index column
    row_index_offset=0,
    rechunk=False,
    batch_size=8192,
    n_workers=None,             # parallel readers (default: all CPUs)
    glob=True,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,           # auto-detected for .dbc files
) -> pl.DataFrame

scan_dbase

scan_dbase(
    sources,
    *,
    batch_size=8192,
    n_workers=None,
    single_col_name=None,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,
    glob=True,
    progress=False,
) -> pl.LazyFrame

write_dbase

write_dbase(
    df,                         # polars DataFrame
    dest,                       # path or file-like object
    *,
    batch_size=None,
    encoding="cp1252",
    overwrite=False,
) -> None

get_dbase_record_count

get_dbase_record_count(path) -> int

Encodings

Common encodings for dBase files:

Encoding Use case
cp1252 Windows Latin-1 (default, common for DATASUS)
utf-8 Unicode
iso-8859-1 Latin-1
iso-8859-15 Latin-9 (Euro sign)

Error Handling

Exception When raised
DbaseError Corrupted or invalid dBase file
DbcError Compression-specific problems
EmptySources No input files or empty DataFrame on write
SchemaMismatch Multiple files with incompatible schemas
EncodingError Invalid or unsupported encoding
from klaw_dbase import DbaseError, DbcError, EmptySources

try:
    df = read_dbase('corrupted.dbf')
except DbaseError as e:
    print(f'Failed to read: {e}')

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-cp313-abi3-win_amd64.whl (7.9 MB view details)

Uploaded CPython 3.13+Windows x86-64

klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ x86-64

klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_aarch64.whl (7.4 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ ARM64

klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_x86_64.whl (8.8 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ x86-64

klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_aarch64.whl (7.2 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ ARM64

klaw_dbase-0.1.0-cp313-abi3-macosx_11_0_arm64.whl (6.9 MB view details)

Uploaded CPython 3.13+macOS 11.0+ ARM64

File details

Details for the file klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 37e0bcf109bc5c0e20908972676aa6b2e84d820ddc04d950fb842d8ac6f85823
MD5 2e421b8f22793b5b9dbd8b60dc58d5c9
BLAKE2b-256 f23c977ade7723cc2118259eb6c6caca26923d5d0be3e3ec13046254d90f147b

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 ec4d2dff2987425c54af3aa46f8c81f99d31970112128b46c9d8a8c9f350f4ea
MD5 9b12fcb0f5f4d2ffe6d319b3a6c40cc5
BLAKE2b-256 9cf230347d731ec5ba54bca5fae91cb61c4281a11afc77b441730029753dcb95

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1844986d4a9f20050ee9c48456cba503b954459e3264e9caabb71ac300ceddab
MD5 f4792e8b870a9d460d89618bf15122ba
BLAKE2b-256 a76b330962c02e8fa870126cc9841c410908160637473f02f9ac731f3d3641a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 15bc369ab1a97527e7e8745fd0b0ac4cdd9ecae65a35499fe10c1a6a8d1a77bb
MD5 e629cb09b6c920dec76980276de0cef7
BLAKE2b-256 5b219f922b4ae7712fd610c6fca5639fb38987b88cc653aca3bde00f5b603f3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 af9481b13dc12ca82c4e6668548d050b4cc61e1078da657e9afbef6639047bb9
MD5 375c41ff28143d73a6980a81ed2a3c1e
BLAKE2b-256 8e98a59d4c5cd6295c61a81bc072bcc0400b9a7a245fd8172c54f7f5c402d422

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 153a6b8f0ef5b47254a55186451c33145402ecff5b75dcb42b3cfc9dd755c27a
MD5 0304f6c003b78c58d107cacde34588e0
BLAKE2b-256 e5604527fe60eae0c6bf413c8d6fd3667a1af7be17740dace5584a18af134dad

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 efbf65e15c57cb446cf7b451e347a010c9facdfdc49636dabccc225a10f13c33
MD5 8373393dabfa021dae14dd38292588cb
BLAKE2b-256 15456cfca9bfa232adfad218ae37574915672d4c476212850b92024aeabed1f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 59f001e85b2da02e2789103bddf000e44f1058402d37b932453e2dbb01c73e10
MD5 3e189bbb76c5a412501126e01a1db537
BLAKE2b-256 cd5decb36b8f7cc5c0a900102d20c9897aa9756d6c1f7692f18337cdd15130db

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1cb3028ded87b7ee7ef854c55f44aa2e4dfde63ac03244df5fd2e4819ac1aa97
MD5 ac2b697683e0431d3951611c34d43d3f
BLAKE2b-256 2c191230937737c4fd3fb9a231d646c6a9860f64b127f5e6c6d7327250ea5e3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 e8851a849a3cbe62c3f95549c0161c8ca17d99fae4a62e99234d8368a5aaf830
MD5 5b4264a2313b9ebb4b531b2148cd7b10
BLAKE2b-256 a1433524cc731bd61e3ad97ed2305de805cf5d39c930ef7b80833b5e49b30330

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-cp313t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-win_amd64.whl.

File metadata

  • Download URL: klaw_dbase-0.1.0-cp313-abi3-win_amd64.whl
  • Upload date:
  • Size: 7.9 MB
  • Tags: CPython 3.13+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ccc7ab6201b296879537b27acd7846c4ea8a62729143c5e37056648d4dc55d8b
MD5 b3d6b3d22dc94c4276d18e8dffa49661
BLAKE2b-256 493c23a7e8401653edf27e61953b6afae8559573e4a117e2c0245fb727d06332

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-win_amd64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ffb1842ed9e873fa0276c512d6850cc36eaaead4ca8c9c8499b9cad62b1ed086
MD5 e27a1f49c9083359fc2bf78503a61c30
BLAKE2b-256 0b24ca6c26f109906e03d0b624ad98b45ab99f04e713a4f74f6f26a0f82f94b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 76af01f0e28d57ca044d50f86162422d2247f7c6711d963f71b6e177e32118e2
MD5 74fbe6ab88b77097acd750375a71d2bf
BLAKE2b-256 0133ef1779c0099afa03c603196035e788541a4b59ed448db3bba216568e3563

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5d934dd6d6ce441bba9980aa5e10b03c6eb74f61772568aa6ac1cc63ac29b4c4
MD5 aba4c0329e40b67556f457adab888f51
BLAKE2b-256 ee8fc42f3bdfac35a0f761f9e8f4a835749f6832642dc1feb9462d0c6ce6f6bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_aarch64.whl
Algorithm Hash digest
SHA256 c79a528118ce7bc7ae3980f3e305e20809a91af1db011a4b579a45eaa9b82c7c
MD5 4f86960d5cd4694e5b1379be190a0075
BLAKE2b-256 2075621444a1a962b4b00894487714b14c2c82aa51863b15424f073e37ee0c7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-manylinux_2_24_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.0-cp313-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.0-cp313-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6af02faefbe2b4f03fd8a2df291a6cf996f2eac7cb46f52b403cb7b71df51e8c
MD5 116e2bb60fbc2577493e43880c9d1b3a
BLAKE2b-256 f03fb97cea299f201aeab896d81d3ad3d5dbef0bfd4a7a5fb279561e8c88b40c

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.0-cp313-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page