Skip to main content

Fast Polars-based reading/writing of dBase/DBF files with async I/O and compression support

Project description

klaw-dbase

A Polars plugin for reading and writing dBase III files (.DBF), with built-in support for DATASUS compressed files (.DBC).

Features

  • Polars IO plugin with lazy scanning, projection pushdown, and predicate pushdown
  • DATASUS .DBC support for compressed Brazilian health system files
  • Parallel reading across multiple files
  • Flexible encodings (cp1252, utf-8, iso-8859-1, etc.)
  • Globbing and directory scanning

Installation

pip install klaw-dbase

Requirements: Python 3.13+

Quickstart

Read a .DBF file

from klaw_dbase import read_dbase

df = read_dbase('data.dbf')

Lazy scan for large files

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data.dbf')
result = lf.filter(pl.col('age') > 30).select('name', 'age').collect()

Write a DataFrame

import polars as pl
from klaw_dbase import write_dbase

df = pl.DataFrame({'name': ['Alice', 'Bob'], 'age': [25, 30]})
write_dbase(df, 'output.dbf', overwrite=True)

DATASUS .DBC Files

The primary use case for this library is handling DATASUS files from Brazil's public health system—both compressed (.DBC) and uncompressed (.DBF).

Read a compressed .DBC file

from klaw_dbase import read_dbase

# Auto-detected by .dbc extension
df = read_dbase('RDPA2402.dbc')

# Or explicitly
df = read_dbase('RDPA2402.dbc', compressed=True)

Read multiple DATASUS files

from klaw_dbase import read_dbase

files = [
    'RDPA2401.dbc',
    'RDPA2402.dbc',
    'RDPA2403.dbc',
]
df = read_dbase(files)

Lazy scan with glob patterns

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data/RDPA24*.dbc')
summary = lf.filter(pl.col('IDADE') >= 65).group_by('UF_RESID').agg(pl.len().alias('count')).collect()

Get record count without loading data

from klaw_dbase import get_dbase_record_count

n = get_dbase_record_count('RDPA2402.dbc')

API Reference

read_dbase

read_dbase(
    sources,                    # path, list of paths, directory, or glob pattern
    *,
    columns=None,               # columns to select (names or indices)
    n_rows=None,                # limit number of rows
    row_index_name=None,        # add row index column
    row_index_offset=0,
    rechunk=False,
    batch_size=8192,
    n_workers=None,             # parallel readers (default: all CPUs)
    glob=True,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,           # auto-detected for .dbc files
) -> pl.DataFrame

scan_dbase

scan_dbase(
    sources,
    *,
    batch_size=8192,
    n_workers=None,
    single_col_name=None,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,
    glob=True,
    progress=False,
) -> pl.LazyFrame

write_dbase

write_dbase(
    df,                         # polars DataFrame
    dest,                       # path or file-like object
    *,
    batch_size=None,
    encoding="cp1252",
    overwrite=False,
) -> None

get_dbase_record_count

get_dbase_record_count(path) -> int

Encodings

Common encodings for dBase files:

Encoding Use case
cp1252 Windows Latin-1 (default, common for DATASUS)
utf-8 Unicode
iso-8859-1 Latin-1
iso-8859-15 Latin-9 (Euro sign)

Error Handling

Exception When raised
DbaseError Corrupted or invalid dBase file
DbcError Compression-specific problems
EmptySources No input files or empty DataFrame on write
SchemaMismatch Multiple files with incompatible schemas
EncodingError Invalid or unsupported encoding
from klaw_dbase import DbaseError, DbcError, EmptySources

try:
    df = read_dbase('corrupted.dbf')
except DbaseError as e:
    print(f'Failed to read: {e}')

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

klaw_dbase-0.1.2.tar.gz (78.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl (5.9 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (5.9 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (5.9 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-cp313-abi3-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.13+Windows x86-64

klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ x86-64

klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ ARM64

klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ x86-64

klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_aarch64.whl (4.9 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ ARM64

klaw_dbase-0.1.2-cp313-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.13+macOS 11.0+ ARM64

File details

Details for the file klaw_dbase-0.1.2.tar.gz.

File metadata

  • Download URL: klaw_dbase-0.1.2.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for klaw_dbase-0.1.2.tar.gz
Algorithm Hash digest
SHA256 71e644aad96fa34096d8aef7eecb8ed2ffdbbf895a4d3985430118afd348f85f
MD5 9111e2c064f827a7c3a51e466e7e4014
BLAKE2b-256 48b801258a55d3f448e3530fb413a66a03348879d50665d422edcdcc2026d233

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2.tar.gz:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 b20f16ac0e35a01105a31d607420fd3aedd277cff5c18b9faed32a5dc57732f4
MD5 54900c65991814358463990a8f6418ef
BLAKE2b-256 45465b6b8d310447397abf370c4613a0dd7a222a5fe9afd52fbf86d3867f03b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 2acd9f1599c453ed529c86e662f0f21da8399123eff707877f52957a4336c81d
MD5 c0364d37ba07aa6ca025d6a67d67c3d3
BLAKE2b-256 2f3e3c320be0fd7159d126a9fd82af028376769144af0c092e8a427f56b3f143

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 3838a522ed6fb484bd3fabae987e63a06a426108ac8121e7eac1d7caf88c5b3b
MD5 e1521e3f28f39600dee4367e4b78e0e3
BLAKE2b-256 8285555bfe47cf061c4322842ad5e276881f2c6188ae24303d2b4bffe8775081

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 306abe6fbf9d107e3b8412cc5ac776601e86a04d51e5195a7728aefdd7c2f242
MD5 5f230749955c7478288978d622f895e3
BLAKE2b-256 a08db09aea1219e627967447c18be60b811c5f930be0297bafcc14ed5d3e3afa

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 64fcdb19cfcd69a9943c79af01954bd0c7aec394e4a6202cd1cd66ae47de39e5
MD5 e01bfd6b65c5451af10d4b5584616c08
BLAKE2b-256 9ad1d384e2ecde161e7c5d0a2ae84f7c7b16ff35416cc57e718fbbca0d644e98

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 6e6ace6395b9a7198f800f43d42888bae0279556cebfb481f66a1396e2db4923
MD5 c4179edcd903a03691dd52a2451e25aa
BLAKE2b-256 b031fed09254facbb2175144017996b90f3808623355f357e9686821fe319b85

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 2cb8c75b7a1b7eb2634762a38920ca91c7a9bfd42774355b524d3df927ec5cdb
MD5 9da77f92cdaad7455127fefdf690be2c
BLAKE2b-256 fe8079fd9305f499a7b53ff7086927d15b9d9682b2fe4650849931ac9f0f7927

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7f8ced84880def8f3f985d438b06c0660c68b93266fe7bd11efe6cb1cafc9a70
MD5 027963fe7a302bfaf8be2dc9e7fb2678
BLAKE2b-256 a4d2a6247ba9738f750786cd39c8a6d987dae6f6cd3335be1dc3ac7cf83ba7ee

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 99b401e1d1e20ed9986af1e1e1cdf76b7d12e65b7b929aac91fc793591945362
MD5 5c96cd7632273143f849a471d91a48ce
BLAKE2b-256 9b8e08517e31aa61a3f2b3a11d208fe1bd090aa2ccc4fd1b1f8da91d2106e04e

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 42568df47eb5e8b1520b733e9831ccc0180a08d2aaa7ce278ae149f383df30f9
MD5 e19677027e2bdcda8cd6aeb9ac77da9d
BLAKE2b-256 147195551fa840c0aaed9e5a37de51c004e81e56154ab5fc99de24bf194b7f76

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-cp313t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-win_amd64.whl.

File metadata

  • Download URL: klaw_dbase-0.1.2-cp313-abi3-win_amd64.whl
  • Upload date:
  • Size: 6.3 MB
  • Tags: CPython 3.13+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9fa698fc4418032b4b8f2b46ac434c0701aa6bfb7329abeec75415de03fe5ada
MD5 4bf08bad1916e0bf02ba592a9673cdb1
BLAKE2b-256 871ea2f5ed8bbad46ef3034f30da1a9b6502df815fa6aabdd8cc013a02fcb588

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-win_amd64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 dc6d38bce1b821a0f29a66abd3d57cf52466c5cd5a91ac77573ab090042ec363
MD5 69a80aba76ee5a3961b2ebf21e7e986b
BLAKE2b-256 9e350937c7b282e7baf8b1a1faa14802c3aaf72d4bdbc7d125ac4433fef00c2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 cbc5e3769eebecc7cc6f01362bc6d39b495dd23819ca122b8762b06c144361ae
MD5 194515895f1d121399e5f622c1713c52
BLAKE2b-256 5e9dee4c35258fd8b864141a3c2c1925860f33852247805405e0d1fb68e7c8f2

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 de1399ebd24ba305a1850c0f41f71f1e26d008d10241c35af8fa2db4ca90c216
MD5 a6ca30e489800f890589cfad7480d381
BLAKE2b-256 1bc3fba946a41a8d30e7a801d1fb7c87021e652a16edb62703ae4bc546d778a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_aarch64.whl
Algorithm Hash digest
SHA256 192bd903ff4ccff6696b5ba8d9ae58ee765e0e45650ca6fc07cb2a31ac62895d
MD5 4228b46080238dc469b822d9cec79d3b
BLAKE2b-256 5466b39a33839ef2e78c826cc49c49ddc71da56143ab31e48a93849e50a6878e

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-manylinux_2_24_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.2-cp313-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.2-cp313-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6edccdf1d2b2b0c65ef46e4b0d8524ca72f0b08c433dc33d44af905464af6902
MD5 357a9a9acf35e83d7bd17fee59787051
BLAKE2b-256 b96819740f34ede7ffbc35f1818d9848c129775b47054554308006563d97f33e

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.2-cp313-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page