Skip to main content

Fast Polars-based reading/writing of dBase/DBF files with async I/O and compression support

Project description

klaw-dbase

A Polars plugin for reading and writing dBase III files (.DBF), with built-in support for DATASUS compressed files (.DBC).

Features

  • Polars IO plugin with lazy scanning, projection pushdown, and predicate pushdown
  • DATASUS .DBC support for compressed Brazilian health system files
  • Parallel reading across multiple files
  • Flexible encodings (cp1252, utf-8, iso-8859-1, etc.)
  • Globbing and directory scanning

Installation

pip install klaw-dbase

Requirements: Python 3.13+

Quickstart

Read a .DBF file

from klaw_dbase import read_dbase

df = read_dbase('data.dbf')

Lazy scan for large files

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data.dbf')
result = lf.filter(pl.col('age') > 30).select('name', 'age').collect()

Write a DataFrame

import polars as pl
from klaw_dbase import write_dbase

df = pl.DataFrame({'name': ['Alice', 'Bob'], 'age': [25, 30]})
write_dbase(df, 'output.dbf', overwrite=True)

DATASUS .DBC Files

The primary use case for this library is handling DATASUS files from Brazil's public health system—both compressed (.DBC) and uncompressed (.DBF).

Read a compressed .DBC file

from klaw_dbase import read_dbase

# Auto-detected by .dbc extension
df = read_dbase('RDPA2402.dbc')

# Or explicitly
df = read_dbase('RDPA2402.dbc', compressed=True)

Read multiple DATASUS files

from klaw_dbase import read_dbase

files = [
    'RDPA2401.dbc',
    'RDPA2402.dbc',
    'RDPA2403.dbc',
]
df = read_dbase(files)

Lazy scan with glob patterns

import polars as pl
from klaw_dbase import scan_dbase

lf = scan_dbase('data/RDPA24*.dbc')
summary = lf.filter(pl.col('IDADE') >= 65).group_by('UF_RESID').agg(pl.len().alias('count')).collect()

Get record count without loading data

from klaw_dbase import get_dbase_record_count

n = get_dbase_record_count('RDPA2402.dbc')

API Reference

read_dbase

read_dbase(
    sources,                    # path, list of paths, directory, or glob pattern
    *,
    columns=None,               # columns to select (names or indices)
    n_rows=None,                # limit number of rows
    row_index_name=None,        # add row index column
    row_index_offset=0,
    rechunk=False,
    batch_size=8192,
    n_workers=None,             # parallel readers (default: all CPUs)
    glob=True,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,           # auto-detected for .dbc files
) -> pl.DataFrame

scan_dbase

scan_dbase(
    sources,
    *,
    batch_size=8192,
    n_workers=None,
    single_col_name=None,
    encoding="cp1252",
    character_trim="begin_end",
    skip_deleted=True,
    validate_schema=True,
    compressed=False,
    glob=True,
    progress=False,
) -> pl.LazyFrame

write_dbase

write_dbase(
    df,                         # polars DataFrame
    dest,                       # path or file-like object
    *,
    batch_size=None,
    encoding="cp1252",
    overwrite=False,
) -> None

get_dbase_record_count

get_dbase_record_count(path) -> int

Encodings

Common encodings for dBase files:

Encoding Use case
cp1252 Windows Latin-1 (default, common for DATASUS)
utf-8 Unicode
iso-8859-1 Latin-1
iso-8859-15 Latin-9 (Euro sign)

Error Handling

Exception When raised
DbaseError Corrupted or invalid dBase file
DbcError Compression-specific problems
EmptySources No input files or empty DataFrame on write
SchemaMismatch Multiple files with incompatible schemas
EncodingError Invalid or unsupported encoding
from klaw_dbase import DbaseError, DbcError, EmptySources

try:
    df = read_dbase('corrupted.dbf')
except DbaseError as e:
    print(f'Failed to read: {e}')

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

klaw_dbase-0.1.1.tar.gz (78.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded PyPymusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded PyPymusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-cp313-abi3-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.13+Windows x86-64

klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ x86-64

klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.13+musllinux: musl 1.2+ ARM64

klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ x86-64

klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.24+ ARM64

klaw_dbase-0.1.1-cp313-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.13+macOS 11.0+ ARM64

File details

Details for the file klaw_dbase-0.1.1.tar.gz.

File metadata

  • Download URL: klaw_dbase-0.1.1.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for klaw_dbase-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3c0156cc8da704600b25940276b19aadade5e0152b25a48d0dffe28580fd3e19
MD5 6c46ad85a9c0d1a14c168c7d4f5e066b
BLAKE2b-256 4f3dcb9c59cc53abd41e9cb302703636ddb24a9d460b86ba389169f9af11392c

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1.tar.gz:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 519a4660e821c269991f0a3d5c7f106df3b0c0e439f15ca28827d450fad458f2
MD5 40fb6dd74f670fcb74cdbde051d6dc21
BLAKE2b-256 978e767cbb86590d7abe0dfb9bd2c71ed1ba503e355f590d3b553768dbba31a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 375f199108bf626bc73ebf5fda3fb161d8b3b9f60638b4eba42ecbd9dc9c34d4
MD5 f260d5829556158a5d7c7293d5604688
BLAKE2b-256 53eafe8f8a2276b56d4b7db37ec208cbafba2e3182705ce04264e3b8b96ca915

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e1a3084a2de898e2d5a0567533c45ae4a20fb8044a91e886620ec13782f61327
MD5 fbf272331845f430e405a876f5f742dc
BLAKE2b-256 6fe20901d4b24ccefe5c61e2df41f83b6eb7076e6686b73ea77ba09bd1ffdb51

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 db6237c73e921f2bad5d50dbb859acca4b6fe4322297908cdd36b6cbe7e79673
MD5 01bbdeb2342d6b0bcbeb6b4f36180b21
BLAKE2b-256 60747b0cb57cfbd9ea93ddfbf995818a103879a3892c3db9032b359a87bf255b

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 dcecdcc80ee1e14e9e5e3cc94a04ea7084be208fd0a9932914a9a414623d47d3
MD5 6317560462bda52dcb13cd627b42f9b0
BLAKE2b-256 2cdd47df7dd02cfd93d778c1d32908c2ecccab41da0219df74e8be2b595ac5d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 c3fb1938593ff742a059964dd5f75b121fb22a74e35fa0f897b6cfb846c0b473
MD5 49564dc4ea1a480bc06d79e486f7fbd4
BLAKE2b-256 87485e146f2fcc26944f8ae679196045e885b813283e26040de7ab099e7fddb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 bb0af3c98946444b6d4009ef1a3d3c98d88bb2c480491efcd253bdaa7b06cdda
MD5 bd270ee8aae9b5b79119d07278d29cdf
BLAKE2b-256 431ba3a69fd9e1e5278e983a5b2e21cb3574891a278ba8d087f11b3c33222242

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 5380585ae21332d3493a96bf8690d36fad8205a194fbdff606615d4297918ce3
MD5 0df9778345ff572507add19a008898af
BLAKE2b-256 2d3472a897062898e7a78c34fede5ce1569198489f8ffd9cc05aadcd06d0ba7e

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 d63e7f8f3ec68bf0ba37935cf55496dbb7841a7d0014673feb749f23fd2bddaf
MD5 93a8a72c325df8f7256910a6bfdf7d90
BLAKE2b-256 42fd71e8ef329ad24d0eb2b63d060afcafd072f288b059593270d641cbccb4ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 09ca0fa8df8724f7f6a04f0e85a24aa55e4a2eba17226a9bdeeb17cf6f5e0adf
MD5 742858b00e2471e215b506789b7f28ed
BLAKE2b-256 20ac6dccc164a08962686d96d51cf16bf4aa8ec9cb60cbdcbe5d3779c18df585

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-win_amd64.whl.

File metadata

  • Download URL: klaw_dbase-0.1.1-cp313-abi3-win_amd64.whl
  • Upload date:
  • Size: 6.3 MB
  • Tags: CPython 3.13+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d44dd1d7552611c151c04124b4d30dcfc3f94daec426d633b8f97598c3787549
MD5 687e6de424a713833fb27843182c5d71
BLAKE2b-256 f4afd3db7f044a0bcb04ec3111c1af1a528cfdd4128356f682d3bc568e9bdd12

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-win_amd64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9982540b3f33f9310702bfb7f2e063620d25401cc0cea23d058d3e808a09591d
MD5 fdf24f54334ac375d0e6dc734ba23ee5
BLAKE2b-256 5f0fa09834c396086096c3735d16c0c2ac55272439daae1a9b960689f846fc0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 5dc0a0a8c2e05d21df457dc303bd57e7d2172ec8a88a717ee72686d29467d457
MD5 b87b9614256ff740eea03ad0eb913afd
BLAKE2b-256 b94503f6ae33dd9d66561b5a785e89c72e9b252729995b088a9d8c7e6dec3097

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 4c8f0f4a4835bb9a5318acaea0134d1f0c59d8a79d9644dc8d14cf6fc7a390d2
MD5 a18471aac07271c7498aef53f441fd66
BLAKE2b-256 3c1121771926fcc308ea165fa8559cf27fca4b1738a4c8213988d0daa377e60c

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_x86_64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_aarch64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_aarch64.whl
Algorithm Hash digest
SHA256 7b7967dd2749b8a83957e4d5750e255b9d72c7fa9582308953ea3602bf5a50ec
MD5 5f9d28acaa7907366b7f5c733a2fe0eb
BLAKE2b-256 878d60a6494f6a6c99f6d1b85f69fadeb1da0f662510f066fb42075030b21c75

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-manylinux_2_24_aarch64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file klaw_dbase-0.1.1-cp313-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for klaw_dbase-0.1.1-cp313-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 baa717bfbb99f00661d3759999e27e64fadee72de42490c9ac6c36e8726aaba4
MD5 4a37d7652d4f18a6ed8e4e44b1c04d6f
BLAKE2b-256 4f48be2fe92daaa46126e938580c9f7d35e8f417039288e8f783df5084083c85

See more details on using hashes here.

Provenance

The following attestation bundles were made for klaw_dbase-0.1.1-cp313-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on klaw-python/klaw

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page