Skip to main content

Handle lots of files

Project description

hlfiles

handle lots of files for you

Modules

read

A basic module which provides a function scan_files() to read all files with a specified suffix in a root directory.

convert

It provides a class DataFileConverter, which is used to generate file converting function, such as a function that converts all csv.gz files into parquet files (see example).

Example

pip install hlfiles
import pandas as pd
from hlfiles.convert import DataFileConverter


# You can create any file converting funcion by DataFileConverter
csvgz_to_parquet = DataFileConverter(
    read_func=pd.read_csv,
    write_func=lambda data, file_path: data.to_parquet(file_path),
    read_file_extension="csv.gz",
    write_file_extension="parquet",
    read_func_kwargs={"compression": "gzip"},
)


if __name__ == "__main__":
    root_dir = r"xxx"
    # Convert all csv.gz files into parquet files in root_dir
    csvgz_to_parquet(root_dir, inplace=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hlfiles-0.0.1.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

hlfiles-0.0.1-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file hlfiles-0.0.1.tar.gz.

File metadata

  • Download URL: hlfiles-0.0.1.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.2

File hashes

Hashes for hlfiles-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ca88cac8c11cb8fcb08748dd184f2765a3effa42bfb51c7379e95d0de0eabbd4
MD5 75c97af8ab61c1c25b66541d7381e2b3
BLAKE2b-256 be68a1eb6f42b68f9049610c3a74b2f1941e173180b4db51b5e158e50e932265

See more details on using hashes here.

File details

Details for the file hlfiles-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hlfiles-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.2

File hashes

Hashes for hlfiles-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 852dc65cbab796358ac59860942525fdd49761cbc4b8ce4f183d0b168048c7cb
MD5 db9725ab1c0bf6c9155fc749daa33a90
BLAKE2b-256 7ab050e4aa90b38c52e0d2865d2c6e64fcc86920e024082d3cdaf3a082985315

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page