Skip to main content

mft to parquet (pyarrow dtypes)

Project description

mft to parquet (pyarrow dtypes)

Tested against Windows 10 / Python 3.11 / Anaconda

pip install mft2parquet

Reads HDD (Hard Disk Drive) information from a specified drive and returns it as a pandas DataFrame.

Args:
drive (str, optional): The drive path to read from. Default is "c:\\".
outputfile (str, optional): If provided, the DataFrame will be saved as a Parquet file at this path.
					  Default is None.

Returns:
pd.DataFrame: A DataFrame with pyarrow dtypes containing HDD information with the specified columns.

Raises:
subprocess.CalledProcessError: If the external command fails to execute.

Note:
- This function uses an external command-line utility https://github.com/githubrobbi/Ultra-Fast-File-Search to retrieve HDD information.
- The DataFrame will have the following columns:
- aa_path
- aa_name
- aa_path_only
- aa_size
- aa_size_on_disk
- aa_created
- aa_last_written
- aa_last_accessed
- aa_descendents
- aa_read-only
- aa_archive
- aa_system
- aa_hidden
- aa_offline
- aa_not_content_indexed_file
- aa_no_scrub_file
- aa_integrity
- aa_pinned
- aa_unpinned
- aa_directory_flag
- aa_compressed
- aa_encrypted
- aa_sparse
- aa_reparse
- aa_attributes

Example:
df = read_hdd(drive="d:\\", outputfile="hdd_info.parquet")
# Reads HDD information from the 'D:' drive and saves it as 'hdd_info.parquet'.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mft2parquet-0.11.tar.gz (679.2 kB view hashes)

Uploaded Source

Built Distribution

mft2parquet-0.11-py3-none-any.whl (681.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page