Package for easily accessing and managing large number of files and dirs (e.g. in ML datasets).
Project description
File Management Utilities (fmutils)
For easily accessing and managing large number of files and dirs in ML datasets.
Updates
Additional sort
functionality added in plot_data_dist
function.
split_some_data
rename_wrt_dirname
filename_replacer
plot_data_dist
Implemented Functions
class DirectoryTree
generator.
generates a dir tree displaying the full structure of the root dir, showing all the sub-dirs and the files.
Parameters:- root_dir – absolute/relative path to root directory containing all files.
- dir_only – whether to only show sub-dirs in the dir-tree (excluding the files inside of each dir and sub-dir, good for getting an overview of large databases). The default is False.
- write_tree – write the full dir-tree in a txt file in current working dir. The default is True.
get_all_files(main_dir, sort=True)
returns the list of all files inside the root dir.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
- sort – wether to sort the output lost in Alphabetical order.
get_all_dirs(main_dir, sort=True)
returns the list of all the sub-dirs inside the root dir.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
- sort – wether to sort the output lost in Alphabetical order.
get_num_of_files(main_dir)
counts the number of files inside each sub-dir of the root.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
A Dictionary containing follwoing keys/info;
get_basename(full_path, include_extension=True)
returns the basename of the file or the dir name at end of given path. In case of file you can choose wether to include the extension or not.
Parameters:- full_path – absolute/relative path to root directory containing all files.
- include_extension – if the input full_path leads to file the by default the the file's extension in included in output string.
get_random_files(main_dir, count=1)
returns a list of randomly selected files from the root dir.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
- count – the number of files to get from root and its sub-dir.
split_some_data(origin_dir, dest_dir, split=0.3, move=False)
Copies a portion of data to a new 'dest_dir'..
Parameters:- main_dir – origin dir which contains all the sub dirs having all files.
- dest_dir – destination dir where to put the splitted data.
- split – Float between [0, 1], percentage of data to split. The default is 0.3.
- move – if True the selected files will be moved to the new dir (not copied.).
rename_wrt_dirname(main_dir)
Change the names of all files inside the main_dir w.r.t their sub_dir names.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
file_name_replacer(data_dir, new_name, name2replace)
Changes the names of all files inside a dir by replacing the specific strings in old file name with new ones, specified via 2 input lists.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
- new_name – list containing the new names which will replace the old ones..
- name2replace – list containing the strings which will be replaced wiht new ones.
del_all_files(main_dir, confirmation=True)
delete all files from root and all its sub-dirs.
Parameters:- main_dir – absolute/relative path to root directory containing all files.
- confirmation – confirm before deleting the files.
plot_data_dist(main_dir)
Plots the bargaph showing number of files in all the sub dir of main dir.
Parameters:- main_dir – main directory which contains all the classes.
- sort – wether to sort the data or not None: wont sort the data and the dirs will also be shown. 1 : sorth by class name. 2 : sort by file count.
Usage
For further details and more examples visit my github
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file fmutils-0.1.9.tar.gz
.
File metadata
- Download URL: fmutils-0.1.9.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbb20130aca9061f747e1a929f49473cd3689e7b1f01b6f06b778583a272f91b |
|
MD5 | d037888056ea4586d7fe348f50a0b618 |
|
BLAKE2b-256 | 4f96d3a9624b05c28b5cf5a7bc4992491c4db8ec8728c9e05ba76f7adff71bcd |