Fluent interface for data processing, advanced toolkit for data science
This package is an data science-specific update for
yo_fluq that introduces:
- querying and output for
pandasdata structures and files in
feed-based extension methods.
The main reason for separating
yo_fluq is that data science functionality requires huge packages like
matplotlib, which I didn't want to include in a basic package.
Small useful classes
Objis an ordered dict with a member-like access:
obj.a=12works exactly as
Enumwith ordering, it's useful when using enums in
pandas, because the basic enumeration cannot be used as keys for
Query.combinatorics has some useful method to create lazy combinatorics enumerations:
cartesian(en1,en2,...)will create a cartesian product of enumerations in
grid(field1=en1,field2=en2)will create an enumeration of
field2that runs over cartesian product of
triangleis query-like replacement for loops
powersetproduces all the subsets of a given set
Adds several aggregators/query sources to work with files.
Query.file.text: text file, its lines are interpreted as enumeration's objects
Query.file.zipped_text: zipped text file
Query.file.pickle: a internal format, lazily writes a sequence of objects in pickle format in one file.
Query.to_zipped_folder: representation for
KeyValuePair: filenames are keys, its concent is values
FileIO class with one-line instruction to read text, json, pickle, jsonpickle, yaml files.
Query.folder method to create enumeration of
Path objects from folder
Query.seriesto convert series in
Query.dfto convert dataframe in
feed method to
SeriesGroupBy by monkey-patching. It is now possible to write something like:
(df .loc[df.status=='shipped'] .feed(lambda z: groupby(z.date.dt.to_period('M'))) .size() )
When calling lambda inside
z will be assigned to the dataframe after filtering out.
This technique allows longer fluent instructions for
pandas, which is otherwise impossible due to filtering.
Some methods from
yo_fluq_ds are not incorporated into
Queryables, because they are used not that often and I want to avoid overloading
Queryable with such methods. So, they are accessible only via
All of them are inside
fluq.with_progress_baris a Queryable-friendly wrapping over
tqdm. It automatically detects notebooks/console environments. The
total(length of enumerable) in most cases is known from
Queryable.lengthfield, but sometimes needs to be provided.
fluq.with_plots(columns,figsize)will create plots for each of elements in enumerable and return the enumerable of
ItemWithAx. Very handy to draw several plots at once, e.g. for different columns in dataframe
fluq.pairwiseconverts enumerable to the enumerable of pair of neighbouring elements
fluq.fractionscan be used where size is normally used to determine the relative size of the groups
fluq.trimmercan be used to trim too high/too low values from the series, thus facilitating histograms' creation.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size yo_fluq_ds-1.1.11-py3-none-any.whl (37.4 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size yo_fluq_ds-1.1.11.tar.gz (20.2 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for yo_fluq_ds-1.1.11-py3-none-any.whl