Fluent interface for data processing, advanced toolkit for data science
Project description
yo_fluq_ds
This package is an data science-specific update for yo_fluq that introduces:
- querying and output for
pandasdata structures and files inQueryable - handy
feed-based extension methods.
The main reason for separating yo_fluq_ds from yo_fluq is that data science functionality requires huge packages like pandas and matplotlib, which I didn't want to include in a basic package.
Small useful classes
Objis an ordered dict with a member-like access:obj.a=12works exactly asobj['a']=12OrderedEnumisEnumwith ordering, it's useful when using enums inpandas, because the basic enumeration cannot be used as keys forgroup_by
Pull-queries updates
Combinatorics
Query.combinatorics has some useful method to create lazy combinatorics enumerations:
cartesian(en1,en2,...)will create a cartesian product of enumerations inen1,en2, etc.grid(field1=en1,field2=en2)will create an enumeration ofObjwith fieldsfield1,field2that runs over cartesian product ofen1,en2, etc.triangleis query-like replacement for loopsi=0..N,j=0..ipowersetproduces all the subsets of a given set
File system
Adds several aggregators/query sources to work with files.
to_text_file/Query.file.text: text file, its lines are interpreted as enumeration's objectsto_zip_file/Query.file.zipped_text: zipped text fileto_pickle_file/Query.file.pickle: a internal format, lazily writes a sequence of objects in pickle format in one file.to_zip_folder/Query.to_zipped_folder: representation forKeyValuePair: filenames are keys, its concent is values
Adds FileIO class with one-line instruction to read text, json, pickle, jsonpickle, yaml files.
Adds Query.folder method to create enumeration of Path objects from folder
pandas
- Adds
to_series,to_dataframeandto_ndarrayaggregators - Adds
Query.seriesto convert series inKeyValuePairenumeration - Adds
Query.dfto convert dataframe inObj(ordict) enumeration
Adds feed method to DataFrame, Series, DataFrameGroupBy and SeriesGroupBy by monkey-patching. It is now possible to write something like:
(df
.loc[df.status=='shipped']
.feed(lambda z: groupby(z.date.dt.to_period('M')))
.size()
)
When calling lambda inside feed, z will be assigned to the dataframe after filtering out.
This technique allows longer fluent instructions for pandas, which is otherwise impossible due to filtering.
feed-extension methods.
Some methods from yo_fluq_ds are not incorporated into Queryables, because they are used not that often and I want to avoid overloading Queryable with such methods. So, they are accessible only via feed method.
All of them are inside fluq module.
For Queryable
fluq.with_progress_baris a Queryable-friendly wrapping overtqdm. It automatically detects notebooks/console environments. Thetotal(length of enumerable) in most cases is known fromQueryable.lengthfield, but sometimes needs to be provided.fluq.with_plots(columns,figsize)will create plots for each of elements in enumerable and return the enumerable ofItemWithAx. Very handy to draw several plots at once, e.g. for different columns in dataframefluq.pairwiseconverts enumerable to the enumerable of pair of neighbouring elements
For pandas
fluq.fractionscan be used where size is normally used to determine the relative size of the groupsfluq.trimmercan be used to trim too high/too low values from the series, thus facilitating histograms' creation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yo_fluq_ds-1.1.9.tar.gz.
File metadata
- Download URL: yo_fluq_ds-1.1.9.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee9a6951c08006d5d4b60743ecec102555096a7d26638753edcd9ecf00084272
|
|
| MD5 |
0423ae8e092fbb74fee2bbc9a1b06a0f
|
|
| BLAKE2b-256 |
c7ae34f97aa4c24d49b94f8486645dc8e584bdf3da0b9163c44cb857c8e76ae6
|
File details
Details for the file yo_fluq_ds-1.1.9-py3-none-any.whl.
File metadata
- Download URL: yo_fluq_ds-1.1.9-py3-none-any.whl
- Upload date:
- Size: 37.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95fab67c9fc3f07d80800ad21a309e74a75d16ee1d2d92fa321ae368246fa865
|
|
| MD5 |
bec4b2e769b0cd9e2f8b84d90c679083
|
|
| BLAKE2b-256 |
d198df0515c946f4630144874fe1996ed03f86e40695b405cb63517da89f74fd
|