Alpha version of the Rasgo Python interface.
Project description
pyRasgo is a python SDK to interact with the Rasgo API. Rasgo accelerates feature engineering for Data Scientists.
Visit us at https://www.rasgoml.com/ to turn your data into Features in minutes!
Documentation is available at: https://docs.rasgoml.com/rasgo-docs/pyrasgo/
Package Dependencies
- idna>=2.5,<3
- more-itertools
- pandas
- pyarrow>=3.0
- pydantic
- pyyaml
- requests
- snowflake-connector-python>=2.4.0
- tqdm
Release Notes
-
v0.2.3(Alpha2)
- Add support for optional catboost parameter
train_dirtoevaluate.feature_importance()to allow users to dictate where temporary training files are generated
- Add support for optional catboost parameter
-
v0.2.3(Alpha)
- introduces
publish.features_from_source_code()function. This function allows customers to pass a custom SQL string to create a view in Snowflake using their own code. This function will: register a child source based off the parent source provided as input, register features from the new child source table. (NOTE: custom python functionality is coming later, mvp is only custom SQL) - introduces new workflow to
publish.source_data()function. Pass insource_type="sql", sql_definition="<valid sql select string>"to create a new Rasgo DataSource as a view in Snowflake using custom SQL. (NOTE: custom python functionality is coming later, mvp is only custom SQL) - makes the
featuresparameter optional inpublish.features_from_source()function. If param is not passed, all columns in the underlying table that are not in thedimensionslist will be registered as features - adds
verboseparameter to all publish methods. When set to True, prints status messages to stdout to update users on progress of the function. Default = False. - introduces
.sourceCodeproperty on Rasgo DataSource and FeatureSet classes to display the SQL or python source code used to build the underlying table - introduces
.display_source_code()method on Rasgo DataSource, FeatureSet, Feature classes to display the SQL of python code used to build the underlying table (NOTE: This is redundant to the above property. Including in alpha preview for feedback on which expeirence is better) - introduces
.rebuild_from_source_code()method on Rasgo DataSource, FeatureSet, Feature classes to run the SQL or python code used to build the underlying table - effectively rebuilding that table. (NOTE: This is not functional in alpha preview. More work is needed before including this in next version push.) - introduces
.render_sql_definition()method on Collection class to display the SQL used to create the underlying collection view - introduces
.dimensionsproperty on Rasgo Collection class to display all unique dimension columns in a Collection - introduces
trigger_statsparameter incollection.generate_training_data()method to allow users to generate a sql view without kicking off correlation and join stats. Set to False to suppress stats jobs. Default=True. (NOTE: This is not fully functional in alpha preview, more API work needed before adding this to next version push. Only including for feedback on experience, not functionality.)
- introduces
-
v0.2.2(July 14, 2021)
- Allow for consistency in
evaluate.feature_importance()evaluation metrics for unchanged dataframes - Allow users to control certain CatBoost parameters when running
evaluate.feature_importance()
- Allow for consistency in
-
v0.2.1(July 01, 2021)
- expand
evaluate.feature_importance()to support calculating importance for collections
- expand
-
v0.2.0(June 24, 2021)
- introduce
publish.experiment()method to fast track dataframes to Rasgo objects - fix register bug
- introduce
-
v0.1.14(June 17, 2021)
- improve new user signup experience in
register()method - fix dataframe bug when experiment wasn't set
- improve new user signup experience in
-
v0.1.13(June 16, 2021)
- intelligently run Regressor or Classifier model in
evaluate.feature_importance() - improve model performance statistics in
evaluate.feature_importance(): include AUC, Logloss, precision, recall for classification
- intelligently run Regressor or Classifier model in
-
v0.1.12(June 11, 2021)
- support fqtn in
publish.source_data(table)parameter - trim timestamps in dataframe profiles to second grain
- support fqtn in
-
v0.1.11(June 9, 2021)
- hotfix for unexpected histogram output
-
v0.1.10(June 8, 2021)
- pin pyarrow dependency to < version 4.0 to prevent segmentation fault errors
-
v0.1.9(June 8, 2021)
- improve model performance in
evaluate.feature_importance()by adding test set to catboost eval
- improve model performance in
-
v0.1.8(June 7, 2021)
evaluate.train_test_split()function supports non-timeseries dataframesevaluate.feature_importance()function now runs on an 80% training set- adds
timeseries_indexparameter toevaluate.feature_importance()&prune.features()functions
-
v0.1.7(June 2, 2021)
- expands dataframe series type recognition for profiling
-
v0.1.6(June 2, 2021)
- cleans up dataframe profiles to enhance stats and visualization for non-numeric data
-
v0.1.5(June 2, 2021)
- introduces
pip install "pyrasgo[df]"option which will install: shap, catboost, & scikit-learn
- introduces
-
v0.1.4(June 2, 2021)
- various improvements to dataframe profiles & feature_importance
-
v0.1.3(May 27, 2021)
- introduces experiment tracking on dataframes
- fixes errors when running feature_importance on dataframes with NaN values
-
v0.1.2(May 26, 2021)
- generates column profile automatically when running feature_importance
-
v0.1.1(May 24, 2021)
- supports sharing public dataframe profiles
- enforces assignment of granularity to dimensions in Publish methods based on list ordering
-
v0.1.0(May 17, 2021)
- introduces dataframe methods: evaluate, prune, transform
- supports free pyrago trial registration
-
v0.0.79(April 19, 2021)
- support additional datetime data types on Features
- resolve import errors
-
v0.0.78(April 5, 2021)
- adds include_shared param to get_collections() method
-
v0.0.77(April 5, 2021)
- adds convenience method to rename a Feature’s displayName
- adds convenience method to promote a Feature from Sandbox to Production status
- fixes permissions bug when trying to read Community data sources from a public org
-
v0.0.76(April 5, 2021)
- adds columns to DataSource primitive
- adds verbose error message to inform users when a Feature name conflict is preventing creation
-
v0.0.75(April 5, 2021)
- introduce interactive Rasgo primitives
-
v0.0.74(March 25, 2021)
- upgrade Snowflake python connector dependency to 2.4.0
- upgrade pyarrow dependency to 3.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyrasgo-0.2.3a2.tar.gz.
File metadata
- Download URL: pyrasgo-0.2.3a2.tar.gz
- Upload date:
- Size: 56.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.7.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8dce848615722b8e2c042f4db7c669ee7b3051d69647567b360663161051a446
|
|
| MD5 |
0d9c9866ae86dcda315cac96a8949ce2
|
|
| BLAKE2b-256 |
7e549d651c76272f9b54cce4140b1428b23005c32bc574155666ee8531be25b6
|
File details
Details for the file pyrasgo-0.2.3a2-py3-none-any.whl.
File metadata
- Download URL: pyrasgo-0.2.3a2-py3-none-any.whl
- Upload date:
- Size: 73.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.7.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bf1b28bd967a6b9897cf5aefbee956c1b7cc4501126a24188c1de75ccd4ffa9
|
|
| MD5 |
e56317879e96b3263b92c6a548f28054
|
|
| BLAKE2b-256 |
334435909809ded85c571848cb8687c7c09c6e934fd66a22c6b18feabc8dcb6e
|