Skip to main content

"The library contains base functions of the RoughSets Theory (introduced by Zdzisław Pawlak in 1982)."

Project description

DOI

RoughSets library (Pandas version)

The goal of the library is to provide base functions of Rough Sets Theory and give foundation to build extensions which will provide different methods built on RoughSets Theory, like:

  • pre-processing methods
  • find core and reducts
  • classifiers
  • post-processing methods

The library doesn't use basic loops so should help to build very fast extensions when large datasets will be used.

The library implements these main functions:

  • computation of indiscernibilty relations - function: get_indiscernibility_relations

  • computation of a lower and upper approximations, boundary and negative regions - all these 4 boundaries are computed by function: get_approximation_indices. For optimization, only indices of X,y are returned by the function, so can be used for futher computations
    before slicing with X and y.

The library has included unit tests for different datasets, subsets and concepts.

Requirements

Python >= 3.8
OS: Linux, Windows

Install from PyPi server

pip install roughsets-base

Build and install the library from source code

pip install --upgrade pip
pip install --upgrade build

On linux: python3 -m build

On Windows:
py -m build

pip install dist/roughsets_base--py3-none-any.whl

Install CI and dev tools

pip install -r requirements.dev.txt

Unit tests

File tests/test_dataset_KDD.py contains tests which use KDD99 dataset.
By default file will be downloaded from: http://kdd.ics.uci.edu/databases/kddcup99/corrected.gz
If You have the file on Your system You can set OS environment variable ROUGHSETS_KDD99_TEST_DATA_FOLDER
with the path to th file.
File tests/KDD99_compare_with_R_RoughSets.R contains R script which generate refernece data using well knwon reference library: RoughSets.
See: https://www.rdocumentation.org/packages/RoughSets/topics/RoughSets-package (R language). Reference datasets with results from R-RoughSets library are saved in folder tests/datasets/KDD99.
You can disable running tests for specific dataset in file test_dataset_X.py (X - symbol of dataset), method setUp.

For checking of unit tests' results was used algorithms from well known reference library:
https://www.rdocumentation.org/packages/RoughSets/topics/RoughSets-package

Re-Build sphinx documentation

pip install -r requirements.dev.txt
sphinx-build -b html ./doc ./doc/_build/html
sphinx-build -b man ./doc ./doc/_build/man

Recommended packages

sklearn-pandas https://github.com/scikit-learn-contrib/sklearn-pandas
pandas ecosystem: https://pandas.pydata.org/community/ecosystem.html

Citation

If You would like to use this library in Your research, please cite it.

Jankowski, D. (2022). Roughsets-base: Python library for base Roughsets methods. (v1.0.1) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.5957474

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roughsets-base-1.0.1.2.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

roughsets_base-1.0.1.2-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file roughsets-base-1.0.1.2.tar.gz.

File metadata

  • Download URL: roughsets-base-1.0.1.2.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for roughsets-base-1.0.1.2.tar.gz
Algorithm Hash digest
SHA256 d0fa4262b8837b879afd1cbf3b78f5ffaded21999ff136e2fe6afafa80ebed47
MD5 77c76bb988335bb6cb8cfbd47990c520
BLAKE2b-256 40f51b7deb18947f0bb90aa3d57713414f9f3bfcaf8a700869a41d8d816a6019

See more details on using hashes here.

File details

Details for the file roughsets_base-1.0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for roughsets_base-1.0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dad1759db3b12e58a069c6ff7db8fbf5c64400df77e0faed4039bb7af2998643
MD5 b939d62ec00f14331ccf4ef1599ad0fc
BLAKE2b-256 ed8693235d463db2e9bea9c101f36e86e715415d0fb1d30ab85716cb3bb38b3d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page