Skip to main content

Feature selection and missing data imputation

Project description

Project Logo

Fuzzy Imputation and Critical Attribute Reduction for Intelligent Analysis


PyPI license coverage

About The PackagePrerequisitesSetupUsageLicenseAuthors

🔷 About The Package

The ficaria package is a Python package providing custom, scikit-learn–compatible transformers for data imputation and feature selection. The transformers are designed to integrate seamlessly with scikit-learn pipelines, making them easy to use in real-world machine learning workflows and straightforward to extend for custom or research-oriented use cases.

The package was developed as part of a Bachelor’s degree thesis at the Warsaw University of Technology, Faculty of Mathematics and Information Science. All implemented methods are fuzzy-based, leveraging concepts from fuzzy set theory to handle uncertainty, vagueness, and incomplete data in a principled and interpretable manner. This makes ficaria particularly suitable for datasets where classical crisp methods may be insufficient or overly restrictive.

⚙️ Prerequisites

python

The ficaria package depends on the following Python libraries:

  • NumPy
  • Pandas
  • SciPy
  • scikit-learn
  • kneed

🛠 Setup

Ficaria can be installed from PyPI:

pip install ficaria

All dependencies are automatically installed when installing the package via pip.

🚀 Usage

Ficaria provides scikit-learn–compatible transformers for data imputation and feature selection. All transformers implement the standard fit / transform interface, so they can be used directly in pipelines alongside scalers, estimators, and cross-validation tools.

Example 1 — Feature Selection with FuzzyGranularitySelector

from ficaria import FuzzyGranularitySelector

selector = FuzzyGranularitySelector(n_features=5, eps=0.3)
selector.fit(X_train, y_train)
X_reduced = selector.transform(X_test)

Example 2 — Data Imputation with FCMKIterativeImputer

from ficaria import FCMKIterativeImputer

pipeline.fit(X_train, y_train)
X_transformed = pipeline.transform(X_test)

Example 3 — Combining Transformers in a Pipeline

Since all transformers implement fit and transform, they can be combined:

from sklearn.pipeline import Pipeline
from ficaria import FuzzyGranularitySelector, FCMKIterativeImputer

pipeline = Pipeline([
    ("imputer", FCMKIterativeImputer()),
    ("selector", FuzzyGranularitySelector(n_features=5, eps=0.3)),
])

pipeline.fit(X_train, y_train)
X_final = pipeline.transform(X_test)

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

👥 Authors

Aleksandra Kwiatkowska
Email: aleksandra.kwiatkowska263@gmail.com
Github: @kwiatkowskaa

Małgorzata Mokwa
Email: malgosiam628@gmail.com
Github: @malgosiam2

Bogumiła Okrojek
Email: bogumila.okrojek@gmail.com
Github: @szostkawron

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ficaria-0.1.0.tar.gz (45.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ficaria-0.1.0-py3-none-any.whl (46.5 kB view details)

Uploaded Python 3

File details

Details for the file ficaria-0.1.0.tar.gz.

File metadata

  • Download URL: ficaria-0.1.0.tar.gz
  • Upload date:
  • Size: 45.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for ficaria-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4c2dee040f0df07966d231ece0d60f7c347e545ca7bb2a5d7ff8c171efffcc7d
MD5 7b030401253633193ffd8ec485b48867
BLAKE2b-256 fb5c4008c3a3ac59e86ac2d5d003277881cf7b2a1523db623cec5e4eb0d78e77

See more details on using hashes here.

File details

Details for the file ficaria-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ficaria-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 46.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for ficaria-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 40e0d4adc965910d279166812a45378f15022bd8d4fab51179d429b25d3350a8
MD5 33966ab2bb742b6f3de72b8f084fed9b
BLAKE2b-256 3d8dcceffa4eef5134d978d45c789210c3b39d4c0105893ca22804786d77e785

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page